United States
Environmental Protection
Agency
EPA/600/R-07/106
July 2007
Transit Bus Load-Based
Modal Emission Rate Model
Development

-------
                                              EPA/600/R-07/106
                                                      July 2007
   TRANSIT BUS LOAD-BASED
EMISSION
                    by

               Chunxia Feng
              Randall Guensler
              Michael Rodgers
 School of Civil and Environmental Engineering
        Georgia Institute of Technology
                Atlanta, GA
        Contract No: EP-05C-000033
     EPA Project Officer: Sue Kimbrough
     U.S. Environmental Protection Agency
National Risk Management Research Laboratory
Air Pollution Prevention and Control Laboratory
      Research Triangle Park, NC 27711
     U.S. Environmental Protection Agency
      Office of Research and Development
           Washington, DC 20460

-------
                                     ABSTRACT
       Heavy-duty diesel vehicle (HDDVs) operations are a major source of oxides of nitrogen
(NOx) and particulate matter (PM) emissions in metropolitan areas nationwide. Although HD-
DVs constitute a small portion of the onroad fleet, they typically contribute more than 45% of
NOx and 75% of PM onroad mobile source emissions (U.S. EPA 2003).  HDDV emissions are a
large source of global greenhouse gas and toxic air containment emissions. Over the last several
decades, both government and private industry have made extensive efforts to regulate and con-
trol mobile  source emissions. The relative importance of emissions from HDDVs has increased
significantly because today's gasoline powered vehicles are more than 95% cleaner than vehicles
in 1968.
       In current regional and microscale modeling  conducted in every state except California,
HDDV emissions rates are taken from the U.S. Environmental Protection Agency's (EPA's)
MOBILE 6.2  model (U.S. EPA 200la).  The U.S. Environmental Protection Agency (U.S. EPA)
is currently developing a new set of modeling tools for the estimation of emissions produced by
onroad and off-road mobile sources.  The new Multi-scale mOtor Vehicle & equipment Emission
System, known as MOVES (U.S. EPA2001a), is a modeling system designed to better predict
emissions from onroad operations.
       The major effort of this research is to develop a new heavy-duty vehicle load-based mod-
al emission rate model that overcomes some of the limitations of existing models and emission
rates prediction methods. This model is part of the proposed Heavy-Duty Diesel Vehicle Modal
Emission Modeling (HDDV-MEM) which was developed by  Georgia Institute of Technology
(Guensler, et al. 2006). HDDV-MEM differs from other proposed HDDV modal models (Earth,
et al. 2004;  Frey, et al. 2002; Nam 2003) in that the modeling framework first predicts second-
by-second engine power demand as a function of vehicle operating conditions and then applies
brake-specific emission rates to these activity predictions.

-------
                                     FOREWORD
       The U.S. Environmental Protection Agency (EPA) is charged by Congress with protect-
ing the Nation's land, air, and water resources. Under a mandate of national environmental laws,
the agency strives to formulate and implement actions leading to a compatible balance between
human activities and the ability of natural systems to support and nurture life. To meet this man-
date, EPA's research program is providing data and technical  support for solving environmental
problems today and building a science knowledge base necessary to manage our ecological re-
sources wisely, understand how pollutants affect our health, and prevent or reduce environmental
risks in the future.
       The National Risk Management Research Laboratory (NRMRL) is the agency's center
for investigation of technological and management approaches for preventing and reducing risks
from pollution that threaten human health and the environment. The focus of the laboratory's
research program is on methods and their cost-effectiveness for prevention and control of pol-
lution to air, land,  water, and subsurface resources; protection of water quality in public water
systems; remediation of contaminated sites, sediments, and ground water; prevention and control
of indoor air pollution; and restoration of ecosystems. NRMRL collaborates with both public and
private sector partners to foster technologies that reduce the cost of compliance and to antici-
pate emerging problems. NRMRL's research provides solutions to environmental problems by:
developing and promoting technologies that protect and improve the environment; advancing
scientific and engineering information to support regulatory and policy  decisions; and providing
the technical support and information transfer to ensure implementation of environmental regula-
tions and strategies at the national, state, and community levels.
       This publication has been produced as part of the laboratory's strategic long-term re-
search plan. It is published and made available by EPA's Office of Research and Development to
assist the user community and to link researchers with their clients.
                                         Sally Gutierrez, Director
                                         National Risk Management Research Laboratory
                                           in

-------
                               EPA REVIEW NOTICE

       This report has been peer and administratively reviewed by the U.S. Environmental Pro-
tection Agency and approved for publication. Mention of trade names or commercial products
does not constitute endorsement or recommendation for use. This document is available to the
public through the National Technical Information Service, Springfield, Virginia 22161.
                                          IV

-------
                              TABLE OF CONTENTS

ABSTRACT	ii
FOREWORD	iii
EPA RE VIEW NOTICE	iv
LIST OF ACRONYMS	xxii
SUMMARY	xxv
1.  INTRODUCTION	1-1
   1.1 Emissions from Heavy-Duty Diesel Vehicles	1-1
   1.2 Current Heavy-Duty Vehicle Emissions Modeling Practices	1-2
   1.3 Research Approaches and Objectives	1-2
   1.4 Summary of Research Contributions	1-3
   1.5 Report Organization	1-4
2.  HEAVY-DUTY DIESEL VEHICLE EMISSIONS	2-1
   2.1 How Diesel Engine Works	2-1
      2.1.1 The Internal Combustion Engine	2-1
      2.1.2 Comparison with the Gasoline Engine	2-3
   2.2 Diesel Engine Emissions	2-4
      2.2.1 Oxides of Nitrogen and Ozone Formation	2-4
      2.2.2 Fine Paniculate Matter (PM25)	2-5
   2.3 Heavy-Duty Diesel Vehicle Emission Regulations	2-6
      2.3.1 National Ambient Air Quality Standards	2-6
      2.3.2 Heavy-Duty Engine Certification Standards	2-7
      2.3.3 Heavy-Duty Engine Emission Regulations	2-8
   2.4 Heavy-Duty Diesel Vehicle Emission Modeling	2-8
3.  HEAVY-DUTY DIESEL VEHICLE EMISSIONS MODELING	3-1
   3.1 VMT-Based Vehicle Emission Models	3-1
      3.1.1 MOBILE	3-1

-------
      3.1.2EMFAC	3-5
      3.1.3 Summary	3-6
   3.2 Fuel-Based Vehicle Emission Models	3-7
   3.3 Modal Emission Rate Models	3-8
      3.3.1 CMEM	3-8
      3.3.2 MEASURE	3-9
      3.3.3 MOVES	3-10
      3.3.4HDDV-MEM	3-11
          3.3.4.1 Model Development Approaches	3-12
          3.3.4.2 Vehicle Activity Module	3-13
          3.3.4.3 Engine Power Module	3-14
          3.3.4.4 Emission Rate Module	3-18
          3.3.4.5 Emission Outputs	3-19
4.  EMIS SIGN D ATASET DESCRIPTION AND POST-PROCES SING PROCEDURE.... 4-1
   4.1 Transit Bus Dataset	4-1
      4.1.1 Data Collection Method	4-2
      4.1.2 Transit Bus Data Parameters	4-4
      4.1.3 Sensors, Inc. Data Processing Procedure	4-5
      4.1.4 Data Quality Assurance/Quality Check	4-6
      4.1.5 Database Formation	4-11
      4.1.6 Data Summary	4-12
   4.2 Heavy-duty Vehicle  Dataset	4-14
      4.2.1 Data Collection Method	4-14
      4.2.2 Heavy-duty Vehicle Data Parameters	4-16
      4.2.3 Data Quality Assurance/Quality Control Check	4-17
      4.2.4 Database Formation	4-20
      4.2.5 Data Summary	4-20
5.  METHODOLOGICAL APPROACH	5-1
   5.1 Modeling  Goal and Objectives	5-1
   5.2 Statistical  Method	5-2
      5.2.1 Parametric Methods	5-2
          5.2.1.1 The^-Test	5-2
          5.2.1.2 Ordinary Least Squares Regression	5-3
          5.2.1.3 Robust Regression	5-5
      5.2.2 Nonparametric Methods	5-5
          5.2.2.1 Chi-Square Test	5-5
                                          VI

-------
          5.2.2.2 Kolmogorv-Smirnov Two-Sample Test	5-6
          5.2.2.3 Wilcoxon Mann-Whitney Test	5-7
          5.2.2.4 Analysis of Variance (ANOVA)	5-7
          5.2.2.5 HTBR	5-8
   5.3 Modeling Approach	5-11
   5.4 Model Validation	5-13
6.  DATA SET SELECTION AND ANALYSIS OF EXPLANATORY VARIABLES	6-1
   6.1 Data Set Used for Model Development	6-1
   6.2 Representative Ability of the Transit Bus Data Set	6-3
   6.3 Variability in Emissions Data	6-5
      6.3.1 Inter-bus Variability	6-5
      6.3.2 Descriptive Statistics for Emissions Data	6-8
   6.4 Potential Explanatory Variables	6-18
      6.4.1 Vehicle Characteristics 	6-19
      6.4.2 Roadway Characteristics	6-22
      6.4.3 Onroad Load Parameters	6-23
      6.4.4 Environmental Conditions	6-23
      6.4.5 Summary	6-24
   6.5 Selection of Explanatory Variables 	6-24
7.  MODAL ACTIVITY DEFINITIONS DEVELOPMENT	7-1
   7.1 Overview of Current Modal Activity Definitions 	7-1
   7.2 Proposed Modal Activity Definitions and Validation	7-3
   7.3 Conclusions	7-11
8. IDLE MODE DEVELOPMENT	8-1
   8.1 Critical Value for Speed in Idle Mode	8-1
   8.2 Critical Value for Acceleration in Idle Mode 	8-4
   8.3 Emission Rate Distribution by Bus in Idle Mode	8-8
   8.4 Discussions	8-13
      8.4.1 High HC Emissions	8-13
      8.4.2 High Engine Operating Parameters	8-15
   8.5 Idle Emission Rates Estimation	8-16
   8.6 Conclusions and Further Considerations	8-19
9. DECELERATION MODE DEVELOPMENT	9-1
   9.1 Critical Value for Deceleration Rates in Deceleration Mode	9-1
   9.2 Analysis of Deceleration Mode Data	9-5
      9.2.1 Emission Rate Distribution by Bus in Deceleration Mode	9-5
                                         vn

-------
      9.2.2 Engine Power Distribution by Bus in Deceleration Mode	9-9
  9.3 The Deceleration Motoring Mode	9-12
  9.4 Deceleration Emission Rate Estimations	9-15
  9.5  Conclusions and Further Considerations	9-19
10. ACCELERATION MODE DEVELOPMENT	10-1
  10.1 Critical Value for Acceleration in Acceleration Mode 	10-1
  10.2 Analysis of Acceleration Mode Data	10-6
      10.2.1 Emission Rate Distribution by Bus in Acceleration Mode 	10-6
      10.2.2 Engine Power Distribution by Bus in Acceleration Mode 	10-10
   10.3 Model Development and Refinement	10-12
      10.3.1 HTBR Tree Model Development	10-12
          10.3.1.1NOXHTBR Tree Model Development	10-16
          10.3.1.2 CO HTBR Tree Model Development	10-20
          10.3.1.3 HC HTBR Tree Model Development	10-22
      10.3.2 OLS  Model Development and Refinement	10-29
          10.3.2.1 NOx Emission Rate Model Development for Acceleration Mode	10-29
             10.3.2.1.1 Linear Regression Model with Engine Power	10-29
             10.3.2.1.2 Linear Regression Model with Engine Power and Vehicle
                      Speed	10-34
             10.3.2.1.3 Linear Regression Model with Dummy Variables	10-36
             10.3.2.1.4 Model Discussions	10-38
          10.3.2.2 CO Emission Rate Model Development for Acceleration Mode	10-42
             10.3.2.2.1 Linear Regression Model with Engine Power	10-42
             10.3.2.2.2 Linear Regression Model with Engine Power and Vehicle
                      Speed	10-46
             10.3.2.2.3 Linear Regression Model with Dummy Variables	10-47
          10.3.2.3 HC Emission Rate Model Development for Acceleration Mode	10-54
             10.3.2.3.1 Linear Regression with Engine Power	10-55
             10.3.2.3.2 Linear Regression Model with Dummy Variables	10-59
             10.3.2.3.3 Model Discussions	10-61
  10.4 Conclusions and Further Considerations	10-63
11. CRUISE MODE DEVELOPMENT	11-1
  11.1 Analysis of Cruise Mode Data	11-1
      11.1.1 Engine Rate Distribution by Bus in Cruise Mode	11-2
      11.1.2 Engine Power Distribution by Bus in Cruise Mode	11-5
  11.2 Model Development and Refinement	11-7
                                         vin

-------
      11.2.1 HTBR Tree Model Development	11-7
          11.2.1.1NOXHTBR Tree Model Development	11-11
          11.2.1.2 CO HTBR Tree Model Development	11-15
          11.2.1.3 HC HTBR Tree Model Development	11-19
      11.2.2 OLS Model Development and Refinement 	11-25
          11.2.2.1 NOx Emission Rate Model Development for Cruise Mode	11-25
             11.2.2.1.1 Linear Regression Model with Engine Power	11-25
             11.2.2.1.2 Linear Regression Model with Dummy Variables	11-30
             11.2.2.1.3 Model Discussion	11-32
          11.2.2.2 CO Emission Rate Model Development for Cruise Mode	11-35
             11.2.2.2.1 Linear Regression Model with Engine Power	11-35
             11.2.2.2.2 Linear Regression Model with Dummy Variables	11-39
             11.2.2.2.3 Model Discussion	11-41
          11.2.2.3 HC Emission Rate Model Development for Cruise Mode	11-43
             11.2.2.3.1 Linear Regression Model with Engine Power	11-43
             11.2.2.3.2 Linear Regression Model with Dummy Variables	11-47
             11.2.2.3.3 Model Discussion	11-49
   11.3 Conclusions and Further Considerations	11-51
12. MODEL VERIFICATION	12-1
   12.1 Engine Power vs. Surrogate Power Variables	12-1
   12.2 Mean Emission Rates vs. Linear Regression Model	12-4
   12.3 Mode-specific Load Based Modal Emission Rate Model vs. Emission Rate
       Models as a Function of Engine Load	12-6
   12.4 Separation of Acceleration and Cruise Modes	12-11
   12.5 MOBILE6.2 vs. Load-Based Modal Emission Rate Model	12-12
   12.6  Conclusions	12-13
13. CONCLUSIONS	13-1
   13.1 Transit Bus Emission Rate Models	13-3
   13.2 Model Limitations	13-4
   13.3 Lessons Learned	13-5
   13.4 Contributions	13-6
   13.5 Recommendation for Further Studies	13-6
14.  REFERENCES	14-1
                                         IX

-------
                                TABLE OF FIGURES

Figure 2.1 Actions of a four-stroke gasoline internal combustion engine — Adapted
           from (HowStuffWorks 2005)	2-2
Figure 2.2 Actions of a four-stroke diesel engine (HowStuffWorks 2005)	2-3
Figure 3-1 FTP Transient Cycle (DieselNet 2006)	3-3
Figure 3-2 Urban Dynamometer Driving Schedule Cycle for Heavy-Duty
           Vehicle (DieselNet 2006)	3-3
Figure 3-3 CARB's Four Mode Cycles (CARS 2002)	3-6
Figure 3-4 A Framework of Heavy-Duty Diesel Vehicle Modal Emission
           Model (Guensler et al. 2005)	3-12
Figure 3-5 Primary Elements in the Drivetrain (Gillespie 1992)	3-14
Figure 4-1 Bus Routes Tested for U. S. EPA (Ensfield 2002)	4-3
Figure 4-2 SEMTECH-D in Back of Bus (Ensfield 2002)	4-4
Figure 4-3 Bus 380 GPS vs. ECM Vehicle  Speed (Ensfield 2002)	4-6
Figure 4-4 Example Check for Erroneous GPS Data for Bus 360 (Ensfield 2002)	4-8
Figure 4-5 Example Check for Synchronization Errors for Bus 360	4-9
Figure 4-6 Histograms of Engine Power for Zero Speed Data Based on Three
           Different Time Delays	4-10
Figure 4-7 General Criteria for Maximum Grades (Roess etal. 2004)	4-11
Figure 4-8 Onroad Diesel Emissions Characterization Facility (U.S. EPA2001c)	4-14
Figure 4-9 Example Check for Erroneous Measured Horsepower for Test 3DRI2-2	4-18
Figure 4-10 Vehicle Speed Correlation (U.S. EPA2001 c)	4-19
Figure 4-11  Vehicle Speed Error for Different Speed Ranges (U. S. EPA 2001 c)	4-19
Figure 6-1 HTBR Regression Tree Result for NO  Emission Rate for All Data Sets	6-2
Figure 6-2 HTBR Regression Tree Result for CO Emission Rate for All Data Sets	6-2
Figure 6-3 HTBR Regression Tree Result for HC Emission Rate for All Data Sets	6-3
Figure 6-4 Transit Bus Speed-Acceleration Matrix	6-4

-------
Figure 6-5 Test Environmental Conditions	6-5
Figure 6-6 Median and Mean of NO  Emission Rates by Bus	6-6
Figure 6-7 Median and Mean of CO Emission Rates by Bus	6-7
Figure 6-8 Median and Mean of HC Emission Rates by Bus	6-7
Figure 6-9 Empirical Cumulative Distribution Function Based on Bus Based
           Median Emission Rates for Transit Buses	6-8
Figure 6-10 Histogram, Boxplot, and Probability Plot of NO  Emission Rate 	6-9
Figure 6-11 Histogram, Boxplot, and Probability Plot of CO Emission Rate 	6-10
Figure 6-12 Histogram, Boxplot, and Probability Plot of HC Emission Rate	6-10
Figure 6-13 Histogram, Boxplot, and Probability Plot of Truncated NOx
           Emission Rate	6-12
Figure 6-14 Histogram, Boxplot, and Probability Plot of Truncated CO
           Emission Rate	6-13
Figure 6-15 Histogram, Boxplot, and Probability Plot of Truncated HC
           Emission Rate	6-13
Figure 6-16 Histogram, Boxplot, and Probability Plot of Truncated Transformed
           NO  Emission Rate	6-15
              X
Figure 6-17 Histogram, Boxplot, and Probability Plot of Truncated Transformed
           CO Emission Rate  	6-15
Figure 6-18 Histogram, Boxplot, and Probability Plot of Truncated Transformed
           HC Emission Rate  	6-16
Figure 6-19 The X Classes and  Typical Vehicle Configurations	6-20
Figure 6-20 Throttle Position vs. Engine Power for Transit Bus Data Set	6-27
Figure 6-21  Scatter plots for environmental parameters	6-29
Figure 7-1 Average NO Modal Emission Rates for Different Activity Definitions 	7-5
Figure 7-2 Average CO Modal Emission Rates for Different Activity Definitions	7-5
Figure 7-3 Average HC Modal Emission Rates for Different Activity Definitions	7-6
Figure 7-4 HTBR Regression Tree Result for NO  Emission Rate  	7-8
Figure 7-5 HTBR Regression Tree Result for CO Emission Rate	7-8
Figure 7-6 HTBR Regression Tree Result for HC Emission Rate	7-9
Figure 8-1 Engine Power vs. NO  Emission Rate for Three Critical Values	8-2
Figure 8-2 Engine Power vs. CO Emission Rate for Three Critical Values	8-2
Figure 8-3 Engine Power vs. HC Emission Rate for Three Critical Values	8-3
Figure 8-4 Engine Power Distribution for Three Critical Values based on NO  Emissions.. 8-3
Figure 8-5 Engine Power vs. NO  Emission Rate for Four Options	8-5
                                           XI

-------
Figure 8-6 Engine Power vs. CO Emission Rate for Four Options	8-6
Figure 8-7 Engine Power vs. HC Emission Rate for Four Options	8-6
Figure 8-8 Engine Power Distribution for Four Options based on NO Emission Rates	8-7
Figure 8-9 Histograms of Three Pollutants for Idle Mode	8-9
Figure 8-10 Median and Mean of NO Emission Rates in Idle Mode by Bus	8-9
Figure 8-11 Median and Mean of CO Emission Rates in Idle Mode by Bus	8-10
Figure 8-12 Median and Mean of HC Emission Rates in Idle Mode by Bus	8-10
Figure 8-13 Histograms of Engine Power in Idle Mode by Bus	8-12
Figure 8-14 Tree Analysis Results for High HC Emission Rates by Bus and Trip	8-14
Figure 8-15 Time Series Plot for Bus 360 Trip 4 Idle Segment 1 (130 Seconds)	8-14
Figure 8-16 Time Series Plot for Bus 360 Trip 4 Idle Segment 38 (516 Seconds)	8-15
Figure 8-17 Time Series Plot for Bus 372 Trip 1 Idle Segment 1 (500 Seconds)	8-15
Figure 8-18 Time Series Plot for Bus 383 Trip 1 Idle Segment 12 (1258 Seconds)	8-16
Figure 8-19 Graphical Illustration of Bootstrap (Adopted from Li 2004))	8-17
Figure 8-20 Bootstrap Results for Idle Emission Rate Estimation	8-18
Figure 9-1 Engine Power Distribution for Three Options	9-3
Figure 9-2 Engine Power vs. NO Emission Rate for Three Options	9-3
Figure 9-3 Engine Power vs. CO Emission Rate for Three Options	9-4
Figure 9-4 Engine Power vs. HC Emission Rate for Three Options	9-4
Figure 9-5 Histograms of Three Pollutants for Deceleration Mode	9-5
Figure 9-6 Median and Mean of NO Emission Rates in Deceleration Mode by Bus	9-6
Figure 9-7 Median and Mean of CO Emission Rates in Deceleration Mode by Bus	9-6
Figure 9-8 Median and Mean of HC Emission Rates in Deceleration Mode by Bus	9-7
Figure 9-9 Histograms of Engine Power in Deceleration Mode by Bus	9-11
Figure 9-10 Engine Power vs. Vehicle Speed, Engine Power vs. Engine Speed,
           and Vehicle Speed vs. Engine Speed	9-12
Figure 9-11 Histograms for Three Pollutants in Deceleration Motoring Mode (a)
           and Deceleration Non-Motoring Mode (b)	9-13
Figure 9-12 Bootstrap Results for NO Emission Rate Estimation in
           Deceleration Mode	9-16
Figure 9-13 Bootstrap Results for CO Emission Rate Estimation in
           Deceleration Mode	9-16
Figure 9-14 Bootstrap Results for HC Emission Rate Estimation in
           Deceleration Mode	9-17
Figure 9-15 Emission Rate Estimation Based on Bootstrap for Deceleration Mode	9-17
                                          xn

-------
Figure 10-1 Engine Power Distribution for Three Options	10-2
Figure 10-2 Engine Power vs. NO  Emission Rate (g/s) for Three Options	10-2
Figure 10-3 Engine Power vs. CO Emission Rate (g/s) for Three Options	10-3
Figure 10-4 Engine Power vs. HC Emission Rate (g/s) for Three Options	10-3
Figure 10-5 Engine Power vs. Emission Rate for Acceleration Mode and Cruise Mode.... 10-6
Figure 10-6 Histograms of Three Pollutants for Acceleration Mode	10-7
Figure 10-7 Median and Mean of NO  Emission Rates in Acceleration Mode by Bus	10-8
Figure 10-8 Median and Mean of CO Emission Rates in Acceleration Mode by Bus	10-9
Figure 10-9 Median and Mean of HC Emission Rates in Acceleration Mode by Bus	10-9
Figure 10-10 Histograms of Engine Power in Acceleration Mode by Bus	10-11
Figure 10-11 Histogram, Boxplot, and Probability Plot of Truncated NO
           Emission Rate in Acceleration Mode	10-13
Figure 10-12 Histogram, Boxplot, and Probability Plot of Truncated CO
           Emission Rate in Acceleration Mode	10-14
Figure 10-13 Histogram, Boxplot, and Probability Plot of Truncated HC
           Emission Rate in Acceleration Mode	10-14
Figure 10-14 Histogram, Boxplot, and Probability Plot of Truncated Transformed
           NO Emission Rate in Acceleration Mode	10-15
Figure 10-15 Histogram, Boxplot, and Probability Plot of Truncated Transformed
           CO Emission Rate in Acceleration Mode	10-15
Figure 10-16 Histogram, Boxplot, and Probability Plot of Truncated Transformed
           HC Emission Rate in Acceleration Mode	10-16
Figure 10-17 Original Untrimmed Regression Tree Model for Truncated
           Transformed NO Emission Rate in Acceleration Mode	10-17
Figure 10-18 Reduction in Deviation with the Addition of Nodes of Regression
           Tree for Truncated Transformed NOx Emission Rate in Acceleration Mode. 10-18
Figure 10-19 Trimmed Regression Tree Model for Truncated Transformed NO
           Emission Rate in Acceleration Mode	10-18
Figure 10-20 Original Untrimmed Regression Tree Model for Truncated
           Transformed CO Emission Rate in Acceleration Mode	10-20
Figure 10-21 Reduction in Deviation with the Addition of Nodes of Regression
           Tree for Truncated Transformed CO Emission Rate in Acceleration Mode.. 10-21
Figure 10-22 Trimmed Regression Tree Model for Truncated Transformed CO
           Emission Rate in Acceleration Mode	10-21
                                          Xlll

-------
Figure 10-23 Original Untrimmed Regression Tree Model for Truncated
           Transformed HC Emission Rate in Acceleration Mode	10-23
Figure 10-25 Trimmed Regression Tree Model for Truncated Transformed
           HC in Acceleration Mode	10-25
Figure 10-26 Secondary Trimmed Regression Tree Model for Truncated
           Transformed HC Emission Rate in Acceleration Mode	10-26
Figure 10-27 Final Regression Tree Model for Truncated Transformed HC
           and Engine Power in Acceleration Mode	10-28
Figure 10-28  QQ and Residual vs. Fitted Plot for NOx Model 1.1	10-31
Figure 10-29 QQ and Residual vs. Fitted Plot for NOx Model 1.2	10-32
Figure 10-30  QQ and Residual vs. Fitted Plot for NOx Model 1.3	10-33
Figure 10-31 QQ and Residual vs. Fitted Plot for NOx Model 1.4	10-36
Figure 10-32 QQ and Residual vs. Fitted Plot for NOx Model 1.5	10-38
 Figure 10-33 QQ and Residual vs. Fitted Plot for CO Model 2.1	10-43
Figure 10-34 QQ and Residual vs. Fitted Plot for CO Model 2.2	10-44
Figure 10-35 QQ and Residual vs. Fitted Plot for CO Model 2.3	10-45
Figure 10-36 QQ and Residual vs. Fitted Plot for CO Model 2.4	10-47
Figure 10-37 QQ and Residual vs. Fitted Plot for CO Model 2.5	10-50
Figure 10-38 QQ and Residual vs. Fitted Plot for CO Model 2.6	10-52
           10-55
Figure 10-39 QQ and Residual vs. Fitted Plot for HC Model 3.1	10-56
Figure 10-40 QQ and Residual vs. Fitted Plot for HC Model 3.2	10-57
Figure 10-41 QQ and Residual vs. Fitted Plot for HC Model 3.3	10-58
Figure 10-42 QQ and Residual vs. Fitted Plot for HC Model 3.4	10-61
Figure 11-1 Histograms of Three Pollutants for Cruise Mode	11-2
Figure 11-2 Median and Mean of NO  Emission Rates in Cruise Mode by Bus	11-3
Figure 11-3 Median and Mean of CO Emission Rates in Cruise Mode by Bus	11-3
Figure 11-4 Median and Mean of HC Emission Rates in Cruise Mode by Bus	11-4
Figure 11-5 Histograms of Engine Power in Cruise Mode by Bus	11-6
Figure 11-6 Histogram, Boxplot, and Probability Plot of Truncated NOx Emission
           Rates in Cruise Mode	11-8
Figure 11-7 Histogram, Boxplot, and Probability Plot of Truncated CO Emission
           Rate in Cruise Mode	11-9
Figure 11-8 Histogram, Boxplot, and Probability Plot of Truncated HC Emission
           Rate in Cruise Mode	11-9
                                          xiv

-------
Figure 11-9 Histogram, Boxplot, and Probability Plot of Truncated Transformed
           NO  Emission Rate in Cruise Mode	11-10
Figure 11-10 Histogram, Boxplot, and Probability Plot of Truncated Transformed
           CO Emission Rate in Cruise Mode	11-10
Figure 11-11 Histogram, Boxplot, and Probability Plot of Truncated Transformed
           HC Emission Rate in Cruise Mode	11-11
Figure 11-12 Original Untrimmed Regression Tree Model for Truncated
           Transformed NO  Emission Rate in Cruise Mode	11-12
Figure 11-13 Reduction in Deviation with the Addition of Nodes of Regression
           Tree for Truncated Transformed NO Emission Rate in Cruise Mode	11-12
Figure 11-14 Trimmed Regression Tree Model for Truncated Transformed NO
           Emission Rate in Cruise Mode	11-14
Figure 11-15 Original Untrimmed Regression Tree Model for Truncated
           Transformed CO Emission Rate in Cruise Mode	11-16
Figure 11-16 Reduction in Deviation with the Addition of Nodes of Regression
           Tree for Truncated Transformed CO Emission Rate in Cruise Mode	11-16
Figure 11-17 Trimmed Regression Tree Model for Truncated Transformed CO
           Emission Rate in Cruise Mode	11-18
Figure 11-18 Original Untrimmed Regression Tree Model for Truncated
           Transformed HC Emission Rate in Cruise Mode	11-19
Figure 11-19 Trimmed Regression Tree Model for Truncated Transformed HC
           Emission Rate in Cruise Mode	11-21
Figure 11-20 Secondary Trimmed Regression Tree Model for Truncated
           Transformed HC in Cruise Mode	11-22
Figure 11-21 Final Regression Tree Model for Truncated Transformed HC
           and Engine Power in Cruise Mode	11-24
Figure 11-22 QQ and Residual vs. Fitted Plot for NOx Model  1.1	11-27
Figure 11-23 QQ and Residual vs. Fitted Plot for NOx Model  1.2	11-28
Figure 11-24 QQ and Residual vs. Fitted Plot for NOx Model  1.3	11-29
Figure 11-25 QQ and Residual vs. Fitted Plot for NOx Model  1.4	11-32
Figure 11-26 QQ and Residual vs. Fitted Plot for CO Model 2.1	11-36
Figure 11-27 QQ and Residual vs. Fitted Plot for CO Model 2.2	11-37
Figure 11-28 QQ and Residual vs. Fitted Plot for CO Model 2.3	11-38
Figure 11-29 QQ and Residual vs. Fitted Plot for CO Model 2.4	11-40
Figure 11-30 QQ and Residual vs. Fitted Plot for HC Model 3.1	11-44
                                          xv

-------
Figure 11-31 QQ and Residual vs. Fitted Plot for HC Model 3.2	11-45
Figure 11-32 QQ and Residual vs. Fitted Plot for HC Model 3.3	11-46
Figure 11-33 QQ and Residual vs. Fitted Plot for HC Model 3.4	11-48
Figure 12-1 QQ and Residual vs. Fitted Plot for NOx Model 1	12-4
Figure 12-2 Trimmed Regression Tree Model for Truncated Transformed NO  	12-7
Figure 12-3 QQ and Residual vs. Fitted Plot for Load-Based Only NOx Emission
           Rate Model	12-9
                                           xvi

-------
                                 TABLE OF TABLES

Table 2-1. National Ambient Air Quality Standards (U.S. EPA 2006)	2-6
Table 2-2. Heavy-Duty Engine Emissions Standards (U.S. EPA 1997)	2-8
Table 3-1. Heavy-Duty Vehicle NOx Emission Rates in MOBILE6	3-4
Table 3-2 Heavy-Duty Vehicle CO Emission Rates in MOBILE6	3-4
Table 3-3 Heavy-Duty Vehicle HC Emission Rates in MOBILE6	3-4
Table 4-1 Buses Tested for U.S. EPA (Ensfield 2002)	4-2
Table 4-2 Transit Bus Parameters Given by the U.S. EPA (Ensfield 2002)	4-4
Table 4-3 List of Parameters Used in Explanatory Analysis for Transit Bus	4-12
Table 4-4 Summary of Transit Bus Database	4-13
Table 4-5 Onroad Tests Conducted with Pre-Rebuild Engine	4-15
Table 4-6 Onroad Tests Conducted with Post-Rebuild Engine	4-16
Table 4-7 List of Parameters Given in Heavy-duty Vehicle Dataset Provided by
         U.S. EPA	4-17
Table 4-8 List of Parameters Used in Explanatory Analysis for HDD V	4-20
Table 4-9 Summary of Heavy-Duty Vehicle Data U.S. EPA 2001 c)	4-21
Table 5-1 ANOVA Table for Single-Factor Study (Neteretal.  1996)	5-8
Table 6-1 Basic  Summary Statistics for Emissions Rate Data for Transit Bus	6-9
Table 6-2 Basic  Summary Statistics for Truncated Emissions Rate Data	6-12
Table 6-4 Percent of High Emission Points by Bus 	6-18
Table 6-5 Correlation Matrix for Transit Bus Data Set	6-25
Table 7-1 Comparison of Modal Activity Definition	7-3
                                         xvn

-------
Table 7-2 Four Different Mode Definitions and Modal Variables	7-4
Table 7-3 Results for Pairwise Comparison for Modal Average Estimates
         In Terms of P-value 	7-7
Table 7-4 Sensitivity Test Results for Four Mode Definition	7-9
Table 8-1 Engine Power Distribution for Three Critical Values for Three Pollutants	8-4
Table 8-2 Percentage of Engine Power Distribution for Three Critical Values for
         Three Pollutants	8-4
Table 8-3 Engine Power Distribution for Four Options for Three Pollutants	8-7
Table 8-4 Percentage of Engine Power Distribution for Three Critical Values
         for Three Pollutants	8-8
Table 8-5 Median, and Mean of Three Pollutants in Idle Mode by Bus	8-11
Table 8-6 Engine Power Distribution in Idle Mode by Bus	8-13
Table 8-7 Idle Mode Statistical Analysis Results for NOx, CO, andHC	8-17
Table 8-8 Idle Emission Rates Estimation and 95% Confidence Intervals
         Based on Bootstrap	8-18
Table 9-1 Engine Power Distribution for Three Options for Three Pollutants	9-2
Table 9-2 Percentage of Engine Power Distribution for Three Options for
         Three Pollutants	9-2
Table 9-3 Median, and Mean for  NOx, CO, and HC in Deceleration Mode by Bus	9-7
Table 9-4 High HC Emissions Distribution by Bus and Trip for Deceleration Mode	9-9
Table 9-5 Engine Power Distributions in Deceleration Mode by Bus	9-10
Table 9-6 Comparison of Emission Distributions between Deceleration Mode and
         Two Sub-Modes (Deceleration Motoring Mode and Deceleration
         Non-Motoring Mode)	9-14
Table 9-7 Emission Rate Estimation and 95% Confidence Intervals Based on
         Bootstrap for Deceleration Mode 	9-18
Table 10-1 Engine Power Distribution for Three Options for Three Pollutants	10-4
Table 10-2 Percentage of Engine Power Distribution for Three Options for Three
         Pollutants	10-4
Table 10-3  Engine Power Distribution for Acceleration Mode and Cruise Mode	10-5
Table 10-4 Median and Mean of  Three Pollutants in Acceleration Mode by Bus	10-7
Table 10-5 Engine Power Distribution in Acceleration Mode by Bus	10-10
                                         xvin

-------
Table 10-6 Original Untrimmed Regression Tree Results for Truncated Transformed
         NO Emission Rate in Acceleration Mode	10-17
            X
Table 10-7 Trimmed Regression Tree Results for Truncated Transformed NO
                     &                                              X
         Emission Rate in Acceleration Mode	10-19

Table 10-8 Original Untrimmed Regression Tree Results for Truncated
         Transformed CO Emission Rate in Acceleration Mode	10-20

Table 10-9 Trimmed Regression Tree Results for Truncated Transformed CO
         Emission Rate in Acceleration Mode	10-22

Table 10-10 Original Untrimmed Regression Tree Results for Truncated
         Transformed HC Emission Rate in Acceleration Mode	10-23

Table 10-11 Trimmed Regression Tree Results for Truncated Transformed HC
         in Acceleration Mode	10-25

Table 10-12 Secondary Trimmed Regression Tree Results for Truncated
         Transformed HC Emission Rate in Acceleration Mode	10-27

Table 10-13 Final Regression Tree Results for Truncated Transformed HC
         and Engine Power in Acceleration Mode	10-28

Table 10-14 Regression Result for NOx Model 1.1	10-30

Table 10-15 Regression Result for NOx Model 1.2	10-32

Table 10-16 Regression Result for NOx Model 1.3	10-33

Table 10-17 Regression Result for NOx Model 1.4	10-35

Table 10-18 Regression Result for NOx Model 1.5	10-37

Table 10-19 Comparative Performance Evaluation of NOx Emission Rate Models	10-40

Table 10-20 Regression Result for CO Model 2.1	10-42

Table 10-21 Regression Result for CO Model 2.2	10-44

Table 10-22 Regression Result for CO Model 2.3	10-45

Table 10-23 Regression Result for CO Model 2.4	10-46

Table 10-24 Regression Result for CO Model 2.5	10-49

Table 10-25  Regression Result for CO Model 2.6	10-51

Table 10-26 Comparative Performance Evaluation of CO Emission Rate Models	10-53

Table 10-27  Regression Result for HC Model 3.1	10-55

Table 10-28 Regression Result for HC Model 3.2	10-57

Table 10-29 Regressi on Result for HC Model 3.3	10-58
                                         xix

-------
Table 10-31 Comparative Performance Evaluation of HC Emission Rate Models	10-62

Table 11-1 Engine Power Distribution for Cruise Mode  	11-1

Table 11-2 Median and Mean of Three Pollutants in Cruise Mode by Bus	11-4

 Table 11-3 Engine Power Distribution in Cruise Mode by Bus	11-5

Table 11-4 Original Untrimmed Regression Tree Results for Truncated Transformed
         NO Emission Rate in Cruise Mode	11-13
            X
Table 11-5 Trimmed Regression Tree Results for Truncated Transformed NO
                     &                                              X
         Emission Rate in Cruise Mode	11-14

Table 11-6 Original Untrimmed Regression Tree Results for Truncated Transformed
         CO Emission Rate in Cruise Mode	11-17

Table 11-7 Trimmed Regression Tree Results for Truncated Transformed CO
         Emission Rate in Cruise Mode	11-18

Table 11-8 Original Untrimmed Regression Tree Results for Truncated Transformed
         HC Emission Rate in Cruise Mode	11-20

Table 11-9 Trimmed Regression Tree Results for Truncated Transformed HC
         Emission Rate in Cruise Mode	11-21

Table 11-10 Trimmed Regression Tree Results for Truncated Transformed HC in
         Cruise Mode	11-23

Table 11-11 Final Regression Tree Results for Truncated Transformed HC and
         Engine Power in Cruise Mode	11-24

Table 11-12 Regression Result for NO Model 1.1	11-26
             &                   X
Table 11-13 Regression Result for NO Model 1.2	11-28
             &                   X
Table 11-15 Regression Result for NO Model 1.4	11-31
             &                   X
Table 11-16 Comparative Performance Evaluation of NOx Emission Rate Models	11-33

Table 11-17 Regression Result for CO Model 2.1	11-35

Table 11-18 Regression Result for CO Model 2.2	11-37

Table 11-19 Regression Result for CO Model 2.3	11-38

Table 11-20 Regression Result for CO Model 2.4	11-40

Table 11-21 Comparative Performance Evaluation of CO Emission Rate Models	11-41

Table 11-22 Regression Result for HC Model 3.1	11-43

Table 11-23 Regression Result for HC Model 3.2	11-45

Table 11-24 Regression Result for HC Model 3.3	11-46
                                          xx

-------
Table 11-25 Regress!on Result for HC Model 3.4	11-48
Table 11-26 Comparative Performance Evaluation of HC Emission Rate Models	11-49
Table 12-1 Regression Result for NOx Model 1	12-3
Table 12-2 Comparative Performance Evaluation between Mode-Only Models
         and Linear Regression Models	12-6
Table 12-3 Trimmed Regression Tree Results for Truncated Transformed NO  	12-8
                     &                                            X
Table 12-4 Regression Result for NO  Load-Based Only Emission Rate Model	12-9
Table 12-5 Comparative Performance Evaluation Between Load-Based Only
         Emission Rate (ER) Model and Load-Based Modal Emission Rate Model	12-10
Table 12-6 Comparative Performance Evaluation between Linear Regression with
         Combined Mode and Linear Regression with Acceleration and Cruise
         Modes	12-12
Table 12-7 Comparative Performance Evaluation between MOBILE 6.2 and Load-Based
         Modal ERModel	12-13
Table 13-1 Load Based Modal Emission Models	13-3
                                        xxi

-------
                                 LIST OF ACRONYMS
%             percent
AADT         annual average daily traffic
AATA          Ann Arbor Transit Authority
Ace            acceleration
ANOVA        analysis of variance
APPCD        Air Pollution Prevention and Control Division
bhp            brake horsepower
BSFC          brake specific fuel consumption
C              Celsius
CARB          California Air Resources Board
CART          classification and regression testing
CE-CERT      College  of Engineering - Center for Environmental Research and Technology
CMEM        Comprehensive Modal Emissions Model
CO            carbon monoxide
deg            degree
df             degrees  of freedom
DPS           drag power surrogate
DVD          digital video disc
ECM          electronic control module
EMFAC        CARB's mobile source emission factor model
E(MS)          expected mean square
EPA           Environmental Protection Agency
F              Fahrenheit
FFiWA         Federal Highway Administration
FR            Federal Register
FTP           Federal Test Procedure
g/bhp-hr        grams per brake-horsepower-hour
g/h            grams per hour
g/s            grams per second
GIS            geographic information system
GPS           global positioning system
GVWR        gross  vehicle weight rating
HC            hydrocarbon
HDD          heavy-duty diesel
HDDV         heavy-duty diesel vehicle
                                           xxn

-------
HDDV-MEM   Heavy-Duty Diesel Vehicle-Modal Emission Model
HDV          heavy-duty vehicle
HDV8B        heavy-duty vehicle 8B
HDV-UDDS    heavy-duty vehicle urban dynamometer driving schedule
Hg            mercury
HHDDE        heavy-heavy duty diesel engine
HTBR         hierarchical tree-based regression
Hz            hertz
1C             internal combustion
IPS            inertial power surrogate
kPa            kilopascal
K/S            Kolmogorov-Smirnov
LAFY         Los Angeles freeway
LANF         Los Angeles non-freeway
Ib             pound
Ib-ft           pound-feet
LDV          light-duty vehicle
LHDDE        light-heavy duty diesel engine
MARTA        Metropolitan Atlanta Rapid Transit Authority
MDPV         medium duty passenger vehicle
MEASURE     Mobile Emissions Assessment System for Urban and Regional Scale Emissions
MM           method of moments
mg/m3         milligrams per cubic meter
MHDDE       medium-heavy duty diesel engine
MOBILE       EPA's mobile source emission rate model
MOBILE6      EPA's mobile source emission rate model
MOVES        Motor Vehicle Emission Simulator
MPE          mean prediction error
mpg           miles per gallon
mph           miles per hour
mph/s          miles per hour per second
MS            mean square
N2            nitrogen
NAAQS        National Ambient Air Quality Standards
NCSU         North Carolina State University
NGM          EPA's Next Generation Model (mobile sources)
NIPER         National Institute for Petroleum and Energy Research
NIST          National Institute of Standards and Technology
NO            nitrogen oxide
NO            nitrogen dioxide
NONROAD    EPA's emission rate model for non-road sources
                                          xxin

-------
NOX
NRMRL
NYNF
°2
°3
ODBC
OLS
OTAQ
Pb
PCV
PERE
PM
PMio
PM2,
ppmv
QA/QC
QQ
RARE
RMSE
RPM
SCFM
S02
ss
SSE
SSTO
SUV
TB-EPDS
TIUS
TRB
UCR-CERT
UDDS
U.S.  EPA
uv
VIF
VMT
VOCs
VSP
ug/m3
um
nitrogen oxides
National Risk Management Research Laboratory
New York non-freeway
oxygen
ozone
Onroad Diesel Emissions Characterization
ordinary least squares
Office of Transportation and Air Quality
lead
positive crankcase ventilation
Physical Emission Rate Estimator
particulate matter
particulate matter <10 microns
particulate matter < 2.5 microns
parts per million by volume
quality assurance/quality control
quantile-quantile
Regional Applied Research Effort
root mean square error
revolutions per minute
standard cubic feet per minute
sulfur dioxide
sum of squares
sum of squares due to errors
total sum of squares
sport utility vehicle
transit bus engine power demand simulator
Truck Inventory and Use Survey
Transportation Research Board
University of California Riverside - Center for Environmental Research and Technology
urban dynamometer driving schedule
U.S. Environmental Protection Agency
ultraviolet
variance inflation factor
vehicle miles traveled
volatile organic compounds
vehicle specific power
micrograms per cubic meter
micron
                                           xxiv

-------
                                      SUMMARY

       Heavy-duty diesel vehicle (HDDV) operations are a major source of pollutant emissions
in major metropolitan areas. Accurate estimation of heavy-duty diesel vehicle emissions is es-
sential in air quality planning efforts because highway and non-road heavy-duty diesel emissions
account for a significant fraction of the oxides of nitrogen (NOx) and particulate matter (PM)
emissions inventories.  MOBILE6 (U.S. EPA2002a), EPA's mobile source emission rate model,
uses an "average trip-based" approach to modeling as opposed to a more fundamental and robust
modal modeling approach.
       The major effort of this research is to develop a new heavy-duty vehicle load-based mod-
al emission rate model that overcomes some of the limitations of existing models and emission
rates prediction methods. This model is part of the proposed Heavy-Duty Diesel Vehicle Modal
Emission Modeling (HDDV-MEM) which was developed by Georgia Institute of Technology.
HDDV-MEM first predicts second-by-second engine power demand as a function of vehicle op-
erating conditions and then applies brake-specific emission rates to these activity predictions.
       To provide better estimates of microscale level emissions, this modeling approach is
designed to predict second-by-second emissions  from on-road vehicle operations.  This research
statistically analyzes the database provided by EPA and yields a model for prediction of emis-
sions at a microscale level based on engine power demand and driving mode.  Research results
demonstrate the importance of including the influence of engine power demand vis-a-vis emis-
sions and simulating engine power in real world  applications. The modeling approach provides
a significant improvement in HDDV emissions modeling compared to the current average speed
cycle-based emissions models.
                                         xxv

-------
This page left blank deliberately.
             xxvi

-------
                                     CHAPTER 1
                                 1. INTRODUCTION

                     1.1 Emissions from Heavy-Duty Diesel Vehicles

             Heavy-duty diesel vehicles (HDDVs) operations are a major source of oxides of
nitrogen (NOx) and particulate matter (PM) emissions in metropolitan areas nationwide. Al-
though HDDVs constitute a small portion of the on-road fleet, they typically contribute more
than 45% of NOx and 75% of PM on-road mobile source emissions (U.S. EPA 2003). HDDV
emissions are a large source of global greenhouse gas and toxic air contaminant emissions.  Ac-
cording to Environmental Defense Report in 2002, NO causes many environmental problems
including acid rain, haze, global warming and nutrient overloading leading to  water quality deg-
radation (CEDF 2002).  HDDV emissions are also harmful to human health and the environment
(SCAQMD 2000).  Groundbreaking long-term studies of children's health conducted in Califor-
nia have demonstrated that particle pollution may significantly reduce lung function growth in
children (Avol 2001, Gauderman 2002, Peters 1999). Previous studies have stressed the signifi-
cance of emissions from HDVs, in urban non-attainment areas especially for ozone (for which
nitrogen oxides are a precursor) and PM2 5 (Gautam and Clark 2003, Lloyd and Cackette 2001).
       Over the last several decades, both government and private industry have made extensive
efforts to regulate and control mobile source emissions. In 1961, the first automotive emissions
control technology in the nation, Positive Crankcase Ventilation (PCV), was mandated by the
California Motor Vehicle State Bureau of Air Sanitation to control  hydrocarbon crankcase emis-
sions, and PCV Requirement went into effect on domestic passenger vehicles  for sale in Califor-
nia in 1963  (CARB 2004). At the same time, first Federal Clean Air Act was enacted. Although
this act only dealt with reducing air pollution by setting emissions  standards for stationary
sources such as power plants and steel mills at the beginning, amendments of 1965, 1966 and
1967 focused on establishing standards for automobile emissions (AMS 2005). Emission control
was first required on light-duty gasoline vehicles (LDVs) by U.S. EPA in the 1968 model year.
Developed and refined over a period of more than 30 years,  these controls have become more ef-
fective at reducing LDV emissions (FCAP 2004).

                                         1-1

-------
       The relative importance of emissions from HDDVs has increased significantly because
today's gasoline powered vehicles are more than 95% cleaner than vehicles in 1968. Consider-
ing that HDDVs typically have a life cycle of over one million miles, may be on the road as long
as 30 years, and will continue to play a major emission inventory role with increases in goods
movement with their high durability and reliability, modeling of HDDV emissions is going to
become increasingly important in air quality planning.

              1.2 Current Heavy-Duty Vehicle Emissions Modeling Practices

       In current regional and microscale modeling conducted in every state except California,
HDDV emissions rates are taken from the U.S. Environmental Protection Agency's (EPA's)
MOBILE 6.2 model (U.S. EPA2001b).  MOBILE 6.21 emission rates were derived from base-
line emission rates (gram/brakehorsepower-hour) developed in the laboratory using engine
dynamometer test cycles. While different driving cycles have been developed over the years,
dynamometer testing is conceptually designed to obtain a "representative sample" of vehicle
operations. These work-based emission rates are then modified through a series of conversion
and correction factors to obtain approximate emission rates in units of grams/mile that can be
applied to on-road vehicle activity (vehicle miles traveled), as a function of temperature, humid-
ity, altitude, average vehicle speed,  etc. (Guensler 1993).  The conversion process used to trans-
late laboratory emission rates to on-road emission rates employs fuel density, brake specific fuel
consumption, and fuel economy for each HDDV technology class.  However, the emission rate
conversion process does not appropriately account for the impacts of roadway operating condi-
tions on brake specific fuel consumption and fuel economy (Guensler, et al. 1991).
       The U.S. Environmental Protection Agency (U.S. EPA) is currently developing a new
set of modeling tools for the estimation of emissions produced by on-road and off-road mobile
sources.  The new Motor Vehicle Emissions Simulator, known as MOVES2 (Koupal, et  al.
2004),  is  a modeling system designed to better predict emissions from on-road operations.  The
philosophy behind MOVES is to develop a model that is as directly data-driven as possible,
meaning  that emission rates are developed from second-by-second or binned emission rate data.

                         1.3 Research Approaches and Objectives

       The major effort of this research is to develop a new heavy-duty vehicle load-based mod-
al emission rate model that overcomes some of the limitations of existing models and emission
 MOBILE = Current mobile source emissions model used for State Implementation Plan emission inventories.
 MOVES = Mobile Vehicle Emissions Estimator, next generation mobile source emissions model. The model w
be used for State Implementation Plan emission inventories and will replace the current MOBILE model.
                                           1-2

-------
rates prediction methods.  This model is part of the proposed Heavy-Duty Diesel Vehicle Modal
Emission Modeling (HDDV-MEM) which was developed by Georgia Institute of Technology
(Guensler, et al. 2006).  HDDV-MEM differs from other proposed HDDV modal models (Earth,
et al. 2004, Frey, et al. 2002, Nam 2003) in that the modeling framework first predicts second-
by-second engine power demand as a function of vehicle operating conditions and then applies
brake-specific emission rates to these activity predictions. This means that HDDV emission rates
are predicted as a function of engine horsepower loads for different driving modes. Hence, the
basic algorithm and matrix calculation in the HDDV-MEM should be transferable to MOVES.
The new model implementation is similar in general structure to previous model emission rate
model known as Mobile Emissions Assessment System for Urban and Regional Evaluation
(MEASURE1) model developed by Georgia Institute of Technology several years ago (Bachman
1998, Guensler, et al. 1998, Bachman, et al. 2000).
       The major effort of this research consists of a number of specific objectives outlined
below:

       •   Develop a new load-based modal emission rate model to improve spatial/temporal
          emissions modeling;

       •   Develop a HDDV modal emission rate model to more accurately estimate on-road
          HDDV emissions;

       •   Develop a modal model that can be verified at multiple levels;

       •   Develop a HDDV modal emission rate model that can be integrated into the MOVES.

                        1.4 Summary of Research Contributions

       There are four major contributions developed by this research. First, a framework for
emission rate modeling suitable for predicting emissions at different scales (microscale, me-
soscale, and macroscale) is established.  Since this model is developed using on-board emissions
data which are collected under real-world conditions, this model will provide capabilities for
integrating necessary vehicle activity data and emission rate algorithms to support second-by-
second and link-based emissions prediction. Combined with GIS framework, this model will
improve spatial/temporal emissions modeling.
 MEASURE = Mobile Emissions Assessment System for Urban and Regional Evaluation Model. This model is a
prototype GIS-based modal emissions model.
                                          1-3

-------
       Second, the relationship between engine power and emissions is explored and integrated
into the modeling framework. Research results indicate that engine power is more powerful
than surrogate variables to present load data in the proposed model. Based on the important role
of engine power in explaining the variability of emissions, it is better to include the load data
measurement during emission data collection procedure. Meanwhile, development of methods
to simulate real world engine power is equally important.
       Third, this research verifies that vehicle emission rates are highly correlated with modal
vehicle activity. To get better understanding of driving modes, it is important to examine not
only emission distributions, but also engine power distributions.
       Finally, a dynamic framework is created for further improvement. As more databases
become available, this approach could be re-run to obtain  a more reliable load-based modal emis-
sion model based  on the same philosophy.

                                 1.5 Report Organization

       Chapter 2  examines the diesel fuel combustion process and its relationship to diesel en-
gine emissions formation.  Chapter 3 overviews the existing heavy-duty vehicle emission models
and presents the proposed heavy-duty diesel vehicle modal emission model (HDDV-MEM).
Chapter 4 provides an overview of the emission rate testing databases provided by U.S. EPA, the
quality assurance  and quality control (QA/QC) procedures to review the validity of the data, and
the methods used  to post-process these databases to correct data deficiencies.  In Chapter 5, the
various statistical  models  considered for data analysis are  discussed. Chapter 6 selects the data-
base used to develop the conceptual model and discusses the influence of explanatory variables
on emissions. Chapter 7 covers  sensitivity tests of driving mode definitions and outlines the
potential impacts on derived models.  Chapters 8 to 11 elaborate the different emission models
developed for idle, deceleration, acceleration  and cruise driving modes. In Chapter 12, research
results are verified. Finally, Chapter 13  presents a discussion and conclusion on research results.
                                           1-4

-------
                                     CHAPTER 2
                   2. HEAVY-DUTY DIESEL VEHICLE EMISSIONS
             Diesel engines differ from gasoline engines in terms of the combustion processes
and engine size, giving rise to their different emission properties and therefore different emis-
sions standards. This chapter examines the diesel fuel combustion process and its relationship to
diesel engine emissions formation followed by a summary of the emission regulations for diesel
engines.

                             2.1 How Diesel Engine Works

             By far the predominant engine design for transportation vehicles is the reciprocat-
ing internal combustion (1C) engine which operates either on a four-stroke or a two-stroke cycle.
The two-stroke engine is commonly found in lower-power applications such as snowmobiles,
lawnmowers, mopeds, outboard motors and motorcycles, while both gasoline and diesel automo-
tive engines  are classified as four-stroke engines.  To understand the formation and control of
emissions, it is necessary to first develop an understanding of the operation of the internal com-
bustion engine.

2.1.1 The Internal Combustion Engine

      Internal combustion engines generate power by converting the chemical energy stored in
fuels into mechanical energy. The engine is termed "internal combustion" because combustion
occurs in a confined space called a combustion chamber. Combustion of the fuel charge inside
a chamber causes a rapid rise in temperature and pressure of the gases in the chamber, which are
permitted to expand.  The expanding gases are used to move a piston, turbine blades, rotor, or the
engine itself.
      The four-stroke gasoline engine cycle is also called Otto cycle,  in honor of Nikolaus  Otto,
who is credited with inventing the process in 1867.  The four piston strokes are illustrated in Fig-
ure 2-1.  The following processes take place during one cycle of operation:
                                          2-1

-------
      1.  Intake stroke: the piston starts at the top, the intake valve opens, and the piston
moves down to let the engine take in a fresh charge composed of a mixture of fuel and air (for
spark-ignition or gasoline engine) or air only (for auto-ignition or diesel engine). (Part 1 of the
figure.)

      2.  Compression stroke: then the piston moves back up to compress this fuel/air mixture
(gasoline engines) or the air only (diesel engines).  In gasoline engines combustion is started by
ignition from a spark plug, in diesel engines auto-ignition occurs when fuel is injected into the
compressed air which has achieved a high temperature through compression such that the tem-
perature is high enough to cause self-ignition. (Part 2 of the figure.)

      3.  Expansion stroke: when the piston reaches the top of its stroke, the combustion process
results  in a  substantial
increase in  the  gas tem-
perature and pressure and
drives the piston down.
(Part 3 of the figure.)

      4.  Exhaust
stroke:  once  the piston
hits  the  bottom  of its
stroke, the exhaust valve
opens  and  the  exhaust
leaves the cylinder into
the exhaust manifold and
then into the  tail pipe.
Discharge  of  the  burnt
gases (exhaust) from the
cylinder occurs to make
room for  the next cycle.
(Part 4 of the figure.)
                                                                      )MltoVM»», (_

                                                                       l°*£'*?™  O l**m*t Vrtr*,
                           o
           • Serin*
O M*« port   O l»*rk ***
Q H*mi      O Lt"v»uit Port
          QpMon
     Bock  QCMMCttaflM
                                                                     00* Pin
                                                                       WUKI
                                                                     ••_•' [»I«'RE$$ION
                                                                       COMBUSTION
                                                                       EXHAUST
  Figure 2.1 Actions of a four-stroke gasoline internal combustion engine - Adapted from (HowStuff-
                                        Works 2005)


       Figure 2-1 is a diagrammatic representation of the four strokes of an internal combustion
engine. The upper end of the cylinder consists of a clearance space in which ignition and com-
bustion occur. The  expanding medium pushes against the piston head inside the cylinder, caus-
ing the piston to move; this straight line motion of the piston is converted into the desired rotary
motion of the wheels by means of a drivetrain consisting of a connecting rod and crankshaft.
Figure 2-1 illustrates that the only stroke that delivers useful work is the expansion stroke; the
other three strokes are thus termed idle strokes. The reader interested in a detailed description
                                            2-2

-------
of the internal combustion engine is referred to specialized texts, such as Heywood (Heywood
1998) and Newton et al. (Newton, et al. 1996).

2.1.2 Comparison with the Gasoline Engine

       The diesel engine employs the compression ignition cycle. German engineer Rudolf Die-
sel developed the idea for the diesel engine and received the patent on February 23,  1893.  His
goal was to create an engine with high efficiency. Figure 2-2 is a diagrammatic representation
of the four strokes of a diesel engine.  The main differences between the gasoline engine and the
diesel engine are:

     •   A gasoline engine compresses at a ratio  of 8:1 to 12:1, while a diesel engine compresses
        at a ratio of 14:1 to as high as 25:1. The higher compression ratio of the diesel engine
        leads to higher peak combustion temperatures and better fuel efficiency.

     •   Unlike a gasoline engine, which takes in a mixture of gas and air, compresses it and
        ignites the mixture with a spark, a diesel engine takes in just air, compresses it and then
        injects fuel into the compressed air. The heat of the compressed air spontaneously ig-
        nites the fuel.
        Gasoline en-
        gines generally
        use either carbu-
        retion, in which
        the air and
        fuel is mixed
        long before the
        air enters the
        cylinder, or port
        fuel injection, in
        which the fuel is
        injected just pri-
        or to the intake
        stroke (outside
        the cylinder),
        while diesel
        engines use
        direct fuel injec-
        tion - the diesel
        fuel is injected
        directly into the
        cylinder.
           Figure 2.2 Actions of a four-stroke diesel engine (HowStuffWorks 2005)
                                           2-3

-------
                               2.2 Diesel Engine Emissions

       Like any other internal combustion engine, diesel engines convert the chemical energy
contained in diesel fuel into mechanical power.  Diesel fuel is injected under pressure into the
engine cylinder, where it mixes with air and combustion occurs.  Diesel fuel is heavier and oilier
than gasoline.  Diesel fuel evaporates much more slowly than gasoline, with a boiling point that
is actually higher than that of water. The lean nature of the diesel-air mixture results in a com-
bustion environment that produces lower emission rates of carbon monoxide (CO) and hydrocar-
bons (HC) compared to gasoline-powered engines.  However, diesel engines do produce rela-
tively high level emissions of oxides of nitrogen (NOx) and particulate matter (PM), especially
fine parti culate matter.  This section will discuss oxides of nitrogen and parti culate emissions in
detail.

2.2.1 Oxides of Nitrogen and Ozone Formation

       Oxides of nitrogen, a mixture of nitric oxide (NO) and nitrogen dioxide (NO,), are
produced from the destruction of atmospheric nitrogen (N2) during the combustion process.
Atmospheric air generally consists of 80% N, and 20% Q  and these elements are stable because
of the moderate temperatures and pressures. However, during high temperature and pressure
conditions of combustion, excess oxygen in the combustion chamber reacts with N, to create NO
which is quickly transformed into NO2. The role of nitrogen contained in the air in NO forma-
tion was initially postulated by Zeldovich (Zeldovich, et al. 1947). In near-stoichiometric or lean
systems the mechanisms associated with NO formation (as many as 30 or so independent chemi-
cal reactions that also involve participation of hydrocarbon species) can generally be simplified
to the following:
              Reaction 1:           O2    ^  O + O
              Reaction 2:           O + N2 ^ NO + N

              Reaction 3:           N + O2 ^ NO + O

       In near-stoichiometric and fuel-rich mixtures, where the concentration of OH radicals can
be high, the following reaction also takes place:

              Reaction 4:    N + OH  ^ NO + H

       Reaction 4, together with reactions 1, 2 and 3, are known as the extended Zeldovich
mechanism.  It is also important to note that emitted nitric oxide (NO) will oxidize to nitrogen
dioxide (NO2) in the atmosphere over a period of a few hours.
                                           2-4

-------
       Oxides of nitrogen (NOX) are reactive gases that cause a host of environmental concerns
impacting adversely on human health and welfare.  Nitrogen dioxide (NO2), in particular, is a
brownish gas that has been linked with higher susceptibility to respiratory infection, increased
airway resistance in asthmatics, and decreased pulmonary function. Most importantly, NOX
emitted from heavy-duty vehicles plays a major role in the formation of ground level ozone
pollution, which causes wide-ranging damage to human health and the environment (U.S. EPA
1995). Ozone is a colorless, highly reactive gas with a distinctive odor.  Naturally, ozone is
formed by electrical discharge (lightning) and in the upper atmosphere at altitudes between 15
and 35 km.  Stratospheric ozone protects the Earth from harmful ultraviolet radiation from the
sun.  However, ground level ozone is formed by chemical reactions involving NOX and volatile
organic compounds (VOCs) combining in the presence of heat and sunlight.  These two cat-
egories of pollutants are also referred to as ozone precursors. The production of photochemical
oxidants usually occurs over several  hours which means that the highest concentrations of ozone
normally occur on summer afternoons, in areas downwind of major sources of ozone precursors.
The simplified reaction processes are illustrated as:

              NO2 + VOC + sunlight (UV) => NO2 + O2 +  sunlight (UV) ^  NO + O3

       At ground level, elevated ozone concentrations can cause health and environmental
problems. Ozone can affect the human cardiac and respiratory systems,  irritating the eyes,  nose,
throat, and lungs. Symptoms of ozone exposure include itchy  and watery eyes, sore throats,
swelling within the nasal passages and nasal congestion.  Effects from ozone are experienced
only for the period of exposure to elevated levels.  EPA promulgated  8-hour ozone standards in
1997 and designated an area as nonattainment if it has violated, or has contributed to violations
of, the national 8-hour ozone standard over a three-year period.

2.2.2 Fine Particulate Matter (PM2 5)

       Particulate matter (PM) is a complex mixture of solid and liquid particles (excluding wa-
ter) that are suspended in air. These particles typically consist of a mixture of inorganic and or-
ganic chemicals, including carbon, sulfates, nitrates, metals, acids,  and semivolatile compounds.
The size of PM in air ranges from approximately 0.005 to 100 micrometers (|im) in aerodynamic
diameter — the size of just a few atoms to about the thickness of a human hair.  U.S. EPA defined
three general categories for PM as coarse (10 to 2.5 jim), fine (2.5 jim or smaller), and ultrafine
(0.1  jim or smaller).
       Heavy-duty diesel vehicles are known to emit large quantities of small particles (Kittel-
son,  et al.  1978). Amajority of the PM found in diesel exhaust is in the nanometer size range.
                                           2-5

-------
Lloyd found that more than 90% of fine particles from heavy-duty vehicles are smaller than Ijim
in diameter (Lloyd and Cackette 2001).
       Fine PM can cause not only human health problems and property damage, but also ad-
versely impact the environment through visibility reduction and retard plant growth (Davis, et
al. 1998).  Health studies have shown a significant association between exposure to fine particles
and premature death from heart or lung diseases. Other important effects include aggravation of
respiratory and cardiovascular disease, lung disease, decreased lung function, or asthma attacks.
Individuals particularly sensitive to fine particle exposure include older adults,  people with heart
and lung disease, and children (U.S. EPA 2005).  EPA promulgated the PM2.5 standard in 1997
and included a 24-hour standard for PM2.5 set at 65 micrograms per cubic meter (|ig/m3), and an
annual standard of 15 |ig/m3.

                   2.3 Heavy-Duty Diesel Vehicle Emission Regulations
2.3.1 National Ambient Air Quality Standards

       The Clean Air Act, which was last amended in 1990, requires the U.S. EPA to set Na-
tional Ambient Air Quality Standards (NAAQS) to safeguard public health against six common
air pollutants: ozone (O3), particulate matter (PM), sulfur dioxide (SO2), carbon monoxide (CO),
nitrogen dioxide (NO,) and lead (Pb). The Clean Air Act established two types of national air
quality standards.  Primary standards set limits to protect public health, including the health of
"sensitive" populations such as asthmatics, children, and the elderly.  Secondary standards set
limits to protect public welfare, including protection against decreased visibility, damage to
animals, crops, vegetation, and buildings (CFR 2004a).  Table 2-1 illustrates the current NAAQS
for ambient concentrations of various pollutants.  Units of measure for the standards are parts per
million by volume (ppmv), milligrams per cubic meter of air (mg/m3), and micrograms per cubic
meter of air (|ig/m3).

Table 2-1. National Ambient Air Quality  Standards (U.S. EPA 2006)
Pollutant
Carbon Monoxide (CO)
Nitrogen Dioxide (NO )
Ozone (O3)
Average Times
8 -hour Average
1 -hour Average
Annual Arithmetic
Mean
1 -hour Average
8 -hour Average
Standard Value
9 ppmv (10 mg/m3)
35 ppmv (40 mg/m3)
0.053 ppmv (100 (ig/m3)
0.1 2 ppmv (23 5 (ig/m3)
0.08 ppmv (157 (ig/m3)
Standard Type
Primary
Primary
Primary & Secondary
Primary & Secondary
Primary & Secondary
                                          2-6

-------
Pollutant
Lead (Pb)
Participate (PM10)
Participate (PM2.5)
Sulfur Dioxide (SO2)
Average Times
Quarterly Average
Annual Arithmetic
Mean
24-hour Average
Annual Arithmetic
Mean
24-hour Average
Annual Arithmetic
Mean
24-hour Average
3 -hour Average
Standard Value
1.5 (ig/m3
50 (ig/m3
150 (ig/m3
15 (ig/m3
65 (ig/m3
0.030 ppmv (80 (ig/m3)
0.14ppmv(365 (ig/m3)
0.50 ppmv (1300 (ig/m3)
Standard Type
Primary & Secondary
Primary & Secondary
Primary & Secondary
Primary & Secondary
Primary & Secondary
Primary
Primary
Secondary
2.3.2 Heavy-Duty Engine Certification Standards

       Heavy-duty vehicles are defined as vehicles of GVWR (gross vehicle weight rating)
above 8,500 Ibs in the federal jurisdiction and above 14,000 Ibs in California (model year 1995
and later). Diesel engines used in heavy-duty vehicles are further divided into service classes by
GVWR, as follows:

       •   Light heavy-duty diesel engines: 8,50033,000

       Under the federal light-duty Tier 2 regulation (phased in beginning 2004), vehicles of
GVWR up to 10,000 Ibs used for personal transportation have been re-classified as "medium-
duty passenger vehicles" (MDPV - primarily larger SUVs and passenger vans) and are subject to
the light-duty vehicle legislation. Thus, the same diesel engine model used for the  8,500-10,000
Ibs vehicle category may be classified as either light- or heavy-duty and certified to different
standards, depending on the manufacturer-defined application (CFR 2004b). Except for the
heavy-duty vehicles classified as LDVs, all heavy-duty vehicle emissions standards are estab-
lished using  the engine dynamometer certification process.
                                          2-7

-------
2.3.3 Heavy-Duty Engine Emission Regulations

       EPA regulates heavy-duty vehicle emissions for compliance with emissions standards
over the useful life of the engine. Useful life is denned as follows (U.S. EPA and California)
(CFR2004c):
           LHDDE - 8 years/110,000 miles (whichever occurs first)
           MHDDE - 8 years/185,000 miles
           HHDDE - 8 years/290,000 miles
       Federal useful life requirements were later increased to 10 years, with no change to
the above mileage numbers, for the urban bus PM standard (1994+) and for the NOx standard
(1998+). The emission warranty period is 5 years/100,000 miles (5 years/100,000 miles/3,000
hours in California), but no less than the basic mechanical warranty for the engine family. Table
2-2 shows the heavy-duty engine emissions standards by model year group.


Table 2-2. Heavy-Duty Engine Emissions Standards (U.S. EPA 1997)
Year
HC (g/bhp-hr)
CO (g/bhp-hr)
N
-------
                                     CHAPTER 3
            3.  HEAVY-DUTY DIESEL VEHICLE EMISSIONS MODELING

       Several models are currently used to estimate emissions from heavy-duty vehicles. A
comprehensive review of the existing heavy-duty vehicle emission models will help modelers
understand the different approaches and how they can contribute to the development of enhanced
emission rate modeling techniques.
       The most common emission rate models are VMT-based or cycle-based developed from
laboratory test facility driving cycle data. Fuel-based models model emissions as a function of
fuel usage rate as well as other parameters. In the 1990s, even the proposed enhanced modal
models, designed to predict emissions as a function of speed and acceleration profiles of ve-
hicles, were still based upon statistical analysis of cycle-based data (Bachman 2000; Fomunung
2000). More recent emission rate modeling frameworks are proposing to model modal emission
rates on a second-by-second basis directly from the vehicle operating mode.

                        3.1 VMT-Based Vehicle Emission Models

       The current emission rate models used by state and federal agencies include the Mobile
Source Emission Model (MOBILE) series of models developed by the U.S. Environmental Pro-
tection Agency (U.S. EPA) and the Emission Factor Emission Inventory Model (EMFAC) series
developed by California Air Resources Board (CARB).

3.1.1 MOBILE

       MOBILE (U.S. EPA 1993), developed by the US EPA in the late 1970s to estimate
vehicle emission, has since become the nation's standard in assessing the emission impacts of
various transportation inputs.  MOBILE uses the method of base emission rates and correction
factors.  This model has undergone significant expansion and improvements over the years.  The
latest version is MOBILE6 released in February 2002 (U.S. EPA 2002a).
                                         3-1

-------
       MOBILE is based on engine dynamometer test data from selected driving cycles.  The
Federal Test Procedure (FTP) transient cycle is composed of a unique profile of stops, starts,
constant speed cruises, accelerations and decelerations. Different driving cycles are developed
to simulate both urban and freeway driving. A concern with driving cycles is that they may not
be sufficiently representative of real-world emissions (Kelly and Groblicki 1993; Denis et al.
1994). For HDV emission rates, MOBILE uses the method of base emission rates and conver-
sion factors which convert the g/bhp-hr emissions estimates observed in the laboratory to g/mile
emission rates, to be consistent with available travel information. Conversion factors are used to
convert the g/bhp-hr emissions estimates to grams per mile traveled. These conversion factors
contribute a large source of uncertainty to the MOBILE model since the BSFC (brake specific
fuel consumption) data are aggregated for the fleet and may not represent in-use vehicle charac-
teristics (Guensler et al. 1991).  Conversion factors have improved accuracy in MOBILE6 due to
improved data, but fundamental flaws remain (Guensler et al. 2006).
       3.1.1.1 Diesel Engine Test Cycles
       EPA currently uses the transient Federal Test Procedure (FTP) engine dynamometer
cycle, which includes both engine cold and warm start operations, for heavy-duty vehicles (CFR
Title 40, Part  86.1333).  Unlike the chassis dynamometer test for light-duty vehicle, the engine is
removed from the vehicle's chassis, mounted on the engine dynamometer test stands, and oper-
ated in the transient FTP test cycle. The transient cycle (Figure 3-1) consists of four phases: the
first is a NYNF (New York Non Freeway) phase typical of light urban traffic with frequent stops
and starts, the second is LANF (Los Angeles Non Freeway) phase typical of crowded urban
traffic with few stops, the third is a LAFY (Los Angeles Freeway) phase simulating crowded
expressway traffic in Los Angeles, and the fourth phase repeats the first NYNF phase.  This cycle
consists of a cold start after parking overnight, followed by idling, acceleration and deceleration
phases, and a wide variety of different speeds and loads sequenced to simulate the running of the
vehicle that corresponds to the engine being tested. There are few stabilized running conditions,
and the average load factor is about 20 to 25% of the maximum horsepower available at a given
speed.
       Emission and operation parameters are measured while the engine operates during the
test cycle.  The engine torque is determined by applying performance percentages with an engine
lug curve (maximum torque curve). Engine torque is then converted to engine brake horsepower
using engine revolution per minute (RPM).  Brake specific emissions rates are reported in g/
bhp-hr and then converted to g/mile using pre-defined conversion factors (CFR Title 40, Part
86.1342-90).
                                          5-2

-------
                     NYNF
LANF
LAFY
HYNF
                     Figure 3-1 FTP Transient Cycle (DieselNet 2006)
      Because the engine dynamometer test procedure does not directly account for the impacts
from load and grade changes, a chassis dynamometer test procedure and the cycle known as the
HDV urban dynamometer driving schedule (HDV-UDDS) was developed [CFR Title 40, Part
86, App. I], sometimes referred to as "cycle D".  This cycle is different from the UDDS cycle for
light-duty vehicles (FTP-72). This FtDV cycle lasts 1060 seconds and covers 5.55 miles. The
average speed for FtDV UDDS is 18.86 mph while the maximum speed is 58 mph.  Figure 3-2
shows the speed profile for the chassis UDDS test.
                         200        400        600
                                         Time, s
                     800
                  1000
Figure 3-2 Urban Dynamometer Driving Schedule Cycle for Heavy-Duty Vehicle (DieselNet 2006)
                                         5-3

-------
       3.1.1.2 Baseline Emission Rates
       Baseline emission rates (g/bhp-hr) for heavy-duty vehicles are obtained from the engine
dynamometer test results collected during U.S. EPA's cooperative test program with engine
manufacturers. The zero mile levels and deterioration rates for NOx, CO, and HC are presented
in the following tables for heavy-duty gasoline and diesel engines.  All the emission rates are
available from "Update of Heavy-Duty Emission Levels (Model Years 1998-2004+) for Use in
MOBILE6" (Lindhjem and Jackson 1999).
Table 3-1. Heavy-Duty Vehicle NOY Emission Rates in MOBILE6
Zero Mile Level (g/bhp-hr) Deterioration (g/bhp-hr/10,000 miles)
Model Year _.. , _ _.. , _
Class Gasoline Diesel Engine Gasoline Diesel Engine
Engine Heavy Med. Light Engine Heavy Med. Light
1988-1989
1990
1991-1993
1994-1997
1998-2003
2004+
4.96
3.61
3.24
3.24
2.59
2.59
6.28
4.85
4.56
4.61
3.68
1.84
6.43
4.85
4.53
4.61
3.69
1.84
4.34
4.85
1.38
1.08
3.26
1.63
0.044
0.026
0.038
0.038
0.038
0.038
0.01
0.004
0.004
0.003
0.003
0.003
0.009
0.006
0.007
0.001
0.001
0.001
0.002
0.011
0.003
0.001
0.001
0.001
Table 3-2 Heavy-Duty Vehicle CO Emission Rates in MOBILE6
   Model
 Year Class
      Zero Mile Level (g/bhp-hr)          Deterioration (g/bhp-hr/10,000 miles)
Gasoline         Diesel Engine          Gasoline         Diesel Engine
 Engine    Heavy    Med.    Light    Engine     Heavy     Med.     Light
1988-1989
1990
1991-1993
1994-1997
1998-2003
2004+
13.84
6.89
7.10
7.10
7.10
7.10
1.34
1.81
1.82
1.07
1.07
1.07
1.70
1.81
1.26
0.85
0.85
0.85
1.21
1.81
0.40
1.19
1.19
1.19
0.246
0.213
0.255
0.255
0.255
0.255
0.008
0.005
0.003
0.004
0.004
0.004
0.018
0.007
0.010
0.009
0.009
0.009
0.022
0.012
0.004
0.003
0.003
0.003
Table 3-3 Heavy-Duty Vehicle HC Emission Rates in MOBILE6
   Model
 Year Class
      Zero Mile Level (g/bhp-hr)
Gasoline         Diesel Engine
 Engine    Heavy     Med.    Light
  Deterioration (g/bhp-hr/10,000 miles)
Gasoline         Diesel Engine
 Engine    Heavy     Med.      Light
1988-1989
1990
1991-1993
1994-1997
1998-2003
2004+
0.62
0.35
0.33
0.33
0.33
0.33
0.47
0.52
0.30
0.22
0.22
0.22
0.66
0.52
0.40
0.31
0.31
0.31
0.64
0.52
0.47
0.26
0.26
0.26
0.023
0.023
0.021
0.021
0.021
0.021
0.001
0.000
0.000
0.001
0.001
0.001
0.002
0.001
0.001
0.001
0.001
0.001
0.002
0.001
0.001
0.001
0.001
0.001
                                           5-4

-------
       3.1.1.3 Conversion Factors
       Because emission standards for both gasoline and diesel heavy-duty vehicles are ex-
pressed in terms of grams per brake-horsepower hour (g/bhp-hr), the MOBILE6.2 model em-
ploys conversion factors of brake horsepower-hour per mile (bhp-hr/mile) to convert the emis-
sion certification data from engine testing to grams per mile.  Conversion factors are a function
of fuel  density, brake-specific fuel consumption (BSFC), and fuel economy for each HDV class
(U.S. EPA 2002b). The conversion factors were calculated using Equation 3-1:

       ConversionFactorCbhp-hr/rm) =	Fuel Density (Ib/gal)	    (Equation 3-1)
                        F        BSFC (lb/bhp-hr)x Fuel Economy (mi/gal)

       To calculate BSFC, U.S. EPAfirst obtained data from model year 1987 through 1996 sup-
plied by six engine manufacturers (U.S. EPA2002d). U.S. EPA then performed regression analy-
sis for BSFCs by model year for each weight class and used a logarithmic curve to extrapolate
values prior to 1988 and after 1995, since sales data were only available for model years 1988
through 1995 (U.S. EPA2002d).
       Fuel economy was calculated using a regression curve derived from the 1992 Truck
Inventory and Use Survey (TIUS) conducted by the U.S. Census Bureau. Fuel densities were
determined from National Institute for Petroleum and Energy Research  (NIPER) publications
for both gasoline and diesel (Browning 1998).  Using the equation defining the conversion factor
together with the data described above, weight class specific conversion factors were calculated
for gasoline and diesel vehicles for model years 1987 through 1996 (U.S. EPA2002c).

3.1.2 EMFAC

       EMFAC (CARB 2007) was developed by CARS separately from MOBILE based upon
the presence of vehicle technologies in the on-road fleet that would be subject to more stringent
standards and fuels used in California.  The latest version, EMFAC 2002, was released in Sep-
tember 2002. EMFAC can estimate emissions for calendar years 1970  to 2040.
       EMFAC abandoned the use of conversion factors from EMFAC  2000 and used chassis
dynamometer data collected for 70 trucks tested over the Urban Dynamometer Driving Schedule
(UDDS). Although the use of UDDS test data marked a significant improvement, it is hard to
say that UDDS adequately represented the full range of heavy duty diesel operation. Although
the cycle was constructed from actual truck activity data, it lacks extended cruises known to
cause many trucks to default to a high NO  emitting, fuel saving mode referred to as "Off-Cycle"
                                          5-5

-------
NOx. The cycle also lacks hard accelerations known to result in high emissions of particulate
matter (CARS 2002).
       CARB continues to develop more mode test cycles designed to better depict the emis-
sions of HDDVs under real world conditions, including emissions from engine programming
to go "off-cycle" at certain speeds. Activity data from instrumented truck studies conducted by
Battelle and Jack Faucett Associates for CARB (CARB 2002) have been used to develop a four
mode heavy-heavy-duty diesel cycle. Figure 3-3 shows these four mode cycles developed by
CARB. The creep mode produced the greatest gram per mile results followed by the transient
and the cruise mode. The  transient and cruise modes produced higher and lower emissions, re-
spectively, than the HDDS (CARB 2002).
                	*
                  ft MA
                                    Creep
                                             Cruse
                                                      IDLE
                                                           —^Transient I
                   Figure 3-3 CARB's Four Mode Cycles (CARB 2002)
3.1.3 Summary
       EPA's MOBILE series models have significantly improved through the series of model
revisions from 1970s. However, the MOBILE series of models still has major modeling de-
fects for the heavy-duty components. These defects have been widely recognized for more than
10 years (Guensler et al. 1991). One of the most frequently stated defects is that fleet average
speed, which aggregates other vehicle activity factors that may yield significant bias in emissions
characterization, is used to characterize vehicle emission rates.
       In developing emissions inventories using the MOBILE and EMFAC (CARB 2007)
emission rate models, vehicle activity is estimated using travel demand models.  The estima-
tion of VMT was based on EPA's fleet characterization study (U.S. EPA 1998).  It is common to
estimate heavy-duty travel as a fixed percentage of predicted traffic volumes (TRB 1995). This

-------
estimate is not correct since heavy-duty truck travel does not follow the same spatial and tempo-
ral patterns as light-duty vehicle travel (Schlappi et al. 1993).

                         3.2 Fuel-Based Vehicle Emission Models

       The fuel-based emission inventory models for heavy-duty diesel trucks combine vehicle
activity data (i.e., volume of diesel fuel consumed) with emission rates normalized to fuel con-
sumption (i.e., mass of pollutant emitted per unit volume of fuel burned) to estimate emissions
within a region of interest (Dreher and R. Harley 1998). This approach was proposed to increase
accuracy of truck VMT estimation by combining state level truck VMT with statewide fuel sales
to estimate total heavy-duty truck activity, using the amount of fuel consumed as a measure of
activity.
       In California, fuel consumption data are available through tax records at the statewide
level and this statewide fuel consumption can be apportioned to provide emission estimates for
an individual air basin by month, day of week,  and time of day. At the same time, emission rates
are normalized to fuel consumption using Equation 3-2:
                            El.  =
                               P                             (Equation 3-2)
        where  El :     emission index for pollutant P, in units of mass of pollutant emitted
                         per unit mass of fuel burned;
                S :     brake specific pollutant emission rate obtained from the dynamometer
                           test, expressed in g/bhp-hr units;
                BSFC :  brake specific fuel consumption of the engine being tested, also in
                         g/bhp.
       Exhaust emissions are estimated by multiplying vehicle activity, as measured by the vol-
ume of fuel used, by emission rates which  are normalized to fuel consumption and expressed as
grams of pollutant emitted per gallon of diesel fuel burned instead of grams of pollutant per mile
(Dreher and R. Harley 1998).  Average emission rates for subgroups of vehicles are weighted by
the fraction of total fuel used by each vehicle subgroup to obtain an overall fleet-average emis-
sion rate. The fleet-average emission rate is multiplied by regional fuel sales to compute pollut-
ant emissions (Singer and Harley 1996).
       The advantages of the fuel-based approach include the fact that fuel -use data are avail-
able from tax records in California. Furthermore, emission rates normalized to fuel consumption
vary considerably less over the full range of driving conditions than travel-normalized emission
                                           5-7

-------
factors (Singer and Harley 1996). The disadvantage is obvious, too.  Tax records are not avail-
able for other states. It is difficult to get input data outside of California, limiting the scope of
the modeling approach.  Furthermore, the users first have to run two models to predict fuel used
and then predict emission rates, which is not statistically efficient.

                            3.3 Modal Emission Rate Models

       Modal emission rate models work on the premise that emissions are better modeled as a
function of specific modes of vehicle operation (idle,  steady-state cruise, various levels of ac-
celeration/deceleration, etc.), than as a function of average vehicle speed (Bachman 1998; Rama-
murthy et al. 1998; U.S. EPA2001b). Emissions of heavy-duty vehicles powered by diesel cycle
engines are more likely to be a function of brake work output of engine than normal gasoline
vehicles, because instantaneous emissions levels of diesel engine are highly correlated with the
instantaneous work output of the engine (U.S. EPA2001b).
       With the consideration of vehicle modal activity, EPA and various research communities
have been developing modal activity-based emission  models. The report published by National
Research Council (NRC 2000) comprehensively reviewed the modeling of mobile source emis-
sions and provided recommendations for the improvement of future mobile source emission
models. The following sections will introduce the most representative modal emission models
one by one.

3.3.1 CMEM

       The Comprehensive Modal Emissions Model  (CMEM) (Barth et al. 2000) was developed
by the Center for Environmental Research and Technology at University of California Riverside
(UCR-CERT). Development of CMEM was first funded by National Cooperative Highway Re-
search Program Project (1995-2000) and then is being enhanced and improved with EPA funding
(2000-present).  From 2001, CE-CERT created a modal-based inventory at the micro- (intersec-
tion), meso-  (highway link), and macro- (region) scale levels for light-duty vehicles (LDV) and
heavy-duty diesel (HDD) vehicles. The CMEM model derives a fuel rate from road-load and a
simple powertrain model.  Emissions rates are then derived empirically from the fuel rate. Fuel
rate, or fuel consumption per unit time, forms the  basis for CMEM.
       The CMEM HDD emissions model (Barth et al. 2004) accepted the same  approach as the
light-duty vehicle model. In that model, second-by-second tailpipe emissions are modeled as the
product of three components:  fuel rate (FR), engine-out emission indices (grams of emissions/
gram of fuel), and an emission after-treatment pass fraction.  The model is  composed of six mod-

-------
ules: 1) engine power demand; 2) engine speed; 3) fuel-rate; 4) engine control unit; 5) engine-out
emissions; and 6) after-treatment pass fraction. The vehicle power demand is determined based
on operating variables [second-by-second vehicle speed (from which acceleration can be derived;
note that acceleration can be input as a separate input variable), grade, and accessory use (such
as air conditioning)] and specific vehicle parameters (vehicle mass, engine displacement, cross-
sectional area, aerodynamics, vehicle accessory load, transmission efficiency, and drive-train
efficiency, and so on).  The core of the model is the fuel rate calculation which is a function of
power demand and engine speed.  Engine speed is determined based on vehicle velocity, gear
shift schedule and power demand (Barth et al. 2004). The model uses a total of 35 parameters to
estimate vehicle tailpipe emissions.

3.3.2 MEASURE

       The Mobile Emissions Assessment System for Urban and Regional Evaluation (MEA-
SURE) (Bachman et al. 2000) model was developed by Georgia Institute of Technology in the
late 1990s.  The MEASURE model is developed within a geographic information system (GIS)
and employs modal emission rates, varying emissions according to vehicle technologies and
modal operation (cruise, acceleration, deceleration, idle).  The model emission rate database
consists of more than 13,000 laboratory tests conducted by the EPA and CARB using standard-
ized test cycle conditions and alternative cycles (Bachman 1998).  The aggregate modal model
within MEASURE employs emission rates based on theoretical engine-emissions relationships.
The relationships are dependent on both modal and vehicle technology variables, and they are
"aggregate" in the sense that they rely on bag data to derive their modal activities (Washington
et al. 1997a).  Emission rates were statistically derived  from the emission rate data as a function
of operating mode power demand surrogates.  The model uses statistical techniques to predict
emission rates using a process that utilizes the best aspects of hierarchical tree-based regression
(HTBR) and ordinary least squares regression (OLS) (Breiman et al. 1984). HTBR is used to
reduce the number of predictor variables to a manageable number, and to identify useful interac-
tions among the variables; then OLS regression techniques are applied until a satisfactory model
is obtained (Fomunung et al. 2000). Vehicle activity variables include average speed, accel-
eration rates, deceleration rates, idle time, and surrogates for power demand.  The MEASURE
model for light-duty vehicles was completed in 2000.
       MEASURE provides the following benefits since it has been developed under the GIS
platform (Bachman et al. 2000): 1) manages topographical parameters that affect emissions;
2) calculates emissions from vehicle modal activities; 3) allows a 'layered' approach to indi-
                                          5-9

-------
vidual vehicle activity estimation; and 4) aggregates emission estimates into grid cells for use in
photochemical air quality models.

3.3.3 MOVES

       To keep pace with new analysis needs, modeling approaches, and data, the U.S. EPA's
Office of Transportation and Air Quality (OTAQ) is developing a modeling system termed
MOVES (Koupal et al. 2004, U.S. EPA2001a). This new system will estimate emissions for on-
road and non-road sources, cover a broad range of pollutants, and allow multiple scale analysis,
from fine-scale analysis to national inventory estimation. In the future, MOVES will serve as
the replacement for MOBILE6 and NONROAD (U.S. EPA 200 la). This project was previously
known as the New Generation Mobile Source Emissions Model (NGM) (U.S. EPA2001a).
       The current plan for MOVES is to use vehicle specific power (VSP) as a variable on
which emission rates can be based (Koupal et al. 2002).  The VSP approach to emissions char-
acterization was developed by Jimenez-Palacios (Jimenez -Palacios 1999). VSP is a function of
speed, acceleration, road grade, etc., as shown in Equation  3-3:
   VSP = vx(ax(\+z) + gxgrade + gxCR) + Q.5p xCD xAxv3/m   (Equation  3-3)

          where:   v:      vehicle speed (assuming no headwind) (m/s)
                   a:      vehicle acceleration (m/s2)
                   e:      mass factor accounting for the rotational masses (~0. 1) - constant
                   g:      acceleration due to gravity (m/s2)
                   grade:  road grade (ratio of rise to run)
                   CR:    rolling resistance (-0.0135)
                   [i:      air density (1.2)
                   CD:    aerodynamic drag coefficient (dimensionless)
                   A:      the frontal area (m2)
                   m:      vehicle mass (metric tons)

       The basic concept of MOVES starts with the characterization of vehicle activity and the
development of relationships between characterized vehicle activity and energy consumption,
and between energy consumption and vehicle emission (Nam 2003). The U.S. EPA established a
modal binning approach, developed using VSP, to characterize the relationship between vehicle
activity and energy consumption. Originally, a total of 14 modal bins were developed based on
different VSP ranges (U.S. EPA 2001a).  This approach was revised in two different ways. U.S.
EPA refined the VSP binning approach by the association of second-by-second speed, engine
                                         3-10

-------
rpm, and acceleration rates, and the original 14 VSP binning approaches are revised with the
combination of five different speed operating modes and redirected to a total of 37 VSP bins
(Koupal et al. 2004).  Researchers at North Carolina State University (NCSU) divided each bin
into four strata representing two engine sizes and two odometer reading categories, and this ap-
proach was referred to as the "56-bin" approach. (U.S. EPA2002b).
       Another important conceptual model for MOVES was developed by NCSU in 2002 (Frey
et al. 2002). Dr. Frey summarized the conceptual analytical methodology in the report "Recom-
mended Strategy for On-Board Emission Data Analysis and Collection for the New Generation
Model" (Frey et al. 2002). This method uses power demand estimate (P) as a variable on which
emission rates can be based (Frey et al. 2002) as shown in Equation 3-4.
                            P = v X a                            (Equation  3-4)

            where:  P : power demand (mph2/sec)
                    v : vehicle speed (mph)
                    a : vehicle acceleration in (mph/s)
       This method uses on-board emissions data where data are collected under real-world
conditions to develop a modal emission model which can estimate emissions at different scales
such as microscale, mesoscale, and macroscale. The philosophy is similar to MEASURE (Fomu-
nung 2000), which first segregated the data into four modes based on suitable modal definitions,
then developed an OLS regression model for each mode using explanatory variables selected by
HTBR techniques. These explanatory variables include model year, humidity, temperature, alti-
tude, grade, pressure, and power. Second and third powers of speed and acceleration were also
included in the regression analysis.

3.3.4 HDDV-MEM

       The researchers in Georgia Institute of Technology have developed a beta version of
FtDDV-MEM, which is based on vehicle technology groups, engine emission characteristics, and
vehicle modal activity (Guensler et al. 2005).  The FtDDV-MEM first predicts second-by-second
engine power demand as a function of on-road vehicle operating conditions and then applies
brake-specific emission rates to these activity predictions. The FtDDV-MEM consists of three
modules: a vehicle activity module (with vehicle activity tracked by vehicle technology group),
an engine power module, and an emission rate module. The model framework is illustrated in
Figure 3-4.
                                          5-11

-------
  D
                ! •H-ura-iity
                | «A; acccktetioa
                *p aiid«a^7
                »W wind sp*i*d
A  lubastruchue
B.  Operating Euviioiuuent
C.  Vohime* and .Hilbtlerfs
P.  FreiclitPagsaisser Loads
F  Onrorul <">peri>tioiis
F  Engine Power Functions
                                         A'cessoivLoad
                                        Ho K* Loed P owst
                                                                            iese. by. g?c)
 Figure 3-4 A Framework of Heavy-Duty Diesel Vehicle Modal Emission Model (Guensler et al. 2005)

3.3.4.1 Model Development Approaches

       The HDDV-MEM modeling framework is designed for transportation infrastructure im-
plementation on link-by-link basis. While the modeling routines are actually amenable to imple-
mentation on a vehicle-by-vehicle basis, the large number of vehicles operating on infrastructure
links precludes practical application of the model in this manner. As such, the model framework
capitalizes upon previous experience gained in development of the MEASURE modeling frame-
work, in which vehicle technology groups were employed. A new heavy-duty vehicle visual
classification scheme, which is an EPA and Federal Highway Administration (FHWA) hybrid
vehicle classification scheme developed by Yoon et al. (Yoon et al. 2004b), classified vehicle
technology groups by engine horsepower  ratings, vehicles GVWR, vehicle configurations, and
vehicle travel characteristics (Yoon 2005c).  On the other hand, the MEASURE model employs
load surrogates for the implementation of a light-duty modal modeling regime. This new model-
ing framework directly implements heavy-duty vehicle operating loads and uses these load pre-
dictions in the emission prediction process. An engine power module is designed for this task.
                                          3-12

-------
       Emission rates are first established for various heavy-duty technology groups (engine
and vehicle family, displacement, certification group, drivetrain, fuel delivery system, emission
control system, etc.) based upon statistical analysis of standard engine dynamometer certifica-
tion data, or on-road emission rate data when available (Wolf et al. 1998; Fomunung et al. 2000).
The following subsets will discuss three main modules in the HDDV-MEM.
3.3.4.2 Vehicle Activity Module
       The vehicle activity module provides hourly vehicle volumes for each vehicle technol-
ogy group on each transportation link in the modeled transportation system. The annual average
daily traffic (AADT) estimate for each road link is processed to yield vehicle-hours of operation
per hour for each technology group (using truck percentages, VMT fraction by vehicle technol-
ogy group, diesel fraction, hourly volume apportionment of daily travel, link length, and average
vehicle speed) (Guensler et al. 2005; Yoon 2005c), as shown in Equation 3-5.

         VAvMf = (AADTsx(NLs/WL)xHVFvh  xVFv xDFv)x(SLs/ASV)   (Equation 3-5)

        where:   VA:     the estimated vehicle activity (veh-hr/hr):
                 v:      the vehicle technology group
                 h:      the hour of day
                 s:       the transportation link
                 f:       the facility type for the link
                 AADT : the annual average daily traffic for the link (number of vehicles)
                 NL :    the number of lanes in the specific link direction
                 TNL:   the total number of lanes on the link
                 HVFvh: the hourly vehicle fraction
                 VF :    the VMT fraction for each vehicle technology  group
                 DFy:    the diesel vehicle fraction for each technology group
                 SL :     the link length (miles)
                 ASy:    the link average speed of the technology group (mph)
       To estimate on-road running emissions from each link, two sets of  calculations are
performed. On-road vehicle  activity (vehicle-hr) for each hour is multiplied by engine power
demand for observed link operations (positive tractive power demand plus auxiliary power de-
mand), and then by baseline emission rates (g/bhp-hr). These calculations are processed sepa-
rately for each speed/acceleration matrix cell (Yoon et al. 2005b).  Emissions from motoring/
idling activity are calculated by the determination of the vehicle-hours of motoring/idling activity
on each link for each hour and the multiplication of the baseline  idle emission rate (g/hr).
                                          3-13

-------
3.3.4.3 Engine Power Module

       Internal combustion engines translate linear piston work (force through a distance) to a
crankshaft, rotating the crankshaft and creating engine output torque (work performed in angular
rotation). The crankshaft rotation speed (engine speed in revolutions per minute) is a function
of engine combustion and physical design parameters (mean effective cylinder pressure, stroke
length, connecting rod angle, etc.). The torque available at the crankshaft (engine output shaft)
is less than the torque generated by the pistons, in that there are torque losses inside the engine
associated with operating a variety of internal engine components. Torque is transferred from the
engine output shaft to the driveshaft via the transmission (sometimes through a torque-converter,
i.e.,  fluid coupling) and through a series of gears that allows the  drive shaft to rotate at differ-
ent speeds relative to engine crankshaft speed. The drive shaft rotation is then transferred to the
drive axle via the rear differential. The ring and pinion gears in the rear differential translate the
rotation of the drive shaft by 90 degrees from the drive  shaft running along the vehicle to the
drive axle that runs across the vehicle. Torque available at the drive axle is now delivered direct-
ly to the drive wheels. This process generates the tractive force used to overcome road friction,
wind resistance, road grade (gravity), and other resistive forces, allowing the vehicle to acceler-
ate on the roadway. Figure 3-5 illustrates the primary components of concern.
                  Trinimlttion,
                  manual or automatic,
                  has qearsels that
                  match engine speed
                  lo desired road
                  speed
Axle •haft
turning Inside each
roar axle housing tube
transmits power from
the diflerential lo the/
rear wheels
                   Engln*
                   provides Ihe power
                   (torque x speed) to
                   propel the vehicle
                   via the drivetrain
Jl^T y f Drlvwhait
^*$jP j\ •/ rtaccae no* P.' '*"»»
XTOrN//1-? /
Wr7
JT T — i
T II Ballhouilng
LS^-* con lains th e clutch
lor a manual
transmission or
the torque converter
lor an automatic
er transmission
0










the transmission to
the differential housing.
U-ioinlB allow it to
nde up and down
with the rear axle






Differential
turns power flow
SO degrees and allows
one wheel lo rotate
(aster than the other
on curves or when
traction differs






                 Figure 3-5 Primary Elements in the Drivetrain (Gillespie 1992)
       The vehicle drivetrain (engine, torque converter, transmission, drive shaft, rear differen-
tial, axles, and wheels) is designed as a system to convert engine torque into useful tractive force
                                             3-14

-------
at the wheel-to-pavement interface. When the tractive force is greater than the sum offerees
acting against the vehicle, the vehicle accelerates in the direction of travel. Given that on-road
speed/acceleration patterns for HDDVs can be observed (or empirically modeled), the modal
modeling approach works backwards from observed speed and acceleration to estimate the trac-
tive force (and power) that was available at the wheels to meet the observed conditions.  Then,
working backwards from tractive force, the model accounts for additional power losses that
occurred between the engine and the wheels to predict the total brake-horsepower output of the
engine.  Force components that reduce available wheel torque and tractive force include:

      •   Aerodynamic drag, which depends on the frontal area, the drag coefficient, and the
          square of the vehicle speed;

      •   Tire rolling resistance, which is determined by the coefficient of rolling resistance,
          vehicle mass, and road grade (where the coefficient of rolling resistance is a function
          of tire construction and size; tire  pressure; axle geometry, i.e., caster and camber; and
          whether the wheels are driven or towed);

      •   Grade load, which is determined by the roadway grade and vehicle mass; and

      •   Inertial load, which is determined by the vehicle's mass and acceleration.
The tractive force required at the interface between the tires and the road to overcome these re-
sistive forces and provide vehicle acceleration can be described by (Gillespie 1992), as shown in
Equation 3-6:

             \  =  Fn  + FR + Fw  + Fj  +  ma   (Cation  3-6)

        where:  FT:  the tractive force available at the wheels (Ibf)
                FD:  the force necessary to overcome aerodynamic drag (Ibf)
                FR:  the force required to overcome tire rolling res:tance (Ibf)
                FW:  the force required to overcome gravitational force (Ibf)
                Fr   the force required to overcome inertial loss (Ibf)
                m:   the vehicle mass (Ibm)
                a:   the vehicle acceleration (ft/sec2)

      Load prediction models could employ a wide variety of aerodynamic drag (Wolf-Hein-
rich 1998) and rolling resistance functional  forms, some of which may be more appropriate for
certain vehicle designs and at certain vehicle speeds.  Note that vehicle mass is a critical param-
eter that must be included in the load-based modeling approach.  Therefore, estimates of gross
                                          3-15

-------
vehicle weight must be included in any transit (vehicle weight plus passenger loading) or heavy-
duty truck (vehicle weight plus cargo payload) application. The following subsections describe
each force in Equation 3-6, taken from Yoon et al. (Yoon et al. 2005a).

Aerodynamic Drag Force

       As a vehicle moves forward through the atmosphere, drag forces are created at the in-
terface of the front of the vehicle and by the vacuum generated at the tail of the vehicle.  The
flow of the air around the vehicle creates a very complex set offerees providing both resistance
to forward motion and vehicle lift. The net aerodynamic drag force is a function of air density,
aerodynamic drag coefficient, vehicle frontal area, and effective vehicle velocity, as shown in
Equation 3-7 (Yoon et al. 2005a).
                                    , ^ „  A  ,,T/2        (Equation 3-7)
           where:  FD:   aerodynamic drag force
                   p :    the air density (lb/ft3)
                   g :    the acceleration of gravity (32.2 ft/sec2)
                   Cd :   the aerodynamic drag coefficient
                   Af:   the vehicle frontal area (ft2)
                   V :   the effective vehicle velocity (ft/sec)

Rolling Resistance Force (FR^

       Rolling resistance force is the sum of the forces required to overcome the combined fric-
tion resistance at the tires. Tires deform at their contact point with the ground as they roll along
the roadway surface. Rolling resistance is  caused by contact friction, the tires' resistance to
deformation, aerodynamic drag at the tire, etc. The force required to overcome rolling resistance
can be expressed with rolling resistance coefficient, vehicle weight, and road grade, as shown in
Equation 3-8 (Yoon et al. 2005a).
                       FR=Crxmxgx cos(6)             (Equation 3'8)
        where:    FR:  force required to overcome rolling resistance
                  Cr:  the rolling resistance coefficient
                  0 :  the road grade (degrees)
                  m:  vehicle mass in metric tons
                  g:   acceleration due to gravity
                                          3-16

-------
Gravitational Weight Force (Fw)

       The gravitational force components account for the effect of gravity on vehicle weight
when the vehicle is operating on a grade. The grade angle is positive on uphill grades (generat-
ing a positive resistance) and negative on downgrades (creating a negative resistance),as shown
in Equation 3-9 (Yoon et al. 2005a).
                            Fw=mxgx sin(9)           (Equation 3'9)
                 where:    F  :    gravitational weight force
                           m:     vehicle mass in metric tons
                           g:      acceleration due to gravity
                           0 :     the road grade (degrees)
Drivetrain Inertial Loss (F^

       The engine, transmission, drive shaft, axles and wheels are all in rotation. The rotational
speed of each component depends upon the transmission gear ratio, the final drive ratio, and the
location of the component in the drive train (i.e., the total gear ratio between each component
and the wheels).  The rotational moment of inertia of the various drivetrain components consti-
tutes a resistance to change in motion.  The torque delivered by each rotating component to the
next component in the power chain (engine to clutch/torque converter, clutch/torque converter
to transmission, transmission to drive shaft, drive shaft to axle, axle to wheel) is reduced by the
amount necessary to increase angular rotation of the spinning mass during vehicle acceleration.
Given the torque loss at each component, the reduction in motive force available at the wheels
due to inertial losses along the drivetrain can be modeled (Wolf-Heinrich  1998).  This model
term is most significant under low speed acceleration conditions, such as vehicle operation in
truck and rail yards where vehicles are lugging heavy loads over short distances.  However, as
will be discussed later, significant new data will be required to incorporate the inertial loss effects
into modal models, as shown in Equation 3-10 (Yoon et al.  2005a).

                axIEFF   ax[(/y +(GJx/D) + (G,2xG,2)x(/
            1      r2                         r2                       (Equation 3-10)

        where:  a :      the acceleration in the direction of vehicle motion (ft/sec2)
                IEFF :   the effective moment of inertia (ft- Ibf -sec2)
                                          3-17

-------
                Iw :    the rotational moment of inertia of the wheels and axles (ft-lbf -sec2)
                ID :    the rotational moment of inertia of the drive shaft (ft-lbf -sec2)
                IT :    the rotational moment of inertia of the transmission (ft-lbf-sec2)
                IE :    the rotational moment of inertia of the engine (ft-lbf-sec2)
                Gt:    the gear ratio at the engine transmission
                Gd :    the gear ratio in the differential
                r  :     wheel radius (ft)
Power Demand
       Using the equations outlined above, the total engine power demand, which is the combi-
nation of tractive power and auxiliary power demands, can be expressed in Equation 3-11 (Yoon
et al. 2005a):
                      P  = [(—) x (FD + FR+Fw+FI+ma}~\ + AP  (Equation 3-11)


        where  P:      total engine power demand
                V :     the vehicle speed (ft/s)
                FD:     the force necessary to overcome aerodynamic drag (Ibf)
                FR:     the force required to overcome tire rolling res:tance (Ibf)
                FW:    the force required to overcome gravitational force (Ibf)
                Fr     the force required to overcome inertial loss (Ibf)
                m:     the vehicle mass (Ibm)
                a:      the vehicle acceleration (ft/sec2)
                AP :    the auxiliary power demand (bhp)
                550 :   the conversion factor to bhp

3.3.4.4 Emission Rate Module

       The emission rate module provides work-related emission rates (g/bhp-hr) and idle emis-
sion rates (g/hr) for each technology group.  The basic application of the HDDV-MEM incorpo-
rates a simple emission rate modeling approach.  The predicted engine power demand (bhp) for
each second of vehicle operation is multiplied by emission rates in gram/bhp-sec for a given bhp
load.  Technology groups (i.e., vehicles that perform similarly on the certification tests) are estab-
lished based upon the engine and control system characteristics and each technology group is as-
signed a constant g/bhp-sec emission rate based upon regression tree and other statistical analysis
of certification data. Under the assumption that testing cycles represent the typical modal activi-
ties undertaken by on-road activities, such emission rates are applied to on-road activity data.
Given the large repository  of certification data, detailed statistical analysis of the certification
                                          3-18

-------
test results can be used to obtain applicable emission rates for these statistically derived vehicle
technology groups.  The data required for analysis must come from chassis dynamometer (the
engine remains in the vehicle and the vehicle is tested on a heavy-duty treadmill) and on-road
test programs in which second-by-second grams/second emission rate data have been collected
concurrently with axle-hp loads.
       At this moment, HDDV-MEM accepts EPA's baseline running emission rate data as
work-related emission rates and EMFAC2002 idling emission rate test data as idle emission
rates. Diesel vehicle registration fractions and annual mileage accumulation rates are employed
to develop calendar year emission rates for each technology group. In the future, a constant
emission rate need not be used as more refined testing data become available.  Linear, polyno-
mial, or generalized relationships can be established between gram/second emission rate and
tractive horsepower (axle horsepower) and other variables. Sufficient testing data are required to
establish statistically significant samples for each technology group.

3.3.4.5 Emission Outputs

       HDDV-MEM outputs link-specific emissions in grams per hour (g/hr) for VOCs, CO,
NOX, and PM for each vehicle type. Toxic air contaminant emission rates (benzene,  1, 3-butadi-
ene, formaldehyde, acetaldehyde, and acrolein) are also estimated in grams/hour for each vehicle
type using the MOBILE6.2-modeled ratios of air toxics to VOC for each calendar year. HDDV-
MEM provides not only hourly emissions, but also aggregated total daily emissions (in accor-
dance with input command options).  The structure of output files, which provide link-specific
hourly emissions, can be directly incorporated with  roadway network features in a GIS environ-
ment for use in  interactive air quality analysis in various spatial scales, i.e., national, regional,
and local scales (Guensler et al. 2005; Yoon 2005c).
                                          3-19

-------
                                     CHAPTER 4
   4. EMISSION DATASET DESCRIPTION AND POST-PROCESSING PROCEDURE
       Using second-by-second data collected from on-road vehicles (Brown et al. 2001, Ens-
field 2002), the research effort reported here developed models to predict emission rates as a
function of on-road operating conditions that affect vehicle emissions. Such models should be
robust and ensure that assumptions about the underlying distribution of the data are verified
and that assumptions associated with applicable statistical methods are not violated. Due to
the general lack of data available for development of heavy-duty vehicle modal emission rate
models, this study focuses on development of an analytical methodology that is repeatable with
different datasets collected across space and time. There are two second-by-second data sets in
which emission rate and applicable load and vehicle activity data have been collected in paral-
lel (Brown et al. 2001,  Ensfield 2002). One database was a transit bus dataset, collected on
diesel transit buses operated by Ann Arbor Transit Authority (AATA) in 2001 (Ensfield 2002),
and another dataset was heavy HDV (HDV8B) dataset prepared by National Risk Management
Research Laboratory (NRMRL) in 2001 (Brown et al. 2001). Each is summarized in the follow-
ing sections.

                                4.1 Transit Bus Dataset

       Transit bus emissions  dataset was prepared by Sensors, Inc. (Ensfield 2002). Sensors,
Inc. has supplied gas analyzers and portable emissions testing systems worldwide for over three
decades.  Their products, SEMTECH-G for gasoline powered vehicles, and  SEMTECH-D for
diesel powered vehicles, are commercially available for on-vehicle emission test applications. In
October 2001, Sensors, Inc. conducted real-world, on-road emissions measurements of 15 heavy-
duty transit buses for U.S. EPA (Ensfield 2002).  Transit buses were provided by the AATA and
all of them were New Flyer models with Detroit Diesel Series 50 engines. Table 4-1 summarizes
the buses tested for U.S. EPA.
                                          4-1

-------
Table 4-1 Buses Tested for U.S. EPA (Ensfield 2002)
Displace Peak
Bus # Bus ID Odometer Engine series ment Torque Test Date
(liters) (Ib-ft)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
BUS360
BUS361
BUS363
BUS364
BUS372
BUS375
BUS377
BUS379
BUS380
BUS381
BUS382
BUS383
BUS384
BUS385
BUS386
1995
1995
1995
1995
1995
1996
1996
1996
1996
1996
1996
1996
1996
1996
1996
270476
280484
283708
247379
216278
211438
252253
260594
223471
200459
216502
199188
222245
209470
228770
SERIES 50 8047 GK40
SERIES 50 8047 GK38
SERIES 50 8047 GK37
SERIES 50 8047 GK42
SERIES 50 8047 GK41
SERIES 50 8047 GK39
SERIES 50 8047 GK36
SERIES 50 8047 GK35
SERIES 50 8047 GK28
SERIES 50 8047 GK29
SERIES 50 8047 GK30
SERIES 50 8047 GK31
SERIES 50 8047 GK32
SERIES 50 8047 GK33
SERIES 50 8047 GK34
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
890
890
890
890
890
890
890
890
890
890
890
890
890
890
890
10/25/2001
10/25/2001
10/24/2001
10/24/2001
10/26/2001
10/25/2001
10/24/2001
10/23/2001
10/23/2001
10/22/2001
10/17/2001
10/19/2001
10/17/2001
10/18/2001
10/19/2001
4.1.1  Data Collection Method

       A total of 15 files were provided for the purpose of model development (Ensfield 2002).
Each file represents data collected from different transit buses. Five of these buses were 1995
model year and the rest were 1996 model year. All of the bus test periods lasted approximately
two hours.  The buses operated along standard Ann Arbor bus routes and stopped at all regular
stops although the buses did not board or discharge any passengers. The routes were mostly
different for each test, and were selected for a wide variety of driving conditions. All of the bus
routes for the test are shown in Figure 4-1.
                                          4-2

-------
                 Figure 4-1 Bus Routes Tested for U. S. EPA (Ensfield 2002).
       Sensors, Inc. engineers performed the instrument setup and data collection for all the
buses.  Test equipment, SEMTECH-D analyzer, is shown in Figure 4-2. Because engine comput-
er vehicle interface (SAE J1708) data were collected at 10 Hz, Sensors, Inc. engineers manually
started and stopped data collections at approximately 30 minute intervals to keep file size man-
ageable.  A total of four trip files were generated per bus. Zero drift was checked between data
collections. Then four files for each bus were combined into one file after post-processing. The
time for each bus is thus sometimes not continuous. To derive other variables easily, like accel-
eration, and keep data manageable or other purposes, data for each bus were separated into trips
based on continuous time.  After this processing, there were 62 "trips" in the transit bus database.
                                          4-3

-------
                 Figure 4-2 SEMTECH-D in Back of Bus (Ensfield 2002)
4.1.2 Transit Bus Data Parameters

      Each of the 15 data files share the same format. The data fields included in each file are
summarized in Table 4-2.
Table 4-2 Transit Bus Parameters Given by the U.S. EPA (Ensfield 2002)
Category Parameters
Test
Information
Vehicle
Characteristics
Roadway
Characteristics
Onroad Load
Parameters
Engine
Operating
Parameters
Environment
Conditions
Vehicle
Emission
Date; Time
License number; Engine size; Instrument configuration number
GPS Latitude (degree); GPS Longitude (degree); GPS Altitude (feet); Grade (%)
Vehicle speed (mph); Engine speed (rpm); Torque (Ib-ft); Engine power (bhp)
Engine load (%); Throttle position (0 - 100%); Fuel volumetric flow rate
(gal/s); Fuel specific gravity; Fuel mass flow rate (g/s); Calculated instanta-
neous fuel economy (mpg); Engine Oil temperature(deg F); Engine oil pres-
sure (kPa); Engine warning lamp (Binary); Engine coolant temperature (deg
F); Barometric pressure reported from ECM (kPa); Calculated exhaust flow
rate (SCFM)
Ambient temperature (deg C); Ambient pressure (mbar); Ambient relative
humidity (%); Ambient absolute humidity (grains/lb air)
HC, CO, NOx, CO2 emission (in ppm, g/sec, g/ke-fuel, g/bhp-hr units)
                                        4-4

-------
4.1.3  Sensors, Inc. Data Processing Procedure

       It is helpful to understand how Sensors, Inc. processed the dataset after data collection
This information is very important for data quality assurance and quality control.  This section is
adapted primarily from the Sensor's field data collection report (Ensfield 2002).
       Data Synchronization: According to Sensor's report, the analytical instruments, vehicle
interface, and global positioning system (GPS) equipment reported data individually to the
SEMTECH data logger asynchronously and at differing rates, but with a timestamp at millisec-
ond precision. The first step of the post-processing procedure is to eliminate the extra data by
interpolating and synchronizing all the data to 1 Hz.  With all the raw data synchronized to the
same data rate, it is then time-aligned so that engine data corresponds to emissions data in real
time.
       Mass Emissions Calculations: Mass emissions (gram/second) are calculated by fuel flow
method. With access to real-time, second-by-second fuel flow rates, a value for transient  mass
emissions is computed as shown by the equation below. Using NO as an example, NO mass
emissions are calculated on a second-by-second basis (Ensfield 2002).
                                                                          Equation4-l
              where NO  /s  :   NO emissions (grams/second)
                     NOfs:     NO emission rate (grams of NO per gram of fuel)
                     Fueflow :   flow of fuel per unit time (grams per second).

       Fuel specific emissions are the ratios of the mass of each pollutant to the fuel in the
combusted air/fuel mixture.  The mass fuel flow rate is converted from fuel volumetric flow rate
using fuel specific gravity.
       Brake Specific Emissions Calculations: Engine torque is first computed by applying the
engine load parameter, which represents the ratio between current engine torque and maximum
engine torque, to the engine lug curve (maximum torque curve). Engine horsepower is then con-
verted from engine torque using engine speed data.  Work (bhp-hr) is computed for each second
of the test, and brake specific emissions are reported as the sum of the grams of pollutant emitted
over the desired interval (one second) divided by the total work.
       Vehicle Speed Validation: Vehicle speed is a critical parameter that influences the de-
rived parameters, acceleration and emission rates. It is important for researchers to understand
                                          4-5

-------
the method of measurement and data accuracy. Sensors, Inc. measured vehicle speed using two
methods: vehicle Electronic Control Module (ECM) and Global Positioning System (GPS).
Figure 4-3 shows the GPS vs. ECM comparison for Bus 380. The regression analysis shows
that the ECM data are around 10% higher than the GPS data, according to Sensors report (Ens-
field 2002).  Sensors, Inc. researchers believe that this comparison suggests that GPS data may
be more reliable for on-road testing.  Buses of model year 1995 were equipped with an earlier
version ECM that did not provide vehicle speed and GPS velocity data were used in place of the
ECM data.  Buses of model year 1996 were equipped with the current version ECM that can pro-
vide vehicle speed and vehicle speed was reported after validation with the GPS data. GPS data
were within 1% accuracy based upon analysis of 10 miles of data (Ensfield 2002).
                               GPS vs ECM Vehicle Speed Comparison
                                        Bus 1, Trip 1
                                                                  —VEH SPEED mph
                                                                 	GPS SPEED mph
                            400
                                    BOO
                                            800       1000
                                         Elapsed Tlirw. sec
                                                             1200
                                                                      1400
                                                                              1C CO
               Figure 4-3 Bus 380 GPS vs. ECM Vehicle Speed (Ensfield 2002)
4.1.4  Data Quality Assurance/Quality Check

       After understanding the manner in which Sensors, Inc. processed the reported data set,
the data set for each bus was screened to check for errors or possible problems. Possible sources
of errors associated with data collection should be considered before undertaking data analysis
for the development of a model. The types of errors checked are listed below.
       Loss of Data: Emission data are missing for some buses. For example, bus 382 had miss-
ing HC data for 343 seconds. Buses 361, 377 and 384 have similar problems. There might be
several reasons for loss  of data. Communication between instruments might be lost or a particu-
                                          4-6

-------
lar vehicle may have failed to report a particular variable. These records are removed from the
test database and not employed in development of HC models because the instantaneous emis-
sion values will be recorded as zero, introducing significant bias to the result.  Similarly, calcu-
lated fuel economy data are missing for some buses.
       Erroneous ECMData: There were some cases where certain engine parameters were well
outside physical limits, and these erroneous ECM data were filtered out with pre-defined filter
limits.  The following filter limits (Ensfield 2002) were imposed on the rate of change of RPM,
fuel flow, and vehicle speed data:
              Rate of change limit for RPM = 10,000 (RPM)/sec
       •       Rate of change limit for Fuel flow = 0.003 (gal/sec)/sec
       •       Rate of change limit for Vehicle speed = 21 (mph)/sec
       According to Sensors, Inc. report, these filters remove the data outside the defined limits.
The SEMTECH post-processor automatically interpolates between the remaining data, and pro-
duces results at IHz as before (Ensfield 2002).  Because this procedure was finished by manually
plotting the ECM parameters and computed mass results, all the buses' data were screened again
to check any remaining data spikes for data quality assurance purposes. No such errors were
identified for this kind of problem.  But the modeler should keep in mind that  data could be erro-
neous because "unreasonable" engine acceleration or deceleration was removed that could have
been within reasonable absolute limits.
       GPSDropouts: There were a few instances when the GPS lost communication with the
satellite for unknown reasons, and these erroneous GPS data were removed manually (Ensfield
2002).  To guarantee data quality, the modeler screened all GPS data again to check any remain-
ing erroneous cases. The principles for screening erroneous GPS data are based on the consis-
tency between GPS data and engine parameters. The secondary screening identified that bus
360 data still contained some erroneous GPS data. The questionable area covers the beginning
434 seconds of the whole trip (see Figure 4-4).  Their GPS data are shown as red in the left fig-
ure.  The right figure illustrates the time series plot for checked area. Although GPS signals are
reported as some fixed positions in the left figure while vehicle speed data are reported as zero in
the right figure, engine speed and engine power in the right figure shows that bus 360 did move
during  that period.  This error might due to GPS dropouts.
                                          4-7

-------
        Figure 4-4 Example Check for Erroneous GPS Data for Bus 360 (Ensfield 2002)
       Due to GPS dropouts, the GPS signals were reported as some fixed positions. At the
same time, the vehicle speed might be reported as zero while other ECM data, such as engine
speed and engine power, would show that the bus did move during that period.  If the modeler
fails to screen and remove such data, these data will be classified as idle mode. Further, these
data will cause erroneous analysis result for idle mode. The modeler screened all buses manually
and found that six buses had such problems (buses 360,361, 363, 364, 375 and 377).  Usually,
this type of error was prevalent during the beginning of the bus trip.  All erroneous data were
removed manually. The correction of the database to remove these erroneous data is critical to
model development (initial models associated with development of idle and load-based emission
rates were problematic until this database error was identified and corrected by the author).
       Synchronization Errors: Data were  checked for synchronization errors. An example
plot of such a check is presented in Figure  4-5 where part of the trip for Bus 360 is used. The
selected area covers about 200 seconds. Their GPS data are shown as the green/red part in the
left figure. The figure on the right illustrates the time series plot for the area checked. The speed
for red points in both figures is 0 mph. Although NOx correlates well to engine load and engine
speed, vehicle speed doesn't correlate well to engine data and NOx emissions data. Bus 360
was equipped with an earlier version ECM that did not provide vehicle speed. GPS velocity
data were used in place of the ECM data. According to Sensor's report, data synchronization
was only done between emissions data and engine data, not for vehicle speed for emissions data
(Ensfield 2002).
                                          4-8

-------
                                                                :


              Figure 4-5 Example Check for Synchronization Errors for Bus 360
       All bus data were checked for this type of error and such errors were identified in all of
the test data for six buses (buses 360, 361, 363, 364, 375, 377). Coincidentally, these six buses
had GPS dropout problems, too. From Frey's work (Frey and Zheng 2001), small errors in
synchronization do not substantially impact estimate of total trip emissions. Such deviations will
influence the estimate for micro-scale analysis. To choose the right delay time to remove the
GPS data and vehicle speed data, the author compared the impacts of using a 2-second, 3-sec-
ond, and 4-second delay. Figure 4-6 illustrates histograms of engine power for zero speed data
based on three different proposed time delay options. A 3-second delay is chosen because engine
power distribution for zero speed data based on a 3-second delay is more reasonable.  Compar-
ing to the 2-second delay results, zero speed data contain fewer data points with higher engine
power (>150 brake horsepower) for 3-second delay. Meanwhile, zero speed data contain more
data points with lower engine power (<20 brake horsepower) for a 3-second delay than 4-second
delay time.
                                          4-9

-------
     2.51
          50  100  150  200  250
              2* second delay
                   100  150  200  SO  30G
                    3-sec and delay
100   150  200 250  300
 4-second delay
 Figure 4-6 Histograms of Engine Power for Zero Speed Data Based on Three Different Time Delays
       Road Grade Validation: According to Sensor's report, the GPS data were used for grade
calculation. Combing the velocity at time t with the difference in altitude between time t and t-1
second, the instantaneous grade is computed as shown in Equation 4-2 (Ensfield 2002).
                            Grade, =-
                                          velocityt
                                     altitudet -altitude M
                                               Equation 4-2
               where  gradet:
                      t:
           Road grade at time t
           time, t or t-1 second
velocityt:  vehicle speed  in feet per second at time t
altitude :   altitude in feet at time t or t-1
        The calculation formula can generate significant errors given the uncertainty in the GPS
position, particularly at low speeds where there is less of a differential in distance over the one-
second interval (Ensfield 2002).  In the real world, the maximum recommended grade for use
in road design depends upon the type of facility, the terrain on which it is built, and the design
speed. Figure 4-7 is directly cited from Traffic Engineering (Roess et al. 2004) to present a
                                           4-10

-------
general overview of usual practice.  Roess et al. (2004) indicated that these criteria represent a
balance between the operating comfort of motorists and passengers and the practical constrains
of design and construction in more severe terrains.

                            — •• Level U-min -"-Rolling Tcf mm
                                            «J I H
                                                      -Mnunutnom TWTMH
                            --- LcvclTcrrain -
                                          Rolling Tcnutn
                                          (hiRufjlAflcrtaK
                            ^•- Level Terrain — -
 •40    -is     ft>
  Disugn SpixJ (
--Kullntf Icirjm
    iku
                                                       MounuinciinTetnM
              Figure 4-7 General Criteria for Maximum Grades (Roess et al. 2004)
        The modeler screened the grade data in the database and found that 0.42% of the data
 have higher grade (> 10%).  Meanwhile, 2% of the road grade data have higher rate of change
 (> 5%). This means some road grade data are dubious or erroneous. Considering Sensors, Inc.
 recommendations, road grade data would only be used as reference, and would not be used di-
 rectly in model development.

 4.1.5 Database Formation

        The data dictionaries of the source files were reviewed for parameter content.  Not all
 variables reported will be included in explanatory analysis. A standard file structure was de-
 signed to accommodate the available format. Emissions rate data with units of grams/second
 were selected to develop the proposed emission rate model. Because volumetric fuel rate, fuel
                                             4-11

-------
specific gravity, and fuel mass flow rate are used to calculate mass emissions (g/s), these vari-
ables will be excluded in further analysis.  Similarly, because percent engine load, engine torque,
and engine speed are used to calculate engine power (brake horsepower), only engine power
(bhp) is selected to represent power related variables. Exhaust flow rate is excluded because it is
back-computed from the mass emissions generated with the fuel flow method. Fuel economy is
excluded because it is 30 second moving average data and computed for a test period by sum-
ming the fuel consumed and dividing by the distance traveled. Because GPS data were used for
grade calculation and road grade data would only be used as reference, a dummy variable was
created to represent different road grade ranges.
       At the same time, variables that might be helpful in explaining variability in vehicle emis-
sions were included in the proposed file structure although they were not provided in the original
dataset. These variables include model year, odometer reading, and acceleration. Acceleration
data were derived from speed data using central difference method.  Table 4-3 summarizes the
parameter list for explanatory analysis.

Table 4-3 List of Parameters Used in Explanatory Analysis for Transit Bus
Category Parameters
Test Information
Vehicle Characteristics
Roadway Characteristics
Onroad Load Parameters
Engine Operating Parameters
Environmental Conditions
Vehicle Emissions
Date; Time
License number; Model year; Odometer reading; Engine size; Instru-
ment configuration number
Dummy variable for road grade range
Engine power (bhp); Vehicle speed (mph); Acceleration (mph/s)
Throttle position (0 - 100%); Engine oil temperature (deg F); Engine
oil pressure (kPa); Engine warning lamp (Binary); Engine coolant tem-
perature (deg F); Barometric pressure reported from ECM (kPa)
Ambient temperature (deg C); Ambient pressure (mbar); Ambient rela-
tive humidity (%); Ambient absolute humidity (grains/lb air)
HC, CO, NOx emission (in g/sec)
4.1.6  Data Summary

       After the post-processing procedure was completed, the summary of the emissions and
activity data as well as environmental and roadway characteristics is given in Table 4-4.
                                          4-12

-------
Table 4-4 Summary of Transit Bus Database
Bus ID
Numbers of Seconds of Data
Vehicle Operation
Average Speed (rnph)
Average Engine Power (blip)
Emission Data
Average CO (g/s)
Average Nox (g/s)
Average HC (g/s)
Environmental Characteristics
Average Ambient Temperature (deg C)
Average Ambient Pressure (mbar)
Average Humidity (grains/ (Ib air))
360
7606

11.116
71.952

0.029652
0.11049
0. (1)1 838

20358
977.16
24.512
361
5153

25.804
87.536

0.018965
0.1484
0,001304

16.666
971.08
26.745
363
7623

14.626
65.822

0.022419
0.066047
0.000239

25,623
965,69
88.396
364
5284

19.046
79.599

0.020627
0,12341
0,003492

20,358
985.58
33.227
372
5275

21.45
72.395

0.016582
0.087625
0,002371

21,375
982.05
32.494
375
7323

16.814
86.307

0.031844
0.13697
0.001377

17.5
977.52
24.394
377
780S

12.518
78.121

0,028571
0,074597
0,000557

26,012
973.08
70.653
379
7880

15.118
84.82

0,030731
0,10658
0.001807

23,788
974,27
70.818
380
8006

13.035
72.987

0.052504
0,10393
0,001073

23,648
973,22
67.525
381
7282

16.335
65.724

0.034294
0.090166
0,000609

22,465
987,82
46.016
382
3136

19.947
85.224

0.052822
014089
0.00132

21.746
994,71
27.868
383
7943

18.253
67.249

0.026207
0,11873
0,001803

21.282
983.55
44.646
384
8453

18.262
64.199

0,036183
010457
0,00137

18.17
992.7
22.494
385
8423

16.559
62.512

0.023527
0,095998
0,001693

21,842
991,34
29.766
386
10339

17.319
62.979

0.047062
0,10635
0.00147

20,389
985,65
37.239

-------
                             4.2 Heavy-duty Vehicle Dataset

       The heavy-duty vehicle emission dataset is prepared by the U.S. EPAN ational Risk
Management Research Laboratory (NRMRL) (U.S. EPA 2001b).  EPA's Onroad Diesel Emis-
sions Characterization (ODEC) facility has been collecting real-world gaseous emissions data for
many years (U.S. EPA2001c).  The on-road facility incorporated a 1990 Kenworth T800 tractor-
trailer as its test vehicle to collect this database. When this truck was purchased, it had already
logged over 900,000 miles and was due for an overhaul of its Detroit Diesel Series 60 engine.
The vehicle was tested prior to having this work done and after the overhaul.  NRMRL collected
the test data for U.S. EPA from 1999 to 2000 and included all the results and findings in a report
titled: "Heavy Duty Diesel Fine Particulate Matter Emissions: Development and Application of
On-Road Measurement Capabilities" (U.S. EPA2001c).

4.2.1 Data Collection Method

       The general capabilities of the ODEC facility are shown in Figure 4-8. The facility is designed
to collect data while traveling along the public roadways using a 1990 Kenworth T800 tractor-trailer.
This  truck was tested using two types of tests. During 'parametric' testing, the truck systematically fol-
lows a test matrix representing the full range of load, grade,  speed and acceleration conditions. During
'highway' testing, the truck travels along an interstate highway with no specific agenda other than cover-
ing the distance safely and efficiently; speed and acceleration vary randomly with grade, speed limit, and
traffic effects. Tables 4-5 and 4-6 summarize the tests finished by NRMRL for U.S. EPA.
                  Stack Measurements
                               Opacity
                           Temperature
                           Velocity Head
                          Static Pressure
                  Engine Measurements
                   ~Intake. Exhaust, Coolant
                      and Oil Temperatures
                           Speed, RPM
                      Drive Shaft Measurements
                           TOITJUO
                         — Speed, RPM
                    W v
                      Operational
                      Measurements
                      Speed, tan/h
v y
                                                  Front-to-Rcar G-Force
             Computerized
             Data Acquisition
             System
                                          Exhaust Sample Measurements
                  Oj.%
                  co!r%
                  co,%
                  CO, ppm
                  HQX, ppm
                  THCa, ppm
        Figure 4-8 Onroad Diesel Emissions Characterization Facility (U.S. EPA2001c)
                                          4-14

-------
Table 4-5 Onroad Tests Conducted with Pre-Rebuild Engine
Test Load Grade(s) _,
ID IbGCW % Comments
3FOOV
3FOOC
3FOOA
3HOOV
3HOOC
3HOOA
3EOOV
3EOOC
3EOOA
3FOGA
3FOSA
3FOV
3HOGA
3HOSA
3HOV
3EOGA
3EOSA
3EOV
3F3&6
3H3&6
3E3&6
3F-SEQ
3DRI
3FIL
3DIOX*
79280
79280
79280
61060
61060
61060
42840
42840
42840
79280
79280
79280
61060
61060
61060
42840
42840
42840
79280
61060
42840
79280
79280
61060
61060
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
3.1,6.0
3.1,6.0
3.1,6.0
Zero
Various
Various
Various
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Governed Acceleration
Short-shift Acceleration
Constant Speed Testing
Governed Acceleration
Short-shift Acceleration
Constant Speed Testing
Governed Acceleration
Short-shift Acceleration
Constant Speed Testing
Uphill Grade Tests
Uphill Grade Tests
Uphill Grade Tests
Dyno Sequence Simulations
Open Highway Tests - Tunnel
Open Highway Tests - Filters
Open Highway Tests - Dioxin
*Note: These tests are not available.
                                            4-15

-------
Table 4-6 Onroad Tests Conducted with Post-Rebuild Engine
Test ID L°™b Grade(s)% Comments
LrCW
5FOV
5FOC*
5FOA*
5HOV
5HOC*
5HOA*
5EOV
5EOC*
5EOA*
5F3&6
5H3&6
5E3&6
5F-SEQ*
5 Plume
SNOxB*
5DIOX*
74000
74000
74000
61440
61440
61440
42600
42600
42600
74000
61440
42600
74000
61440
61440
61440
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
3.1,6.0
3.1,6.0
3.1,6.0
Zero
Various
Various
Various
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Uphill Grade Tests
Uphill Grade Tests
Uphill Grade Tests
Dyno Sequence Simulations
Open Highway Tests - Plume
Open Highway Tests - Burst
Open Highway Tests - Dioxin
*Note: These test results are not available.
4.2.2  Heavy-duty Vehicle Data Parameters

       A total of 42 files were collected for the pre-rebuild engine and a total of 38 file collected
for the post-rebuild engine. Each file represents data collected for a different engine and test.
Preliminary analysis of individual files indicated that the format of files was same for all avail-
able files. The data fields included in each file are summarized in Table 4-7 below.
                                           4-16

-------
Table 4-7 List of Parameters Given in Heavy-duty Vehicle Dataset Provided by U.S. EPA
Category Parameters
Test Information
Vehicle
Characteristics
Onroad Load
Parameters
Engine Operating
Parameters
Environment
Conditions
Vehicle Emissions
Date; Time
Vehicle make/model; Model year; Engine type; Engine Rating; Vehicle mainte-
nance history
Truck load weight (Ib); Vehicle speed (mph); Measured engine power (bhp)
Engine speed (RPM); Shaft volts; Torque volts; Fuel H/C ratio; Fuel factor;
Engine intake air temperature (deg F); Engine exhaust air temperature (deg °F);
Engine coolant temperature (deg °F); Engine oil temperature (deg °F)
Barometric pressure (inches Hg); Ambient humidity (%)
CO, NOY, and HC emission (in ppm, g/hr, g/kg fuel and g/hp-hr units)
4.2.3 Data Quality Assurance/Quality Control Check

       Although a total of 80 tests were finished for that project, preliminary screening found
that there were some test files missing from the data DVD provided by U.S. EPA to the research-
ers. The missing test files include: 3DIOX, 5EOC, 5HOC, 5FOC, 5F-SEQ, SNOxB, and 5DIOX.
For quality assurance purposes, the available data files were screened to check for errors or pos-
sible problems.  Possible sources of errors for data collection should be considered before devel-
oping the model. The types of errors checked are listed below.
       Loss of Data: Measured horsepower (engine power) and emission data were missing
for some tests. Tests 3F-SEQ, 3FIL1, 3FIL2, and 3FIL3 had no measured horsepower data for
the entire test. These test files couldn't be included in emission model development.  In addi-
tion, tests 3EOOA, 3EOOC, 3EOOV, 3FOGA, 3FOSA, 3FOV, 3HOSA, 3FIL4, 3FIL5, 3FIL7, 3FIL8,
3FIL9, 3FIL10,  and 5HOV had no HC emission data. This problem will be fixed by removing
these tests for HC emission model development. Test 3HOSA also had no CO emission data and
this problem will be treated by removing this test for CO emission model development.
       Duplicated Records: A notable issue was duplicate records with different emission values
for same time in some test files.  After communicating with Mr. Brown who prepared this dataset
for EPA, the reason was identified: the data were recorded at rates as high as 10 Hz to improve
the resolution of the data. To keep consistent with other test files, these data were post-processed
as one data point for each second.
       Erroneous Load Data: The "measured horsepower" field is engine power data calculated
from measurement of the drive shaft torque and rotational speed.  Results from the literature
                                         4-17

-------
review show that engine power is a major explanatory variable of possible erroneous load data.
This variable was screened to check for errors or possible problems.  An example of a check
of measured horsepower is given in Figure 4-9.  The observed relationship between measured
horsepower and engine speed is to some extent a relationship between vehicle speed and en-
gine speed which can be found in "Fundamentals of Vehicle Dynamics" (Gillespie 1992). At a
given gear ratio, the relationship between engine speed and road speed is to some extent a linear
relationship. The geometric progression in the left figure reflects the choices made in selection
of transmission gear ratios. The right figure shows a problematic linear relationship between
measured horsepower and vehicle speed.  Essentially, the right figure appears to show no gear
changes as vehicle speed increases, indicating that measured horsepower has been calculated
incorrectly for this test.  Such problems exist in the series of tests 3DRI and test SPlume. These
test files were removed from  emission model development.
                                                          Tisl 30RI2-2. Of an Highway Tssls
                 200  400   600  800  1000  1200
                    Measured Ho«e»ower ithp)
       Figure 4-9 Example Check for Erroneous Measured Horsepower for Test 3DRI2-2
       Vehicle Speed Validation: The author reviewed NRMRL's report (U.S. EPA 2001c)
related to vehicle speed validation.  Vehicle speed data were measured with a Datron LSI opti-
cal speed sensor. The product literature specifies an accuracy of+/- 0.2% and a reproducibility
of+/- 0.1% over the measurement range of 0.5 to 400 kph. Figure 4-10 from NRMRL's report
                                          4-18

-------
correlates the speed measurement to a drive shaft speed sensor that was scaled using a National
Institute of Standards and Technology (NIST)-traceable frequency source.  The outliers at the
low-speed indicated when the truck was turning (the tractor and the trailer-mounted speed sensor
traveled less distance than the tractor does during turns). Notwithstanding these points, the cor-
relation is a good indication of speed measurement precision.
                    70
                    6O -
                 S.
                 to
                    3O -

                    2O -

                    1O -
                                 5OO        1OOO       15OO
                                     Drive Shaft Speed, rpm
                                                                  2OOO
                  Figure 4-10 Vehicle Speed Correlation (U.S. EPA2001c)
       At the same time, NRMRL provided Figure 4-11 (U.S. EPA 200Ic) to show the precision
for four ranges of vehicle speed, along with similar estimates of accuracy.  This figure will help
researchers deal with speed measurement noise in the future.
                           10-30
30-45        45-60
Speed range, mph
Above 60
                               I Precision (correlation error) G Accuracy estimate
        Figure 4-11 Vehicle Speed Error for Different Speed Ranges (U.S. EPA 200Ic)
                                          4-19

-------
4.2.4  Database Formation

       The data dictionaries of the source files were reviewed for parameter content (Table 4-8).
Not all variables reported are included in explanatory analysis. A standard file structure was
designed to accommodate the available format.  Emissions data with units of gram/second were
selected to develop the proposed emission model. All variables used to calculate mass emissions
were excluded in further analysis.  Similarly, because the "measured horsepower" field is calcu-
lated from measurements of drive shaft torque and rotational speed, only "measured horsepower"
is used to represent power related variables. At the same time, variables like acceleration that
might be helpful in explaining variability in vehicle emissions were included in the proposed file
structure although they were not provided in the original dataset. Acceleration data were derived
from speed data using the central difference method.

Table 4-8 List of Parameters Used in Explanatory Analysis for HDDV
Category Parameters
Test Information
Vehicle Characteristics
Onroad Load Parameters
Engine Operating
Parameters
Environment Conditions
Vehicle Emissions
Date; Time
Vehicle make/model; Model year; Engine type; Engine rating; Vehicle
maintenance history
Truck load weight (Ib); Vehicle speed (mph); Acceleration (mph/s);
Measured engine power (bhp)
Engine intake air temperature (deg F); Engine exhaust air tem-
perature (deg F); Engine coolant temperature (deg F); Engine oil
temperature (deg F)
Barometric pressure (Hg), Ambient moisture (%)
CO, NO , and HC emission (in g/s units)
4.2.5  Data Summary

       After the post-processing procedure was completed, a summary of the emissions and
activity data as well as environmental and roadway characteristics is given in Table 4-9.
                                          4-20

-------
Table 4-9 Summary of Heavy-Duty Vehicle Data U.S. EPA2001c).
  Test ID
Number
   of
Seconds
of Data
                   Vehicle Operation
                   Average
                    Speed
                    (mph)
Average
 Engine
 Power
 (bhp)
                                      Emission Data
Average  Average   Average
CO(g/s)  NO (g/s)   HC(g/s)
                                          Environment
                                         Characteristics
Barometric
 Pressure
   (Hg)
Ambient
Moisture
3FOOV
3FOOC
3FOOA
3HOOV
3HOOC
3HOOA
3EOOV
3EOOC
3EOOA
3FOGA
3FOSA
3FOV
3HOGA
3HOSA
3HOV
3EOGA
3EOSA
3EOV
3F3&6
3H3&6
3E3&6
3FIL4
3FIL5
3FIL6
3FIL7
3FIL8
3FIL9
3FIL10
5FOV
5HOV
5EOV
5F3&6a
5F3&6b
5H3&6a
5H3&6b
5E3&6
4430
7991
1904
3718
7593
1959
3863
7962
1810
577
792
3635
594
707
3331
421
571
3395
8629
10573
9825
12456
13738
6415
10678
12248
11956
12367
4895
4091
4407
6971
5058
6919
6951
10807
43.55
36.49
43.55
43.66
39.43
48.04
41.41
39.31
50.15
35.93
36.26
41.65
33.81
34.27
41.53
32.91
31.99
42.64
36.59
43.13
44.74
66.54
58.76
66.94
62.76
64.70
65.62
63.71
32.87
42.36
42.60
36.24
38.69
39.74
39.44
46.01
163.10
323.79
475.12
130.99
112.50
218.50
123.42
104.95
197.07
302.14
287.45
152.23
253.63
223.73
143.38
233.93
180.73
103.63
131.00
107.06
121.69
152.91
129.99
130.11
164.82
147.26
153.44
167.73
96.09
126.14
105.84
147.99
133.54
133.01
148.26
124.07
0.11633
0.08200
0.17476
0.08386
0.07456
0.20521
0.10896
0.07489
0.22324
0.23114
0.25140
0.14879
0.30036
NA
0.08892
0.37978
0.23652
0.08879
0.14409
0.16769
0.16617
0.06994
0.06354
0.06273
0.07042
0.06688
0.06551
0.07481
0.10716
0.12564
0.10681
0.13716
0.14044
0.12723
0.15400
0.13981
0.27983
0.19566
0.34262
0.22701
0.17866
0.32078
0.21157
0.14908
0.26108
0.41269
0.37947
0.28413
0.48494
0.32498
0.27712
0.30728
0.33325
0.25745
0.31374
0.27507
0.23913
0.29925
0.22315
0.20833
0.28353
0.26035
0.20905
0.35788
0.23558
0.30933
0.29045
0.31607
0.30661
0.28763
0.32910
0.27674
0.001442
0.001166
0.001471
0.001429
0.001414
0.001751
NA
NA
NA
NA
NA
NA
0.002159
NA
0.002436
0.000589
0.003042
0.002805
0.001426
0.001753
0.001839
NA
NA
0.001409
NA
NA
NA
NA
0.002828
NA
0.002894
0.003111
0.001924
0.002397
0.002807
0.002827
28.273
28.272
28.272
28.273
28.272
30.423
28.273
28.272
30.137
29.995
29.995
29.995
29.690
29.690
28.020
29.976
29.976
29.976
28.282
28.273
28.250
29.238
29.238
29.238
29.854
29.773
29.418
30.132
30.101
30.179
30.278
28.004
28.009
28.024
28.014
28.024
1.6874
1.6874
1.6874
1.6874
1.6874
1.3573
1.6874
1.6874
1.9020
0.4685
0.4685
0.4685
1.6059
1.6059
0.4742
0.5812
0.5812
0.5812
1.2520
1.6874
1.5716
0.3886
0.3886
0.3886
0.1480
0.1484
0.1502
0.1466
0.5761
0.6091
0.8601
0.9070
0.8862
0.8138
1.2149
1.0131
                                            4-21

-------
                                     CHAPTER 5
                         5. METHODOLOGICAL APPROACH
       The following chapter lays the theoretical foundation of the conceptual framework of
model development. This chapter outlines the statistical methods, addresses issues that arise in
statistical modeling, and presents the solutions that are employed to address these issues.  This
chapter will serve as a guide or "road map" for the underlying methodology of the model  devel-
opment process.

                           5.1 Modeling Goal and Objectives

       The goal of this  research is to provide emission rate models that fill the gap between the
existing models and ideal models for predicting emissions of NO , CO, and HC from heavy-duty
diesel vehicles.  Problems in existing models, like EPA's MOBILE series and CARB's EMFAC
series of models, have been highlighted in previous chapters. U.S. EPA is currently developing a
new set of modeling tools for the estimation of emissions produced by on-road and off-road mo-
bile sources. MOVES,  a new model under development by EPA's OTAQ, is a modeling system
designed to better predict emissions from on-road operations. The philosophy behind MOVES
is the development of a  model that is as directly data-driven as possible, meaning that emission
rates are developed from second-by-second or binned data.
       Using second-by-second data collected from on-road vehicles, this research effort will
develop models that predict emissions as a function of on-road variables known to affect vehicle
emissions.  The model should be robust and ensure that assumptions about the underlying distri-
bution of the data are verified and the properties of parameter estimates are not violated. With
limited available data, this study focuses on development of an analytical methodology that is
repeatable with a different data set from across space and across time.  As more data become
available, the proposed  model will need to be re-estimated to ensure that the model is transfer-
able across additional HDV engine types, operating conditions, environmental conditions, and
even perhaps geographical regions.
                                          5-1

-------
                                 5.2 Statistical Method

       The purpose of statistical modeling was to determine which explanatory variables sig-
nificantly influence vehicle emissions so that the data can be stratified by those variables and a
corresponding regression relationship can be developed. For many statistical problems there are
several possible solutions.  In comparing the means of two small groups, for instance, we could
use a t test, a t test with a transformation, a Mann-Whitney U test, or one of several others. The
choice of method depends on the plausibility of normal assumptions, the importance of obtaining
a confidence interval, the ease of calculation, etc.
       Parametric or non-parametric approaches to evaluation can be applied. Parametric meth-
ods are used when the distribution is either known with certainty or can be guessed with a certain
degree of certainty.  These methods are meaningful only for continuous data which are sampled
from a population with an underlying normal distribution or whose distribution can be rendered
normal by mathematical transformation. Analysts must be careful to ensure that significant er-
rors are not introduced when assumptions are not met. In contrast, nonparametric methods make
no assumptions about the distribution of the data or about the functional form of the regression
equation. Nonparametric methods are especially useful in situations where the assumptions
required by parametric are in question. Brief overviews and underlying theories of statistical
methods that might used in this research are addressed in the following sections.

5.2.1 Parametric Methods

5.2.1.1 Thef-Test

       Student's t-test is one of the most commonly used techniques for testing whether the
means of two groups are statistically different from each other.  This test tries to determine
whether the measured difference between  two groups is large enough to reject the null hypothesis
or whether such differences are just due to "chance".  The formula for the t-test (Equation 5-1) is
a ratio. The numerator of the ratio is just the difference between the two means or averages.  The
denominator is a measure of the variability or dispersion of the data.
                                        Li     SVry
                                                                    (Equation 5-1)
                                           5-2

-------
       where xl and x2 are the sample means,  *? and -s22 are the sample variances, nl and n2 are
the sample sizes and Hs a Student t quantile with nl + n2 - 2 degrees of freedom.
       Usually a significance level of 0.05 (or equivalently, 5%) is employed in statistical analy-
ses. The significance level of a statistical hypothesis test is a fixed probability of wrongly reject-
ing the null hypothesis HQ, if it is in fact true. Another index is p-value which is the probability
of getting a value of the test statistic as extreme as or more extreme than that observed by chance
alone, if the null hypothesis HQ is true.  The p-value is compared with the actual significance
level of the test and, if it is smaller, the result is significant. That is, if the null hypothesis were to
be rejected at the 5% significance level, this would be reported as "p < 0.05".
       The assumptions for f-test include:  1) the populations are normally distributed; 2) vari-
ances in the two populations are equal; and 3) the populations are independent. The results of
the analysis may be incorrect or misleading when assumptions are violated.  For example, if
the assumption of independence for the sample values is violated,  then the two-sample t test is
simply not appropriate. If the assumption of normality is violated or outliers are present, the
two-sample t test may not be the most powerful available test. This could mean the difference
between detecting a true difference or not.  A nonparametric test or employing a transformation
may result in a more powerful test.

5.2.1.2 Ordinary Least Squares Regression

       Regression analysis is a statistical methodology that utilizes the relation between two
or more quantitative variables so that one variable can be predicted from the other, or others
(Neter et al.  1996). There are many different kinds of regression models, like the linear regres-
sion model, exponential regression model,  logistic regression model, and so on.  Among them,
linear regression is a commonly used and easily understood statistical method.  Linear regression
explores relationships that can be described by straight lines or their generalization to many di-
mensions. Regression allows a single response variable to be described by one or more predictor
variables.
       Ordinary least squares (OLS) regression is a common statistical technique for quantifying
the relationship between a continuous dependent variable and one or more independent variables
(Neter et al.  1996). The dependent variables may be either continuous or discrete. Neter et al.
(1996) provides the basic OLS regression equation for a single variable regression model as
shown in Equation 5-2:
                                           5-3

-------
                                Yt = Po + (3 iXi + 8.  (Equation 5-2)


         where:
                 Y   =    value of the response variable in the ith trial
                 P0, P: =    estimators of regression parameters
                 X.   =    value of the predictor variable in the ith trial
                 e.    =    random error term with mean E{s.} = 0 and variance o2 (s.}= o2;
                           e. and e. are uncorrelated so that their covariance is zero.
                            i     j

       The parameters of the OLS regression equation, PQ  and p., are found by the least squares
method, which requires that the sum of squares of errors be minimized.  Gauss-Markov theorem
(Neter et al.  1996) states that, among all unbiased estimators that are linear combinations of ys,
the OLS estimators of regression coefficients have the smallest variance; i.e., they are the best
linear unbiased estimators. The Gauss-Markov Theorem does not tell one to use least squares all
the time, but it strongly suggests use of least squares (Neter et al. 1996).
       In linear regression, there are key assumptions that must be met, including:

       •   Y. are independent normal random variables;

       •   The expected value of the error terms e. is zero;

       •   The error terms  e.  are assumed to have constant variance o2;
                         i

       •   The error terms  e.  are assumed normally distributed;

       •   The error terms  e.  are assumed to be uncorrelated so that their covariance is zero; and

       •   The error terms  e.  are independent of the explanatory  variable

       If the above assumptions are violated the regression equation may yield biased results
(Neter et al.  1996). For example, if the explanatory variable is not independent of the error term,
larger sample sizes do not lead to lower standard errors for the parameters, and the parameter
estimates (slope, etc.) are biased.  If the error is not distributed normally, for example, there may
                                           5-4

-------
be fat tails. Consequently, use of the normal distribution may underestimate true 95% confidence
intervals.

5.2.1.3 Robust Regression

       OLS models generally rely on the normality assumption and are often fitted by means of
the least squares estimators. However, the sensitivity of these estimation techniques is related to
this underlying assumption which has been identified as a weakness that can lead to erroneous
interpretations (Copt and Heritier 2006). Robust regression procedures dampen the influence of
outlying cases, as compared to OLS estimation, in an effort to provide a better fit for the major-
ity of cases. Robust regression procedures are useful when a known, smooth regression function
is to be fitted to data that are "noisy", with a number of outlying cases,  so that the assumption of
a normal distribution for the error terms is not appropriate (Neter et al.  1996). The method of
moments (MM) estimators are designed to be both highly robust against outliers and highly ef-
ficient.

5.2.2 Nonparametric Methods

       Nonparametric methods have several advantages compared with parametric methods.
Nonparametric methods require no or very limited assumptions to be made about the format
of the data, and they may therefore be preferable when the assumptions required for paramet-
ric methods are not valid (Whitley and Ball 2002). Nonparametric methods can be useful for
dealing with unexpected, outlying observations that might be problematic with a parametric
approach.  Nonparametric methods are intuitive and are simple to carry out by hand, for small
samples at least.
       However, nonparametric methods may lack power as compared with more traditional
approaches (Siegel 1988). This lack of power is a particular concern if the sample size is small
or if the assumptions for the corresponding parametric method hold true (e.g., normality of the
data). Nonparametric methods are geared toward hypothesis testing rather than estimation of ef-
fects.  It is often possible to obtain nonparametric estimates and associated confidence intervals,
but this process is not generally straightforward. In addition, appropriate computer software for
nonparametric methods can be limited, although the situation is improving.

5.2.2.1 Chi-Square Test

       The Chi-square (Koehler and Larnz 1980), best known goodness-of-fit test,  assumes that
the observations are independent and that the sample size is reasonably large.  This method can
                                          5-5

-------
be used to test whether a sample fits a known distribution, or whether two unknown distribu-
tions from different samples are the same. The test can detect major departures from a logistic
response function, but is not sensitive to small departures from a logistic response function. The
test assumptions are that the sample is random and that the measurement scale is at least ordinal
(Conover 1980; Neter et al. 1996).
       Pearson's chi-square goodness of fit test statistic is shown in Equation 5-3 (StatsDirect
2005):                                          2
                                  ^ (Oj - Ej )            (Equation 5-3)
       where O. are observed counts, E. are corresponding expected count and c is the number of
classes for which counts/frequencies are being analyzed.
       The test statistic is distributed approximately as a chi-square random variable with c-1
degrees of freedom. The test has relatively low power (chance of detecting a real effect) with
all but large numbers or big deviations from the null hypothesis (all classes contain observations
that could have been in those classes by chance).
       The handling of small expected frequencies is controversial. Koehler and Larnz asserted
that the chi-square approximation is adequate provided all of the following are true: total  of ob-
served counts (N) > 10; number of classes (c) > 3; all expected values > 0.25 (Koehler and Larnz
1980).

5.2.2.2 Kolmogorv-Smirnov Two-Sample Test

       The Kolmogorov-Smirnov (K/S) two-sample test (Chakravart and Roy 1967) compares
the empirical distribution functions of two samples, Ej and E2. The Kolmogorov-Smirnov test is
a nonparametric test, which can be used to test whether two or more samples are governed by the
same distribution by comparing their empirical distribution functions.
       The Kolmogorov-Smirnov two sample test statistic can be defined as shown in Equation
5-4 (Chakravart and Roy 1967):
                                 D =
                                      El (i) - E 2 (i)      (Equation 5-4)
       where E. and E  are the empirical distribution functions for the two samples.
                                          5-6

-------
       The Kolmogorov-Smirnov (K/S) two-sample test provides an improved methodology
over the chi-squared test since data do not have to be assigned arbitrarily to bins.  Further, it is a
non-parametric test so a distribution does not have to be assumed. However, the main disadvan-
tage to the K/S is similar to the chi-square in that the orders of magnitude of separate tests that
would have to be conducted to test all the possible combinations of variables in the datasets is
logistically infeasible (Hallmark 1999).

5.2.2.3 Wilcoxon Mann-Whitney Test

       The Wilcoxon Mann-Whitney Test (Easton and McColl 2005) is one of the most power-
ful of the nonparametric tests for comparing two populations.  This test is used to test the null hy-
pothesis that two populations have identical distribution functions against the alternative hypoth-
esis that the two distribution functions differ only with respect to location (median), if at all.
       The Wilcoxon Mann-Whitney test does not require the assumption that the differences
between the two samples are normally distributed. In many applications, the Wilcoxon Mann-
Whitney Test is used in place of the two sample Mest when the normality assumption is ques-
tionable.  This test can also be applied when the observations in a sample of data are ranks, that
is, ordinal data rather than direct measurements.
       The Mann Whitney U statistic is denned as shown in Equation 5-5 (StatsDirect 2005):
                                                             (Equation 5-5)
       where samples of size n1 and n2 are pooled and R. are the ranks.
       U can be resolved as the number of times observations in one sample precede observa-
tions in the other sample in the ranking.  Wilcoxon rank sum, Kendall's S and the Mann-Whitney
U test are exactly equivalent tests. In the presence of ties the Mann-Whitney test is also equiva-
lent to  a chi-square test for trend.

5.2.2.4 Analysis of Variance (ANOVA)

       ANOVA (Analysis of Variance) (Neter et al. 1996), sometimes called an F test, is closely
related to the t test. The major difference is that, where the t test measures the difference be-
tween the means of two groups, an ANOVA tests the difference between the means of two
or more groups. ANOVA modeling does not require any assumptions about the nature of the
statistical relation between the response and explanatory variables, nor do they require that the
explanatory variables be quantitative.
                                          5-7

-------
       The ANOVA, or single factor ANOVA, compares several groups of observations, all of
which are independent, but each group of observations may have a different mean. A test of
great importance is whether or not all the means are equal. The advantage of using ANOVA rath-
er than multiple t-tesis is that it reduces the probability of a type-I error (making multiple com-
parisons increases the likelihood of finding something by chance). One potential drawback to
an ANOVA is that it can only tell that there is a significant difference between groups, not which
groups are significantly different from each other.  The breakdowns of the total sum of squares
and degrees of freedom, together with the resulting mean squares, are presented in an ANOVA
table such as Table 5-1.

Table 5-1 ANOVA Table for Single-Factor Study (Neter et al. 1996)
Source of
Variation
Between
treatments
Error
(within
treatments)
Total
Sum of
Squares (SS)
- 	 	 --2
Z^i i \ i )

.^-m, tf
**.W.-T1
Degrees of
Freedom
(df)
r - 1


nT-r
nT-l
Mean Square (MS)
MSTR-SSTR
r-l
SSE
NT-r

Expected Mean Square
E(MS)
a 2 Z^ i ^ *i *"• '
r-l
2
o

       A factorial ANOVA can examine data that are classified on multiple independent vari-
ables. A factorial ANOVA can show whether there are significant main effects of the indepen-
dent variables and whether there are significant interaction effects between independent variables
in a set of data.  Interaction effects occur when the impact of one independent variable depends
on the level of the second independent variable (Neter et al.  1996).  Computation can be per-
                                                (8)
formed with standard statistical software such as SAS  .

5.2.2.5 HTBR

       HTBR (Breiman et al. 1984) is a forward step-wise variable selection method, similar
to forward stepwise regression. This method is also known as Classification and Regression
Tree (CART) analysis.  This technique generates a "tree" structure by dividing the sample data

-------
recursively into a number of groups. The groups are selected to maximize some measure of
difference in the response variable in the resulting groups. As Washington et al. summarized in
1997 (Washington et al. 1997a), this method is based upon iteratively asking and answering the
following questions: (1) which variable of all of the variables 'offered' in the model should be
selected to produce the maximum reduction in variability of the response? and (2) which value
of the selected variable (discrete or continuous) results in the maximum reduction in variability
of the response? The HTBR terminology is similar to that of a tree; there are branches, branch
splits or internal nodes, and leaves or terminal nodes (Washington et al. 1997a).
       To explain the method in mathematical terms, the definitions are presented by Washing-
ton et al. (Washington et al. 1997a). The first step is to define the deviance at a node.  A node
represents a data set containing L observations. The deviance, D  , can be estimated as shown in
equation 5-6:
                                   11
                                   «  ^             ,»
                                                                    (Equation 5-6)
                                   1=1
        where
                D   =     total deviance at node a, or the sum of squared error (SSE) at the
                          node
                Y! a  =     Ith observation of dependent variable y at node a

                X"   =     estimated mean of L observations in node a
       Next, the algorithm seeks to split the observation at node a on a value of an independent
variable, X., into two branches and corresponding nodes b and c, each containing M and N of the
original L observations (M+N=L) of the variable X.. The deviance reduction function evaluated
over all possible Xs then can be defined as shown in Equations 5-7 thru 5-9:
                                                             (Equation 5-7)
                                         M
                                  A =

                                         m=l
                                          N
i,b   Xb)     (Equation 5-8)

             (Equation 5-9)
                                         n=\
                                           5-9

-------
        where
                A  1]x  =  the total deviance reduction function evaluated over the domain of
                          all Xs
                Dfe   =  total deviance at node b
                D     =  total deviance at node c
      =  mth observation on dependent variable y in node b
y     =  nth observation on dependent variable y in node c
                 m b
                 %b    =  estimated mean of M observations in node b
                 Jc    =  estimated mean of N observations in node c

       The variable Xk and its optimum split X  is sought so that the reduction in deviance is
maximized, or more formally when (as shown in equation 5-10):

         Aw = i>,.fl -*J2 -±(ym. -*6)2 -tov -*c)2 = max     (Etluation 5-10)
        where
                              m-\            n-\
                A        =the total deviance reduction function evaluated over the domain of
                         all Xs
                Y! a    =  fh observation of dependent variable y at node a
                X"    =  estimated mean of L observations in node a
                ym b   =  m* observation on dependent variable y in node b
                yn c   =  nfe observation on dependent variable y in node c

                xb    =  estimated mean of M observations in node b
                ~XC    =  estimated mean of N observations in node c
       The maximum reduction occurs at a specific value X   of the independent variable Xk.
When the data are split at this point, the remaining samples have a much smaller variance than
the original data set. Thus, the reduction  in node a deviance is greatest when the deviances at
nodes b and c are smallest. Numerical search procedures are employed to maximize Equation
5-10 by varying the selection of variables used as a basis for a split and the value to use for each
variable at a split.
       In growing a regression tree, the binary partitioning algorithm recursively splits the data
in each node until the node is homogenous or the node contains too few observations.  If left
                                          5-10

-------
unconstrained, a regression tree model can "grow" until it results in a complex model with a
single observation at each terminal node that explains all the deviance. However, for application
purposes, it is desirable to create criteria to balance the model's ability to explain the maximum
amount of deviation with a simpler model that is easy to interpret and apply. Some software,
such as S-Plus™, allows the user to select such criteria.  The software allows the user to interact
with the data in the following manner to select variables and help simplify the final model:

       •  Response variable: the response variable is selected by the user from a list of fields
         from the data set;

       •  Predictor variables: one or more independent variables can be selected by the user
         from a list of fields associated with the dataset;

       •  Minimum number of observations allowed at a single split: sets the minimum number
         of observations that must be present before a split is allowed (default is 5);

       •  Minimum node size: sets the allowed sample size at each node (default is 10); and

       •  Minimum node deviance: the deviance allowed at each node (default is 0.01).

       However, unlike OLS regression models, a shortcoming of HTBR is the absence of
formal measures of model fit, such as ^-statistics, F-ratio, and r-square, to name a few.  Thus,
the HTBR model is used to guide the development of an OLS regression model, rather than as
a model in its own right. Similar uses of HTBR techniques have been developed and applied in
previous research papers (Washington et al. 1997a; Washington et al.  1997b; Fomunung et al.
1999; Freyetal. 2002).

                                5.3  Modeling Approach

       The model development process will start by using HTBR both as a data reduction tool
and for identifying  potential interactions among the variables. Then OLS Regression or Robust
Regression is used with the identified variables to estimate a preliminary "final" model. After
that, we need to check the model for compliance with normality assumptions and goodness of fit.
       Several diagnostic tools are available to perform these checks.  Once a preliminary
"final" model is obtained,  regression coefficients are examined using their ^-statistics and cor-
relation coefficients to determine which variables should be removed or retained in the model for
further analysis. However this procedure can lead to the removal of potentially important inter-
correlated explanatory variables. In fact, variable agreement with underlying scientific principles
                                          5-11

-------
of combustion, pollutant formation and emission controls (cause-effect relationships) should be
the basis for the ultimate decisions regarding variable selection.  Thus, a ^-statistic may indicate
that a parameter is insignificant (at level of significance = 0.05), while theory indicates that such
a parameter should be retained in the model for further analysis.  This type of error is usually
referred to as a type II error (Fomunung 2000).
       F-statistics and adjusted coefficient of multiple determination, R2 are used to determine
the effect-size of the parameters.  Usually, adding more explanatory variables to the regression
model can only increase R2 and never reduce it, because SSE can never become larger with more
X variables and total sum of squares (SSTO) is always the same for a given set of responses.
The adjusted coefficient of multiple determination can adjust R2 by dividing each sum of squares
by its associated degrees of freedom. The F-test is used to test whether the parameter can be
dropped even if the  ^-statistic is appropriate.
       In multiple regression analysis, the predictor or explanatory variables tend to be corre-
lated among themselves and with other variables related to the response variable but not included
in the model. The effects of multicollinearity are many and can be severe. Neter et al. (Neter
et al. 1996) have documented a few of these: when multicollinearity exists the interpretation  of
partial slope coefficients becomes meaningless; multicollinearity can lead  to estimated regression
coefficients that vary widely from one sample to another; and there may be several  regression
functions that provide equally good fits to the data, making the effects of individual predictor
variables difficult to assess.
       There are some informal diagnostic tools suggested to detect this problem.  A frequently
used technique is to calculate a simple correlation coefficient between the  predictor variables to
detect the presence of inter-correlation among independent variables.  Large values of correlation
is an indication that multicollinearity may exist. Large changes in the estimated regression coef-
ficients when a predictor variable is added or deleted are also an indication.  Finally, multicol-
linearity may be a problem if estimated regression coefficients are calculated with an algebraic
sign that is the opposite  of that expected from theoretical considerations or prior experience (i.e.,
the beta coefficient is compensating for the beta coefficient of a correlated explanatory variable).
       A formal method of detecting this problem is the variance inflation factor (VIF), which is
a measure of how much the variances of the estimated regression coefficients are inflated as com-
pared to when the predictor variables are not linearly related (Neter et al. 1996).  This method is
widely used because it can provide quantitative measurements of the impact of multicollinear-
ity. The largest VIF value among all Xs is used to assess the severity of multicollinearity.  As a
                                           5-12

-------
rule of thumb, a VIF in excess of 10 is frequently used as an indication that multicollinearity is
severe.
       Diagnostic plots are examined to verify normality and homoscedasticity (i.e., homogene-
ity of variance) assumptions as well as the goodness of fit.  Because of the difficulty in assessing
normality, it is usually recommended that non-constancy of error variance should be investigated
first (Neter et al. 1996).  The plots used to identify any patterns in the residuals are considered
as informal diagnostic tools and include plots of the residuals versus the fitted values and plot of
square root of absolute residuals versus the fitted values. The normality of the residuals can be
studied from histograms, box plots, and normal probability plots of the residuals.  In addition,
comparisons can be made of observed frequencies with expected frequencies if normality ex-
ists. Usually, heteroscedasticity and/or inappropriate regression functions may induce a depar-
ture from normality.  When OLS is applied to heteroskedastic models the estimated variance is a
biased estimator of the true variance. OLS either overestimates or underestimates the true vari-
ance, and, in general it is not possible to determine the nature of the bias.  The variances, and the
standard errors, may therefore be either understated or overstated.

                                   5.4 Model Validation

       Model validity refers to the stability and reasonableness of the regression coefficients,
the  plausibility and usability of the regression function, and the ability to generalize inferences
drawn from the regression function. Validation is a useful and necessary part of the model-build-
ing process (Neter et al.  1996).
       Two basic ways of validating a regression model are internal and external.  Internal
validation consists of model checking for plausibility of signs and magnitudes of estimated  coef-
ficients, agreement with earlier empirical results and theory, and model diagnostic checks such as
distribution of error terms, normality of error terms, etc. Internal validation will be performed as
part of the model estimation procedure.
       External validation is the process to check the model and its predictive ability with the
collection of new data, such as data from another location or time, or using a holdout sample.
Considering there are only 15 buses/engines in the data set, it is not practical to split the data
set and hold a sample for validation purposes. Splitting the data set will definitely influence
the  regression estimators. However suggestions and procedures for  external validation will be
provided.
                                           5-13

-------
                                     CHAPTER 6
     6. DATA SET SELECTION AND ANALYSIS OF EXPLANATORY VARIABLES
                       6.1 Data Set Used for Model Development

       Development of a modal model designed to predict emissions on a second-by-second ba-
sis as a function of engine load requires the availability of appropriate emission test data. Modal
modeling required the availability of second-by-second vehicle emissions data, collected in par-
allel with corresponding revealed engine load data. In 2004, only two data sets could be identi-
fied for use in this modeling effort. U.S. EPA provided two major HDV activity and emission
databases to develop the emission rate model (Ensfield 2002) (U.S. EPA 2001b).  One database
is a transit bus database, which included emissions data collected on diesel transit buses oper-
ated by the AATA in 2001, and another database is heavy HDV (HDV8B) database prepared by
NRMRL in 2001. The transit database consisted of data collected from 15 buses with the same
type of engines while the HDV8B database consisted of only one truck engine tested extensively
on-road under pre-rebuild and post rebuild engine conditions.  To decide whether it is suitable to
combine these two data sets or treat them individually, two dummy variables were added to the
databases to describe vehicle types. For the first dummy variable named "bus",  1 was assigned
for transit bus, and 0 for others. For the second dummy variable, 1 was assigned for FtDDV with
pre-rebuild engine, and 0 for others. HTBR was applied to all data sets to examine whether tran-
sit buses behave differently from FtDDVs or not.  The regression trees and results for NO , CO,
and HC emission rates are given in Figures 6-1 to Figure 6-3.
                                          6-1

-------
12938                                0.2571




   Figure 6-1 HTBR Regression Tree Result for NOv Emission Rate for All Data Sets




                       	1
                                                                          0.1044
          X



          iis
-------
                                                                          0.001476
       0.002710
0.001637
       Figure 6-3 HTBR Regression Tree Result for HC Emission Rate for All Data Sets
       Dummy variable for bus is selected as the first split for all three trees above.  Therefore
transit bus and HDDV should be treated separately.  Since there are 15 engines in the transit bus
data set and one engine (pre-rebuild and post-rebuild for the same engine) in the HDDV data set,
the transit bus data set should be used for the final version of the conceptual model development.

                  6.2 Representative Ability of the Transit Bus Data Set

       The transit bus data set was collected by Sensors, Inc. in Oct. 2001 (Ensfield 2002).  The
buses tested came from the AATA and included 15 New Flyer models with Detroit Diesel Series
50 engines. All of the buses were of model years 1995 and 1996. All of the bus tested periods
lasted approximately 2 hours. The buses operated during standard AATA bus routes and  stopped
at all regular stops although the buses did not board or discharge any passengers (Ensfield 2002)
The routes were mostly different for each test, and were selected for a wide variety of driving
conditions (see Figure 4-1).
       Figure 6-4 shows the speed-acceleration matrix developed with second-by-second data.
There are two high speed/acceleration frequency peaks here.  One is the bin of speed < 2.5 mph
and acceleration (-0.25 mph/s, 0.25 mph/s) and contains 26.11% of the observations, while the
other is the combination of several adjacent bins which covers speed (22.5 mph, 47.5 mph) and
acceleration (-0.75 mph/s, 0.75 mph/s).
                                           6-3

-------
                     Figure 6-4 Transit Bus Speed-Acceleration Matrix
       Georgia Institute of Technology researchers collected more than 6.5 million seconds of
transit bus speed and position data using Georgia Tech Trip Data Collectors ( an onboard com-
puter with GPS receiver, data storage, and wireless communication device) installed on two
Metropolitan Atlanta Rapid Transit Authority (MARTA) buses in 2004 (Yoon et al. 2005b). With
second-by-second data, the research team developed transit bus speed/acceleration matrices for
the combinations between roadway facility type (arterial or local road) and time range (morning,
midday, afternoon, night).  For each matrix, two high acceleration/deceleration frequency peaks
were also found. This finding is consistent with the AATA data set, indicating at least that the on-
road operations of the buses in Ann Arbor are similar to operations in the Atlanta region.
       This data set was collected under a wide variety of environmental conditions, too. The
temperature ranged from 10 °C to 30 °C, the relative humidity ranged from 15% to 65%, while
the barometric pressure ranged from 960 mbar to 1000 mbar (Figure 6-5).  So we can use this
data set to examine the impact of environmental conditions on emissions.
                                          6-4

-------
30






I
^ 20
S
1
t-
iy

12
10








* * *

*
•
*
.
I
*






*
»
*


1







+ * *
* * • + •
-•*••*•••« 	
* *
*


J 1!
               Bus No
                          Figure 6-5 Test Environmental Conditions
       Transit buses tested were provided by the AATA and all of them are New Flyer models
with Detroit Diesel Series 50 engines. Since these buses utilized consistent engine technologies
(i.e., fuel injection type, catalytic converter type, transmission type, and so on), the ability of esti-
mated emission models to incorporate the effect of other types of vehicle technologies is limited.
Another limitation is the consideration of the effects of emission control technology deterioration
on emission levels since these buses were only 5 or 6 years old during the test.
                             6.3 Variability in Emissions Data
6.3.1 Inter-bus Variability
       Data are presented to illustrate the variability in observed data. Inter-bus variabilities are
illustrated using median and mean of NO , CO, and HC emission rates for each bus from Figures
6-6 to 6-8.  The difference between median and mean is an indicator of skewness for the distribu-
tion of emission rates.
                                            6-5

-------
016
014 -
                                              O.I6r
                                              0,14
             Figure 6-6 Median and Mean of NO  Emission Rates by Bus
                                       6-6

-------
 0.06
 0.06
I
.1 0.03
                        0.06
                        006
I
   luLliiii
  0  2  4  6  8 10  12  14  16      0   2  t  6  8  10  12 14 16
          Bus Ho                      Bus No
      Figure 6-7 Median and Mean of CO Emission Rates by Bus
      i  6  8 10    14  16
                             «   6  8  10    14  16
      Figure 6-8 Median and Mean of HC Emission Rates by Bus
                     6-7

-------
       The purpose of inter-bus variability analysis was to characterize the range of variability in
vehicle average emissions among all of the buses, to determine whether the data set is relatively
homogeneous. Although there are some clusters among the buses as suggested from Figures 6-6
to 6-8 and some skewness in the distribution as suggested by upper tails in Figure 6-9, it is not
obvious that this data set lacks homogeneity and should be separated into different groups. Thus,
this data set is treated as a single group for purposes of analysis and model development.
             Empirical CDF
                                          Empirical CDF
                                                                       Empirical CDF

 Figure 6-9 Empirical Cumulative Distribution Function Based on Bus Based Median Emission
                                  Rates for Transit Buses
6.3.2 Descriptive Statistics for Emissions Data

       Applicable numerical summary statistics, such as variable means and standard deviations,
are presented in Table 6-1. Relatively simple graphics such as histograms and boxplots describ-
ing variable distributions are presented in Figures 6-10 to 6-12. It may also be necessary to as-
sess whether the individual variables are normally distributed prior to any further analysis using
parametric methods that are based upon this assumption.
                                           6-8

-------
Table 6-1 Basic Summary Statistics for Emissions Rate Data for Transit Bus
*** Summary Statistics for data in: transitbus.data ***

Min:
IstQu.:
Mean:
Median:
3rdQu.:
Max:
Total N:
NA's:
StdDev.:
NO
O.OOOOOOe+000
3.030000e-003
3.183675e-002
7.540000e-003
2.197000e-002
3.057700e+000
1.075350e+005
O.OOOOOOe+000
8.479305e-002
HC
O.OOOOOOe+000
2.195000e-002
1.052101e-001
5.058000e-002
1.731100e-001
2.427900e+000
1.075350e+005
O.OOOOOOe+000
1.162344e-001
CO
O.OOOOOOe+000
4.200000e-004
1.438709e-003
9.300000e-004
1.840000e-003
6.679000e-002
1.075350e+005
O.OOOOOOe+000
1.956353e-003
                                 1-1
                                (C
                                g
      0.0   05   1.0   1.5  2.0
           NOx Emission Rate (g/s)
                            2.5
                                                               O
                                                                     -i—
-4-2024
  QuanHes of Standard Normal
          Figure 6-10 Histogram, Boxplot, and Probability Plot of NO Emission Rate
                                              6-9

-------
                            s
                              •=•
     00 OS 1.0 15  2S)  IS 30
          CO Emission Rate (gftj
-4-20    2
  OuartilK al Standard Normal
Figure 6-11 Histogram, Boxplot, and Probability Plot of CO Emission Rate
                           II
     00   0.02   0.04   006
          HC Emission Rale (gfe)
•4-202
  Quart*! ol asrtttrd Normal
Figure 6-12 Histogram, Boxplot, and Probability Plot of HC Emission Rate
                                      6-10

-------
       Further analysis indicated that there are some zero values in the emission data. There
might be several reasons for zero values. Missing data caused by loss of communication be-
tween instruments or failure of a particular vehicle were recorded as zero in the data set. Those
zero values were already identified in the data post-processing procedure in Chapter 4.  Zero
values might also have occurred when the reference air contained significant amounts of a pollut-
ant so the instrument systematically reported negative emission values.  Sensors, Inc. suggested
that negative data should be set to zero.  Thus these negative values were artificially recorded as
zero, not observed by test equipment as zero. These zero values would create truncation issues
in the model, since the Sensors, Inc. transit bus data set contained only valid positive emission
data. Usually,  truncation is found when a random variable is not observable over its entire range.
Truncation could not be treated as a missing data problem as the missing observations are ran-
dom.  In statistics consideration or analysis can be limited to data that meet certain criteria or to
a data distribution where values above or below a certain point have been eliminated (or cannot
occur).  A program was written in MATLAB® to check for the presence of zero emissions esti-
mates in the data set. There were  1.45% zero values for NO emissions, 1.65% zero values for
                                                       X         '
CO emissions  and 3.84% zero values for HC emissions. Since negative emission values were
not observable for the transit bus data set, further analysis will focus on truncated data sets with
valid positive emission data only.
       The numerical summary statistics such as variable means and standard deviations for
truncated emission data are presented in Table 6-2, and relatively simple graphics such as his-
tograms and boxplots describing variable distributions are presented from Figures 6-13 to 6-15.
The mean of truncated NOx emission data increases 1.26%, while the mean of truncated CO
emission data increases 1.23% and the mean of truncated HC emission data increases 0.99%,
compared with the means of the original data set.
                                          6-11

-------
Table 6-2 Basic Summary Statistics for Truncated Emissions Rate Data

Min:
IstQu.:
Mean:
Median:
3rdQu.:
Max:
Total N:
NA's:
StdDev.:
NO
X
l.OOOOOOe-005
2.256000e-002
1.067578e-001
5.243500e-002
1.749625e-001
2.427900e+000
1.059760e+005
O.OOOOOOe+000
1.163785e-001
CO
l.OOOOOOe-005
3.190000e-003
3.236955e-002
7.770000e-003
2.246000e-002
3.057700e+000
1.057650e+005
O.OOOOOOe+000
8.539871e-002
HC
l.OOOOOOe-005
4.700000e-004
1.496171e-003
9.900000e-004
1.880000e-003
6.679000e-002
1.034050e+005
O.OOOOOOe+000
1.973375e-003
                                  i
                                  £
      0.0   0.5  1.0   1.5  2.0   2.5



        Truncated NOx Emission Rate (gte)
                                                                      Q _
-4-20    2    4


  Quantltes of Standard Normal
     Figure 6-13 Histogram, Boxplot, and Probability Plot of Truncated NO Emission Rate
                                               6-12

-------
               00 05  10 IS 20 2.S 30
                 Truncated CO Enwsicn Rale (8/5)
  •2024
OuonWes of StsnOara Nam*
      Figure 6-14 Histogram, Boxplot, and Probability Plot of Truncated CO Emission Rate
                                    $ 3
                                    a: <=
                00   0.02   0.04   DOS
                 Tnmcated HC Emisnon Rate (aft)
                                                              8
                                                              3.
   -202
 OuanUes ol Standard Normal
      Figure 6-15 Histogram, Boxplot, and Probability Plot of Truncated HC Emission Rate
       These boxplots for truncated emission data show that there are some obvious outliers in
the measured emissions of all three pollutants, and the histograms suggest a high degree of non-
normality, also indicated in the probability plots.  There is thus a need to transform the response
                                             6-13

-------
variable to correct for this condition. Transformations are used to present data on a different
scale. In modeling and statistical applications, transformations are often used to improve the
compatibility of the data with assumptions underlying a modeling process, to linearize the rela-
tion between two variables whose relationship is non-linear, or to modify the range of values of a
variable (Washington et al. 2003).
       6.3.3 Transformation for Emissions Data
       Although evidence in the literature suggests that a logarithmic transformation is most
suitable for modeling motor vehicle emissions (Washington 1994; Ramamurthy et al. 1998;
Fomunung 2000; Frey et al. 2002), this transformation needs to be verified through the Box-Cox
procedure.  The Box-Cox function in MATLAB® can automatically identify a transformation
from the family of power transformations on emission data, ranging from -1.0 to 1.0. The lamb-
das chosen by the Box-Cox procedure are 0.22875 for truncated NOx, -0.0648 for truncated CO,
0.14631 for truncated HC.
       The Box-Cox procedure is only used to provide a guide for selecting a transformation,
so overly precise results are not needed (Neter et al. 1996). It is often reasonable to use a nearby
lambda value with the power transformation.  The lambda values used for transformations are
1/4 for truncated NOx, 0 for truncated CO, 0 for  truncated HC. Histograms, boxplots and nor-
mal-normal plots describing transformed variable distributions are presented in Figures 6-16 to
6-18, where a great improvement is noted.
                                          6-14

-------
                                      - .
                                              B
                 32 04  06  0,8  1.0  1.2
                                                                  -4-20    24
                                                                    Quanttes of Standard NotmeJ
Figure 6-16 Histogram, Boxplot, and Probability Plot of Truncated Transformed NO Emission Rate
                             III,
               -5  -4   -3   -2   .1   0
                                                                  -4-2024
                                                                    Quartles of Started Nonnal
Figure 6-17 Histogram, Boxplot, and Probability Plot of Truncated Transformed CO Emission Rate
                                              6-15

-------
               ...III
                                           a
                                                                •20J
                                                              Ouartfe! ot SUnaard Ncrml
 Figure 6-18 Histogram, Boxplot, and Probability Plot of Truncated Transformed HC Emission Rate
       Although transformations can result in improvement of a specific modeling assumption
such as linearity or normality, they can often result in the violation of others. Thus, transforma-
tions must be used in an iterative fashion, with continued checking of other modeling assump-
tions as transformations are made. Dr. Washington suggested the comparisons should always
be made on the original untransformed scale of Y when comparing statistical models and these
comparisons extend to goodness of fit statistics and model validation exercises (Washington et al.
2003).
       6.3.4 Identification of High Emitter
       From a modeling viewpoint, it is important to accurately predict the number of 'high
emitter' vehicles in the fleet (older technology, poorly maintained, or tampered vehicles that emit
significantly elevated emissions relative to the fleet average under all operating conditions) and
the fraction of activities that yield high emissions for normal emitting vehicles.  Historic practic-
es to identify  'high emitters' in a data set have relied on judgment to set cut points that are often
indefensible from a statistical, and sometimes even practical, perspective. U.S. EPA uses five
times the prevailing emission standards as the cut point across all pollutants (U.S. EPA 1993),
while CARB has defined different emission regimes ranging from normal to super emitters and
used different criteria for each regime (CARB 1991; Carlock 1994) (see Table 6-3).
                                          6-16

-------
       Table 6-3 CARS Emission Regime Definition (Carlock 1994)
Emitter Status NO CO HC
Normal
Moderate
High
Very High
Super
< 1 standard
1 to 2 standard
2 to 3 standard
3 to 4 standard
> 4 standard
< 1 standard
1 to 2 standard
2 to 6 standard
6 to 10 standard
> 10 standard
< 1 standard
1 to 2 standard
2 to 4 standard
5 to 9 standard
> 9 standard
       In contrast, the methodology employed in MEASURE database development at Georgia
Tech is statistically based. Wolf et al. used regression tree techniques to classify vehicles into
classes that behave similarly, exhibit similar technology characteristics, and exhibit similar mean
emission rates under standardized testing conditions (Wolf et al. 1998). The cut points within
each technology class are then defined on the basis of pre-selected percentiles of a normal distri-
bution of the emission rates for each pollutant.  The analysis by Wolf et al. specified a cut point
of 97.73 percent (that is, mean + 2 standard deviations), which implies that approximately 2.27
percent of the vehicles in each technology class are high emitters.
       For this research, although inter-bus variability exists in the data set, these  15 buses
should be treated as one technology  class because they shared the same fuel injection type, cata-
lytic converter type, transmission type, and their model year and odometer reading were similar.
Just as in Wolf's approach, the emissions value located at two standard deviations  above the
mean of the  normalized emissions distribution is used as a cutpoint to distinguish between nor-
mal and high emission points. Theoretically, this method will consistently identify approximate-
ly 2.27 percent of the data as high emission points.  That means 97.73 percent of the population
should fall into the normal status.  Analysis results showed that 0.33 percent of NO emission,
3.76 percent of CO emission,  and  1.37 percent of HC  emissions were identified as  high emission
points. After assigning those high emissions points to different buses, the distribution is shown
in Table 6-4.
                                          6-17

-------
     Table 6-4 Percent of High Emission Points by Bus
NO CO HC
X
bus 360
bus 361
bus 363
bus 364
bus 372
bus 375
bus 377
bus 379
bus 380
bus 381
bus 382
bus 383
bus 384
bus 385
bus 386
Total
0.02%
0.32%
0.06%
0.04%
0.00%
0.69%
0.00%
0.67%
0.52%
0.10%
1.14%
0.88%
0.50%
0.55%
0.20%
0.36%
2.80%
1.08%
3.10%
0.87%
0.13%
3.16%
4.44%
2.85%
7.67%
4.76%
8.12%
3.44%
5.10%
2.10%
6.63%
3.81%
5.06%
0.25%
0.00%
7.38%
1.96%
0.27%
0.00%
1.17%
0.69%
0.14%
0.36%
1.82%
1.33%
0.60%
0.57%
1.38%
       For each individual bus, the highest proportion is 1.14 percent for bus 382 for NO emis-
sions, 8.12 percent for bus 380 for CO emissions, and 7.38 percent for bus 364 for HC emissions.
No evidence from Table 6-4 suggests that there are some "high emitters" (older technology,
poorly maintained, or tampered vehicles) in the data set.  This conclusion makes sense since
all buses were only 5 or 6 years old during the test.  Another finding indicated that a small frac-
tion of a bus's observed activity exhibited disproportionately high emissions. Activities found
in the literature include hard accelerations at low speeds, moderate acceleration at high speeds,
or equivalent accelerations against gravity (Fomunung 2000).  Given that high emissions points
make up only 0.33 percent of the data set for NO , 3.76 percent for CO, and 1.37 percent for HC,
it is not necessary to develop two different models for normal emissions and high emissions.
Based on this analysis, these 15 buses should be treated as one technology class since no high
emitters were identified.

                           6.4 Potential Explanatory Variables

       There are four main groups of parameters that affect vehicle emissions as indicated in
the literature (Guensler 1993; Clark et al. 2002). These groups are: 1) vehicle characteristics,
including vehicle type, make, model year, engine type, transmission type, frontal area, drag coef-
ficient, rolling resistance, vehicle maintenance history, etc.; 2)  roadway characteristics, includ-
ing road grade and possibly pavement surface roughness, etc.;  3) on-road load parameters, like
                                           6-18

-------
on-road driving trace (sec-cy-sec) or speed/acceleration profile, vehicle payload, on-road operat-
ing modes, driver behavior, etc.; and 4) environmental conditions, including humidity, ambient
temperature, and ambient pressure (Feng et al. 2005; Guensler et al. 2005).
       In general, emissions from HDDVs are more likely to be a function of brake-horsepower
load on the engine (especially for NO ) than emissions from light-duty gasoline vehicles, because
instantaneous emissions levels of diesel engines are highly correlated with the instantaneous
work output of the engine (Ramamurthy et al. 1999; Feng et al. 2005). That is, in particular, the
higher the engine load, the higher emissions for NO  . The emissions modeling framework (from
which most of the items below are derived) is outlined in the Regional Applied Research Effort
(RARE) report (Guensler et al. 2006).  The goal of that modeling regime was to predict on-road
load and then apply appropriate emission rates to the load. Most of the items outlined below are
related to the amount of engine load that a vehicle will experience. Although each of the vari-
ables below is important, the values are not always available in on-road testing data (although in
the future we need to make sure that these data  are all collected). But, engine load in the AATA
database could be used in emission rate model development for this research. Also, there are
some factors, such as temperature and humidity, that may affect emission rates independent of
load, or perhaps interacting with load.  The model should incorporate such variables.

6.4.1 Vehicle Characteristics

       Factors related to vehicle characteristics influencing heavy-duty diesel vehicle emissions
which are summarized in the literature include vehicle class (i.e., weight,  engine size, horsepow-
er rating), model year, vehicle mileage, emission control system (i.e., engine exhaust aftertreat-
ment system),  transmission type, inspection and maintenance history, etc. (Guensler 1993; Clark
et al. 2002).
       The effect of vehicle class on emissions is significant.  Five main factors that cause a
vehicle to demand engine power are vehicle speed, vehicle acceleration, drive train inertial ac-
celeration, vehicle weight, and road grade.  As the required  power and work performed by the
vehicle increase, the amount of fuel burned to produce that power also increases, and the appli-
cable emission rates also generally  increase.  Thus, emissions vary as a function of vehicle class
and vehicle configuration.  The higher truck classes with larger engines are heavier and, thus,
typically produce more emissions.  Vehicle configurations with large frontal areas and high drag
coefficients will yield higher emissions when operated at higher speeds and/or accelerated at
higher rates.
                                          6-19

-------
      The concept of vehicle technology groups is to identify and track subsets of vehicles that
have similar on-road load responses and similar laboratory emission rate performance. The basic
premise is that vehicles in the same heavy-duty vehicle class, employing similar drive train sys-
tems, and of the same size and shape have similar load relationships.  There is also an important
practical consideration in establishing vehicle technology groups. Researchers need to be able to
identify these vehicles in the field during traffic counting exercises.
      The starting point for technology group criteria is a visual classification scheme. Yoon et
al. (Yoon et al. 2004a) developed a new HDV visual classification scheme called the X-scheme
based on the number of axles and gross vehicle weight ratings (GVWR) as a hybrid scheme
between the FHWA truck and U. S. EPA HDV classification schemes. With field-observed HDV
volumes,  emissions rates estimated using the X-scheme were 34.4% and 32.5% higher for NOx
and PM, compared to using the standard U.S. EPA guidance (U.S. EPA2004c).  The X-scheme
reflects vehicle composition in the field more realistically than does the standard U.S.  EPA guid-
ance (U.S. EPA 2004c), which shifted heavy-HDV volumes into light- or medium-HDV volumes
21% more frequently than the X-scheme.  Figure 6-19 shows X-scheme classes and their typical
figures (Yoon et al. 2004a).
             X2
             X3
                       HDVZb,
                    HDV3,HDY4,
                    HDVS^HDVe,
                        HDV7
HDVSa
HDVSb
^_  —  t-.^7^
  URW         »«t
- xll       '  l
•Q —  "  (JO  -"^
              • t>-  wo
                                                    V*11* '-•"»•
                                         .-^SCJ^CF
               Figure 6-19 The X Classes and Typical Vehicle Configurations
      Vehicle age and model year effects are accounted for because some vehicle models have
much lower average emissions.  Researchers from West Virginia University reported that most
regulated emissions from engines produced by Detroit Diesel Corporation have declined over
the years and the expected trend of decreasing emission levels with the model year of the engine
                                        6-20

-------
is clear and consistent for PM, HC, CO and NO , starting with the 1990 models (Prucz et al.
2001). Information on vehicle age can be obtained from a registration database using vehicle
identification numbers and truck manufacturer records.  The registration database can be sorted
by calendar year and show vehicles registered in the given year by model year. However, given
the differences noted between field-observation fleet composition and registration data in the
light-duty fleet (Granell et al. 2002), significant additional research efforts designed to model the
on-road subfleet composition (classifications and model year distributions) are even more war-
ranted for HDVs. It is also important to keep in mind that heavy-duty engines accumulate miles
of travel very rapidly and that engine rebuilding is a common practice. Hence, the age of the
vehicle does not necessarily equal the age of the engine. Previous field work in Atlanta indicates
that on-road surveys provide better information on fleet composition (Ahanotu 1999). To refine
the model, appropriate data sets that include detailed information on engine type, transmission
type, etc. will be needed to appropriately subdivide the observed on-road groups and continue to
develop respective emission rates.  The data collection challenge in this area is daunting, but it is
worthwhile to perform once to provide a library of information that can be used in a large num-
ber of modeling applications.
       Vehicle weight is critical to the demand engine power that must be supplied to produce
the tractive force needed to overcome inertial and drag forces and then influence vehicle emis-
sions.  NO emissions increase as the vehicle weight increases and this relationship does not vary
much from vehicle to vehicle (Gajendran and Clark 2003).  The effects of vehicle age, engine
horsepower ratings, transmission type, and engine exhaust aftertreatment were also investigated
in other literature (Clark et al. 2002; Feng et al. 2005).
       The vast majority of heavy-duty vehicles are normal  emitters, but a small percentage of
vehicles are high-emitters under every operating condition, typically because they have been
tampered with or they are malfunctioning (i.e.,  defective or mal-maintained engine sensors or
actuators). As the vehicle ages,  general engine  wear and tear will increase emission rates mod-
erately due to normal degradation  of emission controls of properly functioning vehicles. On the
other hand, as vehicles age, the probability increases that some of the vehicles will malfunction
and produce significantly higher emissions (i.e., become high-emitters). Probability functions
that classify vehicles within specific model years (and later, within specific statistically-derived
vehicle technology groups) are currently being  developed through the assessment of certification
testing and various roadside emissions tests. Obtaining additional detailed sources of data for
developing failure models appears to be warranted.
       After engine horsepower at the output shaft has been reduced by power losses associ-
ated with fluid pressures, operation of air conditioning, and other accessory  loads, there is still an
                                           6-21

-------
additional and significant drop in available power from the engine before reaching the wheels.
Power is required to overcome mechanical friction within the transmission and differential, inter-
nal working resistance in hydraulic couplings and friction of the vehicle weight on axle bearings.
The combined effect of these components is parameterized as drive train efficiency. However,
the more difficult and more significant component of power loss in the drive train is associated
with the inertial resistance of drive train components rotational acceleration (Gillespie 1992).
       A heavy-duty truck drive train is significantly more massive than its light-duty counter-
part. The net effect of drive train inertial losses when operating in higher gears on the freeway
may not be significant enough to be included in the model (relative to the other load-related com-
ponents in the model for these heavy vehicles). However, recent studies appear to indicate very
high truck emission rates (gram/second) in "creep mode" stop and start driving activities noted
in ports and rail yards. Thus, high inertial loads for low gear, low speed, and acceleration opera-
tions may contribute significantly to emissions from mobile sources in freight transfer yards and
therefore should not be ignored (Guensler et al. 2006).
       The inertial losses are a function of a wide variety of physical drive train characteristics
(transmission and differential types, component mass, etc.) and on-road operating conditions. To
refine the use of inertial losses in the modal model, new drive train testing data will be designed
to evaluate the inertial losses for various engine, drive shaft,  differential, axle, and wheel com-
binations and to establish generalized drive train technology  classes. Then, gear selection  prob-
ability matrices for each drive train technology class and gear and final drive ratio data can be
provided in lookup tables for model implementation, in place of the inertial assumptions current-
ly employed. However, data are currently significantly lacking for development of such lookup
tables.

6.4.2 Roadway Characteristics

       The three basic geometric elements of a roadway are  the horizontal alignment, the cross-
slope or amount of super-elevation and the longitudinal profile or grade.  Among them, road
grade has been shown to have significant impact on engine load and vehicle emissions (Guensler
1993). Other roadway characteristics, such as lane width, are also noted to have a significant
impact on the speed-acceleration profiles of heavy-duty vehicles and can therefore affect engine
load (Grant et al. 1996).
                                          6-22

-------
6.4.3 Onroad Load Parameters

       Onroad load parameters include on-road driving trace (second-by-second) or speed/ac-
celeration profile, engine load, on-road operating modes (i.e., idling, motoring, acceleration,
deceleration, and cruise), driver behavior, and so on.  Vehicle speed and acceleration are integral
components for the estimation of vehicle road load, and therefore engine load. Previous studies
indicated that increased engine power requirements could result in the increase in NOx emissions
(Ramamurthy and Clark 1999; Feng et al. 2005). Clark et al. reported that the vehicle applica-
tions and duty cycles can have an effect on the emission produced (Clark et al. 2002). This study
found that over a typical day of use for any vehicle, one that stops and then accelerates more
often might produce higher distance-specific emissions, providing all else is held constant.
       Passenger and freight payloads together with the vehicle tare weight contribute to the
demand for power that must be supplied to produce the tractive force needed to overcome inertial
and drag forces.  Passenger loading functions for transit operations can be obtained through anal-
ysis of fare data or on-board passenger count programs. On the heavy-duty truck side, on-road
freight weight distributions by vehicle class can be derived from roadside weigh station studies.
Ahanotu conducted detailed weigh-in-motion studies in Atlanta and found that reasonable load
distributions by truck class and time of day could be applied in such a modal modeling approach
(Ahanotu 1999). Although additional field studies are warranted to examine the validity of the
Atlanta results over time and the transferability of findings in Atlanta to other metropolitan areas
(especially considering the potential variability in commodity transport, such  as agricultural
goods, that may occur in other areas), the modeling methodology seems appropriate.

6.4.4 Environmental Conditions

       Environmental conditions under which the vehicle is operated include humidity, ambient
temperature, and ambient pressure. U.S.  EPA is currently conducting studies to find the effect of
ambient conditions on HDDV emissions  (NRC 2000). The current MOBILE6.2 model includes
correction factors to account for the impact of environmental conditions on vehicle emission
rates.  Given the lack of compelling additional data available for analysis, it may be necessary
to ignore the effects of these environmental parameters (altitude, temperature, and humidity) or
simply incorporate the existing MOBILE6.2 correction factors. Preliminary analyses of the data
and methods used to derive the MOBILE6.2 environmental correction factors indicate that the
embedded equations in MOBIL6.2 probably need to be revisited.
                                          6-23

-------
6.4.5 Summary

       It is impossible for modeler to include all explanatory variables identified in the literature
review for model development because the explanatory variables available for model develop-
ment and model validation are only a subset of potential explanatory variables identified above.
Therefore, the conceptual model will only include available variables and derived variables in
the data set provided.

                          6.5 Selection of Explanatory Variables

       As mentioned earlier, available explanatory variables for transit buses are only a subset of
potential explanatory variables identified.  In brief, available explanatory variables can be sum-
marized as:

       •   Test information: date, time;

       •   Vehicle characteristics: license number; model year, odometer reading, engine size,
          instrument configuration number;

       •  Roadway characteristics: road grade (%);

       •   Onroad loadparameters: engine power (bhp), vehicle speed (mph), acceleration
          (mph/s);

       •  Engine operating parameters: throttle position (0 - 100%),  engine oil temperature
          (deg F), engine oil pressure (kPa), engine warning lamp (Binary), engine coolant tem-
          perature (deg F),  barometric pressure reported from ECM (kPa);

       •  Environmental conditions: ambient temperature (deg C), ambient pressure (mbar),
          ambient relative humidity  (%), ambient absolute humidity (grains/lb air).

       The most important  question related to engine power is how to simulate engine power in
the real world for application purposes. Georgia Institute of Technology researchers developed
a transit bus engine power demand simulator (TB-EPDS), which estimates transit bus power
demand for given speed, acceleration, and road grade conditions (Yoon et al. 2005a; Yoon et al.
2005b). Speed-acceleration-road grade matrices were developed from speed and location data
obtained using a Georgia Tech Trip Data Collector. The researchers conclude that speed-accel-
eration-road grade matrices at the link level or the route level are both acceptable for regional
inventory development.  However, for micro-scale air quality impact analysis, link-based ma-
                                          6-24

-------
trices should be employed (Yoon et al. 2005a). Although significant uncertainties still exist for
inertial loss which is significant at low speeds and motoring mode with negative engine power,
this research showed that using engine power as load data is possible for application purposes.
Thus we concluded that engine power could be used as load data in estimated emission models.
       The relationships between explanatory variables were investigated using S-Plus®. Three
variables were excluded because they have only a single value for all records, and they are en-
gine size, instrument configuration number and engine warning lamp. There are 14 explanatory
variables included in correlation analysis. The correlation matrix is shown in Table 6-5.

Table 6-5 Correlation Matrix for Transit Bus Data Set
                         *** Correlations for data in:  transitbus.data ***
 model.year
 odometer
 temperature
 baro
 SCB.RH
 humid
 grade
 vehicle, speed
 throttle.position
 oil.temperature
 oil.pressure
 coolant, temperature
 eng.bar.press
 engine.power
 model.year
 odometer
 temperature
 baro
 SCB.RH
 humid
 grade
 vehicle, speed
 throttle.position
 oil.temperature
 oil.pressure
model.year
1.0000000000
-0.655273106
0.047048515
0.394378106
0.068411842
0.030997734
-0.004241021
-0.014916204
-0.00186824
0.051759069
0.050521339
0.206727241
0.137781076
-0.006066455
SCB.RH
0.0684118427
0.3438144652
0.4882140119
-0.6324801472
1.0000000000
0.9318790788
-0.0060751123
-0.0345026977
0.0134235743
0.096018579
-0.0498528376
odometer
-0.655273106
1.0000000000
0.186771499
-0.704310642
0.343814465
0.39026148
0.00052737
-0.062908098
0.009346571
-0.011881827
-0.098442472
-0.117710067
-0.248876183
0.021283229
humid
0.030997734
0.390261480
0.751260451
-0.649522446
0.931879078
1.000000000
-0.006411009
-0.117870984
-0.024720165
0.087317807
-0.077649741
temperature
0.047048515
0.186771499
1.0000000000
-0.326938545
0.488214011
0.751260451
-0.005590441
-0.225478003
-0.09113266
0.042676227
-0.073256993
0.077114798
-0.260525088
-0.059512654
grade
-0.004241021
0.00052737
-0.005590441
0.002384338
-0.006075112
-0.006411009
1.0000000000
0.000896568
0.020186507
-0.007116669
0.009836954
baro
0.394378106
-0.704310642
-0.326938545
1.0000000000
-0.632480147
-0.649522446
0.002384338
0.054918347
-0.014470281
-0.026744091
0.034212231
0.045844706
0.371021489
-0.035718725
vehicle, speed
-0.014916204
-0.062908098
-0.225478003
0.054918347
-0.034502697
-0.117870984
0.000896568
1.0000000000
0.387705398
0.018641433
0.567493814
                                            6-25

-------
coolant, temperature
eng.bar.press
engine. power
ace

model.year
odometer
temperature
baro
SCB.RH
humid
grade
vehicle, speed
throttle.position
oil.temperature
oil.pressure
coolant, temperature
eng.bar.press
engine. power
ace

model.year
odometer
temperature
baro
SCB.RH
humid
grade
vehicle, speed
throttle.position
oil.temperature
oil.pressure
coolant, temperature
eng.bar.press
engine. power
0.2005559889
-0.3663829274
0.0257436423
0.0000403711
throttle.position
-0.001868240
0.009346571
-0.091132660
-0.014470281
0.013423574
-0.024720165
0.020186507
0.387705398
1.000000000
0.012077329
0.681336402
0.059605193
0.102861968
0.959310116
0.660747116
coolant.temperature
0.206727200
-0.117710000
0.077114700
0.045844700
0.200555900
0.171558800
-0.014531500
0.072998100
0.059605100
0.335667300
-0.298083200
1.000000000
0.284506700
0.050584800
0.171558840
-0.373540032
-0.003279122
0.003340728
oil.temperature
0.051759069
-0.011881827
0.042676227
-0.026744091
0.096018570
0.087317807
-0.007116669
0.018641433
0.012077329
1.000000000
-0.117896787
0.335667341
0.059886972
0.007171781
-0.004185245
eng.bar.press
41 0.137781076
67 -0.248876183
98 -0.260525088
06 0.371021489
88 -0.366382927
40 -0.373540032
24 0.002132063
99 0.143270319
93 0.102861968
41 0.059886972
57 0.022549030
00 0.284506753
53 1.000000000
45 0.089702976
-0.014531524 0.072998199
0.002132063 0.143270319
0.021662091 0.303209657
0.012930076 0.000224126
oil.pressure
0.050521339
-0.098442472
-0.073256993
0.034212231
-0.049852837
-0.077649741
0.009836954
0.567493814
0.681336402
-0.117896787
1.000000000
-0.298083257
0.022549030
0.656609695
0.465493435
engine .power
-0.006066455
0.021283229
-0.059512654
-0.035718725
0.025743642
-0.003279122
0.021662091
0.303209657
0.959310116
0.007171781
0.656609695
0.050584845
0.089702976
1.000000000
       All variable pairs with correlation coefficients greater than 0.5 were scrutinized and
subjected to further analysis, which invariably helped in paring down the number of variables.
The values in the correlation matrix show that throttle position and engine power, ambient rela-
tive humidity and ambient absolute humidity are highly correlated (higher than 0.90). Model
                                          6-26

-------
year and odometer, odometer and barometric pressure, barometric pressure and ambient relative
humidity, barometric pressure and ambient absolute humidity, ambient absolute humidity and
temperature, oil pressure and throttle position, oil pressure and vehicle speed, oil pressure and
engine power, throttle position and acceleration, engine power and acceleration are moderately
correlated (higher than 0.50). Other pairs of variables, however, have only slight correlations.
       The relationship between throttle position and engine power is shown in Figure 6-20.
Since engine power is derived from percent engine load, engine torque, and engine speed, and
previous studies indicated that increased engine power requirements could result in the increase
in NOx emissions (Ramamurthy and Clark 1999; Feng et al. 2005), the author retained engine
power in the database.
                    :Y=.1.,2P.13+Z3754X
                    .R Square =092038
             0     10     20    30     40     50    60     70     80    90    100
                                  Throttle Position (0-100%)

           Figure 6-20 Throttle Position vs. Engine Power for Transit Bus Data Set
       Ambient relative humidity and ambient absolute humidity provide the same informa-
tion in two different ways, and either is enough to consider the influence of ambient humidity on
emissions. The author retained ambient relative humidity in the database.
                                          6-27

-------
Three other findings related to the correlation matrix are:

1.    All environmental characteristics, like temperature, humidity, and barometric pres-
     sure,  are moderately correlated with each other (Figure 6-21), which indicates mod-
     elers  should consider such relationships when developing environmental factors.

2.    Engine power is correlated with not only on-road load parameters such as vehicle
     speed, acceleration, and road grade, but also engine operating parameters such as
     throttle position and engine oil pressure. Engine power in this data set is derived
     from  measured engine speed, engine torque and percent engine load. On the other
     hand, engine power could be derived theoretically from vehicle speed, accelera-
     tion and road grade using an engine power demand equation. So, engine power
     can connect on-road modal activity with engine operating conditions at this level.
     This fact strengthens the importance of introducing engine power into a conceptual
     emissions model and to improve the ability to simulate engine power for regional
     inventory development.

3.    Engine operating parameters, like throttle position (0 - 100%), engine oil pres-
     sure (kPa), engine oil temperature (deg F), engine coolant temperature (deg F), and
     barometric pressure reported from ECM (kPa), are highly or moderately related
     to on-road operating parameters. For example, engine power and throttle position
     are highly correlated, while oil pressure and vehicle speed, oil pressure and en-
     gine power, throttle position and acceleration are moderately correlated. Although
     engine operating parameters may have power to explain the  variability of emis-
     sion data, it is difficult to obtain such data in the real world for modeling purposes.
     These four variables are retained for further analysis of their relationships with
     emissions. Although these four variables will be excluded from the emission model
     at this time, analysis of these potential relationships may indicate a need for further
     research in this area.
                                    6-28

-------
30
26
24
22
14 -
         ~.T.r~. I."*"-".-"..,
                **  *«   ......
                                    55 -
                                    35
                                    90
                                    25 -
                                    20 -
        970    980     990
        Barometric Pressure (mbar)
*******

;



- • ***+ ***** *+'•
* • **»• • •
• »*• •*•
• - - - *

•.*>•••.* +







*»
**
»• »* *




t ....+..


**•»» +




•
* +*
• *»




•••* »4 :
***+ *••*»»
»»' 	 "«~ :



»+•**•*•»*+» *•*«-»»*


• * ...... :
i 1
50 970 980 990 100
Baromelric Pressure (mbar)
10   20   30   40   SO   60
     Ambient Relative Humidity (%)
                                                                                                    /O
                   Figure 6-21  Scatter plots for environmental parameters
                                                6-29

-------
                                     CHAPTER 7
                7. MODAL ACTIVITY DEFINITIONS DEVELOPMENT
                   7.1  Overview of Current Modal Activity Definitions

       Current research suggests that vehicle emission rates are highly correlated with modal
vehicle activity. Modal activity is a vehicle activity characterized by cruise, idle, acceleration
or deceleration operation. Consequently, a modal approach to transportation-related air quality
modeling is becoming widely accepted as more accurate in making realistic estimates of mobile
source contribution to local and regional air quality. Research at Georgia Tech has clearly identi-
fied that modal operation is a better indicator of emission rates than average speed (Bachman
1998). The analysis of emissions with respect to driving modes, also referred to as modal emis-
sions, has been done in several recent researches (Barth et al. 1996; Bachman 1998; Fomunung
et al. 1999; Frey et al. 2002; Nam 2003; Barth et al. 2004). These studies indicated that driv-
ing modes might have the ability to explain a significant portion of variability of emission data.
Usually, driving can be divided into four modes: acceleration, deceleration, cruise, and idle. But
driving mode definitions in literature were somewhat arbitrary. To define the driving modes or
choose more reasonable definitions for the proposed modal emissions model, current driving
mode definitions used in different modal emission models need to be investigated first.
      MEASURE's Definitions
       Researchers at Georgia Tech developed the MEASURE model in 1998 (Guensler et al.
1998). This model was developed from more than 13,000 laboratory tests conducted by the
EPA and CARB using standardized test cycle conditions and alternative cycles (Bachman 1998).
Modal activities variables were introduced into the MEASURE model as follows: acceleration
(mph/sec), deceleration (mph/sec), cruise (mph) and percent in idle time. In addition, two surro-
gate variables were also developed, inertialpower surrogate (IPS) (mph2/s), which was defined
as acceleration times velocity and drag power surrogate (DPS) (mph3/s), which was defined as
acceleration times velocity squared. Within each mode, several  'cut points', or threshold values,
                                          7-1

-------
were specified and used to create several categories. In total, six threshold values were denned
for acceleration, three for deceleration, five for cruise modes, seven for IPS, and seven for DPS.
Modal activity surrogate variables were added as percent of cycle time spend in specified operat-
ing conditions (Fomunung et al. 1999).
       NCSU's Definitions
       Dr. Frey at NCSU defined four modes of operation (idle, acceleration, deceleration, and
cruise), for U.S. EPA's MOVES' model in 2001 (Frey and Zheng 2001; Frey et al. 2002). The
following description is directly cited from his report (Frey et al. 2002).
               Idle is defined as based upon zero speed and zero acceleration. The
       acceleration mode includes several considerations.  First, the vehicle must be
       moving and increasing in speed. Therefore, speed must be greater than zero and
       the acceleration must be greater than zero.  However, vehicle speed can vary
       slightly during events that would typically be judged as cruising.  Therefore,
       in most instances, the acceleration mode is based upon a minimum accelera-
       tion of 2 mph/sec.  However, in  some cases, a vehicle may accelerate slowly.
       Therefore, if the vehicle has had a sustained acceleration rate averaging at least
       1  mph/sec for at least three seconds or more, that is also considered accelera-
       tion.  Deceleration is defined in a similar manner as acceleration, except that the
       criteria for deceleration are based upon negative acceleration rates.  All other
       events not classified as idle, acceleration, or deceleration, are classified as cruis-
       ing. Thus, cruising is approximately steady speed driving but some drifting of
       speed is allowed.
       Physical Emission Rate Estimator's (PERE 's) Definitions
       Dr. Nam developed his definitions when he introduced his Physical Emission Rate Esti-
mator (PERE) model in 2003 (Nam 2003).  Idle is defined as speed less than 2 mph. Accelera-
tion mode is based on acceleration rate greater than 1 mph/sec. However, deceleration is based
on deceleration rate less than -0.2 mph/sec. Other events are classified as cruise mode and the
acceleration range is  between -0.2 mph/sec and  1 mph/sec. Nam also mentioned in his  report
that the definition of cruise (based only  on acceleration) will change depending on the speed in
future studies.
                                           7-2

-------
       Summary
       Current driving mode definitions related to modal emission models are all significantly
different from each other. NCSU used one absolute critical value, 2 mph/sec, for acceleration
and deceleration mode. However, PERE chose two different critical values, 1 mph/sec and -0.2
mph/sec, for acceleration and deceleration mode individually. The critical values, 2 mph/sec, 1
mph/sec, or 0.2 mph/sec, were chosen somewhat arbitrarily. MEASURE used several thresh-
old values to add modal activity surrogate variables.  Table 7-1 summarizes these modal activity
definitions.

Table 7-1 Comparison of Modal Activity Definition
MEASURE NCSU PERE
Idle
Acceleration
Deceleration
Cruise
Speed=0, Acc=0
Acc>6,Acc>5,Acc>4,
Acc>3,Acc>2,Acc>l
Acc<-3,Acc<-2, Acc<-l
Speed>70, Speed>60,
Speed>50, Speed>40,
Speed>30
Speed=0, Acc=0
Acc>2 or Acc>l for
three seconds
Acc<-2 or Acc<-l for
three seconds
Other events
Speed<2
Acc>l
Acc<-0.2
-0.2
-------
dant on the available speed/acceleration data and data quality. For example, a lack of zero speed
records does not mean that there is no idle activity in the data set.
       The initial proposed modal activity definitions were defined as follows:

       •   Idle is defined as based on speeds less than 2.5 mph and absolute acceleration less
          than 0.5 mph/sec.

       •   Acceleration mode is based upon a minimum acceleration of 0.5 mph/sec.

       •   Deceleration is denned in a manner similar to acceleration, except that the criteria for
          deceleration are based upon negative acceleration rates.

       •   All other events not classified as idle, acceleration, or deceleration,  are classified as
          cruise.

       At the same time, several different critical values were chosen to examine the reasonable-
ness of the proposed criteria.  Four different mode definitions using different critical values are
shown in Table 7-2.

Table 7-2 Four Different Mode Definitions and Modal Variables
^^^^^^^^^^| Idle Acceleration Deceleration Cruise
Definition 1
Definition 2
Definition 3
Definition 4
Speed < 2.5 & abs(acc) < 0.5
Speed<2.5&abs(acc)< 1
Speed < 2.5 & abs(acc) < 1.5
Speed<2.5&abs(acc)<2
Ace > 0.5
Acc> 1
Acc> 1.5
Acc>2
Ace < -0.5
Acc<-l
Ace < -1.5
Ace < -2
Other
Other
Other
Other
Note: Unit for speed is mph, unit for acceleration is mph/sec.
       A program was written in MATLAB™ to determine the driving mode for second-by-
second data and estimate the average value of emissions for each of the driving modes. At the
same time, average modal emission rates were estimated for each mode based on different modal
activity definitions in Table 7-2. Figures 7-1 to 7-3 present a comparison of average modal emis-
sion rates for different pollutants (NO , CO, and HC).
                                           7-4

-------
Figure 7-1 Average NO Modal Emission Rates for Different Activity Definitions
Figure 7-2 Average CO Modal Emission Rates for Different Activity Definitions
                                    7-5

-------
                                      Different Modal Aclmly CVin lions
        Figure 7-3 Average HC Modal Emission Rates for Different Activity Definitions
       These four different modal activity definitions show a kind of consistent pattern. The
average emissions during the acceleration mode are significantly higher than any other driving
mode for all of the pollutants.  The average emission rate during deceleration mode is the lowest
of the four modes for NOx and CO emissions while the average emission rate during idle mode is
the lowest of the four modes for HC emissions. The average cruising emission rate is typically
higher than the average idling and decelerating emission rate, except for CO emission in defini-
tions 3 and 4.
       To assess whether the average modal emission rates are statistically significantly different
from each other, two-sample tests were estimated for each pair. Lilliefors tests for goodness of
fit to a normal distribution were first used for each mode based on different modal activity defini-
tions. The results show that all of them reject the null hypothesis of normal distribution at 5%
level. A Kolmogorov-Simirnov two-sample test was chosen to take place of the t-test because
the assumption of normal distribution was questionable. The Kolmogorov-Smirnov two-sample
test is a test of the null hypothesis that two independent samples have been drawn from the same
population (or from populations with the same distribution). The test uses the maximal differ-
ence between cumulative frequency distributions of two samples as the test statistic. Results of
the Kolmogorov-Smirnov two-sample tests are presented in Table 7-3 in terms of p-values where
                                           7-6

-------
"Ace" represents acceleration mode while "Dec" represents deceleration mode. The cases where
the p-value is less than 0.05 indicate that the distributions are different at the 5% level.  All p-
values for 72 possible pairwise comparisons are lower than 0.05, indicating that the distributions
for these pairs are statistically different from each other.

Table 7-3 Results for Pairwise Comparison for Modal Average Estimates In Terms of P-value

Definitonl
Definiton2
DefinitonS
Definiton4
^^^^ Idle-Ace Idle-Dec Idle-Cruise Ace-Dec Ace-Cruise Dec-Cruise
NO
X
CO
HC
NOx
CO
HC
NOx
CO
HC
NOx
CO
HC
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
       The modal emission analysis results suggest that all four mode definitions proposed in
Table 7-2 appear reasonable. These modal definitions allow some explanation of differences in
emissions based upon driving mode, as revealed by the fact that the modal emission distributions
differ from each other. A further step is taken here to see which mode definition would be identi-
fied as the most appropriate definition by utilizing HTBR technique. For each definition, three
dummy variables are added to represent idle, acceleration,  and deceleration mode. The regres-
sion trees are developed between emission data and these three dummy variables for each defini-
tion are shown in Figures 7-4 to 7-6.  The sensitivity test results based on these regression trees
for NO , CO, and HC are summarized in Table 7-4.
                                           7-7

-------
(a) Definition 1
(c) Definition 3
(b) Definition 2
(d) Definition 4
       Figure 7-4 HTBR Regression Tree Result for NO  Emission Rate




(a) Definition 1                              (b) Definition 2
(c) Definition 3
 (d) Definition 4
        Figure 7-5 HTBR Regression Tree Result for CO Emission Rate

-------
        (a) Definition 1
(b) Definition 2
        (c) Definition 3
                                                 (d) Definition 4
                Figure 7-6 HTBR Regression Tree Result for HC Emission Rate
Table 7-4 Sensitivity Test Results for Four Mode Definition
NO Mode Number Deviance Mean ER Residual Mean Deviance

Definition 1




Definition 2




Definition 3






Idle
Acceleration
Deceleration
Cruise

Idle
Acceleration
Deceleration
Cruise

Idle
Acceleration
Deceleration
Cruise
105976

29541
25931
22242
28262

31064
18894
16644
39374

32010
13417
12768
47781
1435.00

11.04
320.90
41.32
365.10

16.05
206.50
21.14
567.80

23.07
130.50
14.27
739.30
0.10680

0.03235
0.22480
0.02671
0.13930

0.03342
0.23110
0.02214
0.14070

0.03470
0.2297
0.02065
0.14350

0.006967 = 738.3 / 106000




0.007658 = 811.5/106000




0.00856 = 907.1/106000




                                             7-9

-------
NO Mode Number Deviance Mean ER Residual Mean Deviance
Definition 4




CO

Definition 1




Definition 2




Definition 3




Definition 4




HC

Definition 1




Definition 2





Idle
Acceleration
Deceleration
Cruise



Idle
Acceleration
Deceleration
Cruise

Idle
Acceleration
Deceleration
Cruise

Idle
Acceleration
Deceleration
Cruise

Idle
Acceleration
Deceleration
Cruise



Idle
Acceleration
Deceleration
Cruise

Idle
Acceleration
Deceleration
Cruise

32717
8719
9452
55088

105765

29287
25866
22456
28156

30764
18864
16919
39218

31691
13402
13035
47637

32375
8712
9681
54997

103405

28780
25122
22287
27216

30250
18330
16805
38020

30.240
77.150
9.191
879.200

771.300

2.166
559.400
3.903
47.380

4.185
484.900
2.410
88.710

9.131
410.100
1.861
138.700

15.5200
339.1000
0.7047
198.7000

0.40270

0.09337
0.09143
0.07644
0.11600

0.09492
0.06668
0.05355
0.16010

0.03583
0.22600
0.02015
0.14490

0.032370

0.005590
0.099740
0.006564
0.018910

0.005944
0.122400
0.005803
0.021250

0.006610
0.147600
0.005454
0.024440

0.007365
0.179700
0.005049
0.028560

0.0014960

0.0009217
0.0022310
0.0012180
0.0016530

0.0009176
0.0023860
0.0011790
0.0016680
0.009397 = 995.8 / 106000






0.005795 = 612.9/105800




0.005486283 = 580.2 / 105800




0.005293 = 559.8 / 105800




0.005239 = 554/105800






3.648e-006 = 0.3772 / 103400




3.629e-006 = 0.3752/103400




7-10

-------
NO Mode Number Deviance Mean ER Residual Mean Deviance
Definition 3




Definition 4





Idle
Acceleration
Deceleration
Cruise

Idle
Acceleration
Deceleration
Cruise

31157
12999
12970
46279

31849
8443
9613
53500

0.09651
0.04355
0.04256
0.19330

0.09835
0.02944
0.03257
0.21760

0.0009258
0.0025110
0.0011600
0.0016890

0.0009364
0.0026390
0.0011470
0.0017120
3.636e-006 = 0.376 / 103400




3.656e-006 = 0.378 / 103400




                                     7.3 Conclusions

       Comparison of modal average estimates shows that the average modal emission rates are
statistically different from each other for three different pollutants. HTBR regression tree results
demonstrate that all four definitions can work well to divide the database.  Comparisons of re-
sidual mean deviance indicate that definition 1 has the smallest residual mean deviance for NOx
(definition 4 for CO and definition 2 for HC). However, differences were  small. At this time, it
is difficult to choose one definition for three pollutants based just on sensitivity analysis results
in this chapter.  The analysis results in this section indicate that driving mode definition could
not be transferred directly from one research study to another research study. A better approach
would be to test several  different critical values and obtain the most suitable definition instead of
testing only one definition developed from other research.  For this research, more analysis will
be performed in the chapters that follow to develop the most suitable driving mode definitions.
                                           7-11

-------
                                      CHAPTER 8
                            8. IDLE MODE DEVELOPMENT

       In Chapter 7, the concept of driving modes was introduced and several sensitivity tests
(comparison of modal average estimates, comparison of HTBR regression tree results, and com-
parison of residual mean deviance) were performed for four different mode definitions.  Based
on sensitivity analysis results, it is difficult to choose one definition for three pollutants at this
moment. More analysis will be performed next to develop the most suitable driving mode defini-
tion. This chapter will focus on developing the suitable definition for idle mode.
       Theoretically, idle mode is usually defined as zero speed and zero acceleration. In real
world data collection efforts, this definition must be refined due to the presence of speed mea-
surement error. In this research, idle mode will be defined by speed and acceleration.  The criti-
cal value could not be deduced directly from previous research. It is better to test several critical
values statistically and identify the most suitable idle definition.

                        8.1 Critical Value for Speed  in Idle Mode

       Three critical values were tested to get the appropriate critical value for speed  in defin-
ing idle activity.  Figures 8-1 to 8-3 illustrate engine power vs. emission rates for three pollutants
for three critical speed values: 1 mph, 2.5  mph, and 5 mph. Figure 8-4 compares engine power
distributions for these three critical values. Because engine power distributions for three pollut-
ants exhibit similar patterns,  only NO emissions are shown in Figure 8-4.  Tables 8-1 and 8-2
provide the engine power distribution for these three critical values in two ways: by number and
percentage.
                                           8-1

-------
                                                                        Speed <= t mph
                                                            _. 1.5
                                                            a
                                                            S
                200  250  300      -0   50   ,00  150  200  260
        En»mP«nr(bh|0                     EnjiM Pownr (hhp)
                                0  50  100  160  JOO
                                       Engim Power (blip)
 Figure 8-1 Engine Power vs. NO  Emission Rate for Three Critical Values
                                                                       Speed <= 1 mph
0  50  100  150  200  25C  300
       Engine Power (bhp)
0   50  100  150  200  250  300
       Enjms Power fbhp)
0  50  100  160  200  S50  30G
       Engine Power (bhp]
Figure 8-2 Engine Power vs.  CO Emission Rate for Three Critical Values
                                        8-2

-------
                  Speed <= 5 mph
                                                SpewJ «= 2.5 mph
                                                                  ^004
                                                                        I  ;
         "0  50  100  150  200  250  300     "fl  50  100  150   200  250  300     UQ  SO  tOO  150  200  250  300
                 Engine Powet (blip)                     Engine Pews? (bhp)                     Engine Power (bhp)



         Figure 8-3 Engine Power vs. HC Emission Rate for Three Critical Values
        0   SO  100  150  201  J50  300      "0   SO   100  ISO  200  350  300     "l>  SO  100  ISO  200 250  300
Figure 8-4 Engine Power Distribution for Three Critical Values based on NO,_ Emissions

-------
Table 8-1 Engine Power Distribution for Three Critical Values for Three Pollutants
Engine Power (brake horsepower (bhp)
Speed Pollutant * y v y Vf
[020) [2030) [3040) [4050) > 50 Total
< 5 mph


<2.5mph


< 1 mph


NOY
CO
HC
NO
V
CO
HC
NO
V
CO
HC
31631
31258
30737
29222
28880
28373
27516
27217
26713
2272
2269
2264
2098
2096
2093
2011
2010
2007
1323
1316
1321
1196
1189
1194
1100
1093
1099
152
149
147
83
81
80
51
51
48
2348
2342
2284
1143
1139
1106
700
699
680
37726
37334
36753
33742
33385
32846
31378
31070
30547
Table 8-2 Percentage of Engine Power Distribution for Three Critical Values for Three Pollutants
Engine Power (brake horsepower (bhp)
pee o utant ^ [2Q 30) [3040) [4050) > 50 Total
< 5 mph


<2.5mph


< 1 mph


NO
V
CO
HC
NO
CO
HC
NO
CO
HC
83.84%
83.73%
83.63%
86.60%
86.51%
86.38%
87.69%
87.60%
87.45%
6.02%
6.08%
6.16%
6.22%
6.28%
6.37%
6.41%
6.47%
6.57%
3.51%
3.52%
3.59%
3.54%
3.56%
3.64%
3.51%
3.52%
3.60%
0.40%
0.40%
0.40%
0.25%
0.24%
0.24%
0.16%
0.16%
0.16%
6.22%
6.27%
6.21%
3.39%
3.41%
3.37%
2.23%
2.25%
2.23%
100%
100%
100%
100%
100%
100%
100%
100%
100%
       Based on the analysis above, a critical value of 5 mph includes more data points with
higher engine power (>50 bhp) than 2.5 mph and 1 mph.  However, there is no large difference
for engine power distributions between 2.5 mph and 1 mph.  These two critical values for speed
will be tested further with different acceleration values in the next section.  The results will be
used to make a final decision with regards to deceleration mode.

                     8.2 Critical Value for Acceleration  in Idle Mode

       After setting the critical value for speed, the next step is to determine a critical value for
acceleration. In total, four options were tested.

       •   Option 1: speed < 2.5 mph and absolute acceleration < 2 mph/s
                                           8-4

-------
       •  Option 2: speed < 2.5 mph and absolute acceleration < 1 mph/s

       •  Option 3: speed < 1 mph and absolute acceleration < 2 mph/s

       •  Option 4: speed < 1 mph and absolute acceleration < 1 mph/s

       Using the same method as outlined in the previous section, Figures 8-5 to 8-7 illustrate
engine power vs. emission rates for three pollutants for four options above. Figure 8-8 compares
engine power distribution for data falling into these four options. Because engine power distri-
butions for three pollutants exhibit a similar pattern, only NO  emissions are shown in Figure
8-8. Tables 8-3  and 8-4 provide the engine power distribution for four options in two ways: by
number and percentage.
                 Option I
                                    OelitMi 2                Optun 3                Option <
                             2.5 -                 2.Sr                 25r
25
2
15
1
1




-
' J
Jifll

,„
*. '
' • "*
r

	



                                                                 f
               100   3DO
              Engine Power (bhp)

Q    S 00   200   300
   Engine Pawe* {bhp)
0    100   200    300
   Engine Powei (bhp)
100   300
ncpni Power (bhp)
                                                                                  °
               Figure 8-5 Engine Power vs. NO Emission Rate for Four Options
                                             8-5

-------
        Option 1
                          Qpllon 2
                                             Cptron3
                                                               Gplwjfi A
o '5





1
Jjj
II





i '•.
jS't
fe
g-£.
o a
ne Power (I





, 	
D a
•T'l
35

2 r
S 2
1
|
0
1
05
0
D 1












•
••'-,". *
^l:!,.
35


: S 2
1
8

i





i
••^
^M





*
s* :.
#?
r





i
100 200 300 "0 100 200 300
Engine Row,, (bhp) Engine Powar (bhp)
w




Ofi
°c





.
•J
^'"^.
II
En*





• "
0 J(
ne Powe« (





	 ]
D 30
Up)
       Figure 8-6 Engine Power vs. CO Emission Rate for Four Options
       Figure 8-7 Engine Power vs. HC Emission Rate for Four Options
                                   8-6

-------
  35r
          Option 1
  1.6
  OS
                         36 r
                                  Option 2
                         1.5
                                                 3.5
                                                   110"
Option 3
                                                                          x1(f
Option 4
                                100    200    300
                                Engine Power
      Figure 8-8 Engine Power Distribution for Four Options based on NO  Emission Rates

Table 8-3 Engine Power Distribution for Four Options for Three Pollutants
Engine Power (brake horsepower (bhp))
o utants ^2o^ p030) [3040) [4050) > 50 Total
Option 1


Option 2


Option 3


Option 4


NO
V
CO
HC
NO
CO
HC
NO
CO
HC
NO
CO
HC
28694
28366
27855
27571
27284
26771
27367
27071
26569
26719
26446
25944
2075
2073
2070
2030
2028
2026
1999
1998
1995
1969
1968
1966
1177
1170
1175
1120
1114
1119
1091
1084
1090
1057
1051
1056
78
76
75
53
51
51
50
50
47
34
34
32
693
690
674
290
287
283
527
526
512
205
204
198
32717
32375
31849
31064
30764
30250
31034
30729
30213
29984
29703
29196
                                               5-7

-------
Table 8-4 Percentage of Engine Power Distribution for Three Critical Values for Three Pollutants
Engine Power (brake horsepower (bhp)
o utants ^2o^ [2030) [3040) [4050) > 50 Total
Option 1


Option 2


Option 3


Option 4


NO
CO
HC
NO
CO
HC
NO
CO
HC
NO
Y
CO
HC
87.70%
87.62%
87.46%
88.76%
88.69%
88.50%
88.18%
88.10%
87.94%
89.11%
89.03%
88.86%
6.34%
6.40%
6.50%
6.53%
6.59%
6.70%
6.44%
6.50%
6.60%
6.57%
6.63%
6.73%
3.60%
3.61%
3.69%
3.61%
3.62%
3.70%
3.52%
3.53%
3.61%
3.53%
3.54%
3.62%
0.24%
0.23%
0.24%
0.17%
0.17%
0.17%
0.16%
0.16%
0.16%
0.11%
0.11%
0.11%
2.12%
2.13%
2.12%
0.93%
0.93%
0.94%
1.70%
1.71%
1.69%
0.68%
0.69%
0.68%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
       Based on the above analysis, data falling into option 2 and option 4 contain fewer data
points with higher engine power (>50 bhp) than data falling into option 1 and option 3. But a
large difference is not observerd in the engine power distribution for data falling into option 2
and option 4. Based upon these results, the idle mode is denned as speed < 2.5 mph and absolute
acceleration < 1 mph/s.

                   8.3 Emission Rate Distribution by Bus in Idle Mode

       After denning "speed < 2.5 mph and absolute acceleration < 1 mph/s" as idle mode, emis-
sion rate histograms for each of the three pollutants for idle operations are presented in Figure
8-9. Figure 8-9 shows significant skewness for all three pollutants for idle mode.  Inter-bus
response variability for idle mode operations is illustrated in Figures 8-10 to 8-12 using median
and mean  of NOx, CO, and HC emission rates. Table 8-5 presents the same information in tabu-
lar form. The difference between median and mean is also an indicator of skewness.

-------
C   01   02   03   0<   0.5      0   01   02   0.3   0.4   05      0  O.Ot  002  0.03 004  005
     NOx Emission Rale (o/s)                  CO Emission Rate (g's)                  HC Emission Rate (9/5)


           Figure 8-9 Histograms of Three Pollutants for Idle Mode
 Figure 8-10 Median and Mean of NO Emission Rates in Idle Mode by Bus
                                       8-9

-------
           ll
                                      0012
                                    15 DOCS
 I
 0   2   4    6   8   ID   12   14   16
              Bos No
I
e    »   in   13   14
  Bus No
Figure 8-11 Median and Mean of CO Emission Rates in Idle Mode by Bus
 D   2   4   6   8   10   12  14   IE
                                        0   2   4   6   8   10   12  11   16
                                                     Bus No
Figure 8-12 Median and Mean of HC Emission Rates in Idle Mode by Bus
                                8-10

-------
Table 8-5 Median, and Mean of Three Pollutants in Idle Mode by Bus
NO CO HC
X
Bus ID Median Mean Median Mean Median Mean
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
0.071020
0.020455
0.022555
0.025050
0.055210
0.028880
0.023370
0.033210
0.026200
0.027115
0.027605
0.027790
0.024210
0.023750
0.032140
0.059444
0.020216
0.032140
0.026480
0.054766
0.035050
0.025393
0.038500
0.027371
0.028768
0.036734
0.027520
0.026982
0.024339
0.030031
0.004830
0.005740
0.000670
0.003110
0.013150
0.005390
0.000960
0.006730
0.000930
0.001915
0.002980
0.002290
0.001205
0.002590
0.004860
0.009145
0.008895
0.005408
0.003601
0.011739
0.013385
0.001572
0.011425
0.001218
0.004044
0.009836
0.002736
0.003428
0.005782
0.006155
0.00072
0.00063
0.00007
0.00071
0.00220
0.00076
0.00019
0.00085
0.00024
0.00020
0.00034
0.00065
0.00043
0.00043
0.00055
0.002441
0.000865
0.000385
0.000927
0.002272
0.001311
0.000219
0.001531
0.000298
0.000228
0.000624
0.000950
0.000498
0.000453
0.000579
       Figures 8-10 to 8-12 and Table 8-5 illustrate that bus 372 has the largest median and
the second largest mean for CO and HC emissions, and the second largest median and the sec-
ond largest mean for NO  emissions.  The activity of bus 372 in terms of distribution of engine
power by bus was compared to that of other buses in an effort to identify why the emission rates
were significantly higher than for other buses.  Table 8-6 and Figure 8-13 show that bus 372 has
higher min (2nd), 1st quartile (2nd), median (1st), and 3rd quartile (2nd) engine power compared
to the other 14 buses.  Engine power in idle mode may include cooling fan, air compressor, air
conditioner, and alternator loads (Clark et al. 2005). Considering test buses and engines are simi-
lar in many ways, this difference might be caused by variability across the engines, or may be
associated with unrecorded air conditioner use. In  analyzing the database, the modeler could not
identify a  contribution of air conditioner to engine power in idle mode. So, model development
will include these data but readers should be cautioned that the noted variability is an indication
that significant numbers of vehicles may need to be tested in the future if such inter-engine dif-
ferences are significant in the fleet.  In addition, the role of air conditioning usage on engine load
in transit buses warrants additional future research.
                                          8-11

-------
r :cu:
 i coo
              HE i	1	i	35CDi	1	1	a53Di	1	:	cBOO
              SCO
             -coo -
                           «a
                           CM
                                        OJJ
                                        1X0
                                                    .00 J
                                                     CM
                                                                 «r -
                                                                 :LU:
                                                                - w: -
                                                                 co:
                                                                              COO
                                                                              :oo
                                                                                           CM
    0 100 300 300  0  100 200  300 0  100 203 300  0  100 2O> 300  0 1CO 300 300 0  100 200  300 0  100 203 300  0  ICO 200 3DO
                   Bus 3E1        Bus 3B3        Bus 3y        Bux 373        Bus 375        Bui 377        Bus 379



-

-
-
- J

i'.O'j
'TO
'-(l-i
tJJ-'
Jjj
ran
i






-


con
OX'
-n-
EClj
:oj
roii
••




•
-
-
-t i

to:
>co:
mr
so:
co:
TO:
-






-


«c
•rnt
vrr
SLL
co:
m:
i




-
-
-
i

iff.
'VJ!
TOT
;ct
OE
OT
n






-


jjiij
•:ri:
m-
50C
OK
5CC
1!

-
-
-

-
-
-~-l 1
    n  no xa 109  0  1002003x0  mo an 300  a  1003113000  IDD ao sen  o  iao3003ana  100 200
       Busseo         Busaai         tj<..-«.
             Figure  8-13 Histograms of Engine Power in Idle Mode by Bus
                                                  8-12

-------
Table 8-6 Engine Power Distribution in Idle Mode by Bus
Bus ID Min lstQuartile Median 3rd Quartile Max
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
3.92
0
0
0
0
0
0
0
2.67
0
0
0
0
0
4.68
15.36
5.35
13.1
13.18
26.44
12.52
8.5
15.86
7.85
8.7
7.35
7.16
6.01
4.53
9.18
18.7
12.52
13.34
13.85
31.84
13.81
9.17
17.15
8.49
10.49
8.52
10.03
7.34
7.19
13.33
19.83
13.83
15.16
14.99
33.10
18.08
9.85
19.42
9.17
11.17
13.89
12.5
8.51
8.51
14.46
135.43
89.47
152.94
154.51
79.08
167.72
166.86
126.64
100.99
148.28
99.04
91.86
117.39
139.05
105.44
                                    8.4 Discussions

8.4.1 High HC Emissions

       Figure 8-7 shows that there are some high HC emissions in idle mode.  Based on defini-
tions of "speed < 2.5 mph and absolute acceleration < 1 mph/s", 388/30250=1.28% of data points
in idle mode for HC are high emissions.  These high emissions were noted in the HC emissions
data, not in NO and CO.  All high HC emissions have been coded as high-idle to determine if
they are related to any other parameters.  Tree analysis could be used for this screening analysis.
After screening engine speed, engine power, engine oil temperature, engine oil pressure, engine
coolant temperature, ECM pressure, and other parameters, no specific operating parameters re-
lated to these high-idle emissions were identified.
       On the other hand, regression tree analysis results by bus and trip are presented in Figure
8-14.  The left figure shows that these high HC emissions occurred in bus 360 and 372 while the
right figure shows that these high HC emissions happened in bus 360 trip 4 and bus 372 trip 1.
Even for HC emissions, Figure  8-14 shows that these high emissions are not a common situa-
tion in idle mode. There are 1529 idle segments in total  for 15 buses, but most of these high HC
emissions came just from three  idle segments.  These three idle segments are: bus 360 trip 4 idle
                                          8-13

-------
segment 1 (130 seconds), bus 360 trip 4 idle segment 38 (516 seconds) and bus 372 trip 1 idle
segment 1 (500 seconds). More specifically, bus 360 trip 4 idle segment 1 contains 102 high HC
emissions, bus 360 trip 4 idle segment 38 contains 264 high HC emissions, while bus 372 trip 1
idle slots contain 13 high HC emissions. Figures 8-15 to 8-17 illustrate time series plots for HC
for these three idle segments while vehicle speed, engine speed, engine power, engine oil tem-
perature,  engine oil pressure, engine coolant temperature and ECM pressure are presented, too.
These figures do not include NO and CO because NO and CO do not show such patterns as
these three idle segments for HC.  These three idle segments contain 379 high HC  emissions in
total. Thus about 98% of high emissions came from three idle segments only.  Exclusion of these
three idle segments based on all current information is difficult. The modeler prefers to keep
these data since these outliers might reflect variability in the real world. However, future data
collection efforts should seek to identify the causes of such events.
                  By Bus
                                                          By Trip
                              i
                                 •••   -
                                                  tnf.t,\ '.

                                                                      (Bus 372 tip 1>
                                                      0«Mi
                                                      I Bus 360 Up 4)
        Figure 8-14 Tree Analysis Results for High HC Emission Rates by Bus and Trip

                             :/vw
        Figure 8-15 Time Series Plot for Bus 360 Trip 4 Idle Segment 1(130 Seconds)
                                          8-14

-------
                                            ir
                                            S'»
                                            H


                                                                                 -it
        Figure 8-16 Time Series Plot for Bus 360 Trip 4 Idle Segment 38 (516 Seconds)
        Figure 8-17 Time Series Plot for Bus 372 Trip 1 Idle Segment 1 (500 Seconds)
8.4.2 High Engine Operating Parameters

       Figure 8-15 shows that engine speed once jumped to about 2000 rpm during bus 360 trip
4 idle segment 1, while corresponding engine power and engine oil pressure jumped, too. This
jump lasted only 9 seconds. There are several reasons which might be responsible for this jump.
Possibly bus 360 moved slowly from one location to another location while the GPS failed to
detect the movement.  Other explanations might be that the engine experienced a computer or
sensor problem. This kind of jump, higher engine speeds (about 2000 rpm) accompanied by
higher engine power and engine oil pressure in idle mode, did occur in the real world. The jump
shown in Figure 8-16 was not such an occurrence since engine speed was only about 1000 rpm
during that jump. After screening the whole dataset, another example of a jump is shown in Fig-
ure 8-18. The jump in bus 383 trip 1 idle segment 12 lasts 28 seconds. Since there are only two
observations of such jumps in the whole database, there are not enough data to assess whether
                                        8-15

-------
they co nstitute a new mode.  These observations might indicate that one should pay attention
to slow movement during an idle segment. Since these two idle segments show some unusual
activities, the modeler will retain them to avoid any bias in the results.
       Figure 8-18 Time Series Plot for Bus 383 Trip 1 Idle Segment 12 (1258 Seconds)

                           8.5 Idle Emission Rates Estimation

       Based on definition of "speed < 2.5 mph and absolute acceleration < 1 mph/s", about 30%
of available data are classified as idle mode. Usually, modelers estimate the idle emission rate
by averaging all emission rates in idle mode. Although there are some data points with higher
engine power (> 50 bhp) in idle mode, about 90% of data in idle mode exhibit engine power be-
tween 0 and 20 bhp. After detailed analysis of all idle segments using time series plots, although
some data may be incorrectly classified as the idle mode, no anomalies were noted.  To avoid in-
troducing any significant bias, a  single idle emission rate is developed for each pollutant. When
we treat all data as a whole and put them in the pool, the mean and confidence interval can reflect
the distribution of emission rates in real world. Table 8-7 provides idle mode statistical analysis
results for NO , CO, and HC.
             v"    ~
                                          8-16

-------
Table 8-7 Idle Mode Statistical Analysis Results for NOY, CO, and HC
NOx CO HC
minimum
lstQuartile
mean
median
3rd Quartile
maximum
skewness
Total Number
0.00121
0.02201
0.03342
0.02670
0.03549
0.40259
4.45050
31064
0.00002
0.00120
0.00594
0.00293
0.00554
0.48118
13.1840
30764
0.00001
0.00026
0.00092
0.00051
0.00079
0.05232
11.6100
30250
       Due to the non-normality of emission rates, the median value (the value that divides
observations into an upper and lower half) and the inter-quartile range (the range of values that
includes the middle 50% of the observations) are the most appropriate for describing the distribu-
tion. The mean and skewness for the original data are presented in Table 8-8 as well.  Although
transformation for three pollutants already discussed based on the whole data set in Chapter
6, lambdas chosen by Box-Cox procedure for the whole data set and idle mode are different.
Lambdas chosen by Box-Cox procedure for the whole data set are 0.22875 for NO , -0.0648
for CO, 0.14631 for HC, while lambdas for idle mode are -0.19619 for NOx, -0.0625 for CO,
0.002875 for HC.  At the same time, using transformation to estimate the mean and construct
confidence intervals will create other problems.  Therefore the modeler considers bootstrap, an-
other class of general method, to obtain the estimation and construct confidence intervals.
       The bootstrap is a procedure that involves choosing random samples with replacement
from a data set and analyzing each sample the same way (Li 2004).  To obtain the 95% confi-
dence interval, the simple method is to take 2.5% and 97.5% percentile of the P replications Tp
T .., T as the lower and upper bounds, respectively. The bootstrap function in this study will
resample the emission data 1000 times and compute the mean, 2.5% and 97.5% percentile on
each sample.  Results are presented in Figure 8-20 and Table 8-8.
                Original data
Resampling
BooMriip Statistic
                   \,
                   \,
                   x,

           Figure 8-19 Graphical Illustration of Bootstrap (Adopted from Li 2004))
                                          8-17

-------
            0033 00332 00334 00336 00339 0034
             Ntean ol NOx Emission Rate [g/sj
5.7  5.8  59  E  61  62  63
    Mian of CO Emission Ran (g/s)
               Figure 8-20 Bootstrap Results for Idle Emission Rate Estimation

Table 8-8 Idle Emission Rates Estimation and 95% Confidence Intervals Based on Bootstrap
Average 2.5% Percentile 97.5% Percentile
NO
X
CO
HC
Estimation
Confidence Interval
Estimation
Confidence Interval
Estimation
Confidence Interval
0.033415
0.033162
0.033669
0.0059439
0.0058184
0.0060693
0.00091777
0.00089742
0.00093811
0.010754
0.010509
0.010998
0.00036116
0.00034446
0.00037775
0.000059167
0.000047572
0.000070763
0.083266
0.082279
0.084252
0.028429
0.028083
0.028775
0.0037260
0.0036412
0.0038108
       Based on table 8-9, the modeler recommends idle emission rates for NO  as 0.033415

g/s with 95% confidence interval (0.010754, 0.083266), CO as 0.0059439 g/s with 95% confi-
                                            8-18

-------
dence interval (0.00036116, 0.028429), HC as 0.00091777 g/s with 95% confidence interval
(0.000059167, 0.0037260).

                      8.6 Conclusions and Further Considerations

       In this research, idle mode is defined as "speed < 2.5 mph and absolute acceleration <1
mph/s". However the critical value could not be introduced from other research to this research
directly.  It is more appropriate to test several critical values and obtain the most suitable one
instead of testing only one developed from other research.
       Inter-bus variability analysis results indicate that bus 372 has the largest mean for NO ,
CO, and HC emissions. Meanwhile, bus 372 has higher minimum (2nd), 1st Quartile (2nd), me-
dian (1st), and 3rd Quartile (2nd) engine power by comparison to the other  14 buses.  Since test
buses and engines are similar in most ways, this difference might be caused by variability of
the engines or air conditioner usage.  However, the contribution of the air conditioner to engine
power in idle mode could not be identified in the database.  Future research regarding the role of
the air conditioner on engine power and emission rates in idle mode may be able to detect a dif-
ference.
       Although some trips or some buses have higher mean and standard deviation than others,
this kind of variability will decrease when all data in idle mode are  treated as a whole. On the
other hand, some elevated emissions events may simply reflect real world  variability. Without
additional evidence, modelers should treat all data as a whole instead of removing outliers and
potentially biasing results.
       There are two observations of an emissions jump that appears to be unrelated to engine
speed, engine power, and  engine oil temperature,  in a single idle segment.  The modeler first as-
sumed that the bus moved too slowly from one location to another location for the GPS/ECM to
detect the movement. Other explanations might be an engine computer problem or sensor prob-
lem. These two jumps might be  evidence to support further research on slow movements during
idle segments.
       In summary, the modeler recommends idle emission rates for NO  as 0.033415 g/s with
95% confidence interval (0.010754, 0.083266), CO as 0.0059439 g/s with 95% confidence inter-
val (0.00036116, 0.028429), HC as 0.00091777 g/s with 95% confidence  interval (0.000059167,
0.0037260).
                                          8-19

-------
                                      CHAPTER 9
                     9. DECELERATION MODE DEVELOPMENT
       Chapter 7 introduced the concept of driving mode into the study and several sensitivity
tests were performed for four different definitions, including comparison of modal average emis-
sion rate estimates, HTBR regression tree results, and residual mean deviance. After developing
the idle mode definition and emission rate in Chapter 8, the next task is dividing the rest of the
vehicle activity data into driving mode (deceleration, acceleration and cruise) for further analy-
sis. The deceleration mode is examined first.

              9.1  Critical Value for Deceleration Rates in Deceleration Mode

       The first task related to analysis of emission rates in the deceleration mode is  identify-
ing critical values for deceleration.  The literature indicates that critical values of -1 mph/s and -2
mph/s should be examined. Because the critical value of "acceleration < -1 mph/s" also includes
all data that conform with a critical value of "acceleration < -2 mph/s", comparison of data that
fall between these two potential cut points is first performed.  In summary, these three decelera-
tion bins for analysis include:
       •      Option 1: acceleration < -2 mph/s
              Option 2: acceleration > -2 mph/s & acceleration < -1 mph/s
       •      Option 3: acceleration > -1 mph/s & acceleration < 0 mph/s
       If the critical value is set as -1 mph/s for deceleration mode, data falling into option 1 and
option 2 will be classified as deceleration mode while data falling into option 3 will be classified
as cruise mode. If the critical value is set as -2 mph/s for deceleration mode, data falling into op-
tion 1 will be classified as deceleration mode while data falling into option 2 and option 3 will be
classified as cruise mode.
                                           9-1

-------
       Figure 9-1 illustrates engine power distribution for these three options. Figures 9-2 to 9-4
compare engine power vs. emission rate for three pollutants for three options.  Tables 9-1 and 9-2
provide the distribution for these three options in two ways: by number and percentage.

Table 9-1 Engine Power Distribution for Three Options for Three Pollutants
Engine Power (brake horsepower (bhp))
Deceleration Pollutants ^ 5Q ^
Option 1
Option 2
Option 3
NO
CO
HC
NO
CO
HC
NO
CO
HC
9322
9558
9483
6748
6800
6754
6806
6782
6705
94
89
94
127
126
125
950
949
921
16
15
16
101
99
99
1062
1061
1044
5
4
5
42
42
42
562
558
541
15
15
15
174
171
172
4353
4326
4212
9452
9681
9613
7192
7238
7192
13733
13676
13423
Table 9-2      Percentage of Engine Power Distribution for Three Options for Three Pollutants
Engine Power (brake horsepower (bhp))
Deceleration Pollutants ^ 5Q ^
Option 1
Option 2
Option 3
NO
CO
HC
NO
V
CO
HC
NO
V
CO
HC
98.6%
98.7%
98.6%
93.8%
93.9%
93.9%
49.6%
49.6%
50.0%
1.0%
0.9%
1.0%
1.8%
1.7%
1.7%
6.9%
6.9%
6.9%
0.2%
0.2%
0.2%
1.4%
1.4%
1.4%
7.7%
7.8%
7.8%
0.1%
0.0%
0.1%
0.6%
0.6%
0.6%
4.1%
4.1%
4.0%
0.2%
0.2%
0.2%
2.4%
2.4%
2.4%
31.7%
31.6%
31.4%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
                                           9-2

-------
 11000
              Opium I
£ 5000
  JUDO
                                HOOOr
                                             Option 2
                              £  axiol
                                                               11000r
                                                                            i/rlra '.
    a   so   100  i5t  an  350
            Engine Ptwui (bnp)
                                   0   SO   100   160  300  350  300
0  93  100  150  200  250  303
           Pom: (Wip)
              Figure 9-1 Engine Power Distribution for Three Options
 .15
                                -1.5
                                  D.S
                                             Opllon2
    0   50  100  ISO  200  2SO

           Engine Power (bhpj
 0   60   100  ISO  200  260

        Engine Ptrwer (bhp)
        Figure 9-2 Engine Power vs. NO Emission Rate for Three Options
                                             9-3

-------
35
3
25
g,
3
g
as
Sis
o
1

05
Oplwn 1
-
•




.

•fc-.i .,- ,i,- -i •• i •
35
3
25
I 2
S
1
o 1.5
o
,

05
0
Option 2
-
-
,



.
•
Ml f-ifJ*tJi..l-' •'*'•*. LJ&d
35
3
25
I 2
£
i
1
o 15
1

0.5
0
Option 3
•
•
.



-











     50   IOD   150   200  250  300
         Engine Power (bhp)
        100   ISO   200  250  300
       Engine Pff«ror (bhp)
0   50   100  150  200   250   300
        Engine Power (bhp)
      Figure 9-3 Engine Power vs. CO Emission Rate for Three Options
             OfKfon 1
0.05
003
                                 O.P7
                                 0.06
                                 0.05
                                 0.04
                                 0.03
                                 003
                                              Option 2
      50   100  150  2DO  250   300
          Engine Power (bhp)
                                                                  007
                                                                              Omion 3
0   50   100  150  200  250  300
        Engine Power (bhp)
0   50   100   ISO  200  230  300
        Engine Powti {bhp)
      Figure 9-4 Engine Power vs. HC Emission Rate for Three Options
                                             9-4

-------
       There is little difference in the engine power distributions noted for data falling into op-
tion 1 and option 2 while the power distribution for option 3 is obviously different from option
1 and option 2 in the above figures and tables.  Tables 9-1 and 9-2 show that the engine power is
more  concentrated in the lower engine power regime (< 20 bhp) for data in deceleration mode.
Tables 9-1 and 9-2 better reflect the power demand of the vehicle in real world in deceleration
mode. Hence, the critical value is set to -1 mph/s for deceleration mode.

                          9.2 Analysis of Deceleration Mode Data

9.2.1  Emission Rate Distribution by Bus in Deceleration Mode

       After defining vehicle activity data with "acceleration <-l mph/s" as deceleration mode,
emission rate histograms for each of the three pollutants for deceleration operations are presented
in Figure 9-5.  Figure 9-5 shows significant skewness for all three pollutants for deceleration
mode. Inter-bus emission rate variability is illustrated by plotting median and mean NOx, CO,
and HC emission rates in deceleration mode for each bus in Figures 9-6 to 9-8 and Table 9-3.
The difference between median and mean is also an  indicator of skewness.
         140001
         60001
                 0.5      1
               NO* Emission R«B (^

                                                          HI."
D   02  01   06   08
     CO Emission Rue (jft)
a  oai  002  003 ow  DOS
     HC Emission flule (j/s)
               Figure 9-5 Histograms of Three Pollutants for Deceleration Mode
                                            9-5

-------
   0035
   0-03
 I 0,02
 i
   0015
   001
      I
I
I
                                            0.036
                                            O 025
                                            0.015
     0   2    4   6   6   10   12   14   16
                   Bus Mo
                                              0    1   t
                                                                10   13   14   IB
Figure 9-6 Median and Mean of NO  Emission Rates in Deceleration Mode by Bus
Figure 9-7 Median and Mean of CO Emission Rates in Deceleration Mode by Bus
                                      9-6

-------
                                                              8   10   12  14
                                                             But No
       Figure 9-8 Median and Mean of HC Emission Rates in Deceleration Mode by Bus
Table 9-3 Median, and Mean for NO , CO, and HC in Deceleration Mode by Bus
NOx CO HC
Bus ID Median Mean Median Mean Median Mean
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
0.00325
0.00624
0.00483
0.00324
0.00437
0.00499
0.00414
0.02664
0.00525
0.01666
0.01214
0.00741
0.00828
0.02066
0.00341
0.01998
0.02206
0.01952
0.01255
0.01924
0.01997
0.01940
0.03457
0.01914
0.02420
0.03541
0.02385
0.02869
0.02118
0.01786
0.00502
0.00384
0.00446
0.00474
0.00578
0.00410
0.00317
0.00397
0.00359
0.00369
0.00450
0.00322
0.00259
0.00377
0.00406
0.00814
0.00535
0.00486
0.00586
0.00803
0.00567
0.00630
0.00522
0.00716
0.00452
0.00564
0.00452
0.00411
0.00585
0.00583
0.00040
0.00079
0.00004
0.00551
0.00161
0.00066
0.00034
0.00078
0.00060
0.00034
0.00073
0.00128
0.00113
0.00088
0.00091
0.00097
0.00095
0.00008
0.00613
0.00229
0.00085
0.00040
0.00103
0.00072
0.00038
0.00083
0.00172
0.00127
0.00086
0.00120
                                         9-7

-------
       Figures 9-6 to 9-8 and Table 9-3 illustrate that bus 379 has the largest median and the sec-
ond largest mean for NOx emissions, bus 372 has the largest median and the second largest mean
for CO emissions, while bus 364 has the largest median and mean for HC emissions. At the
same time, bus 382 has the largest mean for NOx emissions, and bus 360 has the largest mean for
CO emissions. The above figures and table demonstrate that although variability exists among
buses, it is difficult to determine which, if any, bus is a high emitter (i.e., a bus that exhibits ex-
tremely high emission rates under all operating conditions, which also may exhibit significantly
different emissions responses to operating activity than normal emitters).
       The modeler notices that there is also a small number of some very high HC emis-
sions events noted in deceleration mode.  Based on definitions of "acceleration < -1 mph/s",
242/16237=1.49 % of data points in deceleration mode for HC are high emissions. This hap-
pened only for HC.  This did not occur for NO and CO. All high HC emissions have been
coded to determine if they are related to any other parameters. Tree analysis could be used for
this screening analysis. After screening engine speed, engine power, engine oil temperature, en-
gine oil pressure, engine coolant temperature, ECM pressure, and other parameters, no operating
parameters appeared to be correlated to these high emissions events.
       High HC emissions distribution by bus and trip are presented in Table 9-4.  Unlike idle
mode where high HC emissions occurred mainly in three idle segments (bus 360, trip 4, idle seg-
ment 1; bus 360, trip 4, idle segment 38; and bus 372, trip 1, idle segment 1), high HC emissions
are dispersed among seven different buses and 18 different trips. Although there is not enough
evidence to suggest a specific bus is a "high emitter", bus 364 is worthy of additional attention.
There are  5284 data points for bus 364 and, among them,  887 data points classified as decelera-
tion mode. There are 408 high  HC emissions data points for bus 364 in deceleration mode. The
percentage of high HC emission for bus 364 is 7.72% (408/5284), while the percentage of high
HC emissions for bus 364 in deceleration mode  is about 21% (193/887). Given the limited avail-
able data,  no conclusion could be drawn about high HC emissions in deceleration mode.  These
potential outliers may simply reflect real-world emissions variability for these engines.
       Emission rate behavior as a function of operating mode and power for high-emitting ve-
hicles may differ significantly from normal-emitting vehicles.  Since no high-emitting vehicle is
identified in the AATA data set, it is impossible for the modeler to examine such a difference.  To
ensure that models are applicable to normal and high-emitters in the fleet, models have to have
both normal and high-emitters available in the analytical data set. Thus it is important to identify
high-emitting vehicles and bring them in for testing.
                                          9-8

-------
Table 9-4 High HC Emissions Distribution by Bus and Trip for Deceleration Mode
   Bus ID     Number of High HC Events
Number of High HC Events
Bus 360
Bus 361
Bus 364
Bus 372
Bus 383
Bus 384
Bus 386
11
1
193
19
11
1
6
Bus 360, trip 3
Bus 360, trip 4
Bus 361, trip 5
Bus 364, trip 1
Bus 364, trip 2
Bus 364, trip 3
Bus 372, trip 1
Bus 372, trip 2
Bus 372, trip 3
Bus 372, trip 4
Bus 383, trip 1
Bus 383, trip 2
Bus 383, trip 3
Bus 383, trip 4
Bus 384, trip 3
Bus 386, trip 1
Bus 386, trip 2
Bus 386, trip 4
3
8
1
46
61
86
6
4
O
6
O
O
2
O
1
1
2
O
9.2.2  Engine Power Distribution by Bus in Deceleration Mode

       Engine power distribution by bus is shown in Figure 9-9 and Table 9-5. When the bus is
decelerating, the engine typically absorbs energy, yielding low engine power, or even negative
engine power.  Table 9-5 reflects this characteristic of deceleration mode. According to Sensors,
Inc. report (Ensfield 2001), negative engine power is recorded as zero power in the data, which
explains the large number of zero power values in the deceleration mode. The emission rates
under negative engine power conditions may be signficiantly different from those under positive
engine power.  Further analysis will examine this question. Moreover, bus 372 has the greatest
3rd Quartile engine power in deceleration mode, consistent with the finding in idle mode.
                                          9-9

-------
Table 9-5 Engine Power Distributions in Deceleration Mode by Bus
Bus No Minimum 1st Quartile Median 3rd Quartile Maximum
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3.88
5.16
6.70
0
20.41
5.84
3.33
11.77
5.19
7.19
5.84
8.51
5.86
6.00
7.18
275.40
173.10
274.90
254.30
112.00
274.90
275.10
164.90
29.40
121.15
20.75
94.65
162.37
102.59
42.20
                                         9-10

-------
MOM




-





'•At.'
iut>
.-•no
run
Xf!
ao
iyj
ZflC

-
-
-
-
-
-
-
-

KB
400
;nn
ron
an
fan
4oe
.TJtl
n





-
-
-
1_

tin
400
200
-
tClll
£00
400
;i:o

-
-


-
-
-
-

133
4-jj
200
cm
on
CUU
400
•
1

-
-

-
•
-
-


a:
400
200
on
ULU
ecu
400
200

-
-

-
-
-
-
^^,

6CO
SO!
.Tin
001
xu
6(0
ao
jnn

-
-

-
-
-
-
-

600
400
200
000
iUl
BOO
400
TOO

-
-

-
-

-

    0  100 KB  300 0  1002003000 1003203000  1002003000  100 200 300  0  1002003000  100200X00  100230300
                   Bus 361        Bus 363        Bus 364        Bus 372        Bus 375        Bus 377        Bus 379
 800-
              300,	,	1	1BOO
              .-.HI
              ii.i
              900
                           400-
                           200
                           ODO
                           20D-
                                        IK'

-
-
-
•

-
-
-


EOT
an
3D
IUl
300
600
410
:tij
n

-
-
-
-
-
-
-
-


BOO
410
no
BOO
cjj
600
400
200
n

-
-
-
•
•
-
-
•


n 1
CD
;o
11. -i
EC 3
ec:
1C]
:ci


-
-
-
•
•
•
•
-

   0  1002003000  1002003000  1DC2IH30DO  1002003000  1002003000  1D02003000  100200300
      Bus 380        Bus 381         Bus 382        Bus 383        Bus 384        Bus 385
       Figure 9-9 Histograms of Engine Power in Deceleration Mode by Bus
                                                 9-11

-------
       Based on definitions of "acceleration < -1 mph/s", about 1% of data points with high
engine power (>50 bhp) fall in deceleration mode (Table 9-1). Figure 9-10 illustrates plots of
engine power vs. vehicle speed, engine power vs. engine speed, and vehicle speed vs. engine
speed. Figure 9-10 shows that higher engine power always occurred with higher vehicle speed
and higher engine speed. These data points with higher engine power likely reflect the variabil-
ity of the real world and are all retained in the data set and mode definition to avoid potentially
biasing results.
                      aUgnmoce
                  higher enigma pew (>50)
           f£h.
           ^:-
           */*, !*•*
.i
                                  19X1
                                 . 1600
                                  1000
          0  SO   ICO  I5D  2DQ  360  300
                Engin« Power (bhp)
                      Dece.erasipn mod?
                      i-igner enigfii pmvcr {>50)
                                      '

'
*»
*


". " •*
'** **•
t^ -
*&••'•

!** "•".'•
$*•
SPJ ';
c *

*
^



'•

*.
•

***»
• •;
* * *
* -
- •
« .

••



»



*

* „

•
• "»







. »
*
-
. .
•
•

•
.'
•.








                    100  ISO  JOO 250  300
                    Englm Ptwtr (Jihp)
   Figure 9-10 Engine Power vs. Vehicle Speed, Engine Power vs. Engine Speed, and Vehicle
                                  Speed vs. Engine Speed
                          9.3 The Deceleration Motoring Mode

       Bus engines absorb energy during the deceleration mode, resulting in low or negative en-
gine power.  According to the Sensors, Inc. report (Ensfield 2001), such negative power was re-
corded as zero power. The emissions under these negative engine power conditions may be sig-
nificantly different from those under positive engine power conditions, and therefore may need to
be included in the modeling regime as a separate mode of operation.  To examine this possibility,
deceleration mode data were split into two mode bins for analysis. The first bin includes all data
points with zero engine power in deceleration mode, termed 'deceleration motoring mode.'  The
                                           9-12

-------
remaining data in the deceleration mode, which exhibit positive engine power, are classified as
deceleration non-motoring mode. The analysis will begin as a comparison of histograms of three
pollutants between deceleration motoring mode and deceleration non-motoring mode (Figure
9-11). Table 9-6 compares the mean, median, and skewness of emission distributions between
these two modes for the three pollutants.  The statistical results for all deceleration data are also
presented as a reference. Figure 9-11 and Table 9-6 show that lower emission rates are more
prevalent in the deceleration motoring mode than in the deceleration non-motoring mode. Skew-
ness of emission distributions for deceleration motoring mode is also smaller.
 Figure 9-11 Histograms for Three Pollutants in Deceleration Motoring Mode (a) and Decelera-
                               tion Non-Motoring Mode (b)
       To test the differences between deceleration motoring mode and deceleration non-mo-
toring mode, a Kolmogorov-Simirnov two-sample test was chosen rather than a standard t-tesl,
because the normal distribution assumption was questionable. The Kolmogorov-Smirnov two-
sample test is a test of the null hypothesis that two independent samples have been drawn from
the same population (or from populations with the same distribution). The test uses the maximal
difference between cumulative frequency distributions of two samples as the test statistic. Re-
sults of the Kolmogorov-Smirnov two-sample tests demonstrate that the differences in emission
rates under deceleration motoring mode and deceleration non-motoring mode are statistically
significant.
                                          9-13

-------
Table 9-6 Comparison of Emission Distributions between Deceleration Mode and Two Sub-
Modes (Deceleration Motoring Mode and Deceleration Non-Motoring Mode)
Number
Minimum
lstQuartile
Median
3rd Quartile
Maximum
Mean
Skewness
16644
0.00001
0.00182
0.00611
0.03155
1.30640
0.02215
6.02890
16919
0.00001
0.00249
0.00398
0.00605
0.85208
0.00580
30.6459
16805
0.00001
0.00039
0.00068
0.00120
0.04200
0.00118
5.76530
Sub-mode 1 :Deceleration Motoring Mode
Number
Minimum
1st Quartile
Median
3rd Quartile
Maximum
Mean
Skewness
10925
0.00001
0.00124
0.00272
0.00816
0.14930
0.00978
3.08780
11304
0.00001
0.00269
0.00401
0.00567
0.20366
0.00528
12.27120
11240
0.00001
0.00041
0.00067
0.00110
0.01425
0.00111
3.92760
Sub-mode 2: Deceleration Non-Motoring Mode
Number
Minimum
1st Quartile
Median
3rd Quartile
Maximum
Mean
Skewness
5719
0.00002
0.01973
0.03431
0.05658
1.30640
0.04576
5.7018
5615
0.00003
0.00204
0.00384
0.00741
0.85208
0.00685
26.8539
5565
0.00001
0.00034
0.00069
0.00150
0.04200
0.00131
6.8026
                                         9-14

-------
                       9.4 Deceleration Emission Rate Estimations

       Using the "acceleration < -1 mph/s" cutpoint, about 16% of total data collected are clas-
sified in the deceleration mode.  While deceleration emission rates could simply be estimated
directly by averaging all deceleration mode emission rates, the emission rate distribution is non-
normal. Because lambdas identified by the Box-Cox procedure for the whole dataset and decel-
eration mode subsets are different, and because using a transformation to estimate the mean and
construct confidence intervals will create other problems, the bootstrap (another class of general
methods) was used for estimation of the mean and for construction of confidence intervals.  The
bootstrap function in this study resampled the emission rate data 1000 times and computed the
mean, 2.5%, and 97.5% percentile of each sample.
       The results of the bootstrap analyses indicate that splitting the deceleration mode into
deceleration motoring mode and deceleration non-motoring mode using the zero engine power
criteria is warranted. The bootstrap distributions of mean emission rates for deceleration mode,
deceleration motoring mode, and deceleration non-motoring mode are presented in Figures 9-12
to 9-14 and Table 9-7. To illustrate the difference in emission rate estimation between decelera-
tion motoring mode and deceleration non-motoring mode, Figure 9-15 presents bootstrap means
and confidence intervals for the emission rates of all three pollutants.  For reference purposes,
deceleration mode emission rate estimations are also presented.  Table 9-7 and Figure 9-15 show
that the average emission rate for the deceleration motoring mode is much lower than that for
deceleration non-motoring mode for all pollutants especially for NO  .
                                          9-15

-------
        121  0,0215  0.022  0.0225  0.023  0.0235
            Mean of NO Emission Kilt (o/s)
92  93   :•>;   'js    la  10.2  10.4
     M»inotfW6mis«innRate(s'5) , )cf>
   OOU  0045 0046 0047 0.048  0049
     Mean 01 NO Emission Kale (s/s)
Figure 9-12 Bootstrap Results for NO  Emission Rate Estimation in Deceleration Mode
               Deceleration Mode
                                               Decelerating Motoring Mode
                                                                                 Revised Deceleration Mode
     260
      5.4   56    58    6    62   64
           Mean ul CO Emisaiou Rate (ys) ^ ]Q 1
5   51   52  53   54   5.5  SB
    Maun ef CO Emission Rate (g/s] x 1£jJ
6       6.5       7       75
    Mean of CO Emission Rate <9/s) x ^J
 Figure 9-13 Bootstrap Results for CO Emission Rate Estimation in Deceleration Mode
                                                   9-16

-------
                Deceleration Mode
                                                       inj M-nlDrirg Modi
       I 12  1 14  ! 16   I IB  12  1.2!  124
            Mean e( HC Emission Rate (grs) , ,fl-=
                                                                             300
                                                                                     Rensed DacelBiition Mode
                                                                             ISO
1      105     11     I 15    1-2
    Mean of HC Emission Rsle (grt) x10celiruionMotoiingMotte
                                                                                Deceleration NorvMolortng Mode
                                                                 65
                                                 CO Emission RM. (5/1)
                        Oeceleratton r.iolu-irg Mode
                                                                                Oeceleulion Non.Wotor.ng Mode
                                                1.2            1.25
                                                 HC Emission Rate (j/s)
                                                                            1.3
                                                                                          135
   Figure  9-15 Emission Rate Estimation Based  on Bootstrap for Deceleration Mode
                                                     9-17

-------
Table 9-7 Emission Rate Estimation and 95% Confidence Intervals Based on Bootstrap for De-
celeration Mode
                                    Average
                                                  2.5%
97.5%
                                                Percentile     Percentile
                                            Deceleration Mode
N0x
CO
HC
Estimation
Confidence Interval
Estimation
Confidence Interval
Estimation
Confidence Interval
0.02215
0.02161
0.02268
0.00580
0.00562
0.00598
0.00118
0.00115
0.00121
0.00024
0.00022
0.00027
0.00055
0.00051
0.00059
0.00004
0.00004
0.00004
0.10919
0.10427
0.11411
0.02191
0.02067
0.02314
0.00652
0.00626
0.00679
Deceleration Motoring Mode
NO
X
CO
HC
Estimation
Confidence Interval
Estimation
Confidence Interval
Estimation
Confidence Interval
0.00978
0.00945
0.01010
0.00529
0.00514
0.00543
0.00111
0.00109
0.00114
0.00017
0.00015
0.00019
0.00072
0.00068
0.00075
0.00004
0.00004
0.00004
0.06540
0.06306
0.06774
0.01743
0.01635
0.01850
0.00652
0.00621
0.00683
Deceleration Non-Motoring Mode
N0x
CO
HC
Estimation
Confidence Interval
Estimation
Confidence Interval
Estimation
Confidence Interval
0.04578
0.04457
0.04698
0.00686
0.00643
0.00728
0.00131
0.00125
0.00137
0.00173
0.00152
0.00195
0.00037
0.00033
0.00040
0.00004
0.00003
0.00005
0.17187
0.16343
0.18031
0.02846
0.02587
0.03104
0.00650
0.00594
0.00706
                                        9-18

-------
       Based on table 9-7, the deceleration emission rate for NOx is set as 0.02215 g/s with
95% confidence interval (0.00024 to 0.10919), CO as 0.00580 g/s with 95% confidence interval
(0.00055 to 0.02191), HC as 0.00118 g/s with 95% confidence interval (0.00004 to 0.00652).
The deceleration motoring emission rate for NO is set as 0.00978 g/s with 95% confidence
interval (0.00017 to 0.06540), CO as 0.00529 g/s with 95% confidence interval (0.00072 to
0.01743), HC as  0.00111 g/s with 95% confidence interval (0.00004 to 0.00652). The decelera-
tion non-motoring mode emission rate for NO is set as 0.04578 g/s with 95% confidence inter-
val (0.00173 to 0.17187),  CO as 0.00686 g/s with 95% confidence interval  (0.00037 to 0.02846),
HC as 0.00131 g/s with 95% confidence interval (0.00004 to 0.00650).

                      9.5 Conclusions and Further Considerations

       In this research, deceleration mode is defined as "acceleration < -1 mph/s".  However the
emissions under negative  engine power are different from those under positive engine power.
Hence, the deceleration mode is split into deceleration motoring mode and  deceleration non-
motoring mode based on engine power.
       Inter-bus  variability analysis indicates that bus 372 has the largest 3rd Quartile value for
engine power among 15 buses in deceleration mode,  consistent with the finding in idle mode.  At
the same time, inter-bus variability analysis results show that bus 379 has the largest median and
the second largest mean for NOx emissions, bus 372 has the largest median and the second larg-
est mean for CO  emissions, while bus 364 has the largest median and mean for HC emissions.
But it is difficult  to conclude that these buses should be classified as high emitters or that there
are any special modes that should be modeled separately as high-emitting modes.
       Some high HC emissions events are noted in deceleration mode. After screening engine
speed, engine power, engine oil temperature, engine oil pressure, engine coolant temperature,
ECM pressure, and other parameters, these operating parameters could not be linked to these
high emissions occurrences. Additional causal variables may be in play that are not included in
the data available for analysis.
       Based on definitions of "acceleration < -1 mph/s", about 1% of data points exhibit some-
what unusually high engine power (> 50 bhp) in deceleration mode. Analysis shows that higher
engine power always happened with higher vehicle speed and higher engine speed. These high-
er-power data points likely reflect the variability in real world power demand (perhaps associated
with operations on grade,  which could not be identified in the database). All of these data were
retained in the model to avoid potentially biasing the results.
                                          9-19

-------
       In summary, the deceleration non-motoring mode emission rate for NO is set as 0.04578
g/s, CO as 0.00686 g/s, and HC as 0.00131 g/s. The deceleration motoring emission rate for NO
is set as 0.00978 g/s, CO as 0.00529 g/s, and HC as 0.00111 g/s.  Emission rate estimation for the
deceleration motoring mode is significantly lower than the deceleration non-motoring mode for
all three pollutants, especially for NO .
                                          9-20

-------
                                      CHAPTER 10
                     10. ACCELERATION MODE DEVELOPMENT
       After developing the idle mode definition and emission rate in Chapter 8 and deceleration
mode definitions and emission rates in Chapter 9, the next task is to divide the rest of the data
into acceleration and cruise mode.  This chapter examines the definition of acceleration activity
and emission rates for acceleration activity.

                 10.1  Critical Value for Acceleration in Acceleration Mode

       The first task related to analysis of emission rates in the acceleration mode is identifying
a critical value for acceleration.  Two values were tested: 1 mph/s and 2 mph/s.  Since the critical
value of "acceleration > 1 mph/s" will include all data under the critical value of "acceleration
> 2 mph/s",  comparison of data falling between these two potential cut points is conducted first.
Once selected, the chosen critical value will be used to divide the data into acceleration mode
and cruise mode.  Thus "acceleration > 0 mph/s and acceleration < 1 mph/s" will be another op-
tion.  Similarly to analysis for deceleration mode, these three options will be:

       •   Option 1: acceleration > 2  mph/s

       •   Option 2: acceleration > 1  mph/s and acceleration < 2 mph/s

       •   Option 3: acceleration > 0  mph/s and acceleration < 1 mph/s

       Figure  10-1 illustrates engine  power distribution for these three options.  Figures  10-2 to
10-4 compare engine power vs.  emission rate for three pollutants for three options.  Tables  10-1
and 10-2 provide the distribution for these three options in two ways: by number and percentage.
                                          10-1

-------
       Figure 10-1 Engine Power Distribution for Three Options
Figure 10-2 Engine Power vs. NO  Emission Rate (g/s) for Three Options
                               10-2

-------
          Opta 1
                                     Ofbi :
I
I
                          I 2

                          i
                                                      M
§
                                                             100  i»  a»  so  xa
                                                             Ci*y« » F^Ww* fbhp)
  Figure 10-3 Engine Power vs. CO Emission Rate (g/s) for Three Options
            OpIM I
 005





gaw

I



o 003
 001
                            005






                           lorn

                           i




                           o 0
-------
Table 10-1 Engine Power Distribution for Three Options for Three Pollutants
Engine Power (brake horsepower (bhp))
Acceleration Pollutants
(0 50) (50 100) (100 150) (150 200) > 200 Total
Option 1


Option 2


Option 3


NO
CO
HC
NO
CO
HC
NO
CO
HC
322
319
318
613
606
605
3208
3190
3104
446
444
440
865
858
843
4130
4105
3972
852
851
833
1358
1355
1328
4378
4362
4195
1229
1228
1203
1324
1321
1287
2490
2487
2408
5870
5870
5649
6015
6012
5824
3205
3185
3131
8719
8712
8443
10175
10152
9887
17411
17329
16810
Table 10-2 Percentage of Engine Power Distribution for Three Options for Three Pollutants
Engine Power (brake horsepower (bhp))
Acceleration Pollutants (Q ^ (5Q WQ) (WQ ^ (15Q2QQ) ^ ^
Option 1


Option 2


Option 3


NO
CO
HC
NO
V
CO
HC
NO
Y
CO
HC
3.7%
3.7%
3.8%
6.0%
6.0%
6.1%
18.4%
18.4%
18.5%
5.1%
5.1%
5.2%
8.5%
8.5%
8.5%
23.7%
23.7%
23.6%
9.8%
9.8%
9.9%
13.3%
13.3%
13.4%
25.1%
25.2%
25.0%
14.1%
14.1%
14.2%
13.0%
13.0%
13.0%
14.3%
14.4%
14.3%
67.3%
67.4%
66.9%
59.1%
59.2%
58.9%
18.4%
18.4%
18.6%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
       If the critical value is set as 1 mph/s for acceleration mode, data falling into option 1 and
option 2 will be classified as acceleration mode while data falling into option 3 will be classi-
fied as cruise mode. If the critical value is set as 2 mph/s for acceleration mode, data falling into
option 1 will be classified as acceleration mode while data falling into option 2 and option 3 will
be classified as cruise mode. There is little difference in the engine power distributions noted for
data falling into option 1 and option 2 while the power distribution for option 3 is obviously dif-
ferent from option 1 and option 2 in the above figures and tables.  Table 10-1 and 10-2 show that
the engine power is more concentrated in higher engine power (>200 bhp) for data in accelera-
tion mode.  Tables 10-1 and 10-2 better reflect the power demand of the vehicle in real world in
acceleration mode. Hence, the critical value is set as 1 mph/s for acceleration mode.
                                           10-4

-------
       After defining "acceleration > 1 mph/s" as acceleration mode, cruise mode data will
consist of all of the remaining data in the database (i.e., data not previously classified into idle,
deceleration, and now acceleration). Unlike idle and deceleration mode, there is a general rela-
tionship between engine power and emission rate for acceleration mode and cruise mode.  Even
though the engine power distribution for acceleration mode is different from that of cruise mode
(Table 10-3), these two modes share a relationship between engine power and emission rate (Fig-
ure 10-5), although there are potentially some significant differences noted in the HC chart.

Table 10-3   Engine Power Distribution for Acceleration Mode and Cruise Mode
                   Pollutants
                                               Engine Power Distribution
                               (050)   (50100)   (100150)   (150200)    > 200
Acceleration mode
Number
Percentage
NO
Y
CO
HC
NO
V
CO
HC
935
925
923
4.95%
4.90%
5.04%
1311
1302
1283
6.94%
6.90%
7.00%
2210
2206
2161
11.70%
11.69%
11.79%
2553
2549
2490
13.51%
13.51%
13.58%
11885
11882
11473
62.90%
62.99%
62.59%
18894
18864
18330
100.00%
100.00%
100.00%
Cruise mode
Number
Percentage
NO
Y
CO
HC
NO
Y
CO
HC
15885
15834
15481
40.34%
40.37%
40.72%
8988
8940
8600
22.83%
22.80%
22.62%
7173
7145
6830
18.22%
18.22%
17.96%
3536
3529
3394
8.98%
9.00%
8.93%
3792
3770
3715
9.63%
9.61%
9.77%
39374
39218
38020
100.00%
100.00%
100.00%
                                           10-5

-------
     Figure 10-5 Engine Power vs. Emission Rate for Acceleration Mode and Cruise Mode
       The relationships between emission rate and power for acceleration mode data will be ex-
plored in this chapter, while the relationships between emission rate and power for cruise mode
data will be explored in the next chapter.

                         10.2 Analysis of Acceleration Mode Data

10.2.1 Emission Rate Distribution by Bus in Acceleration Mode

       After denning vehicle activity data with "acceleration >1 mph/s" as acceleration mode,
emission rate histograms for each of the three pollutants for acceleration operations are presented
in Figure 10-6.  Figure 10-6 shows significant skewness for all three pollutants for acceleration
mode. There are also a small number of some very high HC emissions events noted in accelera-
tion mode.  After screening engine speed, engine power, engine oil temperature, engine oil pres-
sure, engine coolant temperature, ECM pressure, and other parameters, no operating parameters
appeared to be correlated with the high emissions events.
                                          10-6

-------
         latBO
                                 IBOOD
                                                          son
                                                          6000
               NOi CnuMiwi Rut l»l!
                                              -J  i
                                      OS  I  15 2  ?r>
                                        CO troitnon R«t (*>!)
002   804    I)DB
HC Cmiuoti R«i (Vt)
              Figure 10-6 Histograms of Three Pollutants for Acceleration Mode
       Inter-bus response variability for acceleration mode operations is illustrated in Figures
10-7 to 10-9 using median and mean of NOx, CO, and HC emission rates. Table 10-4 presents
the same information in tabular form.  The difference between median and mean is also an indi-
cator of skewness.

Table 10-4 Median and Mean of Three Pollutants in Acceleration Mode by Bus
   Bus ID     Median     Mean     Median     Mean    Median     Mean
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
0.27729
0.30170
0.14459
0.28948
0.17834
0.31092
0.17827
0.17788
0.26410
0.18011
0.25957
0.28125
0.14058
0.26033
0.18627
0.28991
0.17335
0.20883
0.26620
0.19806
0.06527
0.05177
0.03836
0.03501
0.02980
0.05929
0.04755
0.08430
0.08238
0.07856
0.09217
0.08001
0.09012
0.05650
0.03475
0.08619
0.09612
0.10346
0.19149
0.12646
0.00159
0.00184
0.00022
0.00306
0.00250
0.00143
0.00104
0.00222
0.00210
0.00095
0.00182
0.00228
0.00039
0.00363
0.00279
0.00176
0.00112
0.00276
0.00253
0.00106
                                           10-7

-------
Bus ID    Median    Mean    Median    Mean    Median    Mean
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
0.28966
0.24419
0.18775
0.17783
0.22674
0.29152
0.26739
0.22139
0.21706
0.24673
0.09234
0.05355
0.07111
0.05141
0.10412
0.18179
0.13112
0.17389
0.07893
0.23806
0.00263
0.00308
0.00401
0.00361
0.00272
0.00272
0.00368
0.00429
0.00384
0.00282
   Figure 10-7 Median and Mean of NO Emission Rates in Acceleration Mode by Bus
                                   10-8

-------
  H
   005

                                         fo,s
                                         !
                                         8
                                         I01
                                          005
                              U  16
                                                           8   1C   13   M   16
                                                         But No
Figure 10-8 Median and Mean of CO Emission Rates in Acceleration Mode by Bus
                                           u
                                           :•:
                                                 I
                                                      	
                                            0   J   J
                                                       6   fl   10  13
                                                         Bus No
                                                                     II   Id


Figure 10-9 Median and Mean of HC Emission Rates in Acceleration Mode by Bus
                                    10-9

-------
       Figures 10-7 to 10-9 and Table 10-4 illustrate that NO emissions are more consistent than
CO and HC emissions. Across the 15 buses, Bus 386 has the largest median and mean for CO
emissions, while Bus 384 has the largest median and mean for HC emissions. The above figures
and table demonstrate that although variability exists across buses, it is difficult to conclude that
there are any true "high emitters." That is, the emissions from these buses are not consistently more
than one or two standard deviations from the mean under normal operating conditions. Meanwhile,
Bus 363 has the smallest mean and median HC emissions compared to the other 14 buses.

10.2.2  Engine Power Distribution by Bus in Acceleration Mode

       Engine power distribution in acceleration mode by bus is shown in Figure 10-10 and
Table 10-5. When the bus is accelerating, the engine will be required to produce more power.
Figure 10-10 and Table 10-5 reflect this characteristic of acceleration mode. The distribution
of engine power in acceleration mode is significantly different from deceleration mode  and idle
mode.  Bus 372 has the largest minimum  engine power in acceleration mode, consistent with the
finding for idle mode and deceleration mode. The maximum power values for each bus match
well with the manufacturer's engine power rating.  Although variability for engine power distri-
bution exists across buses, it is difficult to conclude that such variability is affected by individual
buses, bus routes, or other factors.  The relationship between power and emissions appears con-
sistent across the buses for acceleration mode.

Table 10-5 Engine Power Distribution in Acceleration Mode by Bus
Bus ID Number Min 1st Quartile Median 3rd Quartile Max Mean
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
1507
545
1287
931
728
1599
1751
1427
1823
1362
691
1043
1292
1377
1532
0
7.16
0
0
34.42
0
3.35
0
0
0
0
0
0
0
13.81
162.96
131.96
111.52
142.82
145.57
140.92
166.25
204.15
202.69
139.86
173.36
161.16
144.10
143.51
164.27
255.57
199.58
200.39
228.25
213.51
259.45
256.89
264.54
262.11
220.00
250.90
250.37
213.87
226.37
244.80
275.05
261.51
267.06
270.01
264.70
275.13
275.08
275.18
275.15
272.21
275.05
275.08
269.50
274.99
275.06
275.59
275.54
275.59
275.56
275.56
275.57
275.60
275.58
275.54
275.60
275.58
275.59
275.60
275.55
275.60
212.04
184.46
180.03
197.27
199.81
205.56
212.09
233.71
228.55
199.20
218.82
213.70
198.80
201.67
215.95
                                         10-10

-------
       Engine power distribution also shows that about 0.19% (36/18895) of data points show
zero load in acceleration mode. For the 36 data points exhibiting zero indicated engine load,
about 92% (33/36) occurred on roads reported to have zero or negative grade.  Due to the inac-
curacy of road grade values, it was not possible to simulate the engine power in this research.
However, in the real world, linear acceleration with zero load can happen on downhill stretches.
Application of load based emission rates to predicate engine load will be able to take grade into
account in the overall modeling framework. Because only 36 data points with zero load were
included in the acceleration data, it was unnecessary to develop a sub-model for them. Mean-
while, such zero loads in acceleration mode do reflect the variability of acceleration data in the
real world.
                  -
                   u
                    = • ... -I  .1:
U
UHJ
-
TO
au

303




J
Ml
KB
-.1
00





J
HC
"
,ji
0





J
DB
-.,
>,i
0
D





D
,,
,,
= '.
a




j
r.
o
.
E1
n
.1




j
UCJU
^,
-.1,

SB
n




J





n iin OT TO "n ion jii in "(i NII xn m "o vn xa SB "n «n m m "n mi 201 3m ~t ten an wo
ifcnMI HmWI Bw»I Swtt) 8« fU BMlK BM.MB
           Figure 10-10 Histograms of Engine Power in Acceleration Mode by Bus
                                         10-11

-------
                        10.3  Model Development and Refinement

10.3.1 HTBR Tree Model Development

       The potential explanatory variables included in the emission rate model development ef-
fort include:
       •  Vehicle characteristics: model year, odometer reading, bus ID (14 dummy variables);
       •  Roadway characteristics: dummy variable for road grade;
       •  Onroad load parameters: engine power (bhp), vehicle speed (mph), acceleration
         (mph/s);
       •  Engine operating parameters: engine oil temperature (deg F), engine oil pressure
         (kPa), engine coolant temperature (deg F), barometric pressure reported from ECM
         (kPa);
       •  Environmental conditions: ambient temperature (deg C), ambient pressure (mbar),
         ambient relative humidity (%).

       The HTBR technique is used first to identify potentially significant explanatory variables;
this analysis provides the starting point for conceptual model development.  The HTBR model
is used to guide the development of an OLS regression model, and not a model in its own right.
HTBR can be used as a data reduction  tool and for identifying potential interactions among the
variables. Then OLS regression is used with the identified variables to estimate a preliminary
"final" model.
       These 27 variables were first offered to the tree model. To arrive at the "best" model,
various regression tree models were created.  The initial model was created by allowing the tree
to grow unconstrained for the first cut.  Once  an initial model was created, the supervised tech-
nique in S-PLUS was used to simplify  the model by removing the lower branches of the tree that
explained the least deviance.  For application purposes, the resulting tree was examined to ensure
that the model's predictive ability was  not compromised by allowing the overall amount of devi-
ance to increase significantly.
       The 27 variables include continuous, categorical, and dummy variables. Dummy vari-
ables for buses could be used to indicate the variability of buses. Like the analysis in Chapter
6, these 15 buses could be treated as a  single group for purposes of analysis and model develop-
ment. HTBR technique can examine the potential additional influence of road grade (i.e., above
and beyond the contribution to power demand) using a dummy variable to represent a grade
                                         10-12

-------
category (the final model does not include this dummy variable due to the inaccuracy of road
grade values).  Analysis results in Chapter 6 indicate that all environmental characteristics, like
temperature, humidity and barometric pressure, are moderately correlated with each other.  On
the other hand, engine operating parameters, like engine oil pressure, engine oil temperature, en-
gine coolant temperature,  and barometric pressure reported from ECM, are highly or moderately
related to on-road operating parameters, like engine power, vehicle speed, and acceleration. The
modeler should be aware of such correlations among explanatory variables.
       Although evidence in the literature suggests that a logarithmic transformation is most
suitable for modeling motor vehicle emissions (Washington 1994; Ramamurthy et al. 1998;
Fomunung 2000; Frey et al. 2002), this transformation needs to be verified through the Box-Cox
procedure.  The Box-Cox  function in MATLAB™ can automatically identify a transforma-
tion from the family of power transformations on emission data, ranging from -1.0 to 1.0. The
lambdas chosen by Box-Cox  procedure for acceleration mode are  0.683 for NOx, 0.094438 for
CO, 0.31919 for HC. The Box-Cox procedure is used only to provide a guide for selecting a
transformation, so overly precise results are not needed (Neter et al.  1996). It is often reasonable
to use a nearby lambda value that is easier to understand for the power transformation. Although
the lambdas chosen by the Box-Cox procedure are different for acceleration and cruise mode,
the nearby lambda values  are same for these two modes.  In summary, the lambda values used
for transformations are /^  for NOx, 0 for CO (indicating a log transformation), and 1A for HC for
acceleration mode. Figures 10-11 to 10-13 present histogram, boxplot, and probability plots
of truncated emission rates in acceleration mode for NO , CO,  and HC, while Figures 10-14
                                                  x                     ^
to 10-16 present the same plots for truncated transformed emission rates for NOx, CO and HC,
where a great improvement is noted.
                                                       -:•  v   2  4
                                                      OarOn u 
-------
                  ll...
              00    05    10    15   20
          Truncated CO Emission Rate (gfc) in Acceleration Mode
-4-2024
   QuanHes of Standard Normal
Figure 10-12 Histogram, Boxplot, and Probability Plot of Truncated CO Emission Rate in Acceleration Mode
                 ,
              0.0  0.01     0.03     0.05
          Truncated HC Emission Rate Cg/s) in Acceleration Mode
                                                                          a.
                                                                          5 -
 .4    -2     0    24
    Quanttes of Standard Normal
Figure 10-13 Histogram, Boxplot, and Probability Plot of Truncated HC Emission Rate in Acceleration Mode
                                                    10-14

-------
               ll
             0.0    04   OB   1.2
                                                              -4   -2024

                                                                Ouartfcs o( Standard Normal
Figure 10-14 Histogram, Boxplot, and Probability Plot of Truncated Transformed NOx Emission
                                 Rate in Acceleration Mode
           o -
                    .ll
                                  6
             -S  -4  -3  -2  -1   0
                                                              -4   -i   0    !    4

                                                                   fcs o( Slandad Normal
Figure 10-15 Histogram, Boxplot, and Probability Plot of Truncated Transformed CO Emission
                                Rate in Acceleration Mode
                                           10-15

-------
                      ll
               0.1   0.2   0.3  0.4
                                                                -2024
                                                               QuoriHes ol Standard normal
 Figure 10-16 Histogram, Boxplot, and Probability Plot of Truncated Transformed HC Emission
                                Rate in Acceleration Mode
10.3.1.1 NO  HTBR Tree Model Development

       Figure 10-17 illustrates the initial tree model used for truncated transformed NO  emis-
         °                                                                     X
si on rate in acceleration mode. Results for the initial model are given in Table 10-6. The tree
grew into a complex model, with a considerable number of branches and 36 terminal nodes. Fig-
ure 10-18 illustrates the amount of deviation explained corresponding to the number of terminal
nodes.
                                          10-16

-------
                             engine pnwsr<77 ?,
          engine pcsver<32 62

             0.1B8419
                                               vehicle_speej<25 95
Figure 10-17 Original Untrimmed Regression Tree Model for Truncated Transformed NO Emis-
                              sion Rate in Acceleration Mode

Table 10-6 Original Untrimmed Regression Tree Results for Truncated Transformed NO Emis-
sion Rate in Acceleration Mode
 Regression tree:
 tree (formula = N0x.50 ~ model. year + odometer + temperature + baro + humidity  +  ve-
 hicle.speed + oil.temperture + oil.press +  cool.temperature  + eng.bar.press + engine.
 power + acceleration + bus360 + bus361 + bus363  +  bus364  + bus372 +  bus375 + bus377
 + bus379 + bus380 + bus381 + bus382 + bus383  + bus384  + bus385  + dummy.grade,  data =
 busdata!0242006.1.3,
 na.action = na.exclude, mincut =  400, minsize =  800, mindev  = 0.01)
 Variables actually used in tree construction:
  [1] "engine.power"   "vehicle.speed"  "temperature"
  [5] "bus375"         "humidity"       "oil.press"
  [9] "eng.bar.press"  "bus379"         "model.year"
 Number of terminal nodes:  36
 Residual mean deviance:  0.005538 = 104.4 / 18860
 Distribution of residuals:
         Min.     1st Qu.      Median        Mean
                                                        "baro"
                                                        "odometer"
                                                        "oil.temperture"
                                                     3rd Qu.
-3.769e-001 -4.176e-002 -4.298e-003   3.661e-017   3.957e-002
     Max.
i.965e-001
       For model application purposes, it is desirable to select a final model specification that
balances the model's ability to explain the maximum amount of deviation with a simpler model
that is easy to interpret and apply. Figure 10-18 indicates that reduction in deviation with ad-
dition of nodes after 4, although potentially statistically significant, is very small. A simplified
tree model was derived which ends in 4 terminal nodes as compared to the 36 terminal nodes in
the initial model.  The residual mean deviation only increased from 104.4 to 151.2 and yielded a
much more efficient model. Results are shown in Table 10-7 and Figure 10-19. Based on above
analysis, an NO  acceleration emission rate model will be developed based upon these results.
                                          10-17

-------
               i.OOO   5.000   2.500   1.600    1.100    0.700    0.450   0.210   0.110
                I I  I  I  I  I I  I  I I    ! I       I  I  I  I I  I  I  I             I  I  . I  I
                               i
                               10
           30
 Figure 10-18 Reduction in Deviation with the Addition of Nodes of Regression Tree for Trun-
                 cated Transformed NO Emission Rate in Acceleration Mode
             0.2581
                                        temoerat jre<20 5
                                                      vehicle.spieed<25 95
                                                                           0.5482
                                 0 5034
04456
Figure 10-19 Trimmed Regression Tree Model for Truncated Transformed NO Emission Rate in
                                     Acceleration Mode
                                            10-18

-------
Table 10-7 Trimmed Regression Tree Results for Truncated Transformed NO Emission Rate in
                     °                                              X
Acceleration Mode
 Regression tree:
 snip.tree(tree = tree(formula = NOx.50 ~ model.year + odometer + temperature  +
        baro + humidity + vehicle.speed + oil.temperture + oil.press +
        cool.temperature + eng.bar.press + engine.power + acceleration +
        bus360 + bus361 + bus363  + bus364 + bus372 + bus375 + bus377 + bus379 +
        bus380 + bus381 + bus382  + bus383 + bus384 + bus385 + dummy.grade,
        data = busdata!0242006.1.3,  na.action = na.exclude, mincut = 400,
        minsize = 800,  mindev = 0.01),  nodes = c(13.,  7., 12., 2.))
 Variables actually used in tree construction:
 [1]  "engine.power"  "vehicle.speed" "temperature"
 Number of terminal nodes:  4
 Residual mean deviance:  0.008002 = 151.2  / 18890
 Distribution of residuals:
         Min.     1st Qu.      Median        Mean     3rd  Qu.        Max.
  -4.265e-001 -5.813e-002 -7.517e-004   8.861e-016  5.810e-002  8.710e-001
 node),  split, n, deviance, yval
       * denotes terminal node

  1)  root 18894 247.20 0.4669
    2) engine.power<72.3 1397  13.67 0.2581 *
    3) engine.power>72.3 17497 167.70  0.4836
      6) vehicle.speed<25.95  13777 121.40 0.4662
       12) temperature<20.5 4902  42.44 0.5034  *
       13) temperature>20.5 8875  68.45 0.4456  *
      7) vehicle.speed>25.95  3720  26.60 0.5482 *
       This tree model suggests that engine power is the most important explanatory variable for
NO  emissions. This result is consistent with previous research results which verified the impor-
tant effect of engine power on NO emissions (Ramamurthy et al. 1998; Clark et al. 2002; Barth
et al. 2004). Analysis in the previous chapter also indicates that engine power is correlated with
not only on-road load parameters  such as vehicle speed, acceleration, and grade, but also engine
operating parameters such as throttle position and engine oil pressure. On the other hand, en-
gine power in this research is derived from engine speed, engine torque and percent engine load.
Therefore engine power can correlate on-road modal activity with engine operating conditions to
that extent. This fact strengthens  the importance of introducing engine power into the concep-
tual model and the need to improve the ability to simulate engine power for regional inventory
development.

       HTBR results suggest that temperature may be an important predictive variable for NOx
emissions under certain conditions. Temperature effects may  need to be integrated into new
models in the form of a temperature correction factor.  But adequate data are not yet available for
this purpose.  For the time being,  temperature is removed from consideration in further linear re-
gression model  development, but  the effect is probably significant and should be examined when
more comprehensive emission rate data collected under a wider variety of temperature conditions
are available for analysis.
                                         10-19

-------
10.3.1.2 CO HTBR Tree Model Development

       Figure 10-20 illustrates the initial tree model used for truncated transformed CO emission
rate in acceleration mode. Results from the initial model are given in Table 10-8. The tree grew
into a complex model with a considerable number of branches and 33 terminal nodes. Figure 10-
21 illustrates the amount of deviation explained corresponding to the number of terminal nodes.
             engin9.p
                    < 1 6. 785
                                       vehicla so ied<19 Q5
                                                                       1 15
 Figure 10-20 Original Untrimmed Regression Tree Model for Truncated Transformed CO Emis-
                              sion Rate in Acceleration Mode

Table 10-8 Original Untrimmed Regression Tree Results for Truncated Transformed CO Emis-
sion Rate in Acceleration Mode
 Regression tree:
 tree(formula = log.CO ~ model.year + odometer + temperature + baro  +  humidity +
        vehicle.speed + oil.temperture + oil.press + cool.temperature +
        eng.bar.press + engine.power + acceleration + bus360 + bus361 + bus363 +
        bus364  +  bus372 + bus375  + bus377 + bus379 + bus380 + bus381 + bus382  +
        bus383  +  bus384 + bus385  + dummy.grade,  data = busdata!0242006.1.3,
        na.action = na.exclude, mincut = 400, minsize = 800, mindev = 0.01)
 Variables actually used in tree construction:
 [1]  "engine.power"  "humidity"      "vehicle.speed" "acceleration"
 [5]  "odometer"      "model.year"    "baro"
 Number of terminal nodes:  33
 Residual mean deviance:  0.1184 = 2229 / 18830
 Distribution of residuals:
         Min.     1st Qu.      Median        Mean
  -2.552e+000 -2.001e-001 -1.285e-002  3.025e-017
  "eng.bar.press"
   3rd Qu.
1.981e-001
      Max.
1.653e+000
       For model application purposes, it is desirable to select a final model specification that
balances the model's ability to explain the maximum amount of deviation with a simpler model
that is easy to interpret and apply. Figure 10-21 indicated that the reduction in deviation with ad-
dition of nodes after four, although potentially statistically significant, is very small. A simplified
                                         10-20

-------
tree model was derived which ends in four terminal nodes as compared to the 33 terminal nodes
in the initial model. The residual mean deviation only increased from 2229 to 3093 and yielded
a much cleaner model than the initial one. Results are shown in Table 10-9 and Figure 10-22.
The CO acceleration emission rate model will be developed based upon these results.
                   11000
                          150.0    44.0
                          i i  i t  i  i i
                                       250
150    130
     i  i  i i
                                                          66
                                                                 52
                                                                       -inf
                                   10
                                          15
     20
                                                                  30
                                             size
  Figure 10-21 Reduction in Deviation with the Addition of Nodes of Regression Tree for Trun-
                 cated Transformed CO Emission Rate in Acceleration Mode

                                   pOW6r
-------
Table 10-9 Trimmed Regression Tree Results for Truncated Transformed CO Emission Rate in
Acceleration Mode
 Regression tree:
 snip.tree(tree = tree(formula = log.CO ~ model.year + odometer + temperature +
        baro  +  humidity  +  vehicle.speed + oil.temperture  + oil.press +
        cool.temperature + eng.bar.press + engine.power + acceleration +
        bus360  + bus361  +  bus363  + bus364 + bus372  + bus375 + bus377 + bus379 +
        bus380  + bus381  +  bus382  + bus383 + bus384  + bus385 + dummy.grade,
        data  =  busdata!0242006.1.3,  na.action  = na.exclude, mincut = 400,
        minsize =  800, mindev =  0.01),  nodes = c(12.,  7.,  2.,  13.))
 Variables actually used in tree construction:
 [1]  "engine.power"  "vehicle.speed"
 Number of terminal nodes:  4
 Residual mean deviance:  0.164  = 3093 / 18860
 Distribution of residuals:
         Min.     1st Qu.       Median        Mean     3rd Qu.        Max.
  -3.019e+000 -2.450e-001  -1.062e-002 -9.774e-017  2.430e-001  1.735e+000
 node),  split,  n,  deviance, yval
       * denotes  terminal  node

  1)  root 18864 5309.0 -1.1990
    2)  engine.power<82.625 1624   560.0 -1.9810 *
    3)  engine.power>82.625 17240 3662.0 -1.1250
      6) vehicle.speed<19.05 9752 1994.0 -0.9339
       12) engine.power<152.965  2335  522.6 -1.2510 *
       13) engine.power>152.965  7417 1163.0 -0.8342 *
      7) vehicle.speed>19.05 7488  847.2 -1.3740  *
       This tree model suggested that engine power is the most important explanatory variable
for CO emissions, consistent with NO emissions.  This tree will be used as reference for linear
                '                  X
regression model development.

10.3.1.3 HC HTBR Tree Model Development

       Figure 10-23 illustrates the initial tree model used for the truncated transformed HC emis-
sion rate in acceleration mode. Results for the initial model are given in Table 10-10. The tree
grew into a complex model with a considerable number of branches and 30 terminal nodes.
                                         10-22

-------
                                                     bus377<0 5
                                                           Irim
M,temperjik,e<186 75
    0 167089
            engine power<62Q5

           <9l>
 Figure 10-23 Original Untrimmed Regression Tree Model for Truncated Transformed HC Emis-
                              sion Rate in Acceleration Mode

Table 10-10 Original Untrimmed Regression Tree Results for Truncated Transformed HC Emis-
sion Rate in Acceleration Mode
 Regression tree:
 tree(formula = HC.25 ~ model.year + odometer + temperature + baro + humidity  +
        vehicle.speed + oil.temperture + oil.press + cool.temperature +
        eng.bar.press + engine.power + acceleration + bus360 + bus361 + bus363 +
        bus364  +  bus372 + bus375 + bus377 + bus379 + bus380 + bus381 + bus382 +
        bus383  +  bus384 + bus385 + dummy.grade,  data = busdata!0242006.1.3,
        na.action = na.exclude,  mincut = 400,  minsize = 800, mindev = 0.01)
 Variables actually used in tree construction:
  [1] "odometer"       "bus377"          "bus381"         "baro"
  [5] "engine.power"   "humidity"        "vehicle.speed"  "oil-press"
  [9] "bus375"         "oil.temperture"  "acceleration"   "bus384"
 [13] "bus364"         "model.year"
 Number of terminal nodes:  31
 Residual mean deviance:  0.0005694 = 10.42 / 18300
 Distribution of residuals:
         Min.     1st Qu.      Median        Mean     3rd Qu.        Max.
  -1.004e-001 -1.347e-002 -2.222e-003  1.386e-016  1.091e-002  2.755e-001
       Figure 10-23 and Table 10-12 suggest that the tree analysis of HC emission rates identi-
fied a number of buses that appear to exhibit significantly different emission rates under all load
conditions than the other buses (i.e., some of the bus dummy variables appeared significant in
the initial tree splits). Two bus dummy variables split the data pool at the top levels of the HC
tree model. The first cut point of "odometer > 282096" in the HC tree model could be directly
replaced by "bus 363 > 0.5", because only bus 363 has an odometer reading larger than 282096.
                                         10-23

-------
There were three bus dummy variables that split the first three levels of the HC tree model.
Although higher emissions were noted for all three pollutants for some of the 15 buses, the divi-
sion was even more obvious for HC emissions (see Figure 10-9 and Table 10-4), consistent with
the findings in idle and deceleration mode. Although it is tempting to develop different emis-
sion rates for these buses to reduce emission rate deviation in the sample pool, it is difficult to
justify doing  so. Unless there is an obvious reason to classify these three buses as high emitters
(i.e., significantly higher than normal emitting vehicles,  perhaps by as much as a few standard
deviations from the mean), and unless there are enough  data to develop separate emission rate
models for high emitters, one cannot justify removing the data from the data set.  Until data exist
to justify treating these buses as high emitters, the bus dummy variables for individual buses are
removed from the analyses and all 15 buses are treated as part of the whole data set.
       Another tree model was generated excluding the bus dummy variables, model year, and
odometer. This new tree model is illustrated in Figure 10-25 and Table 10-11. The tree model is
then trimmed for application purposes, as was done for the NO  and CO models.
             8.400  2100  0.490
               ii    i  i  i  i
  0.320  0.110  0.090   0.070   0.044   0.018   0.016
i    i  i  i  i    ii      ii    i     ii    ii
                                ID
            I
           15
 I
20
                                                              I
                                                              25
30
                                           size
           Figure 10-24 Reduction in Deviation with the Addition of Nodes of Regression Tree
for Truncated Transformed HC Emission Rate in Acceleration Mode
                                          10-24

-------
          0.1286
                                          fingins prnkmr<5fi
                                                            baro -J :•:':'.' r
                               01682
                                                   0.2134
                                                                       02423
 Figure 10-25 Trimmed Regression Tree Model for Truncated Transformed HC in Acceleration
                                         Mode
Table 10-11 Trimmed Regression Tree Results for Truncated Transformed HC in Acceleration Mode
                                                     mincut = 400, minsize =
                                                    14.))
Regression tree:
snip.tree(tree = tree(formula = HC.25  ~  temperature  +  baro  +  humidity +
       vehicle.speed + oil.temperture + oil.press + cool.temperature  +
       eng.bar.press + engine.power + acceleration + dummy.grade, data  =
       busdata!0242006.1.3,  na.action = na.exclude
       800,  mindev = 0.01),  nodes = c(2., 6., 15.,
Variables actually used in tree construction:
[1] "baro"         "engine.power"
Number of terminal nodes:  4
Residual mean deviance:  0.001018 =  18.65  /  18330
Distribution of residuals:
        Min.     1st Qu.      Median         Mean
 -9.502e-002 -2.174e-002 -2.213e-003   9.390e-016
node),  split, n, deviance, yval
      * denotes terminal node

 1) root 18330 30.840  0.2099
   2) baro<969.5 1189  1.239 0.1286  *
   3) baro>969.5 17141 21.210 0.2155
     6) engine.power<56.24 850  1.069  0.1682  *
     7) engine.power>56.24 16291 18.140  0.2180
      14) baro<989.5 13717 13.970 0.2134 *
      15) baro>989.5 2574  2.372 0.2423  *
                                                      3rd Qu.
                                                   1.844e-002
      Max.
3.100e-001
       The new tree model suggests that barometric pressure is the most important explana-
tory variable for HC emission rates. However, this finding is challenged by this fact: among
                                         10-25

-------
those 1189 data points (baro < 969.5) in the first left branch, 1187 data points belong to bus 363.
Although this dataset was collected under a wide variety of environmental conditions, the scope
of barometric pressures was limited for individual buses tested. As reported earlier, Bus 363
exhibited significantly lower HC emissions that the other buses (see Figure 10-9); the reason is
not clear at this time. To develop a reasonable tree model given the limited data collected, the
environmental parameters are excluded from the model until a greater distribution of environ-
mental conditions can be represented in a test data set. With data collected from a more com-
prehensive testing program, environmental variables can be integrated into the model directly, or
perhaps correction factors for the emission rates can be developed. The secondary trimmed tree
is presented in Figure 10-26 and Table 10-12.
             	annine nnwer<54 5S5	
                                 f
                                                      oil press '427 75
           0.1559
                                    .ena_bar_pre;s<100,249
                                                                          02266
                                0.1937
02169
 Figure 10-26 Secondary Trimmed Regression Tree Model for Truncated Transformed HC Emis-
                              sion Rate in Acceleration Mode
                                          10-26

-------
Table 10-12 Secondary Trimmed Regression Tree Results for Truncated Transformed HC Emis-
sion Rate in Acceleration Mode
 Regression tree:
 snip.tree(tree = tree(formula = HC.25 ~ engine.power + vehicle.speed +
        acceleration + oil.temperture + oil.press + cool.temperature +
        eng.bar.press,  data = busdata!0242006.1.3,  na.action = na.exclude,
        mincut  = 400,  minsize = 800,  mindev = 0.1), nodes  = c(7.,  13.,  12.
 Variables actually used in tree construction:
 [1]  "engine.power"  "oil.press"     "eng.bar.press"
 Number of terminal nodes:   4
 Residual mean deviance:  0.00136 = 24.92 / 18330
 Distribution of residuals:
         Min.      1st Qu.      Median        Mean     3rd Qu.        Max.
  -1.178e-001 -2.378e-002  6.119e-004 -4.275e-017  2.231e-002   3.223e-001
 node),  split,  n,  deviance, yval
       * denotes terminal node

  1)  root 18330 30.840 0.2099
    2)  engine.power<54.555  988  1.779 0.1559 *
    3)  engine.power>54.555  17342 26.020 0.2130
      6) oil.press<427.75 12457 18.610 0.2076
       12) eng.bar.press<100.249 4989  9.241 0.1937 *
       13) eng.bar.press>100.249 7468  7.763 0.2169 *
      7) oil.press>427.75 4885  6.136 0.2266 *
       This tree model suggests that engine power is the most important explanatory variable
for HC emissions, consistent with analysis of NO and CO emission rates. HTBR results also
suggest that oil pressure and engine barometric pressure may be important predictive variables
for HC emissions under certain conditions. After excluding engine barometric pressure and oil
pressure from the tree model, leaving engine power only, the residual mean deviation increased
slightly from 24.92 to 27.34.  While engine operating parameters such as oil pressure and engine
barometric pressure may impact emissions, such variables are not easy to include in real-world
models. The final HTBR tree for HC emissions is shown in Figure 10-27 and Table 10-13. An
HC acceleration emission rate model will be developed based upon these results.
                                         10-27

-------
                engine power<14 825
                                    angina powQr14.825 550  0.8171 0.1717  *
   3)  engine.power>54.555 17342 26.0200 0.2130
     6)  engine.power<98.385 1177  1.8580  0.2022 J
     7)  engine.power>98.385 16165 24.0100  0.2137
                                         10-28

-------
10.3.2 OLS Model Development and Refinement

       Once a manageable number of modal variables have been identified through regression
tree analysis, the modeling process moves into the phase where ordinary least squares techniques
are used to obtain a final model.  The research objective here is to identify the extent to which
the identified factors influence emission rates in acceleration mode. Modelers rely on previous
research, a priori knowledge, educated guesses, and stepwise regression procedures to identify
acceptable functional forms, to determine important interactions, and to derive statistically and
theoretically defensible models.  The final model will be our best understanding about the func-
tional relationship between independent variables and dependent variables.

10.3.2.1 NO Emission Rate Model Development for Acceleration Mode

       Based on previous analysis, truncated transformed NOx will serve as the independent
variable.  However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing the performance of statistical models.
HTBR tree model results suggest that engine power is the best one to begin with.  Linear regres-
sion model with engine power will be developed first, followed by a combined power and ve-
hicle speed model.

10.3.2.1.1 Linear Regression Model with Engine Power

       Let's select engine power to begin with, and estimate the model:
               7  = /?Q + ft ^engine.power) + Error                        (1.1)
       The regression run yields the results shown in Table 10-14.
                                         10-29

-------
Table 10-14 Regression Result for NO Model 1.1
 Call: lm(formula = NOx.50 ~ engine.power, data = busdata!0242006.1.3,  na.action =
 na.exclude)
 Residuals:
      Min       1Q   Median      3Q    Max
  -0.4093 -0.08133 0.005414 0.07084 0.9344

 Coefficients:
                 Value Std. Error  t value Pr(>|t|)
  (Intercept)    0.3054   0.0021   147.9391    0.0000
 engine.power   0.0008   0.0000    83.3557    0.0000

 Residual standard error: 0.09781 on 18892 degrees of  freedom
 Multiple R-Sguared: 0.2689
 F-statistic:  6948 on 1 and 18892 degrees of  freedom,  the  p-value is  0

 Correlation of Coefficients:
              (Intercept)
 engine.power -0.9387

 Analysis of Variance Table

 Response: NOx.50

 Terms added seguentially  (first to last)
                 Df Sum of Sg  Mean Sg  F Value Pr(F)
 engine.power     1   66.4763 66.47630 6948.175      0
    Residuals 18892  180.7482  0.00957
       These results suggest that engine power explains about 27% of the variance in truncated
transformed NOx. F-statistic shows that/?7 ^ 0, and the linear relationship is statistically signifi-
cant. To evaluate the model, residual normality is checked by examining quantile-quantile (QQ)
plot and checking constancy of variance by examining residuals vs. fitted values.
                                          10-30

-------
      (a) Scatter Plot
(b) Residual vs. Fit
            n^r'.^r^*•
                                i
       (c) Response vs. Fit
(d) Residuals Normal QQ
               Figure 10-28  QQ and Residual vs. Fitted Plot for NOx Model 1.1
       The residual plot in Figure 10-28 shows a slight departure from linear regression assump-
tions indicating a need to explore a curvilinear regression function.  Since the variability at the
different X levels appears to be fairly constant, a transformation on X is considered. The reason
to consider transformation first is to avoid multicollinearity brought about by adding the second-
order of X. Based on the prototype plot in Figure 10-28, the square root transformation and loga-
rithmic transformation are tested.  Scatter plots and residual plots based on each transformation
should then be prepared and analyzed to determine which transformation is most effective.
                  Y = /?Q + ^engine.power^IT> + Error
               Y =/?Q + ft Jog w(engine.power+1) + Error
                    (1.2)
                    (1.3)
       The result for Model 1.2 will be shown in Table 10-15 and Figure 10-29, while the result
for Model 1.3 will be shown in Table 10-16 and Figure 10-30.
                                          10-31

-------
Table 10-15 Regression Result for NO Model 1.2
 Call:  lm(formula = NOx.50 ~ engine.power'" (1/2),  data = busdata!0242006.1.3,
 na.action = na.exclude)
 Residuals:
      Min       1Q   Median      3Q    Max
  -0.4106 -0.07981 0.004093 0.06858 0.9248

 Coefficients:
                         Value Std. Error t value Pr(>|t|)
           (Intercept)   0.1912  0.0030    63.2141  0.0000
 I (engine.power'" (1/2) )   0.0196  0.0002    93.5953  0.0000

 Residual standard error: 0.09455 on 18892 degrees of freedom
 Multiple R-Sguared: 0.3168
 F-statistic:  8760 on 1 and 18892 degrees of freedom, the p-value is 0

 Correlation of Coefficients:
                       (Intercept)
 I (engine.power'" (1/2) )  -0.9738

 Analysis of Variance Table

 Response: NOx.50

 Terms  added seguentially (first to last)
                          Df Sum of Sg  Mean Sg  F Value Pr(F)
 I (engine.power'" (1/2))      1   78.3199 78.31986 8760.082     0
             Residuals 18892  168.9047  0.00894
         (a) Scatter Plot
           • t  II  (I   »  01  ill  «»
          (O Response vs. Fit
        i   -j0'
(b) Residua! vs. Fit
(d) Residuals Normal QQ
              Figure 10-29 QQ and Residual vs. Fitted Plot for NO Model 1.2
                                         10-32

-------
Table 10-16 Regress!on Result for NO Model 1.3
 *** Linear Model ***

 Call:  lm(formula = NOx.50 ~ loglO(engine.power  +  1),  data = busdata!0242006.1.3,
 na.action = na.exclude)
 Residuals:
      Min       1Q   Median      3Q    Max
  -0.4109 -0.07485 0.001841 0.06716 0.9119

 Coefficients:
                            Value Std. Error   t  value  Pr(>|t|)
             (Intercept)  -0.0514   0.0052     -9.7873    0.0000
 loglO(engine.power + 1)   0.2291   0.0023     99.6000    0.0000

 Residual standard error: 0.09263 on  18892 degrees of  freedom
 Multiple R-Sguared: 0.3443
 F-statistic:  9920 on 1 and 18892 degrees of  freedom,  the  p-value is 0

 Correlation of Coefficients:
                          (Intercept)
 loglO(engine.power + 1) -0.9917

 Analysis of Variance Table

 Response: NOx.50

 Terms added seguentially (first to last)
                            Df Sum of Sg  Mean Sg   F Value Pr(F)
 loglO(engine.power +1)     1   85.1206  85.12056  9920.161     0
               Residuals 18892  162.1040  0.00858
          (a) Scatter Plot
           (c) Response vs. Fit
(b) Residual vs. Fit
(d) Residuals Normal QQ
              Figure 10-30 QQ and Residual vs. Fitted Plot for NOx Model 1.3
                                          10-33

-------
       The results suggest that by using square root transformed engine power, the model increases
the amount of variance explained in truncated transformed NOx from about 27% (Model 1.1) to
about 32% (Model 1.2), while the increase is about 34% (Model 1.3) by using log transformed
engine power.
       Model 1.3 improves the R2 more than does Model 1.2. The residuals scatter plot for
Model 1.3 (Figure 10-30) shows a more reasonably linear relationship than Model 1.2 (Figure
10-29).  Figure 10-30 also shows that Model 1.3 does a better job in improving the pattern of
variance. QQ plot shows general normality with the  exceptions arising in the tails.

10.3.2.1.2 Linear Regression Model with Engine Power and Vehicle Speed

       HTBR tree model results also suggest that vehicle speed may be an important predictive
variable for emissions under certain conditions.  After developing a linear regression model with
engine power, adding vehicle speed might improve the model predictive ability. The new model
is proposed as:

         Y = /?Q  + ft Jog ^(engine.power+1) + ^vehicle.speed + Error           (1.4)
       The result for Model 1.4 is shown in Table 10-17 and Figure 10-31.
                                         10-34

-------
Table 10-17 Regress!on Result for NO Model 1.4
 Call:  lm(formula = NOx.50 ~ loglO(engine.power + 1)  + vehicle.speed, data
       busdata!0242006.1.3,  na.action  =  na.exclude)
 Residuals:
      Min       1Q   Median      3Q    Max
  -0.4133 -0.07416 0.004219 0.06303 0.9019
 Coefficients:

             (Intercept)
 loglO(engine.power + 1)
           vehicle.speed
  Value Std. Error
-0.0195   0.0053
 0.2007   0.0025
 0.0019   0.0001
t value Pr(>|t|)
-3.6693   0.0002
79.3288   0.0000
25.1554   0.0000
 Residual standard error:  0.09112 on 18891 degrees of freedom
 Multiple R-Sguared:  0.3656
 F-statistic:  5442 on 2 and 18891 degrees of freedom, the p-value is 0

 Correlation of Coefficients:
                         (Intercept)  loglO(engine.power + 1)
 loglO(engine.power + 1)  -0.9681
           vehicle.speed  0.2383     -0.4470

 Analysis of Variance Table

 Response:  NOx.50

 Terms added seguentially (first to last)
                            Df Sum of Sg  Mean Sg  F Value Pr(F)
 loglO(engine.power +1)      1   85.1206 85.12056 10251.92     0
           vehicle.speed     1    5.2540  5.25404   632.80     0
               Residuals 18891  156.8499  0.00830
                                         10-35

-------
                                                   (b) Response vs. Fit
       (a) Residual vs. Fit
     -

                                                      fc-
                                                       •it   > p
                                                     (c) Residuals Normal QQ
        0*    01    03    01     »«     0*     *«
              Figure 10-31 QQ and Residual vs. Fitted Plot for NOx Model 1.4
       The results suggest that by using vehicle speed and transformed engine power, the model
increases the amount of variance explained in truncated transformed NOx from about 34%
(Model 1.3) to about 37% (Model 1.4). The residuals scatter plot for Model 1.4 (Figure 10-31)
shows a more reasonably linear relationship.  Figure 10-31 also shows that model 1.4 does a bet-
ter job in improving the pattern of variance. QQ plot shows general normality, with deviation at
the tails.

10.3.2.1.3 Linear Regression Model with Dummy Variables

       Figure 10-19 suggests that the relationship between NO and engine power may be
somewhat different across the engine power ranges identified in the tree analysis.  That is, there
may be higher or lower NOx emissions in different engine power operating ranges. One dummy
variable is created to represent different engine power ranges identified in Figure 10-19 for use in
linear regression analysis as illustrated below:
                          Engine power (bhp)   dummy 1
                                < 72.30            1
                                > 72.30            0
                                        10-36

-------
       This dummy variable and the interaction between dummy variable and engine power are
then tested to determine whether the use of the variables and interactions can help improve the
model:
         Y = ft  + ft log (engine.power+1) + fl vehicle.speed + ft  dummy 1 +
         ft dummy 1 log (engine.power+1) + ft dummyIvehicle.speed + Error
                                                 (1.5)
       The result for Model 1.5 is shown in Table 10-18 and Figure 10-32.
Table 10-18 Regression Result for NO Model 1.5
 Call:  lm(formula = NOx.50 ~ loglO(engine.power + 1)  + vehicle.speed + dummy1
 loglO(  engine.power  +  1)  +  dummyl:vehicle.speed,  data =  busdata!0242006.1.3,
        na.action = na.exclude)
 Residuals:
     Min       1Q   Median      3Q    Max
  -0.4124 -0.07157 0.003012 0.06319 0.8924
 Coefficients:

                    (Intercept)
        loglO(engine.power + 1)
                  vehicle.speed
                         dummyl
 dummyl:loglO(engine.power + 1)
   Value Std.  Error
  0.1439   0.0115
  0.1281
  0.0023
 -0.1492
             t value Pr (>111
  0.0609
   0.0052
   0.0001
   0.0148
   0.0081
          12.4979
          24.8261
          28.9191
         -10.0783
           7.4995
               0.0000
               0.0000
               0.0000
               0.0000
               0.0000
           dummyl:vehicle.speed  -0.0035   0.0003   -10.4883   0.0000

 Residual standard error:  0.09022 on 18888 degrees of freedom
 Multiple R-Sguared:  0.3781
 F-statistic:  2297 on 5 and 18888 degrees of freedom, the p-value is 0

 Analysis of Variance Table

 Response: NOx.50
 Terms added seguentially (first to last)
                                   Df Sum of
        loglO(engine.power + 1)
                  vehicle.speed
                         dummyl
 dummyl:loglO(engine.power + 1)
           dummyl:vehicle.speed
                      Residuals
    1
    1
    1
    1
    1
18888
85
 5
 1
 0
 0
  Sq
1206
2540
9017
3018
8955
 Mean Sg
85.12056
   25404
   90166
   30180
   89546
 F Value
10456.89
  645.45
  233.62
   37.08
  110.01
                                       153.7510  0.00814
                                        Pr (F)
        loglO(engine.power + 1)  0.OOOOOOe+000
                  vehicle.speed 0.OOOOOOe+000
                         dummyl 0.OOOOOOe+000
 dummyl:loglO(engine.power + 1)  1.158203e-009
           dummyl:vehicle.speed 0.OOOOOOe+000
                      Residuals
                                         10-37

-------
                                                    (b) Response vs. Fit
       (a) Residuals vs. Fit
                                                        ««   01   »}   oil
                                                                            91   91
                                                    (c) Residuals Normal QQ
         00    01    02    03

              Figure 10-32 QQ and Residual vs. Fitted Plot for NOx Model 1.5
       The results suggest that by using dummy variables and interactions with transformed
engine power and vehicle speed, the model slightly increases the amount of variance explained
in truncated transformed NOx from about 37% (Model 1.4) to about 38% (Model 1.5).
       Model 1.5 slightly improves the R2 compared to Model 1.4.  The residuals scatter plot
for Model 1.5 (Figure 10-32) shows a slightly more linear relationship. Figure 10-32 also shows
that Model 1.4 may also do a slightly better job in improving the pattern of variance. The QQ
plot shows general normality with the exceptions arising in the tails. However, it is important
to note that the model improvement, in terms of amount of variance explained by the model, is
marginal at best.

10.3.2.1.4 Model Discussions

       The performance of alternative models can be evaluated by comparing model predictions
and actual observations for emission rates.  The performance of the model can be evaluated in
terms of precision and accuracy (Neter et al. 1996).  The R2 value is an indication of precision.
Usually, higher R2 values imply a higher degree of precision and less unexplained variability in
                                         10-38

-------
model predictions than lower R2 values. The slope of the trend line for the observed versus pre-
dicted values is an indication of accuracy. A slope of one indicates an accurate prediction, in that
the prediction of the model corresponds to an observation. Since the R2 and slope are derived by
comparing model  predictions and actual observations for emission rates, these numbers will be
different from those observed in linear regression models.
       The models' predictive ability is also evaluated using the root mean square error (RMSE)
and the mean prediction error (MPE) (Neter et al. 1996).  The RMSE is a measure of prediction
error. When comparing two models, the model with a smaller RMSE is a better predictor of
the observed phenomenon. Ideally, mean prediction error is close to zero. RMSE and MPE are
calculated as follows:
                                                            Equation (10-1)
                                                            Equation (10-1)
where:
   RMSE:
   n:
   y.:
   y.:
   MPE:
                                  root mean square error
                                  number of observations
                                  observaton y
                                  mean of observation y
                                  mean predictive error
       Previous sections provide the model development process from one model to another
model. To test whether the linear regression with power was a beneficial addition to the regres-
sion tree model, the mean ERs at HTBR end nodes (single value) are compared to the predictions
from the linear regression function with engine power. The results of the performance evalua-
tion are shown in Table 10-19. The improvement in R2 associated with moving toward a linear
function of engine power is large. Hence, the use of the linear regression function will provide a
significant improvement in spatial and temporal model prediction capability.  But this linear re-
gression function might still be improved.  Since the R2 and slope in Table  10-19 are derived by
comparing model predictions and actual observations  for emission rates (untransformed y), these
numbers are different from those obtained from linear regression models.
                                         10-39

-------
       Two transforms of engine power were tested: square root transformation and log trans-
formation.  The results of the performance evaluation are shown in Table 10-19. Results suggest
that linear regression function with log transformation performs slightly better.
       The addition of vehicle speed was also tested.  The results of the performance evaluation
are shown in Table 10-19. Analysis results suggest that a linear regression function for engine
power and vehicle speed also performs slightly better.
       Since the regression tree modeling exercise indicated that a number of power cutpoints
may play a role in the emissions process, an additional modeling run was performed. The results
of the performance evaluation are also shown in Table 10-19. Analysis results suggest that a
linear regression function with dummy variables performs slightly better than the model without
the power cutpoints.

Table 10-19 Comparative Performance Evaluation of NO  Emission Rate Models
                                               Coefficient of
                                               determination
Slope
 (P,)
RMSE    MPE
Mean ERs
Linear Regression (Power)
Linear Regression (Power5)
Linear Regression (log(Power))
Linear Regression (log(Power)+Speed)
Linear Regression (log(Power)+Speed+Dummy)
0.00026
0.190
0.215
0.236
0.268
0.280
1.000
0.838
0.901
1.012
1.001
1.036
0.10455
0.09463
0.09321
0.09178
0.08982
0.08912
0.00001
0.00428
0.00898
0.00872
0.00837
0.00834
       Although the linear regression function with dummy variables works slightly better than
the linear regression function with engine power and vehicle speed, it introduces more explanato-
ry variables (dummy variables and the interaction with engine power) and increases the complex-
ity of the regression model.  There is only one regression function for Model 1.4 while there are
two regression functions for Model 1.5.  There is also no obvious reason why the engine may be
performing slightly differently within these power regimes, yielding different regression slopes
and intercepts. The fuel injection systems in these engines may operate slightly differently under
low load (near-idle)  and high load conditions. This fuel injection system may be controlled by
the engine computer. There may be a sufficient number of low power cruise operations and high
power cruise operations that are incorrectly classified, and that may be better classified as idle
or acceleration events (perhaps due to GPS  speed data errors). In any case, because the model
with dummy variables does not perform appreciably better than the model without the dummy
variables, the dummy variables are not included in the final model  selection at this time. These
                                          10-40

-------
dummy variables are, however, worth exploring when additional data from other engine technol-
ogy groups become available for analysis. Model 1.4 is selected as the preliminary 'final' model.
       The next step in model evaluation is to once again examine the residuals for the improved
model. A principal objective was to verify that the statistical properties of the regression model
conform with a set of properties of least squares estimators. In summary, these properties require
that the error terms be normally distributed, have a mean of zero, and have uniform variance.

Test for Constancy of Error Variance
       A plot of the residuals versus the fitted values is useful in identifying any patterns in the
residuals.  Figure 10-3 l(c) shows this plot for NOx model.  Without considering variance due to
high emission points and zero load data, there is no obvious pattern in the residuals across the
fitted values.

Test of Normality of Error terms
       The first informal test normally reserved for the test of normality of error terms is a
quantile-quantile plot of the residuals. Figure 10-31 plot (c) shows the normal quantile plot of the
NO  model. The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality.  Under normality, we expect 68 percent of the residuals
to fall between ± VMSE and about 90 percent fall between ±1.645 VMSE . Actually, 72.64% of
residuals fall within the first limits, while 93.79% of residuals fall within the second limits. Thus,
the actual frequencies here are reasonably consistent with those expected under normality.  The
heavy tails at both ends are a cause for concern, but are due to the nature of the data set. For
example, even after the transformation, the response variable is not a true normal distribution.
       Based on the above analysis, the final NOx emission model for cruise mode is:

            NOx= [-0.0195 + 0.2011og10(engine.power+l) + 0.0019vehicle.speed]2

       Analysis results support the observation that the final NO  emission model is significantly
better at explaining variability without making the model too complex.  Since there is only one
engine type, complexity may not be valid in  terms of transferability.  This model is specific to the
engine classes employed in the transit bus operations.  Different models may need to be devel-
oped for other engine classes and duty cycles.
                                          10-41

-------
10.3.2.2 CO Emission Rate Model Development for Acceleration Mode

       Based on previous analysis, truncated transformed CO will serve as the independent
variable.  However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing statistical models.  HTBR tree model
results suggest that engine power is best to begin with.
10.3.2.2.1 Linear Regression Model with Engine Power

       Let's select engine power to begin with, and estimate the model:

                      7 = ft + ft ^engine.power + Error                       (2.1)

       The regression run yields the results shown in Table 10-20.

Table 10-20 Regression Result for CO Model 2.1
 Call:  lm(formula = log.CO ~ engine.power,  data = busdata!0242006.1.3,  na.action =
 na.exclude)
 Residuals:
     Min      1Q   Median     3Q   Max
  -3.151 -0.3515 -0.05231 0.3448 1.453

 Coefficients:
                  Value Std. Error   t value  Pr(>|t|)
  (Intercept)    -1.8549    0.0100  -185.2318    0.0000
 engine.power    0.0031    0.0000    69.7761    0.0000

 Residual standard error: 0.473 on 18862 degrees of freedom
 Multiple R-Sguared:  0.2052
 F-statistic:  4869 on 1 and 18862 degrees of freedom,  the p-value is  0

 Correlation  of Coefficients:
              (Intercept)
 engine.power -0.939

 Analysis of  Variance Table

 Response: log.CO

 Terms  added  seguentially (first to last)
                 Df Sum of Sg  Mean Sg  F Value Pr(F)
 engine.power     1  1089.300 1089.300 4868.698     0
    Residuals 18862  4220.097    0.224
      The results suggest that engine power explains about 21% of the variance in truncated
transformed CO. F-statistic shows that/?7^03 and the linear relationship is statistically signifi-
cant. To evaluate the model, the normality is examined in the QQ plot and constancy of variance
is checked by examining residuals vs. fitted values.
                                         10-42

-------
     (a) Scatter Plot
                         (b) Residual vs. Fit
               TOO    no    JOB
      (c) Response vs. Fit
                         (d) Residuals Normal QQ

      tt      >T*      '
                Figure 10-33 QQ and Residual vs. Fitted Plot for CO Model 2.1
       The residual plot in Figure 10-33 shows a slight departure from linear regression assump-
tions indicating a need to explore a curvilinear regression function. Since the variability at the
different X levels appears to be fairly constant, a transformation on X is considered. The reason
to consider transformation first is avoiding multicollinearity brought about by adding the second-
order of X. Based on the prototype plot in Figure 10-33, the square root transformation and loga-
rithmic transformation were tested. Scatter plots and residual plots based on each transformation
should then be prepared and analyzed to determine which transformation is most effective.
                                                                               (2.2)
Y =
engine.power^(l/2) + Error
                  Y = /? + fijog^engine.power+l) + Error
                                                 (2.3)
       The result for Model 2.2 is shown in Table 10-21 and Figure 10-34, while the result for
Model 2.3 is shown in Table 10-22 and Figure 10-35.
                                          10-43

-------
Table 10-21 Regression Result for CO Model 2.2
 Call:  lm(formula = log.CO ~ engine.power'" (1/2),  data  =  busdata!0242006.1.3,
 na.action = na.exclude)
 Residuals:
     Min      1Q  Median     3Q  Max
  -2.798 -0.3492 -0.0529 0.3381 1.52

 Coefficients:
                           Value Std. Error    t value  Pr(>|t|)
            (Intercept)   -2.3146    0.0149  -155.8023     0.0000
 I (engine.power'" (1/2)  )    0.0793    0.0010     77.1161     0.0000

 Residual standard error: 0.4626 on 18862 degrees  of freedom
 Multiple R-Sguared:  0.2397
 F-statistic:  5947 on 1 and 18862 degrees of  freedom,  the p-value is 0

 Correlation of Coefficients:
                        (Intercept)
 I (engine.power'" (1/2)  ) -0.974

 Analysis of Variance Table

 Response: log.CO

 Terms  added seguentially (first to last)
                          Df Sum of Sg  Mean  Sg   F Value Pr(F)
 I (engine.power'" (1/2))     1  1272.706  1272.706 5946.896     0
             Residuals 18862  4036.691    0.214
            Residuals Normal QQ

               Figure 10-34 QQ and Residual vs. Fitted Plot for CO Model 2.2
                                         10-44

-------
Table 10-22 Regression Result for CO Model 2.3
 Call:  lm(formula = log.CO ~ loglO(engine.power + 1), data = busdata!0242006.1.3,
 na.action = na.exclude)
 Residuals:
     Min      1Q   Median     3Q   Max
  -2.187 -0.3475 -0.05182 0.3313 2.475

 Coefficients:
                             Value Std. Error   t value  Pr(>|t|)
             (Intercept)   -3.2695    0.0261  -125.3639    0.0000
 loglO(engine.power + 1)    0.9152    0.0114    80.0560    0.0000

 Residual standard error: 0.4584 on 18862 degrees of freedom
 Multiple R-Sguared: 0.2536
 F-statistic:  6409 on 1 and 18862 degrees of freedom, the p-value is 0

 Correlation of Coefficients:
                         (Intercept)
 loglO(engine.power + 1) -0.9918

 Analysis of Variance Table

 Response: log.CO

 Terms  added seguentially (first to last)
                            Df Sum of Sg  Mean Sg  F Value Pr(F)
 loglO(engine.power +1)     1  1346.515 1346.515 6408.966     0
               Residuals 18862  3962.882    0.210
       (a) Scatter Plot
(b) Residual vs. Fit
                                                 i-

        (c) Response vs. Fit
(d) Residuals Normal QQ
               Figure 10-35 QQ and Residual vs. Fitted Plot for CO Model 2.3
                                         10-45

-------
       The results suggest that by using transformed engine power, the model increases the
amount of variance explained in truncated transformed CO from about 21% to about 25%.

       Model 2.3 improves the R2 more than does Model 2.2. The residuals scatter plot for
Model 2.3 (Figure 10-35) shows a more reasonably linear relationship than Model 2.2 (Figure
10-34). Figure 10-35 also shows that Model 2.3 does a better job in improving the pattern of
variance. QQ plot shows general normality with the exceptions arising in the tails.

10.3.2.2.2 Linear Regression Model with Engine Power and Vehicle Speed

       HTBR tree model results also suggest that vehicle speed may be an important predictive
variable for emissions under certain conditions.  After developing a linear regression model with
engine power, adding vehicle speed might improve the model predictive ability. The new model
is proposed as:

          7 = ft + ft Jog w(engine.power+1) + ^vehicle.speed + Error            (2.4)

       The result for Model 2.4 will be shown in Table 10-23 and Figure 10-36.

Table 10-23 Regression Result for CO Model 2.4
 Call:  lm(formula = log.CO ~ loglO(engine.power + 1)  + vehicle.speed, data
        busdata!0242006.1.3,  na.action = na.exclude)
 Residuals:
     Min     1Q   Median     3Q   Max
  -2.299 -0.236 -0.02889 0.2281 3.209

 Coefficients:
                             Value Std. Error   t value  Pr(>|t|)
             (Intercept)   -3.7472    0.0225  -166.3169    0.0000
 loglO(engine.power + 1)    1.3412    0.0107   125.1282    0.0000
           vehicle.speed   -0.0285    0.0003   -89.0585    0.0000

 Residual standard error:  0.3846 on 18861 degrees of freedom
 Multiple R-Sguared:  0.4746
 F-statistic:  8517 on 2 and 18861 degrees of freedom, the p-value is 0

 Correlation of Coefficients:
                         (Intercept)  loglO(engine.power + 1)
 loglO(engine.power + 1) -0.9683
           vehicle.speed  0.2380     -0.4463

 Analysis of Variance Table

 Response: log.CO

 Terms  added seguentially (first to last)
                            Df Sum of Sg  Mean Sg  F Value Pr(F)
 loglO(engine.power +1)     1  1346.515 1346.515 9103.577     0
           vehicle.speed     1  1173.140 1173.140 7931.415     0
               Residuals 18861  2789.742    0.148
                                         10-46

-------
                                                  (b) Response vs. Fit
          (a) Residual vs. Fit
                                                    (c) Residuals Normal QQ
                                                                          71
               Figure 10-36 QQ and Residual vs. Fitted Plot for CO Model 2.4
       The results suggest that by using vehicle speed and transformed engine power, the model
increases the amount of variance explained in truncated transformed CO from about 25% to
about 47%.
       Model 2.4 tremendously improves the R2 achieved in Model 2.3. The residuals scat-
ter plot for Model 2.4 (Figure 10-36) shows a reasonably linear relationship. Figure 10-36 also
shows that Model 2.4 does a slightly better job in improving the pattern of variance. QQ plot
shows general normality with the exceptions arising in the tails.

10.3.2.2.3 Linear Regression Model w ith Dummy Variables

       Figure 10-22 suggests that the relationship between CO and engine power may be some-
what different across the engine power ranges identified in the tree analysis.  That is, there may
be higher or lower CO emissions in different engine power operating ranges. One dummy vari-
able is created to represent different engine power ranges identified in Figure 10-22 for use in
linear regression analysis as illustrated below:
                     Engine power (bhp)
                         < 82.625
                         > 82.625
Dummy 1
    1
    0
                                        10-47

-------
       This dummy variable and the interaction between dummy variable and engine power are
then tested to determine whether the use of the variable and interactions can help improve the
model.
 Y = ft + ft log (engine.power+1) + fl vehicle.speed + ft dummy 1 +
    ft dummy 1 log (engine.power+1) + fl dummyIvehicle.speed + Error
       The result for Model 2.5 are shown in Table 10-24 and Figure 10-37.
                                        10-48

-------
Table 10-24 Regression Result for CO Model 2.5
Call: lm(formula = log. CO ~ loglO (engine .power + 1) + vehicle. speed
+ dummyl * loglO (
engine. power + 1) + dummy 1 * vehicle . speed, data = busdata!0242006 . 1 . 3,
na. action = na. exclude)
Residuals :
Min 1Q Median 3Q
-2.383 -0.233 -0.02602 0.2235
Coefficients :

(Intercept)
loglO (engine. power + 1)
vehicle . speed
dummy 1
dummyl : loglO (engine. power + 1)
dummyl : vehicle . speed
Residual standard error: 0.3655
Multiple R-Sguared: 0.5255


Max
2.124

Value Std. Error t value Pr(>
-4.4320 0.0498 -89.0217 0.
1.6746 0.0222 75.4956 0.
-0.0333 0.0003 -102.3796 0.
1.4402 0.0614 23.4537 0.
-1.0349 0.0321 -32.2634 0.
0.0414 0.0013 32.8802 0.
on 18858 degrees of freedom

F-statistic: 4177 on 5 and 18858 degrees of freedom, the p-value is
Correlation of Coefficients:

loglO (engine. power + 1)
vehicle . speed
dummyl
dummyl : loglO (engine. power + 1)
dummyl : vehicle . speed

loglO (engine. power + 1)
vehicle . speed
dummyl
dummyl : loglO (engine. power + 1)
dummyl : vehicle . speed

loglO (engine. power + 1)
vehicle . speed
dummyl
dummyl : loglO (engine. power + 1)
dummyl : vehicle . speed
Analysis of Variance Table
Response: log. CO
Terms added seguentially (first

loglO (engine. power + 1)
vehicle . speed
dummyl
dummyl : loglO (engine. power + 1)
dummyl : vehicle . speed
Residuals

(Intercept) loglO (engine. power + 1)
-0.9926
0.3000 -0.4020
-0.8108 0.8047
0.6864 -0.6915
-0.0774 0.1038
vehicle . speed dummyl


-0.2432
0.2780 -0.9559
-0.2581 0.0018
dummyl : loglO (engine .power + 1)




-0.1467


to last)
Df Sum of Sg Mean Sg F Value Pr
1 1346.515 1346.515 10079.07
1 1173.140 1173.140 8781.31
1 23.180 23.180 173.51
1 102.793 102.793 769.44
1 144.430 144.430 1081.10
18858 2519.338 0.134





It|)
0000
0000
0000
0000
0000
0000


0






















(F)
0
0
0
0
0

                                        10-49

-------
            (a) Residuals vs. Fit
                                                  (b) Response vs. Fit
                                                  (c) Residuals Normal QQ

               Figure 10-37 QQ and Residual vs. Fitted Plot for CO Model 2.5
       Model 2.5 does improve R2 from around 0.47 to around 0.52 by adding the dummy
variables. The residuals scatter plot for Model 2.5 (Figure 10-37) shows a slightly more linear
relation. Figure 10-37 also shows that Model 2.5 perhaps may improve the pattern of variance.
The QQ plot again shows general normality with the exceptions arising in the tails. However,
it is important to note that the model improvement, in terms of amount of variance explained by
the model, is not large.
       Then three more dummy variables will be created to represent different engine power and
vehicle speed ranges in Figure 10-22 and are shown as follow:
                    Thresholds
 engine.power < 82.625
 engine.power [82.625, 152.96] & vehicle.speed < 19.05
 engine.power > 152.96 & vehicle.speed < 19.05
 engine.power > 82.625 & vehicle.speed > 19.05
       These three dummy variables and the interaction between dummy variables and engine
power and vehicle speed are added to improve the model. This model will be:
Dummy21  Dummy22   Dummy23
    1          0           0
    0          1           0
    0          0           1
    000
 Y= PQ + P1log1Q(engine.power+l) + P2 vehicle.speed + P3dummy21 +
 P4 dummy21 log1Q(engine.power+l) + P5 dummy21 vehicle.speed + P6 dummy22 +
 P7 dummy22 log1Q(engine.power+l) + Pg dummy22 vehicle.speed + P9dummy23 +
 P10dummy231og10(engine.power+l) +Pndummy23 vehicle.speed + Error
                           (2.6)
                                        10-50

-------
       The results for Mode. 2.6 are shown in Table 10-25 and Figure 10-35.
Table 10-25 Regression Result for CO Model 2.6
 *** Linear Model ***
 Call:  lm(formula = log.CO ~
        loglO(engine.power +
loglO(engine.power + 1) + vehicle.speed + duntmy21  *
1)  + duntmy21 * vehicle, speed + duntmy22 * loglO (
                            dummy22  *  vehicle.speed + dummy23
                            dummy23  *  vehicle.speed,  data =
                             na.action = na.exclude)
                              3Q
                           .2012
      Max
    2.124
       Value
                                    .5895
                                    ,1014
                                    .0150
                                    .5978
                                    .4856
                                    .3863
                                    .4617
                                    .0231
                                    .8643
                                    .0194
                                    .3505
                                    .0387
                                 on 18852
     -3.
      1.
     -0.
      0.
     -1.
     -2.
     -0.
      0.
      0.
     -0.
      1.
     -0.
                                          Std
                                            0
   Error
  0945
  0389
  0007
  1007
  2216
  1632
0.0448
0.0014
  1048
  0016
  0701
  0012
       engine.power +1)  +
       engine.power +1)  +
       busdata!0242006.1.3,
Residuals:
    Min      1Q   Median
 -2.562 -0.2086 -0.02372 0
Coefficients:
                     (Intercept)
        loglO(engine.power + 1)
                  vehicle.speed
                        dummy21
                        dummy2 2
                        dummy2 3
dummy21:loglO(engine.power + 1)
          dummy21:vehicle.speed
dummy22:loglO(engine.power + 1)
          dummy22:vehicle.speed
dummy23:loglO(engine.power + 1)
          dummy23:vehicle.speed
Residual standard error: 0.3517
Multiple R-Sguared:  0.5609
F-statistic:  2189 on 11 and 18852 degrees of freedom,
Analysis of Variance Table
Response: log.CO
Terms added seguentially  (first to last)
                                   Df Sum of Sg
        loglO(engine.power + 1)
                  vehicle.speed
                        dummy21
                        dummy2 2
                        dummy2 3
dummy21:loglO(engine.power + 1)
          dummy21:vehicle.speed
dummy22:loglO(engine.power + 1)
          dummy22:vehicle.speed
dummy23:loglO(engine.power + 1)
          dummy23:vehicle.speed
                      Residuals
                                   loglOi
                         t value Pr(>|t|
    -37.
     28.
    -21.
      5.
     -6.
    -14.
    -10.
     16.
      8.
    -12.
     19.
    -30.
   9720
   3316
   0912
   9384
   7035
   6202
   3020
   8659
   2494
   1421
   2614
   9943
      0.0000
      0.0000
      0.0000
      0.0000
      0.0000
      0.0000
      0.0000
      0.0000
      0.0000
      0.0000
      0.0000
      0.0000
                                          degrees of freedom
                                                       the p-value is 0
                                                  Mean Sg
                                     1
                                     1
                                     1
                                     1
                                     1
                                     1
                                     1
                                     1
                                     1
                                     1
                                     1
                                 18852
           1346.
           1173.
             23.
             67.
            100.
             35.
             93.
              3.
              3.
             12.
            118.
 515
 140
 180
 463
 345
 491
 450
 681
 564
 318
 804
1346.
1173.
  23.
  67.
 100.
  35.
  93.
   3.
   3.
  12.
 118.
515
140
180
463
345
491
450
681
564
318
804
 F Value
10887.89
 9485.98
  187.
  545.
  811.
  286.
  755.
   29.76
   28.82
   99.61
  960.65
.44
.50
.39
.98
.63
                                        2331.445
                                                    0.124
                                         Pr (F)
         loglO(engine.power + 1)  0.OOOOOOe+000
                   vehicle.speed 0.OOOOOOe+000
                         dummy21 0.OOOOOOe+000
                         dummy22 0.OOOOOOe+000
                         dummy23 0.OOOOOOe+000
 dummy21:loglO(engine.power + 1)  0.OOOOOOe+000
           dummy21:vehicle.speed 0.OOOOOOe+000
 dummy22:loglO(engine.power + 1)  4.942365e-008
           dummy22:vehicle.speed 8.032376e-008
 dummy23:loglO(engine.power + 1)  0.OOOOOOe+000
           dummy23:vehicle.speed 0.OOOOOOe+000
                       Residuals
                                         10-51

-------
                                                   (b) Response vs. Fit
     (a) Residual vs. Fit
 r
                                                     (c) Residuals Normal QQ

                               • 10     
-------
comparing model predictions and actual observations for emission rates (untransformed y), these
numbers will be different from those obtained from linear regression models.

Table 10-26  Comparative Performance Evaluation of CO Emission Rate Models
                                              Coefficient of
                                              determination
Slope
        RMSE    MPE
Mean ERs
Linear Regression (Power)
Linear Regression (Power0 5)
Linear Regression (log(Power))
Linear Regression (log(Power)+Speed)
Linear Regression (log(Power)+Speed+Dummy Set 1)
Linear Regression (log(Power)+Speed+Dummy Set 2)
0.00003
0.0462
0.0502
0.0553
0.392
0.406
0.437
1.000
1.180
1.227
1.534
2.161
1.765
1.242
0.16032
0.16516
0.16420
0.16455
0.14252
0.13632
0.12565
-0.00002
0.05229
0.05006
0.05120
0.04211
0.03689
0.03003
       The improvement in R2 associated with moving toward a linear function of engine power
is significant. Hence, the use of the linear regression function will provide a significant improve-
ment on spatial and temporal model prediction capability.  However this linear regression func-
tion might still be improved.
       Results suggest that a linear regression function with log transformation performs slightly
better than the others and that the use of dummy variables can further improve model perfor-
mance. Although the linear regression function with dummy variables performs slightly better
than the linear regression function with log transformation, the introduction of more explanatory
variables (dummy variables and the interaction with engine power) increases the complexity
of the regression model.  As discussed in Section 10.3.2.1.4, there is no compelling reason to
include the dummy variables in the model since:  1) the models with dummy variables are more
complex without significantly improving model performance, and 2) there is no compelling en-
gineering reason at this time to support the difference in model performance within these specific
power regions. Yet,  given the explanatory power of the power cutpoint dummy variables (a 10%
increase in explained variance), additional investigation into why these values are turning out to
be significant is definitely warranted. It may be wise to include such cutpoints in on-road mod-
els for various engine technology groups.  Such dummy variables are, however, worth exploring
when additional data from other engine technology groups become available for analysis.
       It can be argued that inclusion of the dummy variables for power is warranted. However,
Model 2.4 is chosen as the preliminary  'final' model based solely upon ease of implementation.
The next step in model evaluation is to  once again examine the residuals for the improved model.
A principal objective was to verify that the statistical properties of the regression model conform
                                         10-53

-------
to a set of properties of least squares estimators.  In summary, these properties require that the
error terms be normally distributed, have a mean of zero, and have uniform variance.

Test for Constancy of Error Variance
       A plot of the residuals versus the fitted values is useful in identifying patterns in the
residuals.  Figure  10-36 plot (a) shows this plot for CO Model 2.4. Without considering variance
due to high emission points and zero load data, there is no obvious pattern in the residuals across
the fitted values.

Test of Normality  of Error Terms
       The first informal test normally reserved for the test of normality of error terms is  a
quantile-quantile plot of the residuals. Figure 10-36 plot (c) shows the normal quantile plot of
CO Model 2.4.  The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality.  Under normality, we expect 68 percent of the residuals
to fall between ± V-MSE  and about 90 percent to fall between ± 1.645 VMSE. Actually, 87.35%
of residuals fall within the first limits, while 92.19% of residuals fall within the  second limits.
Thus, the actual frequencies here  are reasonably consistent with those expected  under normality.
The heavy tails at both ends are a cause for concern, but are due to the nature of the data set.  For
example, even after the transformation, the response variable is not the real normal distribution.
       Based on above analysis, final CO emission model for cruise mode is:
            =  i Q[-3.747+1.3411oglO(engine.power+l) - 0.02 8 5 vehicle. speed]
       Analysis results support the observation that the final CO emission model (2.4) is signifi-
cantly better at explaining variability without making the model too complex.  Since there is only
one engine type, complexity may not be valid in terms of transferability.  This model is specific
to the engine classes employed in the transit bus operations.  Different models may need to be
developed for other engine classes and duty cycles.

10.3.2.3 HC Emission Rate Model Development for Acceleration Mode

       Based on previous analysis, truncated transformed HC will serve as the independent
variable. However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing statistical models. HTBR tree model
results suggest that engine power is the best one to begin with.
                                          10-54

-------
10.3.2.3.1 Linear Regression with Engine Power

       Let's select engine power to begin with, and estimate the model:


                       Y = fl + ft engine.power + Error                         (3.1)


       The regression run yields the results shown in Table 10-27 and Figure 10-39.

Table 10-27 Regression Result for HC Model 3.1
 Call:  lm(formula = HC.25 ~ engine.power,  data = busdata!0242006.1.3, na.action
 na.exclude)
 Residuals:
      Min       1Q      Median      3Q    Max
  -0.1285 -0.02417 -0.00003173 0.02467 0.2904

 Coefficients:
                 Value Std. Error  t value Pr(>|t|)
  (Intercept)    0.1840   0.0009   216.4203   0.0000
 engine.power    0.0001   0.0000    32.4947   0.0000

 Residual standard error: 0.03989 on 18328 degrees of freedom
 Multiple R-Sguared: 0.05447
 F-statistic:  1056 on 1 and 18328 degrees  of freedom, the p-value is 0

 Correlation  of Coefficients:
              (Intercept)
 engine.power  -0.938

 Analysis of Variance Table

 Response: HC.25

 Terms added  seguentially  (first to last)
                 Df Sum of Sg  Mean Sg  F  Value Pr(F)
 engine.power      1   1.67991 1.679912 1055.908     0
    Residuals  18328  29.15918 0.001591
                                         10-55

-------
      (a) Scatter Plot
(b) Residual vs. Fit
                                                              • A •>'• - .  .  •  .
                                                      .•'.     v%>^'k;vv
       (c) Response vs. Fit
(d) Residuals Normal QQ
               Figure 10-39 QQ and Residual vs. Fitted Plot for HC Model 3.1
       The results suggest that engine power explains about 5% of the variance in truncated
transformed HC.  F-statistic shows that/?7^ 0, and the linear relationship is statistically signifi-
cant.  To evaluate the model, the normality is examined in the QQ plot and constancy of variance
is checked by examining residuals vs. fitted values.
       The residual plot in Figure 10-39 shows a slight departure from linear regression assump-
tions indicating a need to explore a curvilinear regression function.  Since the variability at the
different X levels appears to be fairly constant, a transformation on X is considered.  The reason
to consider transformation first is to avoid multicollinearity brought about by adding the second-
order of X. Based on the prototype plot in Figure 10-39, the square root transformation and loga-
rithmic transformation are tested.  Scatter plots and residual plots based on each transformation
should then be prepared and analyzed to determine which transformation is most effective.
                      7 = /?Q + ^engine.,power^lT> + Error
                          (3.2)
                   Y = ft+ftlog.(engme.powerJrl)+Err&r                      (3.3)
       The result for Model 3.2 is shown in Table 10-28 and Figure 10-40, while the result for
Model 3.3 is shown in Table 10-29 and Figure 10-41.
                                          10-56

-------
Table 10-28 Regression Result for HC Model 3.2
 Call:  lm(formula = HC.25 ~ engine.power'" (1/2),  data = busdata!0242006.1.3, na.action
 = na.exclude)
 Residuals:
      Min       1Q     Median     3Q    Max
  -0.1173 -0.02389 -0.0002473 0.0244  0.2969
 Coefficients:

            (Intercept)
 I (engine .power'" (1/2) )
 Value Std. Error  t value Pr(>|t|)
0.1625   0.0013   127.4341   0.0000
0.0034   0.0001    38.2005   0.0000
 Residual standard error: 0.03948 on  18328  degrees  of freedom
 Multiple R-Sguared: 0.07375
 F-statistic: 1459 on 1 and 18328 degrees of  freedom,  the p-value is 0

 Correlation of Coefficients:
                        (Intercept)
 I (engine.power'" (1/2) )  -0.9735

 Analysis of Variance Table

 Response: HC.25

 Terms added seguentially (first to last)
                          Df Sum of Sg  Mean  Sg  F Value Pr(F)
 I (engine.power'" (1/2) )      1   2.27433  2.274333  1459.28     0
             Residuals 18328  28.56475  0.001559
          (a) Scatter Plot
           (c) Response vs. Fit
          I

                            
-------
Table 10-29 Regression Result for HC Model 3.3
 Call:  lm(formula = HC.25 ~ loglO(engine.power + 1), data = busdata!0242006.1.3,
 na.action = na.exclude)
 Residuals:
      Min       1Q      Median      3Q    Max
  -0.1186 -0.02345 -0.00007336 0.02386 0.3004

 Coefficients:
                           Value Std. Error t value Pr(>|t|)
             (Intercept)   0.1136  0.0022    50.8911  0.0000
 loglO(engine.power + 1)   0.0426  0.0010    43.4726  0.0000

 Residual standard error: 0.03906 on 18328 degrees of freedom
 Multiple R-Sguared: 0.09347
 F-statistic:  1890 on 1 and 18328 degrees of freedom, the p-value is 0

 Correlation of Coefficients:
                         (Intercept)
 loglO(engine.power + 1)  -0.9916

 Analysis of Variance Table

 Response: HC.25

 Terms added seguentially (first to last)
                            Df Sum of Sg  Mean Sg  F Value Pr(F)
 loglO(engine.power +1)      1   2.88268 2.882681 1889.863      0
               Residuals 18328  27.95641 0.001525
         (a) Scatter Plot
(b) Residual vs. Fit
           .

          (c) Response vs. Fit
(d) Residuals Normal QQ
                                                                                ,  •
               Figure 10-41 QQ and Residual vs. Fitted Plot for HC Model 3.3
                                         10-58

-------
       The results suggest that by using transformed engine power, the model increases the
amount of variance explained in truncated transformed HC from about 5% to about 9%.
       Model 3.3 improves R2 relative to Model 3.2. The residuals scatter plot for Model 3.3
(Figure 10-41) also shows a more reasonably linear relation than Model 2.2 (Figure 10-40). Fig-
ure 10-41 also shows that Model 3.3 does a better job in improving the pattern of variance. QQ
plot shows general normality with the exceptions arising in the tails.

10.3.2.3.2 Linear Regression Model with Dummy Variables

       Figure  10-26 suggests that the relationship between HC and engine power may differ
across the engine power ranges.  One dummy variable is created to represent different engine
power ranges identified in Figure 10-26 for use in linear regression analysis as illustrated below:

                       Engine power (bhp)   Dummy 1
                            < 54.555            1
                            > 54.555            0

       This dummy variable and the  interaction between dummy variable and engine power are
then tested to determine whether the use of the variable and interaction can help improve the
model.

 Y = /? + fijog^engine.pawer+l) + /?2 dummy 1 + fi^dummyl logw(engine.power+l) + Error (3.4)

       The results for Model 3.4 are  shown in Table  10-30 and Figure 10-42.
                                         10-59

-------
Table 10-30 Regression Result for HC Model 3.4
Call: lm(formula = HC.25 ~ loglO(engine.power +  1) +  dummyl
1), data = busdata!0242006.1.3, na.action = na.exclude)
Residuals:
     Min       1Q    Median     3Q   Max
 -0.1278 -0.02305 0.0002278  0.0231 0.314
                              loglO(engine.power +
Coefficients:

                    (Intercept)
       loglO(engine.power +  1)
                        dummy1
dummyl:loglO(engine.power +  1)
  Value Std
 0.1734
 0.0171
-0.0643
 0.0195
   Error
0.0042
0.0018
0.0062
0.0039
 t value Pr (>111
 41.4191
  9.4715
-10.3151
  4.9731
0.0000
0.0000
0.0000
0.0000
Residual standard error: 0.03873 on  18326 degrees  of  freedom
Multiple R-Sguared: 0.1084
F-statistic: 742.8 on 3 and  18326 degrees of  freedom,  the  p-value  is  0

Analysis of Variance Table

Response: HC.25

Terms added seguentially  (first to last)
                                  Df Sum of Sg  Mean  Sg  F Value
       loglO(engine.power +1)     1   2.88268  2.882681  1921.331
                        dummyl     1   0.42377  0.423774  282.449
dummyl:loglO(engine.power +1)     1   0.03711  0.037107    24.732
                     Residuals 18326   27.49553  0.001500

                                       Pr (F)
       loglO(engine.power +  1) 0.OOOOOOe+000
                        dummyl 0 . OOOOOOe+000
dummyl:loglO(engine.power +  1) 6.647205e-007
                     Residuals
                                         10-60

-------
       (a) Residuals vs. Fit
                                     <
                      o re
                     • »
                                                    (b) Response vs. Fit
                                                             « H    o*    ait
                                                    (c) Residuals Normal QQ
               Figure 10-42 QQ and Residual vs. Fitted Plot for HC Model 3.4
       The results suggest that by using transformed engine power and speed, the model only in-
creases the amount of variance explained in truncated transformed HC from about 9% to about 10%.
       Model 3.4 slightly improves R2 relative to Model 3.3. The residuals scatter plot for Model
3.4 (Figure 10-42) is not appreciably better nor does Model 3.4 do a better job in improving the pat-
tern of variance. The QQ plot still shows general normality with the exceptions arising in the tails.

10.3.2.3.3 Model Discussions

       The previous sections outline the model development process from regression tree
model, to a simple OLS model, to more complex OLS models. To test whether the linear regres-
sion with power was a beneficial addition to the regression tree model, the mean ERs at HTBR
end nodes (single value) were compared to the predictions from the linear regression function
with engine power. The results of the performance evaluation are shown in Table 10-31. The
improvement in R2 associated with moving toward a linear function of engine power is nearly
imperceptible.  Hence, the use  of the linear regression function will provide almost no signifi-
cant improvement over spatial  and temporal model prediction capability.  This linear regression
function might still be improved.  Since the R2 and slope in Table 10-31 are derived by compar-
                                         10-61

-------
ing model predictions and actual observations for emission rates, these numbers will be different
from those obtained from linear regression models.

Table 10-31 Comparative Performance Evaluation of HC Emission Rate Models
Coefficient of „,
, A . A. Slope
determination r RMSE MPE
(R2) 
Mean ERs
Linear Regression (Power)
Linear Regression (Power0 5)
Linear Regression (log(Power))
Linear Regression (log(Power) + Dummy)
0.000090
0.0166
0.0214
0.0281
0.0367
1.000
0.979
0.749
0.864
1.060
0.0019072
0.0019879
0.0019311
0.0019249
0.0019151
0.00000022
0.00061206
0.00040055
0.00040884
0.00040366
       Results suggest that the linear regression function with log transformation performs
slightly better than the others and that the use of dummy variables can further improve model
performance, but again there is almost no perceptible change in terms of explained variance.
Although the linear regression function with log transformation and dummy variables performs
slightly better than linear regression function with log transformation alone, the revised model
introduces additional explanatory variables (dummy variables and the interaction with engine
power) and increases the complexity of regression model without significantly improving the
model. As discussed in Section 10.3.2.1.4, there is no compelling reason to include the dummy
variables in the model, given that: 1) the second model is more complex without significantly
improving model performance, and 2) there is no compelling engineering reason at this time to
support the difference in model performance within these specific power regions. These dummy
variables are, however, worth exploring when additional data from other engine technology
groups become available for analysis.
       Model 3.3 is recommended as the preliminary 'final' model (although one might argue
that using the regression tree results directly would also probably be acceptable). The next step
in model evaluation is to once again examine the residuals for the improved model. A principal
objective was to verify that the statistical properties of the regression model conform to a set of
properties of least squares estimators. In summary, these properties require that the error terms
be normally distributed, have a mean of zero, and have the same variance.

Test for Constancy of Error Variance
       A plot of the residuals versus the fitted values is useful in identifying any patterns in
the residuals. Figure 10-41 plot (b) is residuals vs. fit for HC Model 3.3. Without considering
variance due to high emission points and zero load data, it can be seen that there is no obvious
pattern in the residuals across the fitted values.
                                          10-62

-------
Test of Normality of Error terms
       The first informal test normally reserved for the test of normality of error terms is a
quantile-quantile plot of the residuals. Figure 10-40 plot (d) shows the normal quantile plot of of
HC Model 3.2.  The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality.  Under normality, we expect 68 percent of the residuals to
fall between ±VMS£ and about 90 percent to fall between ± 1.645 V-MSE .  Actually, 84.83% of
residuals fall within the first limits, while 93.60% of residuals fall within the second limits. Thus,
the actual frequencies here are reasonably consistent with those expected under normality.  The
heavy tails at both ends are a cause for concern, but this is due to the nature of the data set. For
example, even after the transformation, the  response variable is not the real normal  distribution.
       Based on above analysis, final HC emission model for cruise mode is:


                       HC = [0.114+ 0.04261og1Q(engine.power+l)]4


                      10.4 Conclusions  and Further Considerations

       In this research, acceleration mode is defined as "acceleration >1 mph/s". Data not
considered to be in idle, deceleration or acceleration mode will be deemed to be in cruise mode.
Compared to cruise mode activity, the engine power is more concentrated in higher engine power
ranges (> 200 bhp) for acceleration mode activity.
       Inter-bus variability analysis indicated that some of the 15 buses  are higher emitters than
others (especially noted for HC emissions). However, none of the buses appears to qualify as a
traditional high-emitter, which would exhibit emission rates of two to three standard deviations
above the mean.  Hence, it is difficult to classify any of these 15 buses as high emitters for mod-
eling purposes.  At this moment, these  15 buses are treated as a whole for model development.
Modelers should keep in mind that although no true high-emitters are present in the database,
such vehicles may behave significantly different than the vehicles tested. Hence, data from high-
emitting vehicles should be collected and examined in future studies.
       Some high HC emissions events are noted in acceleration mode.  After screening engine
speed, engine power, engine oil temperature, engine oil pressure, engine coolant temperature,
ECM pressure, and other parameters, no variables were identified that could be linked to these
high emissions events.  These events may represent natural variability in on-road emissions, or
                                          10-63

-------
some other variable (such as grade or an engine variable that is not measured) may be linked to
these events.
       Engine power is selected as the most important variable for three pollutants based on
HTBR tree models. This finding is consistent with previous research results which verified the
important role of engine power (Ramamurthy et al. 1998; Clark et al. 2002; Earth et al. 2004).
The HC relationship is significant but fairly weak.  Analysis in previous chapters also indicates
that engine power is correlated with not only on-road load parameters such as vehicle speed,
acceleration, and grade, but also potentially with engine operating parameters such as throttle po-
sition and engine oil pressure. On the other hand, engine power in this research is derived from
engine speed, engine torque and percent engine load.
       The regression tree models suggest that some other variables, like oil pressure and en-
gine barometric pressure, may also impact the HC emissions.  Further analysis demonstrates that
by using engine power alone one might be able to achieve explanatory ability similar to using
engine power and other variables.  To develop  models that are efficient and easy to implement,
only engine power is used to develop emission models. However, additional investigation into
these variables is warranted as additional detailed data from engine testing become available for
analysis.
       Given the relationships noted between engine indicated HP and emission rates, it is
imperative that data be collected to develop solid relationships in engine power demand models
(estimating power demand as a function of speed/acceleration, grade, vehicle characteristics,
surface roughness, inertial losses, etc.)  for use in regional inventory development and microscale
impact assessment.
       In summary, the modeler recommends the following acceleration emission models:

            NOx = [-0.0195 + 0.2011og10(engine.power+l) +  0.0019vehicle.speed]2

               CO = lO^"3'747 + 1-3411°g1°(engine-P°wer+1) - 0.0285vehicle.speed]

                       HC = [0.114 + 0.04261og1Q(engine.power+l)]4
                                          10-64

-------
                                     CHAPTER 11
                         11. CRUISE MODE DEVELOPMENT
       After developing idle mode definition and emission rate in Chapter 8, deceleration mode
definition and emission rate in Chapter 9, and acceleration emission model in Chapter 10, the
next task will be to develop cruise mode.

                           11.1 Analysis of Cruise Mode Data

       After dividing the database into idle mode, deceleration mode, and acceleration mode,
cruise mode data will be all of the remaining data in the database (i.e., data not previously clas-
sified into idle, deceleration, and acceleration). Unlike the idle and deceleration modes, there is
a general relationship between engine power and emission rate for acceleration mode and cruise
mode.  The engine power distribution for data collected in the cruise mode is provided in Table
11-1.

Table 11-1 Engine Power Distribution for Cruise Mode
Engine Power Distribution
o utants ^ 5Q^ ^5Q 10Q^ ^10Q 15Q^ ^15Q 20Q^ ^ 20Q An
Number
Percentage
NO
X
CO
HC
NO
X
CO
HC
15885
15834
15481
40.34%
40.37%
40.72%
8988
8940
8600
22.83%
22.80%
22.62%
7173
7145
6830
18.22%
18.22%
17.96%
3536
3529
3394
8.98%
9.00%
8.93%
3792
3770
3715
9.63%
9.61%
9.77%
39374
39218
38020
100.00%
100.00%
100.00%
       Emission rate histograms for each of the three pollutants for cruise operations are pre-
sented in Figure 11-1. Figure 11-1 shows significant skewness for all three pollutants for cruise
mode.  Some high HC emissions events are noted in cruise mode. After screening engine speed,
engine power, engine oil temperature, engine oil pressure, engine coolant temperature, ECM
                                          11-1

-------
pressure, and other parameters, no operating parameters appeared to correlate with the high emis-
sions events.
                                     0  OS  I   IS  2  ?5  3  36
                                          CO fcinsMm H«tt (»M
                Figure 11-1 Histograms of Three Pollutants for Cruise Mode
11.1.1 Engine Rate Distribution by Bus in Cruise Mode

       Inter-bus response variability for cruise mode operations is illustrated in Figures 11-2 to
11-4 using median and mean of NOx, CO, and HC emission rates. Table 11-2 presents the same
information in tabular form.  The difference between  median and mean is also an indicator of
skewness.
                                          11-2

-------
         I   C   V   10  12   M   16
               I H \.
4   8   9  10   \1   u   16
      'I,,; h:
Figure 11-2  Median and Mean of NO  Emission Rates in Cruise Mode by Bus


                                        005|-	.-•
 Figure 11-3 Median and Mean of CO Emission Rates in Cruise Mode by Bus
                                  11-3

-------
                                                            Bin Mo
         Figure 11-4 Median and Mean of HC Emission Rates in Cruise Mode by Bus
Table 11-2 Median and Mean of Three Pollutants in Cruise Mode by Bus
NOx CO HC
Bus ID Median Mean Median Mean Median Mean
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 3 84
Bus 385
Bus 386
0.11666
0.18479
0.05924
0.12779
0.09092
0.13714
0.11139
0.12570
0.16713
0.09227
0.14987
0.16355
0.11597
0.10244
0.12254
0.14506
0.18507
0.07384
0.14644
0.09936
0.16103
0.11094
0.15673
0.18183
0.11789
0.16698
0.18468
0.13933
0.13024
0.13632
0.01618
0.01091
0.00534
0.01259
0.01262
0.01254
0.01454
0.01394
0.01994
0.01074
0.01342
0.00921
0.00934
0.01266
0.01147
0.02891
0.01389
0.01341
0.01875
0.01704
0.02383
0.02559
0.02298
0.04532
0.02505
0.02544
0.01949
0.01903
0.02066
0.02197
0.00120
0.00122
0.00012
0.00237
0.00181
0.00121
0.00064
0.00151
0.00110
0.00060
0.00130
0.00126
0.00181
0.00187
0.00129
0.00146
0.00135
0.00021
0.00343
0.00236
0.00146
0.00075
0.00195
0.00148
0.00080
0.00155
0.00198
0.00221
0.00205
0.00167
                                         11-4

-------
       Figures 11-2 to 11-4 and Table 11-2 illustrate that NO  emissions are more consistent than
         °                                            X
CO and HC emissions.  Across the 15 buses, Bus 380 has the largest median and mean for CO
emissions, while Bus 364 has the largest median and mean for HC emissions.  The above figures
and table demonstrate that although variability exists across buses, it is difficult to conclude that
there are  any true "high emitters" in the database. This conclusion is consistent with the result
for the other three modes. As was also noted in the acceleration mode data, Bus 363 has the
smallest mean and median HC emissions compared to the other 14 buses.

11.1.2 Engine Power Distribution by Bus in Cruise Mode

       Engine power distribution in cruise mode by bus is shown in Figure 11-5 and Table 11-3.
Bus 361 has the largest 1st quartile engine power in cruise mode while Bus 377 has the largest
median and 3rd quartile engine power in cruise mode. The maximum power values for each bus
match well with the manufacturer's engine power rating. Although variability for engine power
distribution exists across buses, it is difficult to conclude that such variability is affected by indi-
vidual buses, bus routes, or other factors. The relationship between power and emissions appears
consistent across the buses for acceleration mode.
 Table 11-3 Engine Power Distribution in Cruise Mode by Bus
Bus ID Number ,„ _. ., Median _. ., Max Mean
Mm Quartile Quartile
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
1653
3140
3286
2575
2278
2890
1647
2544
1242
2537
1208
3062
3638
3327
4539
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
14.68
70.13
10.46
14.47
30.13
23.19
17.93
43.51
18.85
6.72
32.39
29.42
21.82
11.86
19.24
71.25
108.12
47.19
64.30
68.23
72.09
118.01
102.68
91.07
49.18
81.02
77.95
61.20
48.80
53.43
169.03
140.28
112.37
130.62
118.10
142.47
210.27
165.04
187.71
113.81
124.97
141.19
115.75
102.91
94.38
275.46
296.91
275.55
275.51
275.49
275.54
275.50
275.57
275.56
275.46
275.55
275.53
275.46
275.47
275.30
97.70
107.16
71.45
85.56
79.77
94.36
121.33
110.84
109.41
70.68
89.42
90.85
72.69
68.20
61.66
                                          11-5

-------
                                               0  100 200  300  0  NO 200 SO  0  TOO 200 300  0  100 3X) 300
                                                  001372        Ej.J'5       t'.: :*'       Bttl379
0  100 2DO SO  0 TO 200 300  0  100 200
   &J5«I       E'.. J53       Bin364
         - OOOh        - OCO
o  100 JOO 300  0  ino xn  wi  o  iim xm  .¥«  n  im  rm  v»  o  inn  mn  vn n  im  ?m am  o  ion MI  306
   Dv; £0        GUI 301        Duf332        Dye 303        C'jt X-<        Cv? >;-'        Dus 3X
      Figure 11-5 Histograms of Engine Power in Cruise Mode by Bus
                                           11-6

-------
                        11.2 Model Development and Refinement

11.2.1 HTBR Tree Model Development

       The potential explanatory variables included in the emission rate model development ef-
fort include:
       Vehicle characteristics: model year, odometer reading, bus ID (14 dummy variables)
       Roadway characteristics: dummy variable for road grade;
       Onroad loadparameters: engine power (bhp), vehicle speed (mph), acceleration (mph/s);
       Engine operating parameters: engine oil temperature (deg F), engine oil pressure (kPa),
engine coolant temperature (deg F), barometric pressure reported from ECM (kPa);
       Environmental conditions: ambient temperature (deg C), ambient pressure (mbar), ambi-
ent relative humidity (%).
       HTBR technique is used first to identify potentially significant explanatory variables and
this analysis provides the starting point for conceptual model development. The HTBR model
is used to guide the development of an OLS regression model, rather than as a model in its
own right. HTBR can be used as a data reduction tool and for identifying potential interactions
among the variables.  Then OLS regression is used with the identified variables to estimate a
preliminary "final" model.
       Although evidence in the literature suggests that a logarithmic transformation is most
suitable for modeling motor vehicle emissions (Washington  1994; Ramamurthy et al. 1998;
Fomunung 2000; Frey et al. 2002), this transformation needs to be verified through the Box-Cox
procedure.  The Box-Cox function in MATLAB™ can automatically identify a transforma-
tion from the family of power transformations on emission data, ranging from -1.0 to 1.0. The
lambdas chosen by the Box-Cox procedure for cruise mode are 0.40619 for NO , 0.012969 for
CO, 0.241 for HC.  The Box-Cox procedure is only used to provide a guide for selecting a trans-
formation, so overly precise results are not needed (Neter et al. 1996).  It is often reasonable to
use a nearby lambda value that is easier to understand for the power transformation. Although
the lambdas chosen by the Box-Cox procedure are different for acceleration and cruise modes,
the nearby lambda values are same for these two modes. In  summary, the lambda values used
for transformations are Va for NO , 0 for CO (indicating a log transformation), and Vi for HC
for cruise mode. Figures 11-6 to 11-8 present the histogram, boxplot, and probability plots of
truncated emission rates in cruise mode for NO , CO, and HC, while Figures  11-9 to 11-11 pres-
                                         11-7

-------
ent the same plots for truncated transformed emission rates for NO , CO and HC, where a great
improvement is noted.
                                     a-
                                     s
            0.0  05  1.0  15  2.0  25
          Truncated NOx Emission Rsie (g/s) in Cruise Mode
-4-20    24
  QMarMes Of standard NwnwH
Figure 11-6 Histogram, Boxplot, and Probability Plot of Truncated NO Emission Rates in Cruise Mode
                                              11-8

-------
            OJ) 05  1.0  15  3.0 2.5 3X1
          Truncaed CO Emission Rale ftjft) m Cruse Mode
                                                                                           I
-4-20    24
   OUWUM of Standard Normal
Figure 11-7 Histogram, Boxplot, and Probability Plot of Truncated CO Emission Rate in Cruise Mode
                                        8.
                                                  •
            0.0    0.02   004   006
          Truncated HC Emission Rale (g/s) in Cruse Mode
-4-20     2
  Quantiles of Standard Normal
Figure 11-8 Histogram, Boxplot, and Probability Plot of Truncated HC Emission Rate in Cruise Mode
                                                  11-9

-------
                                  5
              o.o    05    1.0    15
                                                              .4   .2   0   1    4
                                                               Quantfe; of Stand ml Namgl
Figure 11-9 Histogram, Boxplot, and Probability Plot of Truncated Transformed NO Emission
                                   Rate in Cruise Mode

               5  -4  -3  -2  -1  0
                                                             •4-3024
                                                               •;»--a .:•..-. o
Figure 11-10 Histogram, Boxplot, and Probability Plot of Truncated Transformed CO Emission
                                   Rate in Cruise Mode
                                          11-10

-------
                                   3-
                                  Is
                01   0.2  03  0.4  05
                                                                -2   0   2   4
                                                              Qusrties of Stendatd Nwmal
 Figure 11-11 Histogram, Boxplot, and Probability Plot of Truncated Transformed HC Emission
                                  Rate in Cruise Mode

11.2.1.1 NO HTBR Tree Model Development
           X

       Figure 11-12 illustrates the initial tree model used for the truncated transformed NOx
emission rate in cruise mode.  Results for the initial model are given in Table 11-4.  The tree
grew into a complex model, with a considerable number of branches and 32 terminal nodes. Fig-
ure 11-13 illustrates the amount of deviation explained corresponding to the number of terminal
nodes.
                                          11-11

-------
              engine_pojver< 19.05
                                    enginenovfer< 109 555
                                                                                055
Figure 11-12 Original Untrimmed Regression Tree Model for Truncated Transformed NO Emission


                                   Rate in Cruise Mode



             650.00    15.00     4.70     2.40     1.50     100     0.79     039
                        i  i  i	i        i        i        i  i      i        i
          o
          o
          o
       I
o
o
co
          sj
                                 10
                                 15
20
30
                                              size
Figure 11-13 Reduction in Deviation with the Addition of Nodes of Regression Tree for Truncated


                      Transformed NO Emission Rate in Cruise Mode
                                          11-12

-------
Table 11-4 Original Untrimmed Regression Tree Results for Truncated Transformed NO  Emis-
sion Rate in Cruise Mode
 Regression tree:
 tree(formula = NOx.50 ~ model.year + odometer + temperature + baro  + humidity +
 vehicle.speed + oil.temperture + oil.press + cool.temperature +  eng.bar.press +  en-
 gine.power + acceleration + bus360 + bus361 + bus363 + bus364 +  bus372 +  bus375  +
 bus377 + bus379 + bus380 + bus381 + bus382 + bus383 + bus384 + bus385 + dummy.grade,
 data = busdata!0242006.1.4,  na.action = na.exclude, mincut = 400, minsize =  800,
 mindev = 0.01)
 Variables actually used in tree construction:
  [1] "engine.power"  "dummy.grade"   "baro"          "oil-press"
  [5] "humidity"      "vehicle.speed" "temperature"   "bus372"
  [9] "odometer"      "model.year"
 Number of terminal nodes:   32
 Residual mean deviance:  0.005398 = 212.4 / 39340
 Distribution of residuals:
         Min.     1st Qu.      Median        Mean     3rd Qu.        Max.
  -4.634e-001 -4.130e-002 -1.265e-003 -1.315e-016  3.646e-002  1.180e+000
       For model application purposes, it is desirable to select a final model specification that
balances the model's ability to explain the maximum amount of deviation with a simpler model
that is easy to interpret and apply. Figure 11-7 indicates that reduction in deviation with addition
of nodes after four, although potentially statistically significant, is very small.  A simplified tree
model was derived which ends in four terminal nodes as compared to the 37 terminal nodes in
the initial model.  The residual mean deviation only increased from 210.2 to 298.9 and yielded
a much cleaner model than the initial one.  Results are shown in Table 11-5 and Figure 11-14.
Based on above analysis, NO cruise model will be developed based on this result.
                                         11-13

-------
                                   ermine nnwer19.05 7058
   3)  engine.power>52.525 23094
     6)  engine.power<109.555 10186
     7)  engine.power>109.555 12908
   160.50  0.1831
    47.70  0.1252  *
    41.36  0.2588  *
   285.90  0.4438
        81.41  0.3791
       128.40  0.4948
                                         11-14

-------
       This tree model suggests that engine power is the most important explanatory variable
for NO  emissions. This finding is consistent with previous research results which verified the
important effect of engine power on NO emissions (Ramamurthy et al. 1998; Clark et al. 2002;
Earth et al.  2004). Analysis in previous chapter also indicates that engine power is correlated not
only with onroad load parameters such as vehicle speed, acceleration, and grade, but also with
engine operating parameters such as throttle position and engine oil pressure. On the other hand,
engine power in this research is derived from engine speed, engine torque and percent engine
load.  So engine power can connect onroad modal activity with engine operating conditions to
that extent.  This fact strengthens the importance of introducing engine power into the concep-
tual model and the need to improve the ability to simulate  engine power for regional inventory
development.

11.2.1.2 CO HTBR Tree Model Development

       Figure 11-15 illustrates the initial tree model used for truncated transformed CO emis-
sion rate in cruise mode. Results for initial model are given in Table 11-6.  The tree grew into
a complex model with a considerable number of branches  and 65 terminal nodes. Figure 11-16
illustrates the amount of deviation explained corresponding to the number of terminal nodes.
                                         11-15

-------
           ooollen
                          engine DUV er< 15.44
                                                            engine powpr<181 235
Figure 11-15 Original Untrimmed Regression Tree Model for Truncated Transformed CO Emis-

                                  sion Rate in Cruise Mode
                1500000   90000     24.000    12000    6.700    5100    3.000
                   i i i i i i i  i i i i i i i i    i i i i i i i i i  i i i i   i i i i i i  i i i i i i i i i i i i i i i I i
              '8
              S
                           10
                                    2C
                                             30
                                                      I

                                                      40
                                                              50
                                                                        60
 Figure 11-16 Reduction in Deviation with the Addition of Nodes of Regression Tree for Trun-

                    cated Transformed CO Emission Rate in Cruise Mode
                                           11-16

-------
Table 11-6 Original Untrimmed Regression Tree Results for Truncated Transformed CO Emis-
sion Rate in Cruise Mode
 Regression tree:
 tree(formula = log.CO ~ model.year + odometer + temperature + baro + humidity +
        vehicle.speed + oil.temperture + oil.press  + cool.temperature +
        eng.bar.press + engine.power + acceleration + bus360 + bus361 + bus363 +
        bus364  +  bus372 + bus375  + bus377 + bus379  + bus380 + bus381 + bus382 +
        bus383  +  bus384 + bus385  + dummy.grade,  data = busdata!0242006.1.4,
        na.action = na.exclude, mincut = 400,  minsize = 800, mindev = 0.01)
 Variables actually used in tree construction:
  [1] "engine.power"     "oil-press"        "baro"
  [4] "cool.temperature" "vehicle.speed"    "acceleration"
  [7] "humidity"         "odometer"         "dummy.grade"
 [10] "temperature"      "eng.bar.press"    "model.year"
 [13] "oil.temperture"
 Number of terminal nodes:  65
 Residual mean deviance:  0.1089 = 4265  / 39150
 Distribution of residuals:
         Min.     1st Qu.      Median        Mean     3rd Qu.        Max.
  -2.335e+000 -1.783e-001 -1.233e-002  1.869e-016  1.691e-001  2.013e+000
       For model application purposes, it is desirable to select a final model specification that
balances the model's ability to explain the maximum amount of deviation with a simpler model
that is easy to interpret and apply. Figure 11-16 indicates that reduction in deviation with addi-
tion of nodes after 4, although potentially statistically significant, is very small.   A simplified
tree model was derived which ends in 4 terminal nodes as compared to the 67 terminal nodes in
the initial model.  The residual mean deviation only increased from 4265 to 5698 and yielded a
much more efficient model. Results are shown in Table 11-7 and Figure 11-17.  The CO cruise
emission rate model will be based upon these results.
                                         11-17

-------
                                             •r<114 35S
5445
                                                         engine pf>wbr<1B1 735
                                                    -1.753
                                              -1.487
          -2.321
     -1.967
 Figure 11-17 Trimmed Regression Tree Model for Truncated Transformed CO Emission Rate in
                                      Cruise Mode
Table 11-7 Trimmed Regression Tree Results for Truncated Transformed CO Emission Rate in
Cruise Mode
 Regression tree:
 snip.tree(tree = tree(formula = log.CO ~ model.year + odometer +  temperature  +
        baro  +  humidity  + vehicle.speed + oil.temperture + oil.press +
        cool.temperature + eng.bar.press + engine.power + acceleration +
        bus360  +  bus361  + bus363  + bus364 + bus372 + bus375 + bus377 + bus379 +
        bus380  +  bus381  + bus382  + bus383 + bus384 + bus385 + dummy.grade,
        data  =  busdata!0242006.1.4,  na.action = na.exclude, mincut = 400,
       minsize  = 800,  mindev = 0.01),  nodes = c(4.
 Variables actually used in tree construction:
 [1]  "engine.power"
 Number of terminal nodes:   4
 Residual mean deviance:  0.1453 = 5698 / 39210
 Distribution of residuals:
         Min.      1st Qu.      Median        Mean
  -2.679e+000 -2.065e-001 -7.150e-003 -4.942e-015
 node),  split, n, deviance, yval
       *  denotes terminal node

 1)  root  39218 8170.0 -1.944
   2)  engine.power<114.355  27187 4482.0 -2.076
     4)  engine.power<15.445 8414 1639.0 -2.321 *
     5)  engine.power>15.445 18773 2115.0 -1.967 *
   3)  engine.power>114.355  12031 2147.0 -1.646
     6)  engine.power<181.235 7220 1146.0 -1.753 *
     7)  engine.power>181.235 4811  797.8 -1.487 *
                          6., 7., 5.
                            3rd Qu.
                         2.041e-001
      Max.
2.452e+000
                                         11-18

-------
       This tree model suggested that engine power is the most important explanatory variable
for CO emissions. This finding is consistent with NO emissions.  This tree will be used as refer-
ence for linear regression model development.

11.2.1.3 HC HTBR Tree Model Development

       Figure 11-18 illustrates the initial tree model used for truncated transformed HC emis-
sion rate in cruise mode. Results for initial model are given in Table 11-8.  The tree grew into a
complex model  with a considerable number of branches and 61 terminal nodes.
                                                         _busi
Figure 11-18 Original Untrimmed Regression Tree Model for Truncated Transformed HC Emis-
                                sion Rate in Cruise Mode
                                         11-19

-------
Table 11-8 Original Untrimmed Regression Tree Results for Truncated Transformed HC Emis-
sion Rate in Cruise Mode
 Regression tree:
 tree(formula = HC.25 ~ model.year +  odometer  +  temperature  + baro + humidity +
        vehicle.speed + oil.temperture + oil.press + cool.temperature +
        eng.bar.press + engine.power + acceleration + bus360  + bus361 + bus363  +
        bus364 +  bus372 + bus375 + bus377 + bus379 + bus380 + bus381 + bus382 +
        bus383 +  bus384 + bus385 + dummy.grade, data = busdata!0242006.1.4,
        na.action = na.exclude,  mincut = 400, minsize = 800,  mindev  = 0.01)
 Variables actually used in tree construction:
  [1] "bus363"           "bus364"            "engine.power"
  [4] "oil.temperture"   "odometer"          "oil-press"
  [7] "humidity"         "cool.temperature"  "bus381"
 [10] "bus377"           "baro"              "temperature"
 [13] "bus372"           "vehicle.speed"     "dummy.grade"
 [16] "bus385"
 Number of terminal nodes:  56
 Residual mean deviance:   0.0008147 =  30.93  /  37960
 Distribution of residuals:
         Min.     1st Qu.      Median        Mean      3rd Qu.        Max.
  -1.862e-001 -1.595e-002  -3.021e-003  -1.297e-018   1.230e-002  2.886e-001
       Figure 11-18 and Table 11-8 suggest that the tree analysis of HC emission rates identi-
fied a number of buses that appear to exhibit significantly different emission rates under all load
conditions than the other buses (i.e., some of the bus dummy variables appeared as significant in
the initial tree splits). Two bus dummy variables split the data pool at the first two levels of the
HC tree model. This same result was noted for these buses in the acceleration mode. Although
variability exists for three pollutants across 15 buses, the division was even more obvious for HC
emissions (see Figure 11-4 and Table 11-2). Although it is tempting to develop different emis-
sion rates for these buses to reduce emission rate deviation in the sample pool, it is difficult to
justify doing so.  Unless these is an obvious reason to classify these three buses as high emitters
(i.e., significantly higher than normal emitting vehicles, perhaps by as much as a few standard
deviations from the mean), and unless there are enough data to develop separate emission rate
models for high emitters, one cannot justify removing the data from the data set.  Until such
data exist to justify treating these buses as high emitters, the bus dummy variables for individual
buses are removed from the analyses and all 15 buses are treated as part of the whole data set.

       Another tree model was generated excluding the bus dummy variables.  However, odom-
eter reading also had to be excluded because the previous "Bus 363<0.5" tree cutpoint  was
replaced by "odometer>282096" (i.e., was identically correlated to the same bus). This new tree
model is illustrated in Figure 11-19 and Table 11-9. The tree model is then trimmed for applica-
tion purposes, as was done for the NO  and CO models.
                                          11-20

-------
                         bam968.5 35063 49.420 0.1943
      6) engine.power<12.645  6821 13.850  0.1750  *
      7) engine.power>12.645  28242 32.420  0.1989
       14) oil.temperture<192.1 26727 29.900  0.2005
         28) baro<980.5 11265  9.610  0.1918  *
         29) baro>980.5 15462 18.820  0.2068  *
       15) oil.temperture>192.1 1515  1.244  0.1706  *
                                          11-21

-------
       The new tree model suggests that barometric pressure is the most important explanatory
variable for HC emission rates.  However, this finding is challenged by the fact that all the 2957
data points in the first left hand branch of the tree (barometric pressure < 968.5) belong to Bus
363. Although this dataset was collected under a wide variety of environmental conditions, the
scope of barometric pressure was limited for individual buses tested.  As reported earlier, Bus
363 exhibited significantly lower HC emissions than the other buses (see Figure 11-4), but the
reason is not clear at this time. To develop a reasonable tree model given the limited data col-
lected, the environmental parameters are excluded from the model until a greater distribution of
environmental conditions can be represented in a test data set.  With data collected from a more
comprehensive testing program, environmental variables can be integrated into the model direct-
ly, or perhaps correction factors for the emission rates can be developed. The secondary trimmed
tree is presented in Figure 11-20 and Table 11-10.
                 oil pressl<345.25
                                    eng bar prsss
-------
Table 11-10 Trimmed Regression Tree Results for Truncated Transformed HC in Cruise Mode
 Regression tree:
 snip.tree(tree = tree(formula = HC.25 ~ engine.power + vehicle.speed +
        acceleration + oil.temperture  +  oil.press  + cool.temperature +
        eng.bar.press, data = busdata!0242006.1.4,  na.action = na.exclude,
        mincut  =  400,  minsize = 800, mindev = 0.01),  nodes  = c(6.,  5.,  7.,
        4.))
 Variables actually used in tree construction:
 [1]  "eng.bar.press" "oil-press"     "engine.power"
 Number of terminal nodes:   4
 Residual mean deviance:  0.00148 = 56.27 / 38020
 Distribution of residuals:
         Min.     1st Qu.      Median        Mean     3rd Qu.        Max.
  -1.310e-001 -2.290e-002 -2.164e-003  1.281e-015  1.942e-002  3.220e-001
 node),  split,  n,  deviance, yval
       * denotes terminal node

 1)  root 38020 71.970 0.1876
   2)  eng.bar.press<99.9348 10827 24.640 0.1656
     4)  oil.press<345.25 4965 10.870 0.1400 *
     5)  oil.press>345.25 5862  7.754 0.1873 *
   3)  eng.bar.press>99.9348 27193 40.010 0.1963
     6)  engine.power<13.975 5879 12.660 0.1786 *
     7)  engine.power>13.975 21314 24.990 0.2012 *
       The tree model excluding bus dummy variables, odometer readings, and environmental
conditions is shown in Figure 11-20 and Table 11-11.  This final tree model suggests that engine
power is the most important explanatory variable for HC emissions. This finding is consistent
with analysis of NOx and CO emission rates. Although engine operating parameters such as oil
pressure might impact emissions, such variables are not easy to implement in real-world models.
After excluding engine barometric pressure and oil pressure from the tree model, leaving en-
gine power only, the residual mean deviation increased slightly from 56.27 to 65.56.  The final
HTBR tree for HC emissions is shown in Figure 11-21 and Table 11-11. HC cruise emission rate
model will be developed based upon these results.
                                         11-23

-------
                                            -.armies. .noMtecg 1533.5
                     engine. power<0-265
                                                                          01934
                                     engine. DO ver<7.875
          0.1757
                               0.1390
0.1697
 Figure 11-21 Final Regression Tree Model for Truncated Transformed HC and Engine Power in
                                      Cruise Mode
Table 11-11 Final Regression Tree Results for Truncated Transformed HC and Engine Power in
Cruise Mode
 Regression tree:
 snip.tree(tree = tree(formula = HC.25  ~  engine.power,  data =
        busdata!0242006.1.4,  na.action = na.exclude, mincut  = 400,  minsize =
        800,  mindev = 0.01),  nodes = c(ll., 10., 3.))
 Number of terminal nodes:  4
 Residual mean deviance:  0.001725 =  65.56 /  38020
 Distribution of residuals:
         Min.     1st Qu.      Median         Mean      3rd Qu.        Max.
  -1.372e-001 -2.070e-002 -6.875e-004   1.742e-015   2.090e-002  3.309e-001
 node),  split, n, deviance, yval
       * denotes terminal node

  1)  root 38020 71.970 0.1876
    2)  engine.power<15.335 8298 21.630  0.1666
      4) engine.power<0.265 4617  9.741 0.1757  *
      5) engine.power>0.265 3681 11.020 0.1551
       10) engine.power<7.875 1746  3.849  0.1390  *
       11) engine.power>7.875 1935  6.311  0.1697  *
    3)  engine.power>15.335 29722 45.660 0.1934  *
                                          11-24

-------
11.2.2 OLS Model Development and Refinement

       Once a manageable number of modal variables have been identified through regression
tree analysis, the modeling process moves into the phase in which ordinary least squares tech-
niques are used to obtain a final model. The research objective here is to identify the extent to
which the identified factors influence emission rate in cruise mode.  Modelers rely on previous
research, a priori knowledge, educated guesses, and stepwise regression procedures to identify
acceptable functional forms, to determine important interactions, and to derive statistically and
theoretically defensible models.  The final model will be our best understanding about the func-
tional relationship between independent variables and dependent variables.

11.2.2.1 NO Emission Rate Model Development for Cruise Mode

       Based on previous analysis, truncated transformed NO  will serve as the independent
variable. However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing the performance of statistical models.
HTBR tree model results suggest that engine power is the best one to begin with.

11.2.2.1.1 Linear Regression Model with Engine Power

       Let's select engine power to begin with, and estimate the model:

                       7 = /?Q + ft ^(engine.power) + Error                        (1.1)
                                         11-25

-------
       The regression run yields the results shown in Table 11-12 and Figure 11-22.

Table 11-12 Regression Result for NOY Model 1.1	
 Call: lm(formula = NOx.50 ~ engine.power, data  =  busdata!0242006.1.4,  na.action =
 na.exclude)
 Residuals:
      Min       1Q   Median      3Q   Max
  -0.5717 -0.06302 0.006377 0.06653 1.259

 Coefficients:
                 Value Std. Error  t value Pr(>|t|)
  (Intercept)    0.1815   0.0007   242.8528    0.0000
 engine.power   0.0018   0.0000   274.7573    0.0000

 Residual standard error: 0.09765 on 39372 degrees of  freedom
 Multiple R-Sguared: 0.6572
 F-statistic:  75490 on 1 and 39372 degrees of freedom,  the  p-value  is 0

 Correlation of Coefficients:
              (Intercept)
 engine.power -0.7526

 Analysis of Variance Table

 Response: NOx.50

 Terms added seguentially  (first to last)
                 Df Sum of Sg  Mean Sg  F Value  Pr(F)
 engine.power     1  719.8396 719.8396  75491.58      0
    Residuals 39372  375.4263   0.0095
         The results suggest that engine power explains about 66% of the variance in truncated
transformed NOx. F-statistic shows that/?7 ^ 0, and the linear relationship is statistically signifi-
cant. To evaluate the model, residual normality is examined in the QQ plot and constancy of
variance is checked by examining residuals vs. fitted values.
                                          11-26

-------
            (a) Scatter Plot
(b) Residual vs. Fit
                   .         -^-r*~
                                                 r
             (c) Response vs. Fit
(d) Residuals Normal QQ

               Figure 11-22 QQ and Residual vs. Fitted Plot for NOx Model 1.1
       The residual plot in Figure 11-22 shows a departure from linear regression assumptions
indicating a need to explore a curvilinear regression function. Since the variability at the differ-
ent X levels appears to be fairly constant, a transformation on X is considered. The reason to
consider transformation first is to avoid multicollinearity brought about by adding the second-or-
der of X. Based on the prototype plot in Figure 11-22, the square root transformation and loga-
rithmic transformation are tested. Scatter plots and residual plots based on each transformation
should then be prepared and analyzed to determine which transformation is most effective.
                 Y = /? + fl engine.power^IT> + Error                     (1.2)

              Y = /?  + fl Jog w(engine.power+1)  +  Error                   (1.3)
       The result for Model 1.2 is shown in Table 11-13 and Figure 11-23, while the result for
Model 1.3 is shown in Table 11-14 and Figure 11-24.
                                           11-27

-------
Table 11-13 Regression Result for NO Model 1.2
 Call:  lm(formula = NOx.50  ~  engine.power'" (1/2),  data = busdata!0242006.1.4,
 na.action = na.exclude)
 Residuals:
      Min       1Q     Median      3Q  Max
  -0.5007 -0.04881 -0.0008896 0.05047 1.22
 Coefficients:

            (Intercept)
 I (engine .power'" (1/2) )
 Value Std. Error  t value  Pr(>|t|)
0.0874   0.0008   104.1024    0.0000
0.0311   0.0001   342.3056    0.0000
 Residual standard error:  0.08364  on 39372 degrees of freedom
 Multiple R-Sguared: 0.7485
 F-statistic: 117200 on  1  and  39372  degrees of freedom, the p-value  is  0

 Correlation of Coefficients:
                        (Intercept)
 I (engine.power'" (1/2) ) -0.8649

 Analysis of Variance Table

 Response: NOx.50

 Terms added seguentially  (first  to last)
                           Df Sum of  Sg  Mean Sg  F Value Pr(F)
 I (engine.power'" (1/2))     1   819.8002 819.8002 117173.2     0
             Residuals 39372   275.4656   0.0070
            (a) Scatter Plot
                         '  •  —
             (c) Response vs. Fit
                    • «   •   • - "
                     ••' '•/«•<•»;.  .
                        * a * *• W *• •
                           
-------
Table 11-14 Regression Result for NO  Model 1.3
 Call:  lm(formula = NOx.50 ~ loglO(engine.power  +  1),  data = busdata!0242006.1.4,
 na.action = na.exclude)
 Residuals:
      Min       1Q    Median      3Q   Max
  -0.4047 -0.06677 -0.002155 0.06107  1.182
 Coefficients:

             (Intercept)
 loglO(engine.power + 1)
 Value Std. Error  t value Pr(>|t|)
0.0306   0.0012    25.5525   0.0000
0.1895   0.0007   279.4403   0.0000
 Residual standard error: 0.09656 on 39372  degrees  of  freedom
 Multiple R-Sguared: 0.6648
 F-statistic: 78090 on 1 and 39372 degrees  of  freedom,  the p-value is 0

 Correlation of Coefficients:
                         (Intercept)
 loglO(engine.power + 1)  -0.9135

 Analysis of Variance Table

 Response:  NOx.50

 Terms added seguentially (first to last)
                            Df Sum of Sg  Mean Sg   F Value Pr(F)
 loglO(engine.power +1)      1  728.1347 728.1347 78086.87     0
               Residuals 39372  367.1311    0.0093
           (a) Scatter Plot
               Response vs. Fit
                         
-------
       The results suggest that by using square root transformed engine power, the model in-
creases the amount of variance explained in truncated transformed NOx from about 66% (Model
1.1) to about 75% (Model 1.2), while remaining about 66% (Model  1.3) by using log trans-
formed engine power.
       Model 1.2 improves the R2 more than does Model 1.3. The residuals scatter plot for
Model 1.2 (Figure 11-23) shows a more reasonably linear relation than Model 1.3 (Figure 11-24).
Figure 11-23 also shows that Model 1.2 does a better job in improving the pattern of variance.
QQ plot shows a kind of normality except two tails.

11.2.2.1.2 Linear Regression Model with Dummy Variables

       Figure 11-14 suggests that the relationship between NO and engine power may be
somewhat different across the engine power ranges identified in the  tree analysis. That is, there
may be higher or lower NO  emissions in different engine power operating ranges. One dummy
variable is created to represent different engine power ranges identified in Figure 11-14 for use in
linear regression analysis as illustrated below:

                  Engine power (bhp)        Dummy 1
                       < 52.525                 1
                       > 52.525                 0
       This dummy variable and the interaction between dummy variable and engine power are
then tested to determine whether the use of the variables and interactions can help improve the
model.

    Y = ft  + /?j engine.power^IX) + /?2 dummy 1 + /?3 dummylengine.power^llT> + Error     (1.4)
       The result for Model 1.4 is shown in Table 11-15 and Figure 11-25.
                                         11-30

-------
Table 11-15 Regression Result for NO  Model 1.4
              ~                   V
 Call:  lm(formula = NOx.50 ~ engine.power^(1/2) + dummyl
 busdata!0242006.1.4,  na.action = na.exclude)
 Residuals:
      Min       1Q    Median      3Q   Max
  -0.4812 -0.04778 0.0001059 0.04843 1.195
                           engine .power'" (1/2) , data =
 Coefficients:

                  (Intercept)
        I(engine, power'"(1/2))
                       dummy1
 I(engine.power'"(1/2)):dummyl
  Value Std. Error   t value  Pr(>|t|)
 0.1581    0.0024    65.9078    0.0000
 0.0254    0.0002   122.2468    0.0000
-0.0682    0.0026   -25.9438    0.0000
 0.0020    0.0003     6.1264    0.0000
 Residual standard error: 0.08224 on 39370 degrees of freedom
 Multiple R-Sguared: 0.7569
 F-statistic:  40850 on 3 and 39370 degrees of freedom, the p-value  is  0

 Correlation of Coefficients:
                               (Intercept) I (engine. power'" (1/2 ))  dummyl
        I (engine. power'" (1/2) )  -0.9742
                       dummyl -0.9123      0.8888
 I (engine, power'" (1/2) ): dummyl  0.6175     -0.6339               -0.8171

 Analysis of Variance Table

 Response:  NOx.50

 Terms added seguentially (first to last)
                                 Df Sum of Sg  Mean Sg  F Value          Pr(F)
        I (engine. power'" (1/2) )      1  819.8002 819.8002 121203.8 0 . OOOOOOe+000
                       dummyl     1    8.9202   8.9202   1318.8 0 . OOOOOOe+000
 I (engine, power'" (1/2) ): dummyl     1    0.2539   0.2539     37.5 9 . 073785e-010
                    Residuals 39370  266.2915
                                                0.0068
                                         11-31

-------
                                                   (b) Response vs. Fit
           (a) Residuals vs. Fit
        r
                              • i   .
                                                   (c) Residuals Normal QQ
            01     t!

                                                                        r
              Figure 11-25 QQ and Residual vs. Fitted Plot for NOx Model 1.4
       The results suggest that by using dummy variables and interactions with transformed en-
gine power, the model increases the amount of variance explained in truncated transformed NO
from about 75% (Model 1.2) to about 77% (Model 1.4).
       Model 1.4 slightly improves the R2 more than does Model 1.2.  The residuals scatter plot
for Model 1.4 (Figure 11-25) shows a slightly more reasonably linear relation. Figure 11-25
shows that Model 1.4 may also do a slightly better job in improving the pattern of variance. The
QQ plot shows general normality with the exceptions arising in the tails.  However, it is impor-
tant to note that the model improvement, in terms of amount of variance explained by the model,
is marginal at best.

11.2.2.1.3 Model Discussion

       Previous sections provide the model development process from one OLS model to an-
other OLS model. To test whether the linear regression with power was a beneficial addition
to the regression tree model, the mean ERs at HTBR end nodes (single value) are compared to
the predictions from the linear regression function with engine power.  The results of the per-
formance evaluation are shown in Table 11-16.  The improvement in R2 associated with moving
toward a linear function of engine power is tremendous.  Hence, the use of the linear regression
                                         11-32

-------
function will provide a significant improvement in spatial and temporal model prediction capa-
bility. However this linear regression function might still be improved. Since the R2 and slope
in Table 11-16 are derived by comparing model  predictions and actual observations for emission
rates (untransformed y), these numbers are different in linear regression models.
       Two transforms of engine power were tested: square root transformation and log trans-
formation.  The results of the performance evaluation are shown in Table 11-16. These results
suggest that linear regression function with square root transformation performs slightly better.
       Given that the regression tree modeling exercise indicated that a number of power cut-
points may play a role in the emissions process,  an additional modeling run was performed. The
results of the performance evaluation are shown in Table 11-16.  Analysis results suggest that the
linear regression function with dummy variable  performs slightly better than the model without
the power cutpoints.

Table 11-16 Comparative Performance Evaluation of NOx Emission Rate Models
 Mean ERs
                                             Coefficient of
                                             determination
0.00003
         Slope
1.000
                 RMSE
0.12008
                  MPE
-0.000006
 Linear regression (power)
  0.529
0.814
0.08542
  0.01031
 Linear regression (powerA0.5)
  0.614
0.975
0.07494
  0.00707
 Linear regression (log(power))
  0.587
1.287
0.08043
  0.00933
 Linear regression (powerA0.5) w/dummy
 variables
  0.627
1.011
0.07372
  0.00704
       Although the linear regression function with dummy variables performs slightly bet-
ter than linear regression function with square root transformation, more explanatory variables
(dummy variable and the interaction with engine power) are introduced and the complexity of
the regression model increases. There is only one regression function for Model 1.2 while there
are two regression functions for Model 1.4. There is also no obvious reason why the engine
may be performing slightly differently within these power regimes, yielding different regression
slopes and intercepts. The fuel injection systems in these engines may operate slightly different-
ly under low load (near-idle) and high load conditions. The fuel injection system may be con-
trolled by the engine computer, or there may be a sufficient number of low power cruise opera-
tions and high power cruise operations that are incorrectly classified, and may be better classified
as idle or acceleration events (perhaps due to GPS speed data errors). In any case, because the
model with dummy variables does not perform appreciably better than the model without the
dummy variables, the dummy  variables are not included in the final model selection at this time.
                                         11-33

-------
These dummy variables are, however, worth exploring when additional data from other engine
technology groups become available for analysis. Model 1.2 is selected as the preliminary 'final'
model.
       The next step in model evaluation is to once again examine the residuals for the improved
model. A principal objective was to verify that the statistical properties of the regression model
conform to a set of properties of least squares estimators. In summary, these properties require
that the error terms be normally distributed, have a mean of zero, and have uniform variance.
       Test for Constancy of Error Variance
       A plot of the residuals versus the fitted values is useful in identifying any patterns in the
residuals.  Figure 11-23 plot (b) shows this plot for NO  model 1.2. Without considering vari-
ance due to high emission points and zero load data, there is no obvious pattern in the residuals
across the fitted values.
       Test of Normality of Error terms
       The first informal  test normally reserved for the test of normality of error terms is a quan-
tile-quantile plot of the residuals. Figure  11-23 plot (d) shows the normal quantile plot of NO
model  1.2.  The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality.  Under normality, we expect 68 percent of the residuals
to fall between ±VMS£ and about 90 percent to fall between ± 1.645  VMSE.  Actually, 81.79%
of residuals fall within the first limits, while 94.05% of residuals to fall within the second limits.
Thus the actual frequencies here are reasonably consistent with those  expected under normality.
The heavy tails at both ends are a cause for concern, but are due to the nature of the data set. For
example, even after the transformation, the response variable is not a true  normal distribution.
       Based on the above analysis, the final NO  emission rate model selected for cruise mode is:

                            NOx = (0.087 + 0.0311(engine.power)(1/2))2

       Analysis results support the observation that the final NO  emission model is significantly
better at explaining variability without making the model too complex. Since there is only one
engine type, complexity may not be valid in terms of transferability. This model is specific to the
engine classes employed in the transit bus operations. Different models may need to be devel-
oped for other engine classes and duty cycles.
                                          11-34

-------
11.2.2.2 CO Emission Rate Model Development for Cruise Mode

       Based on previous analysis, truncated transformed CO will serve as the independent
variable. However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing statistical models.  HTBR tree model
results suggest that engine power is the best one to begin with.

11.2.2.2.1 Linear Regression Model with Engine Power

       Let's select engine power to begin with, and estimate the model:

                 7 = /?Q + ft ^engine.power + Error                      (2.1)

       The regression run yields the results shown in Table 11-17 and Figure 11-26.

Table 11-17 Regression Result for CO Model 2.1
 Call:  lm(formula = log.CO ~ engine.power,  data = busdata!0242006.1.4,  na.action
 na.exclude)
 Residuals:
     Min      1Q   Median     3Q   Max
  -2.779 -0.2088 -0.01417 0.2153 2.376

 Coefficients:
                  Value Std. Error   t value  Pr(>|t|)
  (Intercept)    -2.2230    0.0030  -751.4277    0.0000
 engine.power    0.0033    0.0000   125.1304    0.0000

 Residual standard error: 0.3859 on 39216 degrees of freedom
 Multiple R-Sguared:  0.2853
 F-statistic:  15660 on 1  and 39216 degrees of freedom,  the p-value is 0

 Correlation  of Coefficients:
              (Intercept)
 engine.power -0.7525

 Analysis of  Variance Table

 Response: log.CO

 Terms  added  seguentially (first to last)
                 Df Sum of Sg  Mean Sg  F Value Pr(F)
 engine.power     1  2331.251 2331.251 15657.62     0
    Residuals 39216  5838.839    0.149
       These results suggest that engine power explains about 29% of the variance in truncated
transformed CO. F-statistic shows that/?7^ 0, and the linear relationship is statistically signifi-
cant. To evaluate the model, the normality is examined in the QQ plot and constancy of variance
is checked by examining residuals vs. fitted values.
                                         11-35

-------
            (a (Scatter Plot
            (c) Response vs. Fit
                         (b) Residual vs. Fit
                         (d) Residuals Normal QQ

                                                                0     I      4
               Figure 11-26 QQ and Residual vs. Fitted Plot for CO Model 2.1
       Although the residual plot in Figure 11-26 shows a linear relationship between engine
power and truncated transformed CO, square root transformation and logarithmic transformation
are tested to see whether transformation would be useful to improve the model. Scatter plots
and residual plots based on each transformation should then be prepared and analyzed to decide
which transformation is most effective.
Y =
+
                                                + Error
(2.2)
                   Y= PQ + P1log1Q(engine.power+l) + Error
                                                  (2.3)
       The results for Model 2.2 are shown in Table 11-18 and Figure 11-27, while the results
for Model 2.3 are shown in Table 11-19 and Figure  11-28.
                                          11-36

-------
Table 11-18 Regression Result for CO Model 2.2
 Call:  lm(formula = log.CO ~ engine.power'" (1/2), data  =  busdata!0242006.1.4,
        na.action = na.exclude)
 Residuals:
     Min      1Q   Median     3Q   Max
  -2.679 -0.2124 -0.01769 0.2178 2.319

 Coefficients:
                           Value Std. Error    t value  Pr(>|t|)
            (Intercept)   -2.3645    0.0039  -610.0636     0.0000
 I(engine.power'" ( 1/2))    0.0526    0.0004    125.3638     0.0000

 Residual standard error: 0.3857 on 39216 degrees  of freedom
 Multiple R-Sguared: 0.2861
 F-statistic:  15720 on 1 and 39216 degrees of  freedom, the  p-value is  0

 Correlation of Coefficients:
                        (Intercept)
 I (engine.power'" (1/2) ) -0.8646

 Analysis of Variance Table

 Response: log.CO

 Terms  added seguentially (first to last)
                          Df Sum of Sg  Mean  Sg  F Value Pr(F)
 I (engine.power'" (1/2))     1  2337.466 2337.466 15716.09     0
             Residuals 39216  5832.624    0.149
           (a) Scatter Plot
            (c) Response vs. Fit
(b) Residual vs. Fit
(d) Residuals Normal QQ
               Figure 11-27 QQ and Residual vs. Fitted Plot for CO Model 2.2
                                         11-37

-------
Table 11-19 Regression Result for CO Model 2.3
 Call:  lm(formula = log.CO ~ loglO(engine.power +  1),  data  =  busdata!0242006.1.4,
 na.action = na.exclude)
 Residuals:
     Min      1Q  Median     3Q   Max
  -2.636 -0.2225 -0.0167 0.2193 2.308

 Coefficients:
                             Value Std. Error   t  value   Pr(>|t|)
             (Intercept)   -2.4326    0.0050  -489.4690     0.0000
 loglO(engine.power + 1)    0.3031    0.0028   107.5567     0.0000

 Residual standard error: 0.4011 on 39216 degrees  of  freedom
 Multiple R-Sguared: 0.2278
 F-statistic:  11570 on 1 and 39216 degrees of freedom, the  p-value is  0

 Correlation of Coefficients:
                         (Intercept)
 loglO(engine.power + 1) -0.9132

 Analysis of Variance Table

 Response: log.CO

 Terms  added seguentially (first to last)
                            Df Sum of Sg  Mean Sg   F  Value  Pr(F)
 loglO(engine.power +1)     1  1861.106  1861.106  11568.45      0
               Residuals 39216  6308.983    0.161

           (a) Scatter Plot
            (c) Response vs. Fit
(b) Residual vs. Fit

(d) Residuals Normal QQ
               Figure 11-28 QQ and Residual vs. Fitted Plot for CO Model 2.3
                                          11-38

-------
       The results suggest that by using transformed engine power, the model retains the amount
of variance explained in truncated transformed CO at about 29% (Model 2.2), and even decreas-
es to 23% (Model 2.3).
       Considering two kinds of transformation, Model 2.2 improves the R2 more than does Model
2.3. The residuals scatter plot for Model 2.2 (Figure 11-27) shows a more reasonably linear re-
lationship than Model 2.3 (Figure  11-28). Figure 11-27 also shows that Model 2.2 does a better
job of improving the pattern of variance comparing with Model 2.3. The QQ plot shows a kind of
normality except for the two tails.  Model 2.1 and Model 2.2 are both acceptable at this point.

11.2.2.2.2 Linear Regression Model with Dummy Variables

       Figure 11-17 suggests that the relationship between CO and engine power may be some-
what different across the engine power ranges identified in the tree analysis. That is, there may
be higher or lower CO emissions  in different engine power operating ranges. One dummy vari-
able is created to represent different engine power ranges identified in Figure 11-17 for use in
linear regression analysis as illustrated below:
                      Engine power (bhp)     Dummy 1
                           <114.355             1
                           >114.355             0
       This dummy variable and  the interaction between dummy variable and engine power are
then tested to determine whether the use of the variable and interactions can help improve the
model.

    Y = ft  + /? engine.power^IT> + /?2 dummy 1 + /?3 dummy 1 engine.power^IT> + Error      (2.4)
       The regression yields the results shown in Table 11-20 and Figure 11-29.
                                         11-39

-------
Table 11-20 Regression Result for CO Model 2.4
 *** Linear Model ***

 Call:  lm(formula = log.CO ~ engine.power^(1/2)  + dummyl * engine.power^(1/2),  data
 busdata!0242006.1.4, na.action = na.exclude)
 Residuals:
     Min      1Q   Median     3Q  Max
  -2.714 -0.2081 -0.01473 0.2136 2.37
 Coefficients:

                  (Intercept)
        I(engine, power'"(1/2))
                       dummy1
 I(engine.power'"(1/2)):dummyl
  Value Std. Error   t value  Pr(>|t|)
-2.6690    0.0250  -106.5896    0.0000
 0.0772    0.0019    41.2399    0.0000
 0.3472    0.0254    13.6516    0.0000
-0.0338    0.0020   -17.0016    0.0000
 Residual standard error:  0.3836 on 39214 degrees of freedom
 Multiple R-Sguared:  0.2936
 F-statistic:  5432 on 3 and 39214 degrees of freedom,  the p-value is 0

 Analysis of Variance Table

 Response:  log.CO

 Terms added seguentially (first to last)
                                 Df Sum of Sg  Mean Sg  F Value Pr(F)
        I (engine.power'" (1/2) )      1  2337.466 2337.466 15881.03     0
                       dummyl      1    18.325   18.325   124.50     0
 I (engine, power'" (1/2) ): dummyl      1    42.545   42.545   289.05     0
                    Residuals  39214  5771.754    0.147
           (a) Residuals vs. Fit
             •
                                                    (b) Response vs. Fit

                                                    (c) Residuals Normal QQ
               Figure 11-29 QQ and Residual vs. Fitted Plot for CO Model 2.4
                                         11-40

-------
       Model 2.4 improves R2 only marginally and retains the amount of variance explained in
truncated transformed CO at about 29%, same as Model 2.1 and Model 2.2. Model 2.4 slightly
improves R2 more than does Model 2.2.  The residuals scatter plot for Model 2.4 (Figure 11-29)
shows a reasonably linear relationship.  Figure 11-29 also shows that Model 2.4 does a good job
of improving the pattern of variance.  QQ plot shows general normality with the exceptions aris-
ing in the tails. These three models (Model 2.1, Model 2.2, and Model 2.4) are all acceptable.

11.2.2.2.3 Model Discussion

       The previous sections outline the model development process from a regression tree
model, to a simple OLS model, to more complex OLS models.  Since the performance of the
models is evaluated by comparing model predictions and actual observations for emission rates,
the R2 and slope are different from those in previous linear regression models. The results of
each step in the model improvement process are presented in Table 11-21.  The mean emission
rates at HTBR end nodes (single value) are compared to the results of various linear regression
functions with engine power. Since the R2 and slope in Table 11-21 are derived by comparing
model predictions and actual observations for emission rates (untransformed y), these numbers
are different from those encountered in linear regression models.

Table 11-21 Comparative Performance Evaluation of CO Emission Rate Models
 Mean ERs
                                          Coefficient of
                                          determination
0.000005
           Slope
1.000
                   RMSE
0.047559
                   MPE
0.0000002
 Linear regression (power)
  0.0880
1.422
 0.04622
  0.00749
 Linear regression (power  )
  0.0899
1.984
 0.04662
  0.00804
 Linear regression (log(power))
  0.0659
2.560
 0.04736
  0.00866
 Linear regression (power0-5) w/dummy variables
  0.0915
1.657
 0.04634
  0.00777
       The improvement in R2 associated with moving toward a linear function of engine power
is significant. Hence, the use of the linear regression function will provide a significant improve-
ment in spatial and temporal model prediction capability.  However, this linear regression func-
tion might still be improved.
       Results suggest that a linear regression function with square root transformation performs
slightly better than the others and that the use of dummy variables can further improve model
performance. However, given the marginal improvement in R2, one could argue that use of the
engine power may be just as reasonable considering the slope, RMSE, and MPE.  Although the
                                         11-41

-------
linear regression function with dummy variables performs slightly better than other linear re-
gression models, more explanatory variables (dummy variables and the interaction with engine
power) are introduced and the complexity of regression model increases. As discussed in Section
11.2.2.1, there is no compelling reason to include the dummy variables in the model, given that:
1) the second model is more complex without significantly improving model performance, and 2)
there is no compelling engineering reason at this time to support the difference in model perfor-
mance within these specific power regions. These dummy variables are, however, worth explor-
ing when additional data from other engine technology groups become available for analysis.
       Considering all four parameters together, Model 2.1 is recommended as the preliminary
'final' model.  The next step in model evaluation is to once again examine the residuals for the
improved  model.  A principal objective was to verify that the statistical properties of the regres-
sion model conform to a set of properties of least squares estimators. In summary,  these proper-
ties require that the error terms be normally distributed, have a mean of zero, and have uniform
variance.
       Test for Constancy of Error Variance
       A plot of the residuals versus the fitted values is useful in identifying patterns in the
residuals.  Figure 11-26 plot (b) shows this plot for CO Model 2.1. Without considering variance
due to high emission points and zero load data, there is no obvious pattern in the residuals across
the fitted values.
       Test of Normality of Error Terms
       The first informal test normally reserved for the test of normality of error terms is a
quantile-quantile plot of the residuals. Figure  11-26 plot (c) shows the normal quantile plot of
CO model 2.1. The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality. Under normality, we expect 68 percent of the residuals
to fall between ± V-MSE and about 90 percent to fall between ± 1.645 VMSE. Actually, 95.20%
of residuals fall within the first limits, while 96.97% of residuals fall within the second limits.
Thus the actual frequencies here are reasonably consistent with those expected under normality.
The heavy tails at both ends are a cause for concern, but these tails are due to the nature of the
data set. For example, even after the transformation, the response variable is not the real normal
distribution.
       Based on the above  analysis, the final CO emission rate model for the cruise mode is:
                            rn = i r)(~2-223+o.oo33engine.power)
                                          11-42

-------
11.2.2.3 HC Emission Rate Model Development for Cruise Mode

       Based on previous analysis, truncated transformed HC will serve as the independent
variable. However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing statistical models. Previous analysis
results suggest that engine power is the best one to begin with.

11.2.2.3.1 Linear Regression Model with Engine Power

       Let's select engine power to begin with, and estimate the model:

                      7 = ft  + ft ^engine.power + Error                        (3.1)

       The regression run shows the results in Table 11-22 and Figure 11-30.

Table 11-22 Regression Result for HC Model 3.1
 Call:  lm(formula = HC.25 ~ engine.power, data = busdata!0242006.1.4, na.action
 na.exclude)
 Residuals:
     Min      1Q     Median      3Q    Max
  -0.123 -0.0212 0.00002295 0.02228 0.3279

 Coefficients:
                 Value Std. Error  t value Pr(>|t|)
  (Intercept)    0.1769   0.0003   537.0480   0.0000
 engine.power   0.0001   0.0000    43.0656   0.0000

 Residual standard error: 0.04248 on 38018 degrees of freedom
 Multiple R-Sguared: 0.04651
 F-statistic:  1855 on 1 and 38018 degrees of freedom, the p-value  is 0

 Correlation of Coefficients:
              (Intercept)
 engine.power -0.7501

 Analysis of Variance Table

 Response: HC.25

 Terms  added seguentially  (first to last)
                 Df Sum of Sg  Mean Sg  F Value Pr(F)
 engine.power     1   3.34748 3.347484 1854.647     0
    Residuals 38018  68.61934 0.001805
       The results suggest that engine power explains about 5% of the variance in truncated
transformed HC. F-statistic shows that/?7^0, and the linear relationship is statistically signifi-
cant. To evaluate the model, the normality is examined in the QQ plot and constancy of variance
is checked by examining residuals vs. fitted values.
                                         11-43

-------
           (a) Scatter Plot
         KOJ-
             n   so   ^B  t»   300  no   we
            (c) Response vs. Fit
(b) Residual vs. Fit
(d) Residuals Normal QQ

                                                                               •
                Figure 11-30 QQ and Residual vs. Fitted Plot for HC Model 3.1
       The residual plot in Figure 11-30 shows a slight departure from linear regression assump-
tions indicating a need to explore a curvilinear regression function. Since the variability at the
different X levels appears to be fairly constant, a transformation on X is considered.  The reason
to consider transformation first is to avoid multicollinearity brought about by adding the second-
order of X. Based on the prototype plot in Figure 11-30, the square root transformation and loga-
rithmic transformation are tested. Scatter plots and residual plots based on each transformation
should then be prepared and analyzed to determine which transformation is most effective.
                     Y = ft  + ft engine.power^IT> + Error
                  Y = /?Q + ft JoglO (engine.power+1) + Error
                        (3.2)
                        (3.3)
       The results for Model 3.2 are shown in Table 11 -23 and Figure 11-31, while the results
for Model 3.3 are shown in Table 11-24 and Figure 11-32.
                                           11-44

-------
Table 11-23 Regression Result for HC Model 3.2
 Call:  lm(formula = HC.25 ~ engine.power'" (1/2), data = busdata!0242006.1.4,  na.action
         =  na.exclude)
 Residuals:
      Min       1Q     Median      3Q    Max
  -0.1233 -0.02113 -0.0002419 0.02195 0.3266
 Coefficients:

            (Intercept)
 I (engine .power'" (1/2) )
 Value Std. Error  t value Pr(>|t|)
0.1700   0.0004   396.7451   0.0000
0.0022   0.0000    47.6385   0.0000
 Residual standard error: 0.04227 on 38018 degrees of  freedom
 Multiple R-Sguared: 0.05633
 F-statistic:  2269 on 1 and 38018 degrees of freedom,  the p-value  is  0

 Correlation of Coefficients:
                       (Intercept)
 I (engine.power'" (1/2) )  -0.8625

 Analysis of Variance Table

 Response:  HC.25

 Terms added seguentially (first to last)
                          Df Sum of Sg  Mean Sg  F Value Pr(F)
 I (engine.power'" (1/2))      1   4.05395 4.053948 2269.422      0
             Residuals 38018  67.91288 0.001786
         (a) Scatter Plot

                   -•
                  .
                    «
-------
Table 11-24 Regression Result for HC Model 3.3
 Call:  lm(formula = HC.25 ~ loglO(engine.power + 1), data = busdata!0242006.1.4,
        na.action = na.exclude)
 Residuals:
     Min       1Q     Median      3Q    Max
  -0.127 -0.02073 -0.0003198 0.02203 0.3226
 Coefficients:

             (Intercept)
 loglO(engine.power + 1)
 Value Std. Error  t value Pr(>|t|)
0.1653   0.0005   313.2136   0.0000
0.0139   0.0003    46.4046   0.0000
 Residual standard error: 0.04233 on 38018 degrees of freedom
 Multiple R-Sguared: 0.05361
 F-statistic:  2153 on 1 and 38018 degrees of freedom, the p-value is  0

 Correlation of Coefficients:
                         (Intercept)
 loglO(engine.power + 1)  -0.9114

 Analysis of Variance Table

 Response:  HC.25

 Terms added seguentially (first to last)
                            Df Sum of Sg  Mean Sg F Value Pr(F)
 loglO(engine.power +1)      1   3.85779 3.857786 2153.39      0
               Residuals 38018  68.10904 0.001791
        (a) Scatter Plot
     y
     \-

                       ,.
                  to         i
            Response vs. Fit
                         (b) Residual vs. Fit
                         (d) Residuals Normal QQ
               Figure 11-32 QQ and Residual vs. Fitted Plot for HC Model 3.3
                                         11-46

-------
       The results suggest that by using transformed engine power, the model retains the amount
of variance explained in truncated transformed HC at about 5% (Model 2.2 and Model 2.3).  The
improvement is very small.
       Model 3.2 improves R2 relative to Model 3.3. The scatter plot for Model 3.2 (Figure
11-31) also shows a better linear relationship than Model 3.3  (Figure 11-32). Figure 11-31 also
shows that Model 3.2 does a good job  of improving the pattern of variance. The QQ plot shows
general normality with the exceptions  arising in the tails.

11.2.2.3.2 Linear Regression Model with Dummy Variables

       Figure 11-21 suggests that the relationship between HC and engine power may differ
across the engine power ranges. One dummy variable is created to represent different engine
power ranges identified in Figure 11-21 for use in linear regression analysis as illustrated below:
                    Engine power (bhp)      Dummy 1
                         < 15.335               1
                         > 15.335               0

       This dummy variable and the interaction between dummy variable and engine power
are then tested to determine whether the use of the variable and interaction can help improve the
model.
 Y = /?Q + yffj logw(engine.power+1) + fi2 dummy 1 + /?3 dummy 1 logw(engine.power+1) + Error     (3.4)
       The regression run shows the results in Table  11-25 and Figure 11-33.
                                         11-47

-------
Table 11-25 Regression Result for HC Model 3.4
 Call:  lm(formula = HC.25 ~ loglO(engine.power + 1)  + dummyl * loglO(engine.power +
 1),  data = busdata!0242006.1.4,  na.action = na.exclude)
 Residuals:
      Min      1Q     Median      3Q    Max
  -0.1292 -0.0209 -0.0007262  0.02123 0.3423
 Coefficients:

                    (Intercept)
        loglO(engine.power + 1)
                         dummy1
 dummyl:loglO(engine.power + 1)
  Value Std. Error   t value  Pr(>|t|)
 0.1695    0.0015   109.7632    0.0000
 0.0124    0.0008    15.7058    0.0000
 0.0022    0.0017     1.3388    0.1807
-0.0249    0.0012   -20.1153    0.0000
 Residual standard error:  0.04184 on 38016 degrees of freedom
 Multiple R-Sguared:  0.07514
 F-statistic:  1030 on 3 and 38016 degrees of freedom,  the p-value is 0

 Analysis of Variance Table

 Response:  HC.25

 Terms added seguentially (first to last)
                                   Df Sum of Sg  Mean Sg  F Value Pr(F)
        loglO(engine.power +1)      1   3.85779 3.857786 2203.411     0
                         dummyl     1   0.84128 0.841276  480.503     0
 dummyl:loglO(engine.power +1)      1   0.70843 0.708425  404.624     0
                      Residuals 38016  66.55934 0.001751
                                                     (b) Response vs. Fit
     (a) Residuals vs. Fit

                                                     (c) Residuals Normal QQ
        OW      017      Ot*       01»       030
          rate Be«x • i|*dwnvi*iog10(«icHi»poMr> 1)
               Figure 11-33 QQ and Residual vs. Fitted Plot for HC Model 3.4
                                         11-48

-------
       The results suggest that by using dummy variables and interactions with transformed en-
gine power, the model only increases the amount of variance explained in truncated transformed
HC from about 5% to about 8%.
       Model 3.4 slightly improved R2 relative to Model 3.2. The F-statistic shows that all P
values are not equal to zero, and the linear relationship is statistically significant. The gap in the
residuals plot may be shifted regarding the intercept and slope by the difference of two regres-
sion functions.

11.2.2.3.3 Model Discussion

       The previous sections outline the model development process from regression tree model,
to a simple OLS model, to more complex OLS models.  Since the performance of the models
is evaluated by comparing model predictions and actual  observations for emission rates, the
R2 and slope are different from those in previous linear regression models. To test whether the
linear regression with power was a beneficial addition to the regression tree model, the mean
ERs at HTBR end nodes (single value) are compared to the predictions from the linear regres-
sion function with engine power. The results of the performance evaluation are shown in Table
11-26. The improvement in R2 associated with moving toward a linear function of engine power
is nearly imperceptible. Hence, the use of the linear regression function will provide  almost no
significant improvement in spatial and temporal model prediction capability.  This linear regres-
sion function might still be improved. Since the R2 and  slope in Table 11-26 are derived by
comparing model predictions and actual observations for emission rates (untransformed y), these
numbers are different from the results obtained from linear regression models.
Table 11-26 Comparative Performance Evaluation of HC Emission Rate Models
                                          Coefficient of
                                         determination
Slope
         RMSE
MPE
Mean ERs
Linear regression (power)
Linear regression (power0'5)
Linear regression (log(power))
Linear regression (log(power)) w/dummy variables
0.00002
0.00766
0.00912
0.00950
0.00939
1.000
0.886
0.724
0.820
-1.142
0.0020519
0.0020984
0.0020845
0.0020831
0.0022933
0.0000003
0.00047397
0.00040936
0.00040857
0.00097449
       Results suggest that the linear regression function with log transformation performs
slightly better than the others and that the use of dummy variables can further improve model
performance, but again there is almost no perceptible change in terms of explained variance.
Although the linear regression function with log transformation and dummy variables performs
slightly better than the linear regression function with square root transformation alone, the
                                         11-49

-------
revised model introduces additional explanatory variables (dummy variables and the interaction
with engine power) and increases the complexity of the regression model without significantly
improving the model.  As discussed in Section 11.2.2.1, there is no compelling reason to include
the dummy variables in the model, given that:  1) the second model is more complex without sig-
nificantly improving model performance, and 2) there is no compelling engineering reason at this
time to support the difference in model performance within these specific power regions. These
dummy variables are, however, worth exploring when additional data from other engine technol-
ogy groups become available for analysis.
       Model 3.2 is recommended as the preliminary "final" model (although one might argue
that using the regression tree results directly would also probably be acceptable). The next step
in model evaluation is to once again examine the residuals for the improved model. A principal
objective was to verify that the statistical properties of the regression model conform ta a set of
properties of least squares estimators. In summary, these properties require that the error terms
be normally distributed, have a mean of zero, and have uniform variance.
       Test for Constancy of Error Variance
       A plot of the residuals versus the fitted values is useful in identifying any patterns in the
residuals. Figure 11-31 plot (c) shows this plot for HC Model 3.2. Without considering variance
due to high emission points and zero load data, there  is no obvious pattern in the residuals across
the fitted values.
       Test of Normality of Error terms
       The first informal test normally reserved for the test of normality of error terms is a
quantile-quantile plot of the residuals. Figure 11-31 plot (d) shows the normal quantile plot of
the HC model. The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality.  Under normality, we expect  68 percent of the residuals
to fall between ±VMS£ and about 90 percent to fall between ± 1.645 VMSE.  Actually, 95.20%
of residuals fall within the first limits, while 96.99% of residuals fall within the second limits.
Thus, the actual frequencies here are reasonably consistent with those expected under normality.
The heavy tails at both ends are a cause for concern, but are due to the nature of the data set. For
example, even after the transformation, the response variable is not the real normal distribution.
       The final HC emission rate model selected for cruise mode is:

                            HC = [0.170  + 0.0022(engine.power)(1/2)]4
                                          11-50

-------
                      11.3 Conclusions and Further Considerations

       In this research, engine power is used as the main explanatory variable to develop cruise
emission rate models.  The explanatory ability of engine power varies by pollutant. In general,
the relationship between NOx and engine power is more highly correlated than the other two pol-
lutants.
       Inter-bus variability analysis indicated that some of the 15 buses are higher emitters that
others (especially noted for HC emissions).  However, none of the buses appear to qualify as
traditional high-emitters, which would exhibit emission rates of two to three standard devia-
tions above the mean.  Hence, it is difficult to classify any of these 15 buses as high emitters
for modeling purposes. At this point, these 15 buses are treated as a whole data set for model
development. Modelers should keep in mind that although no true high-emitters are present in
the database, such vehicles may  behave significantly differently than the vehicles tested. Hence,
data from high-emitting vehicles should be collected and examined in future studies.
       Some high HC emissions events are noted in cruise mode.  After screening engine speed,
engine power, engine oil temperature, engine oil pressure, engine coolant temperature, ECM
pressure, and other parameters, no variables  were identified that could be linked to these high
emissions events.  These events may represent natural variability in onroad emissions, or some
other variable (such as grade or an engine variable that is not measured) may be linked to these
events.
       Engine power is selected as the most important variable for three pollutants based on
HTBR tree models.  This finding is consistent with previous research results which verified the
important role of engine power (Ramamurthy et al. 1998; Clark et al. 2002; Earth et al. 2004).
The noted HC relationship is significant but fairly weak. Analysis in previous chapters also indi-
cates that engine power is correlated with not only onroad load parameters such as vehicle speed,
acceleration, and grade, but also potentially correlated with engine operating parameters such
as throttle position and engine oil pressure. On the other hand, engine power in this research is
derived from engine speed,  engine torque and percent engine load.
       The regression tree models still suggest that some other variables, like oil pressure and
engine barometric pressure, may also impact the HC emissions. Further analysis demonstrates
that by using engine power  alone one might be able to achieve similar explanatory ability as
opposed to using engine power and other variables.  To develop models that are efficient and
easy to implement, only engine power is used to develop emission models. However, additional
investigation into  these variables is warranted as additional detailed data from engine testing
become available  for analysis.
                                          11-51

-------
       Given the relationships noted between engine indicated HP and emission rates, it is
imperative that data be collected to develop solid relationships in engine power demand models
(estimating power demand as a function of speed/acceleration, grade, vehicle characteristics,
surface roughness, inertial losses, etc.) for use in regional inventory development and microscale
impact assessment.
       In summary, the cruise emission rate models selected for implementation are:

                        NOx= [0.0087+0.0311 (engine.power)(1/2)]2


                          CO = 10A(-2.223+0.0033engine.power)


                         HC = [0.170+0.0022 (engine.power)(1/2)]4
                                          11-52

-------
                                    CHAPTER 12
                             12. MODEL VERIFICATION
       In the previous chapters, three statistically-derived modal emission rate models were de-
veloped for use in predicting emissions of NOx, CO and HC from transit buses. This chapter dis-
cusses the reasons for using engine power instead of surrogate power variables in emission rate
modeling, the necessity of developing a linear regression model rather than using mean emission
rates, the need to introduce driving mode with load modeling, the possibility of combining ac-
celeration and cruise modes, and other issues.

                    12.1 Engine Power vs. Surrogate Power Variables

       The first step towards verifying the model is to compare the explanatory power of real
load data and surrogate power variables. Different approaches have been proposed by several re-
searchers.  The MOVES model employs vehicle specific power (VSP), defined as instantaneous
power per unit mass of the vehicle (Jimenez-Palacios 1999).
       VSP is a measure of the road load on a vehicle, defined as the power per unit mass to
overcome road grade, rolling and aerodynamic resistance, and inertial acceleration (Jimenez-
Palacios 1999; U.S. EPA2002b; Nam 2003; Younglove et al. 2005):
                                  g* grade + g*CR) + 0.5p*CZ)M*v3//w
             where:
                 v:     vehicle speed (assuming no headwind) in m/s
                 a:     vehicle acceleration in m/s2
                 y :     mass factor accounting for the rotational masses (-0.1)
                 g:     acceleration due to gravity
                 grade:  road grade
                 CR:    rolling resistance (-0.0135)
                 p:     air density (1.2)
                 CD:    aerodynamic drag coefficient
                 A:     the frontal area
                 M:     vehicle mass in metric tons
                                         12-1

-------
       Using typical values for coefficients, in SI units the equation becomes (CDA/m ~ 0.0005)
(Younglove et al. 2005):
                                                             0.132) + 0.001208xv3
       The VSP approach to emission characterization was developed by several researchers
(Jimenez-Palacios 1999; U.S. EPA2002b; Nam 2003; Younglove et al. 2005) and further devel-
oped as part of the MOVES model. The coefficients used to estimate VSP were different in pre-
vious research because of the choice of typical values of coefficients. However, the coefficients
given in the above equation are specific for light-duty vehicles. For example, a mass factor of
0. 1 is not suitable to describe the transit bus characteristics of inertial loss. This surrogate power
variable (VSP) is not suitable to compare with engine load data for this study. First, the imple-
mentation approach that is used in MOVES is based upon VSP bins, and not on instantaneous
VSP. Second, the coefficients given in the above equation are specific for light-duty vehicles, not
for transit buses.
       Other research efforts have used surrogate power variables such as the inertial power sur-
rogate, defined as acceleration times velocity, and drag power surrogate, defined as acceleration
times velocity squared (Fomunung 2000). Earth and Frey also used acceleration times velocity
for power demand estimation (Barth and Norbeck 1997; Frey et al. 2002). Both  surrogate vari-
ables for power demand can be used to compare NOx in cruise mode. Using surrogate variables
instead of real load data, the model is:

        Y = fi0 + fij acceleration + /32 vehicle, speed + /33 vehicle. speed*acceleration +
             P4 vehicle, speed ^acceleration + Error
       The regression run shows the results in Table 12-1 and Figure 12-1.
                                          12-2

-------
Table 12-1 Regression Result for NOx Model 1
 Call:  lm(formula = NOx.50 ~ vehicle.speed * acceleration + vehicle.speedA2 :
        acceleration,  data = busdata!0242006.1.4,  na.action =  na.exclude)
 Residuals:
      Min       1Q   Median      3Q   Max
  -0.4779 -0.08625 0.001824 0.08759 1.338
 Coefficients:

                     (Intercept)
                   vehicle.speed
                    acceleration
      vehicle.speed:acceleration
 acceleration: I (vehicle.speedA2)
  Value  Std.  Error    t  value   Pr(>|t|)
 0.1996     0.0018    113.0559     0.0000
 0.0043     0.0001     77.4369     0.0000
 0.0738     0.0052     14.2957     0.0000
 0.0066     0.0004     15.5704     0.0000
-0.0001     0.0000    -13.7590     0.0000
 Residual standard error:  0.1323 on 39369 degrees of freedom
 Multiple R-Squared:  0.3708
 F-statistic:  5801 on 4 and 39369 degrees of freedom,  the p-value is 0

 Correlation of Coefficients:
                                 (Intercept)  vehicle.speed acceleration
                   vehicle.speed -0.9243
                    acceleration  0.0796
      vehicle.speed:acceleration -0 . 0825
 acceleration: I (vehicle.speedA2)   0.0782
          -0.0590
           0.0569
          -0.0593
                   -0.9114
                    0.7978
                                 vehicle.speed:acceleration
                   vehicle.speed
                    acceleration
      vehicle.speed:acceleration
 acceleration: I (vehicle.speedA2)  -0.9678

 Analysis of Variance Table

 Response:  NOx.50
 Terms added sequentially (first to last)
                                    Df Sum of Sq
                   vehicle.speed     1
                    acceleration     1
      vehicle.speed:acceleration     1
 acceleration: I (vehicle.speedA2)      1
                        F Value Pr(F)
                        6999.67     0
                       Residuals 39369  689.1106
          Mean Sq
122.5215 122.5215
278.9165 278.9165 15934.55
  1.4036   1.4036    80.19
  3.3136   3.3136   189.31
           0.0175
                                          12-2

-------
      (a} Residual vs. Fit
(b) Residuals Normal QQ
     •
                                                                                 . :-.
                                                                                 •
             02      03      04       0
                                                                c* Slmara Nwm*
                Figure 12-1 QQ and Residual vs. Fitted Plot for NOx Model 1
       The results suggest that the surrogate variable model can explain about 37 % of the vari-
ance in truncated transformed NO , whereas the OLS model developed in Chapter 10 explained
more than 75% of the cruise mode variance. Considering the theoretical equation of engine
power presented much earlier in Chapter 3, the surrogate variables can only represent some, and
not all, of the components of engine power. Given the importance of engine power in explaining
the variability of emissions, it is essential that field data collection efforts include the measure-
ment of indicated load data as well as all of the operating conditions necessary to estimate bhp
load when second-by-second emission rate data are collected.

                 12.2 Mean Emission Rates vs. Linear Regression Model

       The modeling approach employed in this research involved the separation of data into
separate driving modes for analysis and then applying modeling techniques to derive emission
rates as a function of engine load. Although constant emission rates in grams/second were ad-
equate for idle, motoring, and non-motoring deceleration modes, modeling efforts in Chapters 10
and 11 demonstrated that a linear regression function should improve spatial and temporal model
prediction capability significantly for acceleration and cruise modes. However, one verification
comparison that should be undertaken is on the overall  benefit of introducing engine load into the
modeling regime vs. simply using average emission rate values for each operating mode. This
comparison will provide insight into the overall effect of introducing engine load (even though  it
is only introduced into acceleration and cruise modes).
       There are a number of model goodness-of-fit criteria that can be used to assess the dif-
ference between the emissions predicted by the load-based modal emission rate model and the
mode-only emission rate models.  Normally, one would compare the alternative model perfor-
                                          12-4

-------
mance for an independent set of data collected from similar vehicles, which is currently not
available.  Alternatively, model developers would set aside a significant subset of the data in the
model development data set so that the data are not used in model development and instead used
in model comparisons.  However, there were not enough data available to do this. Hence, at this
time, the only comparisons that can be made are for alternative model performance using the
same data that were used to develop the models presented in this research effort.
       The performance of the models is first evaluated by comparing model predictions and ac-
tual observations for emission rates.  The performance of the model can be evaluated in terms of
precision and accuracy (Neter et al. 1996). TheR2 value is an indication of precision. Usually,
higher R2 values imply a higher degree of precision and less unexplained variability in model
predictions than lower R2 values.  The slope of the trend line for the  observed versus predicted
values is an indication of accuracy. A slope of one indicates an accurate prediction,  in that the
prediction of the model corresponds to an observation.
       The model's predictive ability is also evaluated using the root mean square error (RMSE)
and the mean prediction error (MPE) (Neter et al. 1996).  The RMSE is a measure of prediction
error.  When comparing two models,  the model with a smaller RMSE is a better predictor of
the observed phenomenon. Ideally, mean prediction error is close to zero. RMSE and MPE are
calculated as follows:
                        \\"
              RMSE = . -Y(> -j)2                 Equation (12-1)
                       \n^
                                                      Equation (12-2)
       where:
                 RMSE:    =      root mean square error
                 n:         =      number of observations
                 yr        =      observaton y
                 yr        =      mean of observation y
                 MPE:     =      mean predictive error
       To test whether the linear regression with power was a beneficial addition to the regres-
sion tree model, the mean ERs at HTBR end nodes (single value) are compared to the predictions
from the linear regression function with engine power. The results of the performance evaluation
are shown in Table 12-2.
                                          12-5

-------
Table 12-2 Comparative Performance Evaluation between Mode-Only Models and Linear Re-
gression Models
                           Coefficient of
                          determination
Slope
 (P,)
RMSE
MPE
NOY
Mean ERs
Linear Regression
CO
Mean ERs
Linear Regression
HC
Mean ERs
Linear Regression

0.438
0.665

0.248
0.491

0.0686
0.0677

1.000
1.102

1.000
1.749

1.000
1.213

0.08725
0.07122

0.07406
0.06691

0.00190
0.00192

0.000002
0.021463

-0.000004
0.010285

0.0000005
0.000223
       For NOx and CO, the R2 values indicate that load based modal emission model performs
slightly better than mean emission rates and the use of linear regression function can further im-
prove model performance. The results shown in Table 12-2 reinforce the importance of introduc-
ing linear regression functions in acceleration and cruise mode.  For HC, there is no discernible
difference in model performance. Combining this finding with the performance results for HC
noted in Chapters 8 through 11, using constant emission rates for each operating mode could be
justified for this data set. When additional data are collected, researchers should compare mean
emission rates approaches to power-based approaches to ensure that power demand models for
HC are necessary.

12.3 Mode-specific Load Based Modal Emission Rate Model vs. Emission Rate Models as a
                                Function of Engine Load

       Modal modeling approaches are becoming widely accepted as more accurate in making
realistic estimates of mobile source contributions to local and regional air quality.  Research at
Georgia Tech has clearly identified that modal operation is a better indicator of emission rates
than average speed (Bachman 1998).  The analysis of emissions with respect to driving modes,
also referred to as modal emissions, has been performed in recent research  studies (Barth et al.
1996; Bachman 1998; Fomunung et al. 1999; Frey et al. 2002; Nam 2003; Barth et al. 2004).
These studies indicated that driving modes might have the ability to explain a certain portion of
the variability in emissions data. In Chapters  10 and 11, emission rates were derived as a func-
tion of driving mode (cruise, idle, acceleration, and deceleration operations) and engine power
because previous research efforts had separately suggested that vehicle emission rates were
                                          12-6

-------
highly correlated with modal activity and engine power. In this research, five driving modes are
introduced in total: idle mode, deceleration motoring mode, revised deceleration mode, accelera-
tion mode, and cruise mode.
       Chapters 10 and 11 did not compare the combined modal and engine power models to
models that use power alone to predict emission rates.  To test the effect of adding driving modes
in the emission rate model, the derivation of a load-only model for NOx emissions is illustrated
in detail. Load-only CO emissions models and HC emissions models are also derived for com-
parison purposes and presented in final form (however, the detailed regression plots and tables
are omitted for the purposes of brevity).
       As in previous chapters, the first  step for a load based only model is to select the most im-
portant variable for NO  emissions. When using the entire database at once (data are not broken
into mode subsets for this derivation), the appropriate transformation for NOx is 1A based on Box-
Cox results,  rather than the /^ value used in developing models for acceleration and cruise mode
(see Chapters 10 and 11). The trimmed HTBR tree models for NOx are illustrated in Figure 12-2
and Table 12-3.
                                    ginftpQwer<41 5.'
        engine.power<4.515
                                 engine. power<96.255
                                                      0.5926
                                                     0.6933
0.2768
0.4246
         Figure 12-2 Trimmed Regression Tree Model for Truncated Transformed NO
                                          12-7

-------
Table 12-3 Trimmed Regression Tree Results for Truncated Transformed NO
                     ~                                               v
 Regression tree:
 tree(formula = NOx.25 ~ engine.power + vehicle.speed + acceleration +
        oil.temperture + oil.press  + cool.temperature + eng.bar.press +
        model.year  +  odometer + bus360  + bus361 + bus363 + bus364 + bus372 +
        bus375  +  bus377 + bus379 +  bus380 + bus381 + bus382 + bus383 + bus384 +
        bus385  +  dummy.grade,  data  = busdata!0242006.1,  na.action = na.exclude,
        mincut  =  3000,  minsize = 6000,  mindev = 0.1)
 Variables actually used in tree construction:
 [1]  "engine.power"
 Number of terminal nodes:  4
 Residual mean deviance:  0.005837 = 618.6 / 106000
 Distribution of residuals:
         Min.     1st Qu.      Median        Mean     3rd Qu.        Max.
  -5.187e-001 -4.510e-002 -9.204e-003  3.768e-016  5.004e-002  6.557e-001
 node),  split,  n, deviance, yval
       * denotes terminal node

 1)  root 105976 3058.00 0.4991
   2)  engine.power<41.535 62441  666.60 0.3823
     4)  engine.power<4.515 17897  195.50 0.2768 *
     5)  engine.power>4.515 44544  192.20 0.4246 *
   3)  engine.power>41.535 43535  316.60 0.6667
     6)  engine.power<96.255 11504   61.56 0.5926  *
     7)  engine.power>96.255 32031  169.20 0.6933  *
       After testing different transformations for Y and adding dummy variables according to
HTBR results, Table 12-4 and Figure 12-3 show that a load based only model for NO  emissions
is a fairly good model, considering the constancy of error variance and normality of error terms.
So, the final load based only model for NO  emissions is:

                      NOx= [0.230 + 0.1951oglO(engine.power+l)]4

       The regression run shows the results in Table 12-4 and Figure 12-3.
                                          12-8

-------
Table 12-4 Regression Result for NOy Load-Based Only Emission Rate Model
 Call:  lm(formula  =  NOx.25  ~  loglO(engine.power + 1), data = busdata!0242006.1,
       na.action = na.exclude)
 Residuals:
     Min       1Q  Median     3Q   Max
  -0.4683 -0.04297 -0.01329 0.04138 0.663
 Coefficients:

             (Intercept)
 loglO(engine.power  +  1)
 Value Std.  Error  t  value  Pr(>|t|)
0.2303   0.0005    489.9131    0.0000
0.1950   0.0003    657.2170    0.0000
 Residual  standard error:  0.0754 on 105974 degrees of freedom
 Multiple  R-Sguared:  0.803
 F-statistic:  431900  on  1  and  105974 degrees of freedom, the p-value is 0

 Correlation of  Coefficients:
                         (Intercept)
 loglO(engine.power + 1) -0.8702

 Analysis  of Variance Table

 Response:  NOx.25

 Terms added seguentially  (first to last)
                            Df Sum of Sg  Mean Sg  F Value Pr(F)
 loglO(engine.power +1)       1  2455.676 2455.676 431934.2     0
              Residuals 105974   602.494    0.006
     (a) Residuals vs. Fit
                                 „«*«
                                                   (b) Response vs. Fit

                                                         ftl    44     0*.    «•    07
                                                   (c) Residuals Normal QQ
  Figure 12-3 QQ and Residual vs. Fitted Plot for Load-Based Only NO  Emission Rate Model
                                         12-9

-------
       Following the same derivation techniques, the final load-only model for CO emissions is:
                       CO = 10A[-2.659 + 0.0899(engine.power)(1/2)]

       Following the same derivation techniques, the final load-only model for HC emissions is:

                          HC = 10A[-3.306 + 0.0382(engine.power)(1/2)]

       The performance of the load-only models relative to the combined mode and load models
developed in Chapters 8 through 11 is presented in Table 12-5.

Table 12-5 Comparative Performance Evaluation Between Load-Based Only Emission Rate (ER)
Model and Load-Based Modal Emission Rate Model
                                      Coefficient of
                                     determination
Slope
 (P,)
RMSE
MPE
NO
Load-Only Emission Rate Model
Mode/Load Emission Rate Models
CO
Load-Only Emission Rate Model
Mode/Load Emission Rate Models
HC
Load-Only Emission Rate Model
Mode/Load Emission Rate Models

0.715
0.665

0.246
0.490

0.0672
0.0677

1.181
1.102

2.071
1.749

0.982
1.213

0.06494
0.07122

0.07886
0.06691

0.00197
0.00192

0.011382
0.021463

0.015568
0.010285

0.000499
0.000223
       For NO , both models perform well in explaining the variance of emission rates, reinforc-
ing the importance of including engine power as a variable in explaining the variance of NOx
emission rates. Results suggest that a mode/load modal emission modeling approach performs
slightly better than load-only emission rate models for CO. For HC, there is no discernible
difference in model performance.  Combining this finding with the performance results for HC
noted in Chapters  8 through 11, using constant emission rates for each operating mode could be
justified for this data set. When additional data are collected, researchers should compare mode-
only approaches to power-based approaches to ensure that power demand models for HC are
necessary.
                                         12-10

-------
                   12.4 Separation of Acceleration and Cruise Modes

       In this research effort, separate models were developed for acceleration and cruise modes
(Chapters 10 and 11). However, it may be possible to combine acceleration and cruise mode
activity into a new "combined driving" mode. As noted in Chapter 10, although engine power
distribution for acceleration mode is different from cruise mode, these two modes share a similar
pattern. A quick analysis of the impact of combining acceleration and cruise mode is presented
in this section.
       After examining HTBR results, selecting the important explanatory variables, testing dif-
ferent transformations for X and Y, and adding dummy variables according to HTBR results, the
final NO emission model for combined driving mode is:
        X                                 &
                        NOx= [0.113 + 0.0266(engine.power(1/2)]2

       The final CO emission model for combined driving mode is:

                            CO = 10A[-2.238  + 0.0043(engine.power)]

       while the final HC emission model for combined driving mode is:

                            HC = [0.167 + 0.0028(engine.power(1/2)]4

       To test whether combining acceleration and cruise modes would benefit the load-based
modal emission model, the predictions from the linear regression function  for combined driving
mode are compared to the predictions from sub-models for acceleration and cruise mode in the
load-based modal emission model. Since the other elements are the same for two models, they
will be excluded from test.  The results of the performance evaluation are  shown in Table 12-6.
                                         12-11

-------
Table 12-6 Comparative Performance Evaluation between Linear Regression with Combined
Mode and Linear Regression with Acceleration and Cruise Modes
                                       Coefficient of
                                       determination   Slope (P )    RMSE      MPE
                                            (R2)
NOY
Combined Driving Mode
Acceleration & Cruise Mode
CO
Combined Driving Mode
Acceleration & Cruise Mode
HC
Combined Driving Mode
Acceleration & Cruise Mode

0.531
0.527

0.177
0.452

0.0338
0.0410

0.921
0.953

1.594
1.775

0.907
0.905

0.08488
0.09312

0.10395
0.08966

0.00204
0.00203

0.00840
0.03904

0.02305
0.01873

0.00042
0.00041
       Results shown in Table 12-6 suggest that separate linear regression functions for accelera-
tion and cruise modes perform significantly better than linear regression functions with combined
driving mode for CO. For NO  and HC, both models perform similarly with respect to explain-
ing the variance of emission rates.  In general, these results support introducing acceleration and
cruise mode into the conceptual model. However, as new data become available for testing,
researchers should examine whether it is reasonable to simply separate idle and deceleration
modes from other driving modes and then apply a simple power-based model to the remaining
combined driving activity for NO .

              12.5 MOBILE6.2 vs. Load-Based Modal Emission Rate Model

       The final step undertaken in the model verification process was a comparison of predic-
tion results from MOBILE6.2 and the load-based modal emission rate model developed in this
research. Comparisons are based upon the Ann Arbor transit vehicle test data. These data were
used to develop the modal emission rates for this report, but were not used in developing the
MOBILE6.2 model. Normally, one would compare alternative model performance using an
independent set of data collected from similar vehicles, which is currently not available. Hence,
the comparisons that will be presented are far from unbiased.  When new data from an indepen-
dent test fleet become available, these comparisons should be performed again.
       To facilitate the emission rate prediction comparison, lookup tables for MOBILE6.2
transit bus emission rates on arterial roads were first created for average speeds from 2.5 mph to
65 mph.  The MOBILE6.2 calendar year was set to January 2002 since the data set was collected
during October 2001. The temperature was set  as 75 °F, since the emission rates for transit buses
                                         12-12

-------
in MOBILE6.2 do not change with temperature.  Emissions predictions from MOBILE6.2 were
then obtained by combining lookup tables and corresponding speed values in the AATA data set.
The results of the performance evaluation are shown in Table 12-7.
Table 12-7 Comparative Performance Evaluation between MOBILE 6.2 and Load-Based Modal ER Model
                                     Coefficient of
                                     determination    Slope (Pj)   RMSE
                                         (R2)
MPE
NOY
MOBILE 6.2
Load-Based Modal ER Model
CO
MOBILE 6.2
Load-Based Modal ER Model
HC
MOBILE 6.2
Load-Based Modal ER Model

0.172
0.665

0.0195
0.491

0.0408
0.0677

0.706
1.102

1.690
1.749

0.584
1.213

0.10825
0.07122

0.08516
0.06691

0.00194
0.00192

0.011217
0.021463

0.013399
0.010285

0.000173
0.000223
       Results suggest that load-based modal emission rate model performs significantly better
than MOBILE6.2 for NOx and CO, and slightly better for HC. The performance of the load-
based modal emission rate model is not surprising because the same data used to develop the
model are used in the comparison. Results suggest that the load-based modal emission model
performs well vis-a-vis explaining the variance of NO  and CO emission rates on a microscopic
level.  The slight differences in RMSE and MPE indicate that both models (MOBILE6.2 and the
load-based modal emission model) perform well at the macroscopic level, and should perform
similarly when used in regional inventory development.

                                   12.6  Conclusions

       In general, the results provided here are encouraging for the load based modal emis-
sion model.  The comparison between engine power and surrogate power variables confirms
the important role of engine power in explaining the variability of emissions. The comparison
between the load-only emission rate model and the load-based modal emission rate model shows
that the impact of driving mode on emissions is signficiant for NO  and CO emissions while no
such trend is discernible for HC.  The comparison between acceleration and  cruise modes and
combined driving mode indicates that the relationships between engine power and emissions are
slightly different for acceleration and cruise modes.  Splitting the database into five modes (idle
mode, decelerating motoring mode, deceleration non-motoring mode, acceleration mode, and
cruise mode) appears warranted.
                                         12-13

-------
       The data used to develop the load based modal emission model in this research are very
limited since the data set contained only 15 transit buses. Inter-bus variability is more obvious
for HC emissions since Bus 363 has the lowest HC emissions compared with the other 14 buses.
This kind of variability might influence the explanatory variables of the modal emission model
for HC emissions. When new data become available, these models should be re-derived to ob-
tain further improved performance in applications to the transit bus fleet.
                                         12-14

-------
                                     CHAPTER 13
                                  13. CONCLUSIONS
       The goal of this research is to provide emission rate models that fill the gap between
existing models and ideal models for predicting emissions of NOx, CO, and HC from heavy-duty
diesel vehicles. The researchers at Georgia Institute of Technology have developed a beta ver-
sion of HDDV-MEM (Guensler et al. 2005), which is based upon vehicle technology groups,
engine emission characteristics, and vehicles modal activity.  The HDDV-MEM first predicts
second-by-second engine power demand as a function of onroad vehicle  operating conditions
and then applies brake-specific emission rates to these activity predictions. The HDDV-MEM
consists of three modules: a vehicle activity module (with vehicle activity tracked by a vehicle
technology group), an engine power module, and an emission rate module.
       Using second-by-second data collected from onroad vehicles, the research effort reported
herein developed models to predict emission rates as a function of onroad operating conditions
that affect vehicle emissions.  Such models should be robust and ensure that assumptions about
the underlying distribution of the data are verified and that assumptions associated with appli-
cable  statistical methods are not violated. Due to the general lack of data available for develop-
ment of heavy-duty vehicle modal emission rate models, this study focuses on development of an
analytical methodology that is repeatable with different data sets collected across space and time.
The only acceptable second-by-second data set in which emission rate and applicable load and
vehicle activity data had been collected in parallel was the AATAbus emissions database col-
lected by Sensors, Inc., for use by the U.S. EPA.
       The models developed in this report are applicable to transit buses only, and are not ap-
plicable to all transit buses (see limitations discussion in Section 13.2). However, a significant
contribution of the research is in the development of the analytical framework established for
analysis of second-by-second emission rate data collected in parallel with engine load and other
onroad operating parameters, and in the development of applicable processes for developing sta-
tistical models using such data. To demonstrate the capability of the modeling framework, three

                                          13-1

-------
modal emission rate models have been developed for prediction of NOx, CO and HC emissions
from mid-1990s transit buses.
       The AATA transit bus data set was first post-processed through a quality control/quality
assurance process.  Data problems were identified and corrected during this stage of the research
effort.  The types of errors checked include: loss of data, erroneous ECM data, GPS dropouts,
and synchronization errors.  Data records for which all data elements were not collected were
removed to avoid any bias to the results.  No erroneous ECM data were identified.  Six buses ex-
perienced GPS dropouts and synchronization errors and these problems were treated as described
in chapter 4. Emission rate variability was also assessed across the sample of buses to identify
any potential high-emitters that may behave differently than other buses under normal operating
conditions and therefore warrant separate model development. However, no high-emitters were
identified. To find the true 'high-emitters', modelers need to include a representative sample
of buses to try to ensure that mean emissions and response rates to operating variables are rep-
resented in the data. Since there are only 15 buses in the data set, modelers could not exclude
buses that showed higher emission rates than the others.
       Model development  then proceeded through a structured series of steps.  Transformations
of emission rates (NO , CO, and HC) were verified through a Box-Cox procedure to improve
the specific modeling assumptions, such as linearity or normality.  HTBR regression tree results
were used to identify the most important explanatory variables for emission rates.  OLS regres-
sion models were developed for transformed emission rates using chosen  explanatory variables.
Dummy variables were created to represent the cut points  identified in HTBR trees. Interaction
effects for identified explanatory variables were also tested to see whether they could improve
the model.  The models were comparatively evaluated and the most efficient models for each
pollutant were  selected.  By demonstrating statistical "robustness" and sufficiency  in previous
chapters, the main goal of this research, that of "developing new load-based models with  signifi-
cant improvement", was achieved.
       This chapter will review the key accomplishments  of this research. The chapter provides
the final models selected for implementation and begins with a summary of the final models
developed for the transit buses, followed immediately by a discussion on the limitations of these
models. The chapter concludes with the lessons learned and recommendations on further re-
search.
                                          13-2

-------
                         13.1  Transit Bus Emission Rate Models
       The goal of this research was to develop a methodology for creating load-based emis-
sion rate models designed to predict emission rates of NOx, CO, and HC from transit buses as a
function of onroad operating conditions. The models should be robust and ensure that statisti-
cal assumptions in model development are not violated. With limited available data, this study
developed a methodology that is repeatable with a different data set from across space and across
time.  The final estimated models are presented in Table 13-1.

Table 13-1 Load Based Modal Emission Models
Driving Mode
NO
X
Idle Mode
Decelerating Motoring Mode
Deceleration Non-Motoring Mode
Acceleration Mode
Cruise Mode
CO
Idle Mode
Decelerating Motoring Mode
Deceleration Non-Motoring Mode
Acceleration Mode
Cruise Mode
HC
Idle Mode
Decelerating Motoring Mode
Revised Deceleration Mode
Acceleration Mode
Cruise Mode


0.033415 g/s
0.0097768 g/s
0.045777 g/s
NOx = (-0.0195 + 0.2011oglO(engine.power + 1) +
0.0019vehicle.speed)2
NOx = (0.0087 + 0.0311 (engine.power)(1/2))2

0.0059439 g/s
0.0052857 g/s
0.0068557 g/s
CO = 10A(-3.747 + 1.3411oglO(engine.power + 1) -
0.0285vehicle.speed)
r^f\ — i (-,(-2. 223+0. 0033engine.power)

0.00091777 g/s
0.001113 g/s
0.0013 12 g/s
HC = (0.114 + 0.04261oglO(engine.power + I))4
HC = (0.170 + 0.0022 (engine.power)(1/2))4
       The transformations employed for the three pollutants in acceleration and cruise modes
are different.  The predictive capabilities of each of the models for three pollutants are also dif-
ferent. The R2 value is high for NO  and CO emission rates, but very low for HC emission rates.
HC models are not much better than simply using HTBR mean ERs.  The relatively poor perfor-
mance of the HC models is not an inherent limitation of the modal modeling approach. Instead,
                                          1O "
                                          13-j

-------
it is a result of the lack of availability of a suitable explanatory variable for model development
purposes. Although the model with dummy variables and interactions works better, the final
model is not necessarily the best fit, but is one that can be readily implemented.
       The three models include all of those significant variables identified as affecting gram/
second emissions rates, with the exception of those variables that are highly correlated with indi-
vidual bus ID. Although a few of the vehicles behaved differently from other vehicles, modelers
could not reasonably include bus ID as a variable, nor environmental parameters of testing since
all low barometric pressure tests were conducted on one or two vehicles. Additional explora-
tion of environmental conditions should be conducted by collecting data for a larger fleet under a
wider variety of environmental conditions over a longer time.
       The new modal emission rate  models all indicate that engine power has a significant im-
pact on the acceleration and cruise emission rates. This observation strengthens the importance
of using load based emission data to develop new emission models and simulate engine power
in real world applications.  All three models were shown to be robust by use of several statistical
measures. Although some departures from accepted norms were noted, these departures were
judged not so serious as to compromise the usefulness of the models.  Hence, no remedial mea-
sures were taken.

                                 13.2 Model Limitations

       There are several limitations in the models estimated and presented in this work.  Theo-
retically, the models cannot be used to forecast emissions beyond the domain of variables used
in estimating the models.  These models were developed from 15 buses equipped with same fuel
injection type, catalytic converter type, transmission type, and so on, so the models could not
consider the effect of variation in vehicle technologies on emissions. Another  limitation is the
consideration of the effect of emission control technology deterioration on emission levels since
all buses were only 5 or 6 years old at the time testing was conducted.  Although the speed/ac-
celeration profiles between the AATA data set and the Atlanta buses are similar, there is no way
to estimate the effect of changes in vehicle technologies and deterioration on emissions in the
current and future fleet in Atlanta. Such a limitation introduces obvious uncertainties in the use
of the model to make predictions for other fleets.
       The predictive models are derived from a research effort conducted by  other parties.
Modeling at this time cannot control for those variables for which data were not collected. This
inability to control the variables may  yield several uncertainties in the models. First, important
or useful variables relevant to the effect of emission rates may not have been observed at all, so it
                                           13-4

-------
may be difficult to derive a model with sufficient explanatory power, or variables that are select-
ed may simply be correlated to the true causal variables that are affecting instantaneous emission
rates.  Second, the interpretation of the effects of individual variables effects might be limited.
For example, the ability of negative load to explain the variability on emissions is limited due to
the negative loads recorded as zero.
       An additional limitation imposed by the data is the uncertainty introduced by the actual
data collection process. The uncertainty in the GPS position will introduce significant instan-
taneous error in grade computation (grade should be collected by means other than GPS). Al-
though filter limits were imposed on the rate of change of engine speed  (RPM), fuel flow, and
vehicle speed data, data could yield unreasonable instantaneous vehicle acceleration or decelera-
tion rates, and still be within reasonable absolute limits. This uncertainty may bias predictions.
       The possible presence of outliers has the potential to cause a misleading fit by dispropor-
tionately pulling the fitted regression line away  from the majority of the data points (Neter et al.
1996). Cook's distance plots indicated that some points do have influence over the regression
fit. However, none of these points is indicative  of obvious errors in data. It is difficult to deter-
mine whether those extreme values were actually outliers or not. Since the data passed through
EPAs rigorous QA/QC procedures and no "true" outliers exist, and these high-emission events
are assumed to be representative of events that occur in the real world.  Therefore, all of these
data were retained in model development. When additional data become available,  researchers
should make it a priority to examine these high  emissions events to identify the underlying causal
factors.

                                  13.3 Lessons Learned

       Because driving mode definitions varied across previous research efforts, findings from
these efforts are not directly comparable. This study independently developed driving mode defi-
nitions through comparison across critical values.  Suitable modal activity definition can divide
the data into several homogeneous groups according to emission rates and driving conditions.
Unlike previous research  efforts which only present pairwise comparisons of modal average es-
timates or HTBR regression tree analyses, this study compared distributions of engine operating
characteristics under proposed vehicle mode definitions by defining applicable vehicle modes.
       A representative data set is the most critical issue for development the final version of
the proposed model. This issue plays an important role no matter which modeling approach is
employed. The representative data set should reflect the real world with respect to vehicle emis-
sions and activity  patterns. The data set used for the proposed model consists of EPA AATA data
                                           13-5

-------
and includes 15 buses. At the time this research was conducted, the AATA data were the only ap-
plicable data set that contained all required data (second-by-second emission rates, engine load,
and applicable operating variables) all collected in parallel. New data sets will improve model
performance in future.
       A combination of tree and OLS regression methods was used to estimate NO , CO and
HC emission models from EPAs transit bus database tested by Sensors, Inc.  The HTBR tech-
nique was used as a tool to reveal underlying data structure and identify useful explanatory
variables and was demonstrated as a powerful tool that will allow researchers to deal with large
multivariate data sets with mixed mode (discrete and continuous) variables.

                                   13.4 Contributions

       This research verifies that vehicle emission rates are highly correlated with modal ve-
hicle activity. Furthermore, the relationship between engine power and emissions is also sig-
nificant and is quantified for the available data. Research results indicate that engine power is
more powerful than surrogate variables in predicting second-by-second grams/second emission
rates.  Hence, to improve our understanding of emission rates, it is  important to examine not only
vehicle operating modes, but also engine power distributions. Based upon the important role
of engine power in explaining the variability of emissions, it is critical to include the load data
measurement (and collection of all onroad operating parameters to estimate load, such as grade)
during the emission data collection procedure.
       Another major contribution of the work is the establishment of a framework for emission
rate model development suitable for predicting emissions at microscopic level.  As more databases
become available,  the model development steps can be re-run to develop a more robust load-based
modal emission model based on the same philosophy. This living modeling framework provides
the ability to integrate necessary vehicle activity data and emission rate algorithms to support
second-by-second and link-based emissions prediction. Combined with a GIS  framework, models
derived through this methodology will improve spatial/temporal emissions modeling.

                        13.5 Recommendation for Further Studies

       The methodology developed and applied in this research can, and should, be used to
estimate similar models for the on-road fleet consisting of transit buses and heavy-duty vehicles.
Since emissions of these vehicles are heavily dependent on vehicle dynamics (that is, load and
power), a successful validation will provide further evidence of the "correctness" of the method
employed here. When new data become available and these models are re-derived, modelers
                                          13-6

-------
can expect further improved performance in applications to the transit bus fleet and eventually to
other heavy-duty vehicle fleets.
       Given the important role of engine power in explaining the variability of emissions, en-
gine load data should be measured during the emission data collection procedure and all param-
eters necessary to estimate onroad load (such as grade and vehicle payload) should be included in
the data collection efforts.  Similarly, simulation of engine power demand for onroad operations
becomes important in the implementation of emission inventory modeling for heavy-duty transit
buses.  Refinement of roadway characteristic data (grade, etc.) for urban areas is paramount and
research efforts that can quantify drive train inertial losses under various operating conditions
will help enhance modal model development.
       Because all buses tested were of the same model with the same engine, the test data were
valuable from the perspective of controlling potential explanatory variables related to vehicle
characteristics. However, these data simultaneously constrain the ability to explain the effect of
vehicle technology groups and deterioration of emission control technologies on emissions data.
Expanded data collection efforts should focus on identification of appropriate vehicle technology
groups and high-emitting vehicle groups. In these test programs, it will also be important to test
buses under their real-world operating conditions (on a variety of routes, road types and grades,
onroad operating conditions, environmental conditions, passenger loadings, etc.) to better reflect
real world conditions.  These high-resolution data collection efforts will provide the data needed
by modelers to develop new and enhanced  modal emission rate models for a variety of heavy-
duty vehicle classes.
                                           13-7

-------
                                  14.  REFERENCES
Ahanotu, D. (1999). Heavy-Duty Vehicle Weight and Horsepower Distributions: Measurement
       of Class-Specific Temporal and Spatial Variability. School of Civil and Environmental
       Engineering. Atlanta, GA, Georgia institute of Technology. Ph.D. dissertation.

AMS. (2005). "A look at U.S. air pollution laws and their amendments."  Retrieved July 30,
       2005, from http://www.ametsoc.org/sloan/cleanair/cleanairlegisl.html

Avol, et. al. (2001). "Respiratory effects of relocating to areas of differing air pollution levels."
       Am. J. Respir. Crit. Care. Med. 164: 2067-2072.

Bachman, W. (1998). A GIS-Based Modal Model of Automobile Exhaust Emissions Final Re-
       port. Atlanta, GA, Prepared by Georgia Institute of Technology for U.S. Environmental
       Protection Agency. EPA-600/R-98-097.

Bachman, W., W. Sarasua, et al. (2000). "Modeling Regional Mobile Source Emissions in a GIS
       Framework." Transportation Research C 8(1-6): 205-229.

Barth, M., F. An, et al.  (1996). "Modal Emission Modeling: A Physical Approach." Transporta-
       tion Research Record 1520: 81-88.

Barth, M., F. An, et al.  (2000). "Comprehensive Modal Emissions Model (CMEM), Version 2.0
       User's Guide."  http://pah.cert.ucr.edu/cmem/cmem_users_guide.pdf. January 2000.

Barth, M., G. Gcora, et al. (2004). A Modal Emission Model for Heavy Duty Diesel Vehicles.
       Proceedings of the 83rd Transportation Research Board Annual Meeting Proceedings
       (CD-ROM), Washington, DC.

Barth, M. and J. Norbeck (1997). NCHRP Project 25-11: The Development of a Comprehensive
       Modal Emission Model. Proceedings of the 7th CRC On-Road Vehicle Emissions Work-
       shop, Coordinating Research Council, Atlanta, GA.

Breiman, L., J. Friedman, et al. (1984). Classification and Regression Trees. Wadsworth Interna-
       tional Group, Belmont. CA.
                                         14-1

-------
Brown, J. Edward, et al. (2001)." Heavy Duty Diesel Fine Particulate Matter Emissions: Devel-
      opment and Application of On-Road Measurement Capabilities." Research Triangle Park,
      NC, Prepared by ARCADIS Geraghty & Miller, Inc. for U.S. Environmental Protection
      Agency. EPA-600/R-01-079.

Browning, L. (1998). Update of Heavy-Duty Engine Emission Conversion Factors — Analysis
      of Fuel  Economy, Non-Engine Fuel Economy Improvements and Fuel Densities, U.S.
      Environmental Protection Agency.

CARB (1991).  Modal Acceleration Testing. Mailout No. 91-12; Mobile Source Division; El
      Monte,  CA.

CARB (2002).  "Heavy-Duty Diesels Compression Ignition Engine Emissions and Testing." Cali-
      fornia Air Resources Board Emissions Inventory Series 1(10).

CARB (2004).  " California's Air Quality History Key Events." California Air Resources Board
      Retrieved July 2, 2004, from http://www.arb.ca.gov/html/brochure/history.htm

CARB (2007)  "EMFAC" California Air Resources Board Retrieved July 20, 2007, from http://
      www. arb. ca.gov/msei/onroad/latest_version.htm

Carlock, M. A.  (1994). An Analysis of High Emitting Vehicles in the On-road Vehicle Fleet. Pro-
      ceedings of the 87th Air and Waste Management Association Annual Meeting Proceeding
      Pittsburgh, PA.

CEDF (2002). "Nitrogen Oxides: How NOx Emissions Affect Human Health and the Environ-
      ment." Environmental Defense.

CFR (2007a). Calculations: exhaust emissions (40CFR86.1342-90). Code of Federal Regula-
      tions. National Archives and Records Administration.

CFR (2007b). Urban Dynamometer Schedules (40CFR86. Appendix I). Code of Federal Regula-
      tions. National Archives and Records Administration.

CFR (2004a). National Primary and Secondary Ambient Air Quality Standards (40CFR50). Code
      of Federal Regulations. National Archives and Records Administration.

CFR (2004b). Gross Vehicle Weight Rating (40CFR86.1803). Code of Federal Regulations. Na-
      tional Archives and Records Administration.

CFR (2004c). Useful Lift (40CFR86.1805). Code of Ferderal Regulations. National Archives and
      Records Adminisration.
                                         14-2

-------
Chakravart, L. and Roy (1967). Handbook of Methods of Applied Statistics, Volume I, John Wiley.

Clark, N. N., J. M. Kern, et al. (2002). "Factors Affecting Heavy-Duty Diesel Vehicle Emis-
       sions." Journal of the Air & Waste Management Association 52:  84-94.

Clark, N. N., A. S. Khan, et al. (2005). Idle Emissions from Heavy-Duty Diesel Vehicles, Center
       for Alternative Fuels, Engines, and Emissions (CAFEE), Department of Mechanical and
       Aerospace Engineering, West Virginia University (WVU).

Conover, W. J. (1980). Practical Non-parametric Statistics, John Wiley and Sons; New York, NY.

Copt, S. and S. Heritier (2006). Robust MM-Estimation and Inference in Mixed Linear Models.
       Department of Econometrics, Working Papers, University of Sydney.

Davis, W., K. Wark, et al., Eds. (1998). Air Pollution Its Origin and Control. 3rd Edition, 2003
       Special Studies. Addison Wesley Longman, Inc. Menlo Park, California.

Denis, M. J. S., P. Cicero-Fernandez, et al. (1994). "Effects of In-Use Driving Conditions and Ve-
       hicle/Engine Operating Parameters on "Off-Cycle" Events: Comparison with Federal Test
       Procedure Conditions." Journal of the Air & Waste Management Association 44(1): 31-38.

DieselNet. (2006). "Heavy-Duty FTP Transient Cycle." Retrieved December 20, 2006,  from
       http://www.dieselnet.com/standards/cycles/ftp_trans.html

Dreher, D. and R. Harley (1998). "A Fuel-Based Inventory for Heavy-Duty Diesel Truck Emis-
       sions." Journal of the Air & Waste Management Association 48:  352-358.

Easton, V J. and J. H. McColl. (2005). "Statistics Glossary."  Retrieved March, 28, 2005, from
       http://www.stats.gla.ac.uk/steps/glossary/index.html.

Ensfield, C. (2002).  On-Road Emissions Testing of 18 Tier 1 Passenger  Cars and 17 Diesel Pow-
       ered Public Transport Buses. Saline, Michigan, Sensors, inc.

FCAP (2004). " Ambient Air Quality Trends: An Analysis of Data Collected by the U.S.  Envi-
       ronmental Protection Agency." Foundation for Clean Air Progress.

Feng, C., S. Yoon, et al. (2005). Data Needs for a Proposed Modal Heavy-Duty Diesel Vehicle
       Emission Model. Proceedings of the 98th Air and Waste Management Association Annual
       Meeting Proceeding  (CD-ROM), Pittsburgh, PA.

Fomunung, I, S. Washington, et al. (1999). "A Statistical Model for Estimating Oxides of Nitrogen
       Emissions from Light-Duty Motor Vehicles." Transportation Research D 4D(5): 333-352.
                                          14-2

-------
Fomunung, I, S. Washington, et al. (2000). "Validation of the MEASURE Automobile Emis-
       sions Model: A Statistical Analysis." Journal of Transportation Statistics 3(2): 65-84.

Fomunung, I. W. (2000). Predicting emissions rates for the Atlanta on-road light duty vehicular
       fleet as a function of operating modes, control technologies, and engine characteristics.
       Civil and Environmental Engineering. Atlanta, Georgia Institute of Technology. Ph.D.
       dissertation.

Frey, H. C., A. Unal, et al. (2002). Recommended Strategy for On-Board Emission Data Analysis
       and Collection for the New Generation Model. Raleigh, NC, Prepared by Computational
       Laboratory for Energy, Air, and Risk, Department of Civil Engineering, North Carolina
       State University, Prepared for Office of Transportation and Air Quality, U.S. Environmen-
       tal Protection Agency, http://www.epa.gov/otaq/models/ngm/ncsu.pdf.

Frey, H. C. and J. Zheng (2001). Methods and Example Case  Study for Analysis of Variability
       and Uncertainty in Emissions Estimation (AUVEE). Research Triangle Park, NC, Pre-
       pared by North Carolina State University for Office of Air Quality Planning and Stan-
       dards, U.S. Environmental Protection Agency.

Gajendran, P. and N. N. Clark (2003). "Effect of Truck Operating Weight on Heavy-Duty Diesel
       Emissions." Environment Science and Technology 37: 4309-4317.

Gauderman, et. al. (2002). "Association between air pollution and lung function growth in Southern
       California children: Results from a second cohort." Am J Resp Crit Care Med 166(1): 74-84.

Gautam, M. and N. Clark (2003). Heavy-Duty Vehicle Chassis Dynamometer Testing for Emis-
       sions Inventory, Air Quality Modeling, Source Apportionment and Air Toxics Emission
       Inventory; Phase  I Report. Coordinating Research Council, Project No. E-55/E-59.

Gillespie, T. (1992). Fundamentals of Vehicle Dynamics. Warrendale, PA, Society of Automotive
       Engineers, Inc.

Granell, J. L., R. Guensler, et al. (2002). Using Locality-Specific Fleet Distributions in Emissions
       Inventories: Current Practice, Problems, and Alternatives. Proceedings of the 81st Trans-
       portation Research Board Annual Meeting (CD-ROM), Washington, DC.

Grant, C., R. Guensler, et al. (1996). Variability of Heavy-Duty Vehicle  Operating Mode Fre-
       quencies for Prediction of Mobile Emissions. Proceedings of the 89th Air and Waste
       Management Association Annual Meeting Proceeding (CD-ROM), Pittsburgh,  PA.
                                          14-4

-------
Guensler, R. (1993). "Data Needs for Evolving Motor Vehicle Emission Modeling Approaches."
       In: Transportation Planning and Air Quality II, Paul Benson, Ed.; American Society of
       Civil Engineers: New York, NY; 1993.

Guensler, R., S. Yoon., et al. (2005). Heavy-Duty Diesel Vehicle Modal Emissions Modeling
       Framework. Regional Applied Research Effort (RARE) Project. Presented to U.S. Envi-
       ronmental Protection Agency, Georgia Institute of Technology.

Guensler, R., S. Yoon., et al. (2006). Heavy-Duty Diesel Vehicle Modal Emissions Modeling
       Framework. Regional Applied Research Effort (RARE) Project. Presented to U.S. Envi-
       ronmental Protection Agency, Georgia Institute of Technology.

Guensler, R., D. Sperling, et al. (1991). Uncertainty in the Emission Inventory for Heavy-Duty
       Diesel-Powered Trucks. Proceedings of the 84th Air and Waste Management Association
       Annual Meeting Proceedings (CD-ROM), Pittsburgh, PA.

Guensler, R., S. Washington, et al. (1998). "Overview of MEASURE Modeling Framework."
       Proc. Conf. Transport Plan Air Quality A: 51-70.

Hallmark, S. L. (1999). Analysis and Prediction of Individual Vehicle Activity for Microscopic
       Traffic Modeling. Civil and Environmental Engineering. Atlanta, Georgia Institute of
       Technology. Ph.D. dissertation.

Heywood, J. B. (1998). Internal Combustion Engine Fundamentals. New York, The McGraw Hill
       Publishing Company

HowStuffWorks (2005).   Retrieved December 30, 2005, from http://www.howstuffworks.com

Jimenez-Palacios, J. (1999). Understanding and Quantifying Motor Vehicle Emissions with
       Vehicle Specific Power and TILDAS Remote Sensing. Cambridge, MA, Massachusetts
       Institute of Technology. Ph. D. dissertation.

Kelly, N. A. and P. J. Groblicki (1993). "Real-World Emissions from a Modern Production
       Vehicle Driven in Los Angeles." Journal of the Air & Waste Management Association
       43(10): 1351-1357.

Kittelson, D. B., D. F. Dolan, et al.  (1978). "Diesel Exhaust Particle Size Distribution - Fuels and
       Additive Effects." SAE Paper No. 780787.

Koehler, K. J. and K. Larnz (1980). "An empirical investigation of goodness-of-fit statistics for
       sparse multinomials." Journal of the American Statistical Association 75: 336-344.
                                         14-5

-------
Koupal, J., M. Cumberworth, et al. (2002). Draft Design and Implementation Plan for EPA's
       Multi-Scale Motor Vehicle and Equipment Emissions System (MOVES), U. S. Environ-
       mental Protection Agency. EPA-420/P-02-006.

Koupal, J., N. E. Nam, et al. (2004). The MOVES Approach to Modal Emission Modeling. Pro-
       ceedings of the 14th CRC On-Road Vehicle Emissions Workshop, Coordinating Research
       Council, San Diego, CA.

Li, L. (2004). Calculating the Confidence Intervals Using Bootstrap, Department of Statistics,
       University of Toronto, Presented for a project of Ontario Power Generation on October
       28, 2004.

Lindhjem, C. and T Jackson (1999). Update of Heavy-Duty Emission Levels (Model Years
       1998-2004+) for Use in MOBILE6, U.S. Environmental Protection Agency.

Lloyd, A. C. and T. A. Cackette  (2001). "Diesel Engines: Environmental Impact and Control."
       Journal of Air and Waste Management Association 51: 809-847.

MOBILE6. (2007) Access http://www.epa.gov/otaq/m6.htm

Nam, E. K. (2003). Proof of Concept Investigation for the Physical Emission Rate Estimator
       (PERE) to be Used in MOVES, Ford Research and Advanced Engineering.

Neter, J., M. H. Kutner, et al. (1996). Applied Linear Statistical Models, McGraw-Hill: Chicago IL.

Newton, K., W. Steeds, et al. (1996). The Motor Vehicle. Warrendale, PA, Society of Automotive
       Engineers, Inc.

NRC (2000). Modeling Mobile-Source Emissions. Washington, D.C., National Academy Press,
       National Research Council.

Peters, J. M., et al. (1999). "A study of twelve Southern California communities with differing
       levels and types of air pollution. II. Effects on pulmonary function." Am. J. Respir. Crit.
       CareMed. 159:768-775.

Prucz, J. C., N. N. Clark, et al. (2001). "Exhaust Emissions from Engines of the Detroit Diesel
       Corporation in Transit Buses: A Decade of Trends." Environment Science and  Technol-
       ogy 35: 1755-1764.

Ramamurthy, R. and N. Clark (1999). "Atmospheric Emissions Inventory Data for Heavy-Duty
       Vehicles." Environmental Science and Technology 33: 55-62.
                                         14-6

-------
Ramamurthy, R., N. N. Clark, et al. (1998). "Models for Predicting Transient Heavy Duty Ve-
       hicle Emissions." Society of Automotive Engineers SAE 982652.

Roess, R. P., E. S. Prassas, et al. (2004). Traffic Engineering, Pearson Education, Inc.

SCAQMD (2000). Multiple Air Toxics Exposure Study (MATES-II), South Coast Air Quality
       Management District Governing Board.

Schlappi, M. G., R. G. Marshall, et al. (1993). "Truck travel in the San Francisco Bay Area."
       Transportation Research Record 1383: 85-94.

Siegel S, andN. Castellan. (1988). Non-parametric Statistics for the Behavioural Sciences 2nd
       Edition. McGraw-Hill. January 1988.

Singer, B. C. and R. A. Harley (1996). "A fuel-based motor vehicle emission inventory." Journal
       of the Air & Waste Management Association 46: 581-593.

StatsDirect. (2005). "Statistical Help." http://www.statsdirect.com/ Retrieved May 30, 2005.

TRB (1995). Expanding Metropolitan Highways: Implications for Air Quality and Energy Use.
       Washington, DC, Transportation Research Board, National Academy Press.

U.S. EPA (1993). User's Guide to MobileSa. http://www.epa.gov/otaq/models/mobile5/mob5ug.pdf

U.S. EPA (1995). National Air Quality and Emission Trends Report  1995, Office of Air Quality
       Planning and Standards, U.S. Environmental Protection Agency, http://www.epa.gov/air/
       airtrends/aqtrnd95/report/

U.S. EPA (1997). Emissions Standards Reference Guide for Heavy-Duty and Nonroad Engines.
       http://www.epa.gov/otaq/cert/hd-cert/stds-eng.pdf. EPA-420-F-97-014.

U.S. EPA (1998). Update of Fleet  Characterization Data for Use in MOBILE6 - Final Report,
       U.S. Environmental Protection Agency.  EPA-420/P-98-016.

U.S. EPA (200la). EPA's New Generation Mobile Source Emissions Model: Initial Proposal and
       Issues, U.S. Environmental Protection Agency. EPA-420/R-01-007.

U.S. EPA (200 Ib). Update of Heavy-duty Emission Levels (Model Years 1988-2004) for Use in
       MOBILE6, U.S. Environmental Protection Agency. EPA-420/R-99-010.
                                         14-7

-------
U.S. EPA (200Ic). Heavy Duty Diesel Fine Particulate Matter Emissions: Development And Ap-
       plication Of On-Road Measurement Capabilities. Research Triangle Park, NC. Prepared
       by National Risk Management Research Laboratory for Office of Air Quality Planning
       and Standards, U.S. Environmental Protection Agency. EPA-600/R-01-079.

U.S. EPA (2002a) "MOBILE6 Vehicle Emission Modeling Software" Retrieved July 20, 2007
       from http ://www. epa.gov/otaq/m6. htm.

U.S. EPA (2002b). Update Heavy-Duty Engine Emission Conversion Factors for MOBILE6,
       Analysis of Fuel Economy, Non-Engine Fuel Economy Improvements and Fuel Densi-
       ties. EPA-420/P-98-014.

U.S. EPA (2002c). Methodology for Developing Modal Emission Rates for EPAs Multi-Scale
       Motor Vehicle and Equipment Emission System. Raleigh, NC, Prepared by North Caro-
       lina State University for Office of Transportation and Air Quality, U.S. Environmental
       Protection Agency. EPA-420/R-01-027.

U.S. EPA (2002d). Update Heavy-duty Engine Emission Conversion Factors for MOBILE6:
       Analysis of BSFCs and Calculation of Heavy-duty Engine Emission Conversion Factors.
       EPA-420/P-98-015.

U.S. EPA (2003). National Air Quality and Emissions Trends Report, 2003 Special Studies Edi-
       tion. Research Triangle Park, NC, Office of Air Quality and Standards, U.S. Environmen-
       tal Protection Agency. EPA-454/R-03-005.

U.S. EPA (2004c). Technical Guidance on the Use of MOBIEL6 for Emissions Inventory Prepa-
       ration. Publication No. EPA420-R-04-013. U.S. Environmental Protection Agency.

U.S. EPA (2005). "Fine Particle (PM2.5) Designations." Retrieved October 20, 2005, from http://
       www.epa.gov/pmdesignations/index.htm.

U. S. EPA (2006). "National Ambient Air Quality Standards (NAAQS)." Retrieved October 30,
       2006, from http://www.epa.gov/air/criteria.html

Washington, S. (1994). Estimation of a vehicular carbon monoxide modal emissions model and
       assessment of an intelligent transportation technology, University of California at Davis.
       Ph.D. dissertation.

Washington, S., J. Leonard, et al. (1997b). Forecasting Vehicle Modes of Operation needed as
       Input to 'Model' Emissions Models. Proceedings of the 4th International Scientific Sym-
                                         14-8

-------
       posium on Transport and Air Pollution, Lyon, France.

Washington, S., L. F. Mannering, et al. (2003). Statistical and Econometric Methods for Trans-
       portation Data Analysis, CRC Pr I Lie.

Washington, S., J. Wolf, et al. (1997a). "Binary Recursive Partitioning Method for Modeling Hot-
       Stabilized Emissions from Motor Vehicles." Transportation Research Record 1587: 96-105.

Whitley, E. and J.  Ball (2002). "Statistics review 6: Nonparametric methods." Critical Care 6:
       509-513.

Wolf-Heinrich, H. (1998). Aerodynamics of Road Vehicles, Society Of Locomotive Engineers
       Inc., USA.

Wolf, J., R. Guensler, et al. (1998). "High Emitting Vehicle Characterization Using Regression
       Tree Analysis." Transportation Research Record 1641: 58-65.

Yoon, S. (2005c). A New Heavy-Duty Vehicle Visual Classification and Activity Estimation
       Method For Regional Mobile Source Emissions Modeling. School of Civil and Environ-
       mental Engineering. Atlanta, Georgia Institute of Technology. Ph.D. dissertation.

Yoon, S., H. Li, et al. (2005a). Transit Bus Engine Power Simulation: Comparison of Speed-
       Acceleration-Road Grade Matrices to Second-by-Second Speed, Acceleration, and Road
       Grade Data. Proceedings of the 98th Air and Waste Management Association Annual
       Meeting Proceeding (CD-ROM), Pittsburgh, PA.

Yoon, S., H. Li, et al. (2005b). A Methodology for Developing Transit Bus Speed-Acceleration
       Matrices to be used in Load-Based Mobile  Source Emissions Models. Proceedings of the
       84th Transportation Research Board Annual Meeting Proceedings (CD-ROM), Washing-
       ton, DC.

Yoon, S., M. Rodgers, et al. (2004b). "Engine and Vehicle Characteristics of Heavy-Duty Diesel
       Vehicles in the Development of Emissions Inventories: Model Year, Engine Horsepower
       and Vehicle Weight." Transportation Research Record(1880): 99-107.

Yoon, S., P. Zhang, et al. (2004a). A Heavy-Duty Vehicle Visual Classification Scheme: Heavy -
       Duty Vehicle Reclassification Method for Mobile Source Emissions Inventory Develop-
       ment. Proceedings of the 97th Air and Waste Management Association Annual Meeting
       Proceeding (CD-ROM), Pittsburgh, PA.
                                          14-9

-------
Younglove, T., G. Scora, et al. (2005). Designing On-road Vehicle Test Programs for Effective
       Vehicle Emission Model Development. Proceedings of the 84th Transportation Research
       Board Annual Meeting Proceedings (CD-ROM), Washington, DC.

Zeldovich, Y. B., P. Y. Sadonikov, et al. (1947). "The oxidation of nitrogen in combustion and
       explosions." Acta Physicochimica USSR 21(4): 577-628.
                                         14-10

-------