United States
Environmental Protection
Agency
EPA/600/R-07/106
July 2007
Transit Bus Load-Based
Modal Emission Rate Model
Development
-------
EPA/600/R-07/106
July 2007
TRANSIT BUS LOAD-BASED
EMISSION
by
Chunxia Feng
Randall Guensler
Michael Rodgers
School of Civil and Environmental Engineering
Georgia Institute of Technology
Atlanta, GA
Contract No: EP-05C-000033
EPA Project Officer: Sue Kimbrough
U.S. Environmental Protection Agency
National Risk Management Research Laboratory
Air Pollution Prevention and Control Laboratory
Research Triangle Park, NC 27711
U.S. Environmental Protection Agency
Office of Research and Development
Washington, DC 20460
-------
ABSTRACT
Heavy-duty diesel vehicle (HDDVs) operations are a major source of oxides of nitrogen
(NOx) and particulate matter (PM) emissions in metropolitan areas nationwide. Although HD-
DVs constitute a small portion of the onroad fleet, they typically contribute more than 45% of
NOx and 75% of PM onroad mobile source emissions (U.S. EPA 2003). HDDV emissions are a
large source of global greenhouse gas and toxic air containment emissions. Over the last several
decades, both government and private industry have made extensive efforts to regulate and con-
trol mobile source emissions. The relative importance of emissions from HDDVs has increased
significantly because today's gasoline powered vehicles are more than 95% cleaner than vehicles
in 1968.
In current regional and microscale modeling conducted in every state except California,
HDDV emissions rates are taken from the U.S. Environmental Protection Agency's (EPA's)
MOBILE 6.2 model (U.S. EPA 200la). The U.S. Environmental Protection Agency (U.S. EPA)
is currently developing a new set of modeling tools for the estimation of emissions produced by
onroad and off-road mobile sources. The new Multi-scale mOtor Vehicle & equipment Emission
System, known as MOVES (U.S. EPA2001a), is a modeling system designed to better predict
emissions from onroad operations.
The major effort of this research is to develop a new heavy-duty vehicle load-based mod-
al emission rate model that overcomes some of the limitations of existing models and emission
rates prediction methods. This model is part of the proposed Heavy-Duty Diesel Vehicle Modal
Emission Modeling (HDDV-MEM) which was developed by Georgia Institute of Technology
(Guensler, et al. 2006). HDDV-MEM differs from other proposed HDDV modal models (Earth,
et al. 2004; Frey, et al. 2002; Nam 2003) in that the modeling framework first predicts second-
by-second engine power demand as a function of vehicle operating conditions and then applies
brake-specific emission rates to these activity predictions.
-------
FOREWORD
The U.S. Environmental Protection Agency (EPA) is charged by Congress with protect-
ing the Nation's land, air, and water resources. Under a mandate of national environmental laws,
the agency strives to formulate and implement actions leading to a compatible balance between
human activities and the ability of natural systems to support and nurture life. To meet this man-
date, EPA's research program is providing data and technical support for solving environmental
problems today and building a science knowledge base necessary to manage our ecological re-
sources wisely, understand how pollutants affect our health, and prevent or reduce environmental
risks in the future.
The National Risk Management Research Laboratory (NRMRL) is the agency's center
for investigation of technological and management approaches for preventing and reducing risks
from pollution that threaten human health and the environment. The focus of the laboratory's
research program is on methods and their cost-effectiveness for prevention and control of pol-
lution to air, land, water, and subsurface resources; protection of water quality in public water
systems; remediation of contaminated sites, sediments, and ground water; prevention and control
of indoor air pollution; and restoration of ecosystems. NRMRL collaborates with both public and
private sector partners to foster technologies that reduce the cost of compliance and to antici-
pate emerging problems. NRMRL's research provides solutions to environmental problems by:
developing and promoting technologies that protect and improve the environment; advancing
scientific and engineering information to support regulatory and policy decisions; and providing
the technical support and information transfer to ensure implementation of environmental regula-
tions and strategies at the national, state, and community levels.
This publication has been produced as part of the laboratory's strategic long-term re-
search plan. It is published and made available by EPA's Office of Research and Development to
assist the user community and to link researchers with their clients.
Sally Gutierrez, Director
National Risk Management Research Laboratory
in
-------
EPA REVIEW NOTICE
This report has been peer and administratively reviewed by the U.S. Environmental Pro-
tection Agency and approved for publication. Mention of trade names or commercial products
does not constitute endorsement or recommendation for use. This document is available to the
public through the National Technical Information Service, Springfield, Virginia 22161.
IV
-------
TABLE OF CONTENTS
ABSTRACT ii
FOREWORD iii
EPA RE VIEW NOTICE iv
LIST OF ACRONYMS xxii
SUMMARY xxv
1. INTRODUCTION 1-1
1.1 Emissions from Heavy-Duty Diesel Vehicles 1-1
1.2 Current Heavy-Duty Vehicle Emissions Modeling Practices 1-2
1.3 Research Approaches and Objectives 1-2
1.4 Summary of Research Contributions 1-3
1.5 Report Organization 1-4
2. HEAVY-DUTY DIESEL VEHICLE EMISSIONS 2-1
2.1 How Diesel Engine Works 2-1
2.1.1 The Internal Combustion Engine 2-1
2.1.2 Comparison with the Gasoline Engine 2-3
2.2 Diesel Engine Emissions 2-4
2.2.1 Oxides of Nitrogen and Ozone Formation 2-4
2.2.2 Fine Paniculate Matter (PM25) 2-5
2.3 Heavy-Duty Diesel Vehicle Emission Regulations 2-6
2.3.1 National Ambient Air Quality Standards 2-6
2.3.2 Heavy-Duty Engine Certification Standards 2-7
2.3.3 Heavy-Duty Engine Emission Regulations 2-8
2.4 Heavy-Duty Diesel Vehicle Emission Modeling 2-8
3. HEAVY-DUTY DIESEL VEHICLE EMISSIONS MODELING 3-1
3.1 VMT-Based Vehicle Emission Models 3-1
3.1.1 MOBILE 3-1
-------
3.1.2EMFAC 3-5
3.1.3 Summary 3-6
3.2 Fuel-Based Vehicle Emission Models 3-7
3.3 Modal Emission Rate Models 3-8
3.3.1 CMEM 3-8
3.3.2 MEASURE 3-9
3.3.3 MOVES 3-10
3.3.4HDDV-MEM 3-11
3.3.4.1 Model Development Approaches 3-12
3.3.4.2 Vehicle Activity Module 3-13
3.3.4.3 Engine Power Module 3-14
3.3.4.4 Emission Rate Module 3-18
3.3.4.5 Emission Outputs 3-19
4. EMIS SIGN D ATASET DESCRIPTION AND POST-PROCES SING PROCEDURE.... 4-1
4.1 Transit Bus Dataset 4-1
4.1.1 Data Collection Method 4-2
4.1.2 Transit Bus Data Parameters 4-4
4.1.3 Sensors, Inc. Data Processing Procedure 4-5
4.1.4 Data Quality Assurance/Quality Check 4-6
4.1.5 Database Formation 4-11
4.1.6 Data Summary 4-12
4.2 Heavy-duty Vehicle Dataset 4-14
4.2.1 Data Collection Method 4-14
4.2.2 Heavy-duty Vehicle Data Parameters 4-16
4.2.3 Data Quality Assurance/Quality Control Check 4-17
4.2.4 Database Formation 4-20
4.2.5 Data Summary 4-20
5. METHODOLOGICAL APPROACH 5-1
5.1 Modeling Goal and Objectives 5-1
5.2 Statistical Method 5-2
5.2.1 Parametric Methods 5-2
5.2.1.1 The^-Test 5-2
5.2.1.2 Ordinary Least Squares Regression 5-3
5.2.1.3 Robust Regression 5-5
5.2.2 Nonparametric Methods 5-5
5.2.2.1 Chi-Square Test 5-5
VI
-------
5.2.2.2 Kolmogorv-Smirnov Two-Sample Test 5-6
5.2.2.3 Wilcoxon Mann-Whitney Test 5-7
5.2.2.4 Analysis of Variance (ANOVA) 5-7
5.2.2.5 HTBR 5-8
5.3 Modeling Approach 5-11
5.4 Model Validation 5-13
6. DATA SET SELECTION AND ANALYSIS OF EXPLANATORY VARIABLES 6-1
6.1 Data Set Used for Model Development 6-1
6.2 Representative Ability of the Transit Bus Data Set 6-3
6.3 Variability in Emissions Data 6-5
6.3.1 Inter-bus Variability 6-5
6.3.2 Descriptive Statistics for Emissions Data 6-8
6.4 Potential Explanatory Variables 6-18
6.4.1 Vehicle Characteristics 6-19
6.4.2 Roadway Characteristics 6-22
6.4.3 Onroad Load Parameters 6-23
6.4.4 Environmental Conditions 6-23
6.4.5 Summary 6-24
6.5 Selection of Explanatory Variables 6-24
7. MODAL ACTIVITY DEFINITIONS DEVELOPMENT 7-1
7.1 Overview of Current Modal Activity Definitions 7-1
7.2 Proposed Modal Activity Definitions and Validation 7-3
7.3 Conclusions 7-11
8. IDLE MODE DEVELOPMENT 8-1
8.1 Critical Value for Speed in Idle Mode 8-1
8.2 Critical Value for Acceleration in Idle Mode 8-4
8.3 Emission Rate Distribution by Bus in Idle Mode 8-8
8.4 Discussions 8-13
8.4.1 High HC Emissions 8-13
8.4.2 High Engine Operating Parameters 8-15
8.5 Idle Emission Rates Estimation 8-16
8.6 Conclusions and Further Considerations 8-19
9. DECELERATION MODE DEVELOPMENT 9-1
9.1 Critical Value for Deceleration Rates in Deceleration Mode 9-1
9.2 Analysis of Deceleration Mode Data 9-5
9.2.1 Emission Rate Distribution by Bus in Deceleration Mode 9-5
vn
-------
9.2.2 Engine Power Distribution by Bus in Deceleration Mode 9-9
9.3 The Deceleration Motoring Mode 9-12
9.4 Deceleration Emission Rate Estimations 9-15
9.5 Conclusions and Further Considerations 9-19
10. ACCELERATION MODE DEVELOPMENT 10-1
10.1 Critical Value for Acceleration in Acceleration Mode 10-1
10.2 Analysis of Acceleration Mode Data 10-6
10.2.1 Emission Rate Distribution by Bus in Acceleration Mode 10-6
10.2.2 Engine Power Distribution by Bus in Acceleration Mode 10-10
10.3 Model Development and Refinement 10-12
10.3.1 HTBR Tree Model Development 10-12
10.3.1.1NOXHTBR Tree Model Development 10-16
10.3.1.2 CO HTBR Tree Model Development 10-20
10.3.1.3 HC HTBR Tree Model Development 10-22
10.3.2 OLS Model Development and Refinement 10-29
10.3.2.1 NOx Emission Rate Model Development for Acceleration Mode 10-29
10.3.2.1.1 Linear Regression Model with Engine Power 10-29
10.3.2.1.2 Linear Regression Model with Engine Power and Vehicle
Speed 10-34
10.3.2.1.3 Linear Regression Model with Dummy Variables 10-36
10.3.2.1.4 Model Discussions 10-38
10.3.2.2 CO Emission Rate Model Development for Acceleration Mode 10-42
10.3.2.2.1 Linear Regression Model with Engine Power 10-42
10.3.2.2.2 Linear Regression Model with Engine Power and Vehicle
Speed 10-46
10.3.2.2.3 Linear Regression Model with Dummy Variables 10-47
10.3.2.3 HC Emission Rate Model Development for Acceleration Mode 10-54
10.3.2.3.1 Linear Regression with Engine Power 10-55
10.3.2.3.2 Linear Regression Model with Dummy Variables 10-59
10.3.2.3.3 Model Discussions 10-61
10.4 Conclusions and Further Considerations 10-63
11. CRUISE MODE DEVELOPMENT 11-1
11.1 Analysis of Cruise Mode Data 11-1
11.1.1 Engine Rate Distribution by Bus in Cruise Mode 11-2
11.1.2 Engine Power Distribution by Bus in Cruise Mode 11-5
11.2 Model Development and Refinement 11-7
vin
-------
11.2.1 HTBR Tree Model Development 11-7
11.2.1.1NOXHTBR Tree Model Development 11-11
11.2.1.2 CO HTBR Tree Model Development 11-15
11.2.1.3 HC HTBR Tree Model Development 11-19
11.2.2 OLS Model Development and Refinement 11-25
11.2.2.1 NOx Emission Rate Model Development for Cruise Mode 11-25
11.2.2.1.1 Linear Regression Model with Engine Power 11-25
11.2.2.1.2 Linear Regression Model with Dummy Variables 11-30
11.2.2.1.3 Model Discussion 11-32
11.2.2.2 CO Emission Rate Model Development for Cruise Mode 11-35
11.2.2.2.1 Linear Regression Model with Engine Power 11-35
11.2.2.2.2 Linear Regression Model with Dummy Variables 11-39
11.2.2.2.3 Model Discussion 11-41
11.2.2.3 HC Emission Rate Model Development for Cruise Mode 11-43
11.2.2.3.1 Linear Regression Model with Engine Power 11-43
11.2.2.3.2 Linear Regression Model with Dummy Variables 11-47
11.2.2.3.3 Model Discussion 11-49
11.3 Conclusions and Further Considerations 11-51
12. MODEL VERIFICATION 12-1
12.1 Engine Power vs. Surrogate Power Variables 12-1
12.2 Mean Emission Rates vs. Linear Regression Model 12-4
12.3 Mode-specific Load Based Modal Emission Rate Model vs. Emission Rate
Models as a Function of Engine Load 12-6
12.4 Separation of Acceleration and Cruise Modes 12-11
12.5 MOBILE6.2 vs. Load-Based Modal Emission Rate Model 12-12
12.6 Conclusions 12-13
13. CONCLUSIONS 13-1
13.1 Transit Bus Emission Rate Models 13-3
13.2 Model Limitations 13-4
13.3 Lessons Learned 13-5
13.4 Contributions 13-6
13.5 Recommendation for Further Studies 13-6
14. REFERENCES 14-1
IX
-------
TABLE OF FIGURES
Figure 2.1 Actions of a four-stroke gasoline internal combustion engine — Adapted
from (HowStuffWorks 2005) 2-2
Figure 2.2 Actions of a four-stroke diesel engine (HowStuffWorks 2005) 2-3
Figure 3-1 FTP Transient Cycle (DieselNet 2006) 3-3
Figure 3-2 Urban Dynamometer Driving Schedule Cycle for Heavy-Duty
Vehicle (DieselNet 2006) 3-3
Figure 3-3 CARB's Four Mode Cycles (CARS 2002) 3-6
Figure 3-4 A Framework of Heavy-Duty Diesel Vehicle Modal Emission
Model (Guensler et al. 2005) 3-12
Figure 3-5 Primary Elements in the Drivetrain (Gillespie 1992) 3-14
Figure 4-1 Bus Routes Tested for U. S. EPA (Ensfield 2002) 4-3
Figure 4-2 SEMTECH-D in Back of Bus (Ensfield 2002) 4-4
Figure 4-3 Bus 380 GPS vs. ECM Vehicle Speed (Ensfield 2002) 4-6
Figure 4-4 Example Check for Erroneous GPS Data for Bus 360 (Ensfield 2002) 4-8
Figure 4-5 Example Check for Synchronization Errors for Bus 360 4-9
Figure 4-6 Histograms of Engine Power for Zero Speed Data Based on Three
Different Time Delays 4-10
Figure 4-7 General Criteria for Maximum Grades (Roess etal. 2004) 4-11
Figure 4-8 Onroad Diesel Emissions Characterization Facility (U.S. EPA2001c) 4-14
Figure 4-9 Example Check for Erroneous Measured Horsepower for Test 3DRI2-2 4-18
Figure 4-10 Vehicle Speed Correlation (U.S. EPA2001 c) 4-19
Figure 4-11 Vehicle Speed Error for Different Speed Ranges (U. S. EPA 2001 c) 4-19
Figure 6-1 HTBR Regression Tree Result for NO Emission Rate for All Data Sets 6-2
Figure 6-2 HTBR Regression Tree Result for CO Emission Rate for All Data Sets 6-2
Figure 6-3 HTBR Regression Tree Result for HC Emission Rate for All Data Sets 6-3
Figure 6-4 Transit Bus Speed-Acceleration Matrix 6-4
-------
Figure 6-5 Test Environmental Conditions 6-5
Figure 6-6 Median and Mean of NO Emission Rates by Bus 6-6
Figure 6-7 Median and Mean of CO Emission Rates by Bus 6-7
Figure 6-8 Median and Mean of HC Emission Rates by Bus 6-7
Figure 6-9 Empirical Cumulative Distribution Function Based on Bus Based
Median Emission Rates for Transit Buses 6-8
Figure 6-10 Histogram, Boxplot, and Probability Plot of NO Emission Rate 6-9
Figure 6-11 Histogram, Boxplot, and Probability Plot of CO Emission Rate 6-10
Figure 6-12 Histogram, Boxplot, and Probability Plot of HC Emission Rate 6-10
Figure 6-13 Histogram, Boxplot, and Probability Plot of Truncated NOx
Emission Rate 6-12
Figure 6-14 Histogram, Boxplot, and Probability Plot of Truncated CO
Emission Rate 6-13
Figure 6-15 Histogram, Boxplot, and Probability Plot of Truncated HC
Emission Rate 6-13
Figure 6-16 Histogram, Boxplot, and Probability Plot of Truncated Transformed
NO Emission Rate 6-15
X
Figure 6-17 Histogram, Boxplot, and Probability Plot of Truncated Transformed
CO Emission Rate 6-15
Figure 6-18 Histogram, Boxplot, and Probability Plot of Truncated Transformed
HC Emission Rate 6-16
Figure 6-19 The X Classes and Typical Vehicle Configurations 6-20
Figure 6-20 Throttle Position vs. Engine Power for Transit Bus Data Set 6-27
Figure 6-21 Scatter plots for environmental parameters 6-29
Figure 7-1 Average NO Modal Emission Rates for Different Activity Definitions 7-5
Figure 7-2 Average CO Modal Emission Rates for Different Activity Definitions 7-5
Figure 7-3 Average HC Modal Emission Rates for Different Activity Definitions 7-6
Figure 7-4 HTBR Regression Tree Result for NO Emission Rate 7-8
Figure 7-5 HTBR Regression Tree Result for CO Emission Rate 7-8
Figure 7-6 HTBR Regression Tree Result for HC Emission Rate 7-9
Figure 8-1 Engine Power vs. NO Emission Rate for Three Critical Values 8-2
Figure 8-2 Engine Power vs. CO Emission Rate for Three Critical Values 8-2
Figure 8-3 Engine Power vs. HC Emission Rate for Three Critical Values 8-3
Figure 8-4 Engine Power Distribution for Three Critical Values based on NO Emissions.. 8-3
Figure 8-5 Engine Power vs. NO Emission Rate for Four Options 8-5
XI
-------
Figure 8-6 Engine Power vs. CO Emission Rate for Four Options 8-6
Figure 8-7 Engine Power vs. HC Emission Rate for Four Options 8-6
Figure 8-8 Engine Power Distribution for Four Options based on NO Emission Rates 8-7
Figure 8-9 Histograms of Three Pollutants for Idle Mode 8-9
Figure 8-10 Median and Mean of NO Emission Rates in Idle Mode by Bus 8-9
Figure 8-11 Median and Mean of CO Emission Rates in Idle Mode by Bus 8-10
Figure 8-12 Median and Mean of HC Emission Rates in Idle Mode by Bus 8-10
Figure 8-13 Histograms of Engine Power in Idle Mode by Bus 8-12
Figure 8-14 Tree Analysis Results for High HC Emission Rates by Bus and Trip 8-14
Figure 8-15 Time Series Plot for Bus 360 Trip 4 Idle Segment 1 (130 Seconds) 8-14
Figure 8-16 Time Series Plot for Bus 360 Trip 4 Idle Segment 38 (516 Seconds) 8-15
Figure 8-17 Time Series Plot for Bus 372 Trip 1 Idle Segment 1 (500 Seconds) 8-15
Figure 8-18 Time Series Plot for Bus 383 Trip 1 Idle Segment 12 (1258 Seconds) 8-16
Figure 8-19 Graphical Illustration of Bootstrap (Adopted from Li 2004)) 8-17
Figure 8-20 Bootstrap Results for Idle Emission Rate Estimation 8-18
Figure 9-1 Engine Power Distribution for Three Options 9-3
Figure 9-2 Engine Power vs. NO Emission Rate for Three Options 9-3
Figure 9-3 Engine Power vs. CO Emission Rate for Three Options 9-4
Figure 9-4 Engine Power vs. HC Emission Rate for Three Options 9-4
Figure 9-5 Histograms of Three Pollutants for Deceleration Mode 9-5
Figure 9-6 Median and Mean of NO Emission Rates in Deceleration Mode by Bus 9-6
Figure 9-7 Median and Mean of CO Emission Rates in Deceleration Mode by Bus 9-6
Figure 9-8 Median and Mean of HC Emission Rates in Deceleration Mode by Bus 9-7
Figure 9-9 Histograms of Engine Power in Deceleration Mode by Bus 9-11
Figure 9-10 Engine Power vs. Vehicle Speed, Engine Power vs. Engine Speed,
and Vehicle Speed vs. Engine Speed 9-12
Figure 9-11 Histograms for Three Pollutants in Deceleration Motoring Mode (a)
and Deceleration Non-Motoring Mode (b) 9-13
Figure 9-12 Bootstrap Results for NO Emission Rate Estimation in
Deceleration Mode 9-16
Figure 9-13 Bootstrap Results for CO Emission Rate Estimation in
Deceleration Mode 9-16
Figure 9-14 Bootstrap Results for HC Emission Rate Estimation in
Deceleration Mode 9-17
Figure 9-15 Emission Rate Estimation Based on Bootstrap for Deceleration Mode 9-17
xn
-------
Figure 10-1 Engine Power Distribution for Three Options 10-2
Figure 10-2 Engine Power vs. NO Emission Rate (g/s) for Three Options 10-2
Figure 10-3 Engine Power vs. CO Emission Rate (g/s) for Three Options 10-3
Figure 10-4 Engine Power vs. HC Emission Rate (g/s) for Three Options 10-3
Figure 10-5 Engine Power vs. Emission Rate for Acceleration Mode and Cruise Mode.... 10-6
Figure 10-6 Histograms of Three Pollutants for Acceleration Mode 10-7
Figure 10-7 Median and Mean of NO Emission Rates in Acceleration Mode by Bus 10-8
Figure 10-8 Median and Mean of CO Emission Rates in Acceleration Mode by Bus 10-9
Figure 10-9 Median and Mean of HC Emission Rates in Acceleration Mode by Bus 10-9
Figure 10-10 Histograms of Engine Power in Acceleration Mode by Bus 10-11
Figure 10-11 Histogram, Boxplot, and Probability Plot of Truncated NO
Emission Rate in Acceleration Mode 10-13
Figure 10-12 Histogram, Boxplot, and Probability Plot of Truncated CO
Emission Rate in Acceleration Mode 10-14
Figure 10-13 Histogram, Boxplot, and Probability Plot of Truncated HC
Emission Rate in Acceleration Mode 10-14
Figure 10-14 Histogram, Boxplot, and Probability Plot of Truncated Transformed
NO Emission Rate in Acceleration Mode 10-15
Figure 10-15 Histogram, Boxplot, and Probability Plot of Truncated Transformed
CO Emission Rate in Acceleration Mode 10-15
Figure 10-16 Histogram, Boxplot, and Probability Plot of Truncated Transformed
HC Emission Rate in Acceleration Mode 10-16
Figure 10-17 Original Untrimmed Regression Tree Model for Truncated
Transformed NO Emission Rate in Acceleration Mode 10-17
Figure 10-18 Reduction in Deviation with the Addition of Nodes of Regression
Tree for Truncated Transformed NOx Emission Rate in Acceleration Mode. 10-18
Figure 10-19 Trimmed Regression Tree Model for Truncated Transformed NO
Emission Rate in Acceleration Mode 10-18
Figure 10-20 Original Untrimmed Regression Tree Model for Truncated
Transformed CO Emission Rate in Acceleration Mode 10-20
Figure 10-21 Reduction in Deviation with the Addition of Nodes of Regression
Tree for Truncated Transformed CO Emission Rate in Acceleration Mode.. 10-21
Figure 10-22 Trimmed Regression Tree Model for Truncated Transformed CO
Emission Rate in Acceleration Mode 10-21
Xlll
-------
Figure 10-23 Original Untrimmed Regression Tree Model for Truncated
Transformed HC Emission Rate in Acceleration Mode 10-23
Figure 10-25 Trimmed Regression Tree Model for Truncated Transformed
HC in Acceleration Mode 10-25
Figure 10-26 Secondary Trimmed Regression Tree Model for Truncated
Transformed HC Emission Rate in Acceleration Mode 10-26
Figure 10-27 Final Regression Tree Model for Truncated Transformed HC
and Engine Power in Acceleration Mode 10-28
Figure 10-28 QQ and Residual vs. Fitted Plot for NOx Model 1.1 10-31
Figure 10-29 QQ and Residual vs. Fitted Plot for NOx Model 1.2 10-32
Figure 10-30 QQ and Residual vs. Fitted Plot for NOx Model 1.3 10-33
Figure 10-31 QQ and Residual vs. Fitted Plot for NOx Model 1.4 10-36
Figure 10-32 QQ and Residual vs. Fitted Plot for NOx Model 1.5 10-38
Figure 10-33 QQ and Residual vs. Fitted Plot for CO Model 2.1 10-43
Figure 10-34 QQ and Residual vs. Fitted Plot for CO Model 2.2 10-44
Figure 10-35 QQ and Residual vs. Fitted Plot for CO Model 2.3 10-45
Figure 10-36 QQ and Residual vs. Fitted Plot for CO Model 2.4 10-47
Figure 10-37 QQ and Residual vs. Fitted Plot for CO Model 2.5 10-50
Figure 10-38 QQ and Residual vs. Fitted Plot for CO Model 2.6 10-52
10-55
Figure 10-39 QQ and Residual vs. Fitted Plot for HC Model 3.1 10-56
Figure 10-40 QQ and Residual vs. Fitted Plot for HC Model 3.2 10-57
Figure 10-41 QQ and Residual vs. Fitted Plot for HC Model 3.3 10-58
Figure 10-42 QQ and Residual vs. Fitted Plot for HC Model 3.4 10-61
Figure 11-1 Histograms of Three Pollutants for Cruise Mode 11-2
Figure 11-2 Median and Mean of NO Emission Rates in Cruise Mode by Bus 11-3
Figure 11-3 Median and Mean of CO Emission Rates in Cruise Mode by Bus 11-3
Figure 11-4 Median and Mean of HC Emission Rates in Cruise Mode by Bus 11-4
Figure 11-5 Histograms of Engine Power in Cruise Mode by Bus 11-6
Figure 11-6 Histogram, Boxplot, and Probability Plot of Truncated NOx Emission
Rates in Cruise Mode 11-8
Figure 11-7 Histogram, Boxplot, and Probability Plot of Truncated CO Emission
Rate in Cruise Mode 11-9
Figure 11-8 Histogram, Boxplot, and Probability Plot of Truncated HC Emission
Rate in Cruise Mode 11-9
xiv
-------
Figure 11-9 Histogram, Boxplot, and Probability Plot of Truncated Transformed
NO Emission Rate in Cruise Mode 11-10
Figure 11-10 Histogram, Boxplot, and Probability Plot of Truncated Transformed
CO Emission Rate in Cruise Mode 11-10
Figure 11-11 Histogram, Boxplot, and Probability Plot of Truncated Transformed
HC Emission Rate in Cruise Mode 11-11
Figure 11-12 Original Untrimmed Regression Tree Model for Truncated
Transformed NO Emission Rate in Cruise Mode 11-12
Figure 11-13 Reduction in Deviation with the Addition of Nodes of Regression
Tree for Truncated Transformed NO Emission Rate in Cruise Mode 11-12
Figure 11-14 Trimmed Regression Tree Model for Truncated Transformed NO
Emission Rate in Cruise Mode 11-14
Figure 11-15 Original Untrimmed Regression Tree Model for Truncated
Transformed CO Emission Rate in Cruise Mode 11-16
Figure 11-16 Reduction in Deviation with the Addition of Nodes of Regression
Tree for Truncated Transformed CO Emission Rate in Cruise Mode 11-16
Figure 11-17 Trimmed Regression Tree Model for Truncated Transformed CO
Emission Rate in Cruise Mode 11-18
Figure 11-18 Original Untrimmed Regression Tree Model for Truncated
Transformed HC Emission Rate in Cruise Mode 11-19
Figure 11-19 Trimmed Regression Tree Model for Truncated Transformed HC
Emission Rate in Cruise Mode 11-21
Figure 11-20 Secondary Trimmed Regression Tree Model for Truncated
Transformed HC in Cruise Mode 11-22
Figure 11-21 Final Regression Tree Model for Truncated Transformed HC
and Engine Power in Cruise Mode 11-24
Figure 11-22 QQ and Residual vs. Fitted Plot for NOx Model 1.1 11-27
Figure 11-23 QQ and Residual vs. Fitted Plot for NOx Model 1.2 11-28
Figure 11-24 QQ and Residual vs. Fitted Plot for NOx Model 1.3 11-29
Figure 11-25 QQ and Residual vs. Fitted Plot for NOx Model 1.4 11-32
Figure 11-26 QQ and Residual vs. Fitted Plot for CO Model 2.1 11-36
Figure 11-27 QQ and Residual vs. Fitted Plot for CO Model 2.2 11-37
Figure 11-28 QQ and Residual vs. Fitted Plot for CO Model 2.3 11-38
Figure 11-29 QQ and Residual vs. Fitted Plot for CO Model 2.4 11-40
Figure 11-30 QQ and Residual vs. Fitted Plot for HC Model 3.1 11-44
xv
-------
Figure 11-31 QQ and Residual vs. Fitted Plot for HC Model 3.2 11-45
Figure 11-32 QQ and Residual vs. Fitted Plot for HC Model 3.3 11-46
Figure 11-33 QQ and Residual vs. Fitted Plot for HC Model 3.4 11-48
Figure 12-1 QQ and Residual vs. Fitted Plot for NOx Model 1 12-4
Figure 12-2 Trimmed Regression Tree Model for Truncated Transformed NO 12-7
Figure 12-3 QQ and Residual vs. Fitted Plot for Load-Based Only NOx Emission
Rate Model 12-9
xvi
-------
TABLE OF TABLES
Table 2-1. National Ambient Air Quality Standards (U.S. EPA 2006) 2-6
Table 2-2. Heavy-Duty Engine Emissions Standards (U.S. EPA 1997) 2-8
Table 3-1. Heavy-Duty Vehicle NOx Emission Rates in MOBILE6 3-4
Table 3-2 Heavy-Duty Vehicle CO Emission Rates in MOBILE6 3-4
Table 3-3 Heavy-Duty Vehicle HC Emission Rates in MOBILE6 3-4
Table 4-1 Buses Tested for U.S. EPA (Ensfield 2002) 4-2
Table 4-2 Transit Bus Parameters Given by the U.S. EPA (Ensfield 2002) 4-4
Table 4-3 List of Parameters Used in Explanatory Analysis for Transit Bus 4-12
Table 4-4 Summary of Transit Bus Database 4-13
Table 4-5 Onroad Tests Conducted with Pre-Rebuild Engine 4-15
Table 4-6 Onroad Tests Conducted with Post-Rebuild Engine 4-16
Table 4-7 List of Parameters Given in Heavy-duty Vehicle Dataset Provided by
U.S. EPA 4-17
Table 4-8 List of Parameters Used in Explanatory Analysis for HDD V 4-20
Table 4-9 Summary of Heavy-Duty Vehicle Data U.S. EPA 2001 c) 4-21
Table 5-1 ANOVA Table for Single-Factor Study (Neteretal. 1996) 5-8
Table 6-1 Basic Summary Statistics for Emissions Rate Data for Transit Bus 6-9
Table 6-2 Basic Summary Statistics for Truncated Emissions Rate Data 6-12
Table 6-4 Percent of High Emission Points by Bus 6-18
Table 6-5 Correlation Matrix for Transit Bus Data Set 6-25
Table 7-1 Comparison of Modal Activity Definition 7-3
xvn
-------
Table 7-2 Four Different Mode Definitions and Modal Variables 7-4
Table 7-3 Results for Pairwise Comparison for Modal Average Estimates
In Terms of P-value 7-7
Table 7-4 Sensitivity Test Results for Four Mode Definition 7-9
Table 8-1 Engine Power Distribution for Three Critical Values for Three Pollutants 8-4
Table 8-2 Percentage of Engine Power Distribution for Three Critical Values for
Three Pollutants 8-4
Table 8-3 Engine Power Distribution for Four Options for Three Pollutants 8-7
Table 8-4 Percentage of Engine Power Distribution for Three Critical Values
for Three Pollutants 8-8
Table 8-5 Median, and Mean of Three Pollutants in Idle Mode by Bus 8-11
Table 8-6 Engine Power Distribution in Idle Mode by Bus 8-13
Table 8-7 Idle Mode Statistical Analysis Results for NOx, CO, andHC 8-17
Table 8-8 Idle Emission Rates Estimation and 95% Confidence Intervals
Based on Bootstrap 8-18
Table 9-1 Engine Power Distribution for Three Options for Three Pollutants 9-2
Table 9-2 Percentage of Engine Power Distribution for Three Options for
Three Pollutants 9-2
Table 9-3 Median, and Mean for NOx, CO, and HC in Deceleration Mode by Bus 9-7
Table 9-4 High HC Emissions Distribution by Bus and Trip for Deceleration Mode 9-9
Table 9-5 Engine Power Distributions in Deceleration Mode by Bus 9-10
Table 9-6 Comparison of Emission Distributions between Deceleration Mode and
Two Sub-Modes (Deceleration Motoring Mode and Deceleration
Non-Motoring Mode) 9-14
Table 9-7 Emission Rate Estimation and 95% Confidence Intervals Based on
Bootstrap for Deceleration Mode 9-18
Table 10-1 Engine Power Distribution for Three Options for Three Pollutants 10-4
Table 10-2 Percentage of Engine Power Distribution for Three Options for Three
Pollutants 10-4
Table 10-3 Engine Power Distribution for Acceleration Mode and Cruise Mode 10-5
Table 10-4 Median and Mean of Three Pollutants in Acceleration Mode by Bus 10-7
Table 10-5 Engine Power Distribution in Acceleration Mode by Bus 10-10
xvin
-------
Table 10-6 Original Untrimmed Regression Tree Results for Truncated Transformed
NO Emission Rate in Acceleration Mode 10-17
X
Table 10-7 Trimmed Regression Tree Results for Truncated Transformed NO
& X
Emission Rate in Acceleration Mode 10-19
Table 10-8 Original Untrimmed Regression Tree Results for Truncated
Transformed CO Emission Rate in Acceleration Mode 10-20
Table 10-9 Trimmed Regression Tree Results for Truncated Transformed CO
Emission Rate in Acceleration Mode 10-22
Table 10-10 Original Untrimmed Regression Tree Results for Truncated
Transformed HC Emission Rate in Acceleration Mode 10-23
Table 10-11 Trimmed Regression Tree Results for Truncated Transformed HC
in Acceleration Mode 10-25
Table 10-12 Secondary Trimmed Regression Tree Results for Truncated
Transformed HC Emission Rate in Acceleration Mode 10-27
Table 10-13 Final Regression Tree Results for Truncated Transformed HC
and Engine Power in Acceleration Mode 10-28
Table 10-14 Regression Result for NOx Model 1.1 10-30
Table 10-15 Regression Result for NOx Model 1.2 10-32
Table 10-16 Regression Result for NOx Model 1.3 10-33
Table 10-17 Regression Result for NOx Model 1.4 10-35
Table 10-18 Regression Result for NOx Model 1.5 10-37
Table 10-19 Comparative Performance Evaluation of NOx Emission Rate Models 10-40
Table 10-20 Regression Result for CO Model 2.1 10-42
Table 10-21 Regression Result for CO Model 2.2 10-44
Table 10-22 Regression Result for CO Model 2.3 10-45
Table 10-23 Regression Result for CO Model 2.4 10-46
Table 10-24 Regression Result for CO Model 2.5 10-49
Table 10-25 Regression Result for CO Model 2.6 10-51
Table 10-26 Comparative Performance Evaluation of CO Emission Rate Models 10-53
Table 10-27 Regression Result for HC Model 3.1 10-55
Table 10-28 Regression Result for HC Model 3.2 10-57
Table 10-29 Regressi on Result for HC Model 3.3 10-58
xix
-------
Table 10-31 Comparative Performance Evaluation of HC Emission Rate Models 10-62
Table 11-1 Engine Power Distribution for Cruise Mode 11-1
Table 11-2 Median and Mean of Three Pollutants in Cruise Mode by Bus 11-4
Table 11-3 Engine Power Distribution in Cruise Mode by Bus 11-5
Table 11-4 Original Untrimmed Regression Tree Results for Truncated Transformed
NO Emission Rate in Cruise Mode 11-13
X
Table 11-5 Trimmed Regression Tree Results for Truncated Transformed NO
& X
Emission Rate in Cruise Mode 11-14
Table 11-6 Original Untrimmed Regression Tree Results for Truncated Transformed
CO Emission Rate in Cruise Mode 11-17
Table 11-7 Trimmed Regression Tree Results for Truncated Transformed CO
Emission Rate in Cruise Mode 11-18
Table 11-8 Original Untrimmed Regression Tree Results for Truncated Transformed
HC Emission Rate in Cruise Mode 11-20
Table 11-9 Trimmed Regression Tree Results for Truncated Transformed HC
Emission Rate in Cruise Mode 11-21
Table 11-10 Trimmed Regression Tree Results for Truncated Transformed HC in
Cruise Mode 11-23
Table 11-11 Final Regression Tree Results for Truncated Transformed HC and
Engine Power in Cruise Mode 11-24
Table 11-12 Regression Result for NO Model 1.1 11-26
& X
Table 11-13 Regression Result for NO Model 1.2 11-28
& X
Table 11-15 Regression Result for NO Model 1.4 11-31
& X
Table 11-16 Comparative Performance Evaluation of NOx Emission Rate Models 11-33
Table 11-17 Regression Result for CO Model 2.1 11-35
Table 11-18 Regression Result for CO Model 2.2 11-37
Table 11-19 Regression Result for CO Model 2.3 11-38
Table 11-20 Regression Result for CO Model 2.4 11-40
Table 11-21 Comparative Performance Evaluation of CO Emission Rate Models 11-41
Table 11-22 Regression Result for HC Model 3.1 11-43
Table 11-23 Regression Result for HC Model 3.2 11-45
Table 11-24 Regression Result for HC Model 3.3 11-46
xx
-------
Table 11-25 Regress!on Result for HC Model 3.4 11-48
Table 11-26 Comparative Performance Evaluation of HC Emission Rate Models 11-49
Table 12-1 Regression Result for NOx Model 1 12-3
Table 12-2 Comparative Performance Evaluation between Mode-Only Models
and Linear Regression Models 12-6
Table 12-3 Trimmed Regression Tree Results for Truncated Transformed NO 12-8
& X
Table 12-4 Regression Result for NO Load-Based Only Emission Rate Model 12-9
Table 12-5 Comparative Performance Evaluation Between Load-Based Only
Emission Rate (ER) Model and Load-Based Modal Emission Rate Model 12-10
Table 12-6 Comparative Performance Evaluation between Linear Regression with
Combined Mode and Linear Regression with Acceleration and Cruise
Modes 12-12
Table 12-7 Comparative Performance Evaluation between MOBILE 6.2 and Load-Based
Modal ERModel 12-13
Table 13-1 Load Based Modal Emission Models 13-3
xxi
-------
LIST OF ACRONYMS
% percent
AADT annual average daily traffic
AATA Ann Arbor Transit Authority
Ace acceleration
ANOVA analysis of variance
APPCD Air Pollution Prevention and Control Division
bhp brake horsepower
BSFC brake specific fuel consumption
C Celsius
CARB California Air Resources Board
CART classification and regression testing
CE-CERT College of Engineering - Center for Environmental Research and Technology
CMEM Comprehensive Modal Emissions Model
CO carbon monoxide
deg degree
df degrees of freedom
DPS drag power surrogate
DVD digital video disc
ECM electronic control module
EMFAC CARB's mobile source emission factor model
E(MS) expected mean square
EPA Environmental Protection Agency
F Fahrenheit
FFiWA Federal Highway Administration
FR Federal Register
FTP Federal Test Procedure
g/bhp-hr grams per brake-horsepower-hour
g/h grams per hour
g/s grams per second
GIS geographic information system
GPS global positioning system
GVWR gross vehicle weight rating
HC hydrocarbon
HDD heavy-duty diesel
HDDV heavy-duty diesel vehicle
xxn
-------
HDDV-MEM Heavy-Duty Diesel Vehicle-Modal Emission Model
HDV heavy-duty vehicle
HDV8B heavy-duty vehicle 8B
HDV-UDDS heavy-duty vehicle urban dynamometer driving schedule
Hg mercury
HHDDE heavy-heavy duty diesel engine
HTBR hierarchical tree-based regression
Hz hertz
1C internal combustion
IPS inertial power surrogate
kPa kilopascal
K/S Kolmogorov-Smirnov
LAFY Los Angeles freeway
LANF Los Angeles non-freeway
Ib pound
Ib-ft pound-feet
LDV light-duty vehicle
LHDDE light-heavy duty diesel engine
MARTA Metropolitan Atlanta Rapid Transit Authority
MDPV medium duty passenger vehicle
MEASURE Mobile Emissions Assessment System for Urban and Regional Scale Emissions
MM method of moments
mg/m3 milligrams per cubic meter
MHDDE medium-heavy duty diesel engine
MOBILE EPA's mobile source emission rate model
MOBILE6 EPA's mobile source emission rate model
MOVES Motor Vehicle Emission Simulator
MPE mean prediction error
mpg miles per gallon
mph miles per hour
mph/s miles per hour per second
MS mean square
N2 nitrogen
NAAQS National Ambient Air Quality Standards
NCSU North Carolina State University
NGM EPA's Next Generation Model (mobile sources)
NIPER National Institute for Petroleum and Energy Research
NIST National Institute of Standards and Technology
NO nitrogen oxide
NO nitrogen dioxide
NONROAD EPA's emission rate model for non-road sources
xxin
-------
NOX
NRMRL
NYNF
°2
°3
ODBC
OLS
OTAQ
Pb
PCV
PERE
PM
PMio
PM2,
ppmv
QA/QC
QQ
RARE
RMSE
RPM
SCFM
S02
ss
SSE
SSTO
SUV
TB-EPDS
TIUS
TRB
UCR-CERT
UDDS
U.S. EPA
uv
VIF
VMT
VOCs
VSP
ug/m3
um
nitrogen oxides
National Risk Management Research Laboratory
New York non-freeway
oxygen
ozone
Onroad Diesel Emissions Characterization
ordinary least squares
Office of Transportation and Air Quality
lead
positive crankcase ventilation
Physical Emission Rate Estimator
particulate matter
particulate matter <10 microns
particulate matter < 2.5 microns
parts per million by volume
quality assurance/quality control
quantile-quantile
Regional Applied Research Effort
root mean square error
revolutions per minute
standard cubic feet per minute
sulfur dioxide
sum of squares
sum of squares due to errors
total sum of squares
sport utility vehicle
transit bus engine power demand simulator
Truck Inventory and Use Survey
Transportation Research Board
University of California Riverside - Center for Environmental Research and Technology
urban dynamometer driving schedule
U.S. Environmental Protection Agency
ultraviolet
variance inflation factor
vehicle miles traveled
volatile organic compounds
vehicle specific power
micrograms per cubic meter
micron
xxiv
-------
SUMMARY
Heavy-duty diesel vehicle (HDDV) operations are a major source of pollutant emissions
in major metropolitan areas. Accurate estimation of heavy-duty diesel vehicle emissions is es-
sential in air quality planning efforts because highway and non-road heavy-duty diesel emissions
account for a significant fraction of the oxides of nitrogen (NOx) and particulate matter (PM)
emissions inventories. MOBILE6 (U.S. EPA2002a), EPA's mobile source emission rate model,
uses an "average trip-based" approach to modeling as opposed to a more fundamental and robust
modal modeling approach.
The major effort of this research is to develop a new heavy-duty vehicle load-based mod-
al emission rate model that overcomes some of the limitations of existing models and emission
rates prediction methods. This model is part of the proposed Heavy-Duty Diesel Vehicle Modal
Emission Modeling (HDDV-MEM) which was developed by Georgia Institute of Technology.
HDDV-MEM first predicts second-by-second engine power demand as a function of vehicle op-
erating conditions and then applies brake-specific emission rates to these activity predictions.
To provide better estimates of microscale level emissions, this modeling approach is
designed to predict second-by-second emissions from on-road vehicle operations. This research
statistically analyzes the database provided by EPA and yields a model for prediction of emis-
sions at a microscale level based on engine power demand and driving mode. Research results
demonstrate the importance of including the influence of engine power demand vis-a-vis emis-
sions and simulating engine power in real world applications. The modeling approach provides
a significant improvement in HDDV emissions modeling compared to the current average speed
cycle-based emissions models.
xxv
-------
This page left blank deliberately.
xxvi
-------
CHAPTER 1
1. INTRODUCTION
1.1 Emissions from Heavy-Duty Diesel Vehicles
Heavy-duty diesel vehicles (HDDVs) operations are a major source of oxides of
nitrogen (NOx) and particulate matter (PM) emissions in metropolitan areas nationwide. Al-
though HDDVs constitute a small portion of the on-road fleet, they typically contribute more
than 45% of NOx and 75% of PM on-road mobile source emissions (U.S. EPA 2003). HDDV
emissions are a large source of global greenhouse gas and toxic air contaminant emissions. Ac-
cording to Environmental Defense Report in 2002, NO causes many environmental problems
including acid rain, haze, global warming and nutrient overloading leading to water quality deg-
radation (CEDF 2002). HDDV emissions are also harmful to human health and the environment
(SCAQMD 2000). Groundbreaking long-term studies of children's health conducted in Califor-
nia have demonstrated that particle pollution may significantly reduce lung function growth in
children (Avol 2001, Gauderman 2002, Peters 1999). Previous studies have stressed the signifi-
cance of emissions from HDVs, in urban non-attainment areas especially for ozone (for which
nitrogen oxides are a precursor) and PM2 5 (Gautam and Clark 2003, Lloyd and Cackette 2001).
Over the last several decades, both government and private industry have made extensive
efforts to regulate and control mobile source emissions. In 1961, the first automotive emissions
control technology in the nation, Positive Crankcase Ventilation (PCV), was mandated by the
California Motor Vehicle State Bureau of Air Sanitation to control hydrocarbon crankcase emis-
sions, and PCV Requirement went into effect on domestic passenger vehicles for sale in Califor-
nia in 1963 (CARB 2004). At the same time, first Federal Clean Air Act was enacted. Although
this act only dealt with reducing air pollution by setting emissions standards for stationary
sources such as power plants and steel mills at the beginning, amendments of 1965, 1966 and
1967 focused on establishing standards for automobile emissions (AMS 2005). Emission control
was first required on light-duty gasoline vehicles (LDVs) by U.S. EPA in the 1968 model year.
Developed and refined over a period of more than 30 years, these controls have become more ef-
fective at reducing LDV emissions (FCAP 2004).
1-1
-------
The relative importance of emissions from HDDVs has increased significantly because
today's gasoline powered vehicles are more than 95% cleaner than vehicles in 1968. Consider-
ing that HDDVs typically have a life cycle of over one million miles, may be on the road as long
as 30 years, and will continue to play a major emission inventory role with increases in goods
movement with their high durability and reliability, modeling of HDDV emissions is going to
become increasingly important in air quality planning.
1.2 Current Heavy-Duty Vehicle Emissions Modeling Practices
In current regional and microscale modeling conducted in every state except California,
HDDV emissions rates are taken from the U.S. Environmental Protection Agency's (EPA's)
MOBILE 6.2 model (U.S. EPA2001b). MOBILE 6.21 emission rates were derived from base-
line emission rates (gram/brakehorsepower-hour) developed in the laboratory using engine
dynamometer test cycles. While different driving cycles have been developed over the years,
dynamometer testing is conceptually designed to obtain a "representative sample" of vehicle
operations. These work-based emission rates are then modified through a series of conversion
and correction factors to obtain approximate emission rates in units of grams/mile that can be
applied to on-road vehicle activity (vehicle miles traveled), as a function of temperature, humid-
ity, altitude, average vehicle speed, etc. (Guensler 1993). The conversion process used to trans-
late laboratory emission rates to on-road emission rates employs fuel density, brake specific fuel
consumption, and fuel economy for each HDDV technology class. However, the emission rate
conversion process does not appropriately account for the impacts of roadway operating condi-
tions on brake specific fuel consumption and fuel economy (Guensler, et al. 1991).
The U.S. Environmental Protection Agency (U.S. EPA) is currently developing a new
set of modeling tools for the estimation of emissions produced by on-road and off-road mobile
sources. The new Motor Vehicle Emissions Simulator, known as MOVES2 (Koupal, et al.
2004), is a modeling system designed to better predict emissions from on-road operations. The
philosophy behind MOVES is to develop a model that is as directly data-driven as possible,
meaning that emission rates are developed from second-by-second or binned emission rate data.
1.3 Research Approaches and Objectives
The major effort of this research is to develop a new heavy-duty vehicle load-based mod-
al emission rate model that overcomes some of the limitations of existing models and emission
MOBILE = Current mobile source emissions model used for State Implementation Plan emission inventories.
MOVES = Mobile Vehicle Emissions Estimator, next generation mobile source emissions model. The model w
be used for State Implementation Plan emission inventories and will replace the current MOBILE model.
1-2
-------
rates prediction methods. This model is part of the proposed Heavy-Duty Diesel Vehicle Modal
Emission Modeling (HDDV-MEM) which was developed by Georgia Institute of Technology
(Guensler, et al. 2006). HDDV-MEM differs from other proposed HDDV modal models (Earth,
et al. 2004, Frey, et al. 2002, Nam 2003) in that the modeling framework first predicts second-
by-second engine power demand as a function of vehicle operating conditions and then applies
brake-specific emission rates to these activity predictions. This means that HDDV emission rates
are predicted as a function of engine horsepower loads for different driving modes. Hence, the
basic algorithm and matrix calculation in the HDDV-MEM should be transferable to MOVES.
The new model implementation is similar in general structure to previous model emission rate
model known as Mobile Emissions Assessment System for Urban and Regional Evaluation
(MEASURE1) model developed by Georgia Institute of Technology several years ago (Bachman
1998, Guensler, et al. 1998, Bachman, et al. 2000).
The major effort of this research consists of a number of specific objectives outlined
below:
• Develop a new load-based modal emission rate model to improve spatial/temporal
emissions modeling;
• Develop a HDDV modal emission rate model to more accurately estimate on-road
HDDV emissions;
• Develop a modal model that can be verified at multiple levels;
• Develop a HDDV modal emission rate model that can be integrated into the MOVES.
1.4 Summary of Research Contributions
There are four major contributions developed by this research. First, a framework for
emission rate modeling suitable for predicting emissions at different scales (microscale, me-
soscale, and macroscale) is established. Since this model is developed using on-board emissions
data which are collected under real-world conditions, this model will provide capabilities for
integrating necessary vehicle activity data and emission rate algorithms to support second-by-
second and link-based emissions prediction. Combined with GIS framework, this model will
improve spatial/temporal emissions modeling.
MEASURE = Mobile Emissions Assessment System for Urban and Regional Evaluation Model. This model is a
prototype GIS-based modal emissions model.
1-3
-------
Second, the relationship between engine power and emissions is explored and integrated
into the modeling framework. Research results indicate that engine power is more powerful
than surrogate variables to present load data in the proposed model. Based on the important role
of engine power in explaining the variability of emissions, it is better to include the load data
measurement during emission data collection procedure. Meanwhile, development of methods
to simulate real world engine power is equally important.
Third, this research verifies that vehicle emission rates are highly correlated with modal
vehicle activity. To get better understanding of driving modes, it is important to examine not
only emission distributions, but also engine power distributions.
Finally, a dynamic framework is created for further improvement. As more databases
become available, this approach could be re-run to obtain a more reliable load-based modal emis-
sion model based on the same philosophy.
1.5 Report Organization
Chapter 2 examines the diesel fuel combustion process and its relationship to diesel en-
gine emissions formation. Chapter 3 overviews the existing heavy-duty vehicle emission models
and presents the proposed heavy-duty diesel vehicle modal emission model (HDDV-MEM).
Chapter 4 provides an overview of the emission rate testing databases provided by U.S. EPA, the
quality assurance and quality control (QA/QC) procedures to review the validity of the data, and
the methods used to post-process these databases to correct data deficiencies. In Chapter 5, the
various statistical models considered for data analysis are discussed. Chapter 6 selects the data-
base used to develop the conceptual model and discusses the influence of explanatory variables
on emissions. Chapter 7 covers sensitivity tests of driving mode definitions and outlines the
potential impacts on derived models. Chapters 8 to 11 elaborate the different emission models
developed for idle, deceleration, acceleration and cruise driving modes. In Chapter 12, research
results are verified. Finally, Chapter 13 presents a discussion and conclusion on research results.
1-4
-------
CHAPTER 2
2. HEAVY-DUTY DIESEL VEHICLE EMISSIONS
Diesel engines differ from gasoline engines in terms of the combustion processes
and engine size, giving rise to their different emission properties and therefore different emis-
sions standards. This chapter examines the diesel fuel combustion process and its relationship to
diesel engine emissions formation followed by a summary of the emission regulations for diesel
engines.
2.1 How Diesel Engine Works
By far the predominant engine design for transportation vehicles is the reciprocat-
ing internal combustion (1C) engine which operates either on a four-stroke or a two-stroke cycle.
The two-stroke engine is commonly found in lower-power applications such as snowmobiles,
lawnmowers, mopeds, outboard motors and motorcycles, while both gasoline and diesel automo-
tive engines are classified as four-stroke engines. To understand the formation and control of
emissions, it is necessary to first develop an understanding of the operation of the internal com-
bustion engine.
2.1.1 The Internal Combustion Engine
Internal combustion engines generate power by converting the chemical energy stored in
fuels into mechanical energy. The engine is termed "internal combustion" because combustion
occurs in a confined space called a combustion chamber. Combustion of the fuel charge inside
a chamber causes a rapid rise in temperature and pressure of the gases in the chamber, which are
permitted to expand. The expanding gases are used to move a piston, turbine blades, rotor, or the
engine itself.
The four-stroke gasoline engine cycle is also called Otto cycle, in honor of Nikolaus Otto,
who is credited with inventing the process in 1867. The four piston strokes are illustrated in Fig-
ure 2-1. The following processes take place during one cycle of operation:
2-1
-------
1. Intake stroke: the piston starts at the top, the intake valve opens, and the piston
moves down to let the engine take in a fresh charge composed of a mixture of fuel and air (for
spark-ignition or gasoline engine) or air only (for auto-ignition or diesel engine). (Part 1 of the
figure.)
2. Compression stroke: then the piston moves back up to compress this fuel/air mixture
(gasoline engines) or the air only (diesel engines). In gasoline engines combustion is started by
ignition from a spark plug, in diesel engines auto-ignition occurs when fuel is injected into the
compressed air which has achieved a high temperature through compression such that the tem-
perature is high enough to cause self-ignition. (Part 2 of the figure.)
3. Expansion stroke: when the piston reaches the top of its stroke, the combustion process
results in a substantial
increase in the gas tem-
perature and pressure and
drives the piston down.
(Part 3 of the figure.)
4. Exhaust
stroke: once the piston
hits the bottom of its
stroke, the exhaust valve
opens and the exhaust
leaves the cylinder into
the exhaust manifold and
then into the tail pipe.
Discharge of the burnt
gases (exhaust) from the
cylinder occurs to make
room for the next cycle.
(Part 4 of the figure.)
)MltoVM»», (_
l°*£'*?™ O l**m*t Vrtr*,
o
• Serin*
O M*« port O l»*rk ***
Q H*mi O Lt"v»uit Port
QpMon
Bock QCMMCttaflM
00* Pin
WUKI
••_•' [»I«'RE$$ION
COMBUSTION
EXHAUST
Figure 2.1 Actions of a four-stroke gasoline internal combustion engine - Adapted from (HowStuff-
Works 2005)
Figure 2-1 is a diagrammatic representation of the four strokes of an internal combustion
engine. The upper end of the cylinder consists of a clearance space in which ignition and com-
bustion occur. The expanding medium pushes against the piston head inside the cylinder, caus-
ing the piston to move; this straight line motion of the piston is converted into the desired rotary
motion of the wheels by means of a drivetrain consisting of a connecting rod and crankshaft.
Figure 2-1 illustrates that the only stroke that delivers useful work is the expansion stroke; the
other three strokes are thus termed idle strokes. The reader interested in a detailed description
2-2
-------
of the internal combustion engine is referred to specialized texts, such as Heywood (Heywood
1998) and Newton et al. (Newton, et al. 1996).
2.1.2 Comparison with the Gasoline Engine
The diesel engine employs the compression ignition cycle. German engineer Rudolf Die-
sel developed the idea for the diesel engine and received the patent on February 23, 1893. His
goal was to create an engine with high efficiency. Figure 2-2 is a diagrammatic representation
of the four strokes of a diesel engine. The main differences between the gasoline engine and the
diesel engine are:
• A gasoline engine compresses at a ratio of 8:1 to 12:1, while a diesel engine compresses
at a ratio of 14:1 to as high as 25:1. The higher compression ratio of the diesel engine
leads to higher peak combustion temperatures and better fuel efficiency.
• Unlike a gasoline engine, which takes in a mixture of gas and air, compresses it and
ignites the mixture with a spark, a diesel engine takes in just air, compresses it and then
injects fuel into the compressed air. The heat of the compressed air spontaneously ig-
nites the fuel.
Gasoline en-
gines generally
use either carbu-
retion, in which
the air and
fuel is mixed
long before the
air enters the
cylinder, or port
fuel injection, in
which the fuel is
injected just pri-
or to the intake
stroke (outside
the cylinder),
while diesel
engines use
direct fuel injec-
tion - the diesel
fuel is injected
directly into the
cylinder.
Figure 2.2 Actions of a four-stroke diesel engine (HowStuffWorks 2005)
2-3
-------
2.2 Diesel Engine Emissions
Like any other internal combustion engine, diesel engines convert the chemical energy
contained in diesel fuel into mechanical power. Diesel fuel is injected under pressure into the
engine cylinder, where it mixes with air and combustion occurs. Diesel fuel is heavier and oilier
than gasoline. Diesel fuel evaporates much more slowly than gasoline, with a boiling point that
is actually higher than that of water. The lean nature of the diesel-air mixture results in a com-
bustion environment that produces lower emission rates of carbon monoxide (CO) and hydrocar-
bons (HC) compared to gasoline-powered engines. However, diesel engines do produce rela-
tively high level emissions of oxides of nitrogen (NOx) and particulate matter (PM), especially
fine parti culate matter. This section will discuss oxides of nitrogen and parti culate emissions in
detail.
2.2.1 Oxides of Nitrogen and Ozone Formation
Oxides of nitrogen, a mixture of nitric oxide (NO) and nitrogen dioxide (NO,), are
produced from the destruction of atmospheric nitrogen (N2) during the combustion process.
Atmospheric air generally consists of 80% N, and 20% Q and these elements are stable because
of the moderate temperatures and pressures. However, during high temperature and pressure
conditions of combustion, excess oxygen in the combustion chamber reacts with N, to create NO
which is quickly transformed into NO2. The role of nitrogen contained in the air in NO forma-
tion was initially postulated by Zeldovich (Zeldovich, et al. 1947). In near-stoichiometric or lean
systems the mechanisms associated with NO formation (as many as 30 or so independent chemi-
cal reactions that also involve participation of hydrocarbon species) can generally be simplified
to the following:
Reaction 1: O2 ^ O + O
Reaction 2: O + N2 ^ NO + N
Reaction 3: N + O2 ^ NO + O
In near-stoichiometric and fuel-rich mixtures, where the concentration of OH radicals can
be high, the following reaction also takes place:
Reaction 4: N + OH ^ NO + H
Reaction 4, together with reactions 1, 2 and 3, are known as the extended Zeldovich
mechanism. It is also important to note that emitted nitric oxide (NO) will oxidize to nitrogen
dioxide (NO2) in the atmosphere over a period of a few hours.
2-4
-------
Oxides of nitrogen (NOX) are reactive gases that cause a host of environmental concerns
impacting adversely on human health and welfare. Nitrogen dioxide (NO2), in particular, is a
brownish gas that has been linked with higher susceptibility to respiratory infection, increased
airway resistance in asthmatics, and decreased pulmonary function. Most importantly, NOX
emitted from heavy-duty vehicles plays a major role in the formation of ground level ozone
pollution, which causes wide-ranging damage to human health and the environment (U.S. EPA
1995). Ozone is a colorless, highly reactive gas with a distinctive odor. Naturally, ozone is
formed by electrical discharge (lightning) and in the upper atmosphere at altitudes between 15
and 35 km. Stratospheric ozone protects the Earth from harmful ultraviolet radiation from the
sun. However, ground level ozone is formed by chemical reactions involving NOX and volatile
organic compounds (VOCs) combining in the presence of heat and sunlight. These two cat-
egories of pollutants are also referred to as ozone precursors. The production of photochemical
oxidants usually occurs over several hours which means that the highest concentrations of ozone
normally occur on summer afternoons, in areas downwind of major sources of ozone precursors.
The simplified reaction processes are illustrated as:
NO2 + VOC + sunlight (UV) => NO2 + O2 + sunlight (UV) ^ NO + O3
At ground level, elevated ozone concentrations can cause health and environmental
problems. Ozone can affect the human cardiac and respiratory systems, irritating the eyes, nose,
throat, and lungs. Symptoms of ozone exposure include itchy and watery eyes, sore throats,
swelling within the nasal passages and nasal congestion. Effects from ozone are experienced
only for the period of exposure to elevated levels. EPA promulgated 8-hour ozone standards in
1997 and designated an area as nonattainment if it has violated, or has contributed to violations
of, the national 8-hour ozone standard over a three-year period.
2.2.2 Fine Particulate Matter (PM2 5)
Particulate matter (PM) is a complex mixture of solid and liquid particles (excluding wa-
ter) that are suspended in air. These particles typically consist of a mixture of inorganic and or-
ganic chemicals, including carbon, sulfates, nitrates, metals, acids, and semivolatile compounds.
The size of PM in air ranges from approximately 0.005 to 100 micrometers (|im) in aerodynamic
diameter — the size of just a few atoms to about the thickness of a human hair. U.S. EPA defined
three general categories for PM as coarse (10 to 2.5 jim), fine (2.5 jim or smaller), and ultrafine
(0.1 jim or smaller).
Heavy-duty diesel vehicles are known to emit large quantities of small particles (Kittel-
son, et al. 1978). Amajority of the PM found in diesel exhaust is in the nanometer size range.
2-5
-------
Lloyd found that more than 90% of fine particles from heavy-duty vehicles are smaller than Ijim
in diameter (Lloyd and Cackette 2001).
Fine PM can cause not only human health problems and property damage, but also ad-
versely impact the environment through visibility reduction and retard plant growth (Davis, et
al. 1998). Health studies have shown a significant association between exposure to fine particles
and premature death from heart or lung diseases. Other important effects include aggravation of
respiratory and cardiovascular disease, lung disease, decreased lung function, or asthma attacks.
Individuals particularly sensitive to fine particle exposure include older adults, people with heart
and lung disease, and children (U.S. EPA 2005). EPA promulgated the PM2.5 standard in 1997
and included a 24-hour standard for PM2.5 set at 65 micrograms per cubic meter (|ig/m3), and an
annual standard of 15 |ig/m3.
2.3 Heavy-Duty Diesel Vehicle Emission Regulations
2.3.1 National Ambient Air Quality Standards
The Clean Air Act, which was last amended in 1990, requires the U.S. EPA to set Na-
tional Ambient Air Quality Standards (NAAQS) to safeguard public health against six common
air pollutants: ozone (O3), particulate matter (PM), sulfur dioxide (SO2), carbon monoxide (CO),
nitrogen dioxide (NO,) and lead (Pb). The Clean Air Act established two types of national air
quality standards. Primary standards set limits to protect public health, including the health of
"sensitive" populations such as asthmatics, children, and the elderly. Secondary standards set
limits to protect public welfare, including protection against decreased visibility, damage to
animals, crops, vegetation, and buildings (CFR 2004a). Table 2-1 illustrates the current NAAQS
for ambient concentrations of various pollutants. Units of measure for the standards are parts per
million by volume (ppmv), milligrams per cubic meter of air (mg/m3), and micrograms per cubic
meter of air (|ig/m3).
Table 2-1. National Ambient Air Quality Standards (U.S. EPA 2006)
Pollutant
Carbon Monoxide (CO)
Nitrogen Dioxide (NO )
Ozone (O3)
Average Times
8 -hour Average
1 -hour Average
Annual Arithmetic
Mean
1 -hour Average
8 -hour Average
Standard Value
9 ppmv (10 mg/m3)
35 ppmv (40 mg/m3)
0.053 ppmv (100 (ig/m3)
0.1 2 ppmv (23 5 (ig/m3)
0.08 ppmv (157 (ig/m3)
Standard Type
Primary
Primary
Primary & Secondary
Primary & Secondary
Primary & Secondary
2-6
-------
Pollutant
Lead (Pb)
Participate (PM10)
Participate (PM2.5)
Sulfur Dioxide (SO2)
Average Times
Quarterly Average
Annual Arithmetic
Mean
24-hour Average
Annual Arithmetic
Mean
24-hour Average
Annual Arithmetic
Mean
24-hour Average
3 -hour Average
Standard Value
1.5 (ig/m3
50 (ig/m3
150 (ig/m3
15 (ig/m3
65 (ig/m3
0.030 ppmv (80 (ig/m3)
0.14ppmv(365 (ig/m3)
0.50 ppmv (1300 (ig/m3)
Standard Type
Primary & Secondary
Primary & Secondary
Primary & Secondary
Primary & Secondary
Primary & Secondary
Primary
Primary
Secondary
2.3.2 Heavy-Duty Engine Certification Standards
Heavy-duty vehicles are defined as vehicles of GVWR (gross vehicle weight rating)
above 8,500 Ibs in the federal jurisdiction and above 14,000 Ibs in California (model year 1995
and later). Diesel engines used in heavy-duty vehicles are further divided into service classes by
GVWR, as follows:
• Light heavy-duty diesel engines: 8,50033,000
Under the federal light-duty Tier 2 regulation (phased in beginning 2004), vehicles of
GVWR up to 10,000 Ibs used for personal transportation have been re-classified as "medium-
duty passenger vehicles" (MDPV - primarily larger SUVs and passenger vans) and are subject to
the light-duty vehicle legislation. Thus, the same diesel engine model used for the 8,500-10,000
Ibs vehicle category may be classified as either light- or heavy-duty and certified to different
standards, depending on the manufacturer-defined application (CFR 2004b). Except for the
heavy-duty vehicles classified as LDVs, all heavy-duty vehicle emissions standards are estab-
lished using the engine dynamometer certification process.
2-7
-------
2.3.3 Heavy-Duty Engine Emission Regulations
EPA regulates heavy-duty vehicle emissions for compliance with emissions standards
over the useful life of the engine. Useful life is denned as follows (U.S. EPA and California)
(CFR2004c):
LHDDE - 8 years/110,000 miles (whichever occurs first)
MHDDE - 8 years/185,000 miles
HHDDE - 8 years/290,000 miles
Federal useful life requirements were later increased to 10 years, with no change to
the above mileage numbers, for the urban bus PM standard (1994+) and for the NOx standard
(1998+). The emission warranty period is 5 years/100,000 miles (5 years/100,000 miles/3,000
hours in California), but no less than the basic mechanical warranty for the engine family. Table
2-2 shows the heavy-duty engine emissions standards by model year group.
Table 2-2. Heavy-Duty Engine Emissions Standards (U.S. EPA 1997)
Year
HC (g/bhp-hr)
CO (g/bhp-hr)
N
-------
CHAPTER 3
3. HEAVY-DUTY DIESEL VEHICLE EMISSIONS MODELING
Several models are currently used to estimate emissions from heavy-duty vehicles. A
comprehensive review of the existing heavy-duty vehicle emission models will help modelers
understand the different approaches and how they can contribute to the development of enhanced
emission rate modeling techniques.
The most common emission rate models are VMT-based or cycle-based developed from
laboratory test facility driving cycle data. Fuel-based models model emissions as a function of
fuel usage rate as well as other parameters. In the 1990s, even the proposed enhanced modal
models, designed to predict emissions as a function of speed and acceleration profiles of ve-
hicles, were still based upon statistical analysis of cycle-based data (Bachman 2000; Fomunung
2000). More recent emission rate modeling frameworks are proposing to model modal emission
rates on a second-by-second basis directly from the vehicle operating mode.
3.1 VMT-Based Vehicle Emission Models
The current emission rate models used by state and federal agencies include the Mobile
Source Emission Model (MOBILE) series of models developed by the U.S. Environmental Pro-
tection Agency (U.S. EPA) and the Emission Factor Emission Inventory Model (EMFAC) series
developed by California Air Resources Board (CARB).
3.1.1 MOBILE
MOBILE (U.S. EPA 1993), developed by the US EPA in the late 1970s to estimate
vehicle emission, has since become the nation's standard in assessing the emission impacts of
various transportation inputs. MOBILE uses the method of base emission rates and correction
factors. This model has undergone significant expansion and improvements over the years. The
latest version is MOBILE6 released in February 2002 (U.S. EPA 2002a).
3-1
-------
MOBILE is based on engine dynamometer test data from selected driving cycles. The
Federal Test Procedure (FTP) transient cycle is composed of a unique profile of stops, starts,
constant speed cruises, accelerations and decelerations. Different driving cycles are developed
to simulate both urban and freeway driving. A concern with driving cycles is that they may not
be sufficiently representative of real-world emissions (Kelly and Groblicki 1993; Denis et al.
1994). For HDV emission rates, MOBILE uses the method of base emission rates and conver-
sion factors which convert the g/bhp-hr emissions estimates observed in the laboratory to g/mile
emission rates, to be consistent with available travel information. Conversion factors are used to
convert the g/bhp-hr emissions estimates to grams per mile traveled. These conversion factors
contribute a large source of uncertainty to the MOBILE model since the BSFC (brake specific
fuel consumption) data are aggregated for the fleet and may not represent in-use vehicle charac-
teristics (Guensler et al. 1991). Conversion factors have improved accuracy in MOBILE6 due to
improved data, but fundamental flaws remain (Guensler et al. 2006).
3.1.1.1 Diesel Engine Test Cycles
EPA currently uses the transient Federal Test Procedure (FTP) engine dynamometer
cycle, which includes both engine cold and warm start operations, for heavy-duty vehicles (CFR
Title 40, Part 86.1333). Unlike the chassis dynamometer test for light-duty vehicle, the engine is
removed from the vehicle's chassis, mounted on the engine dynamometer test stands, and oper-
ated in the transient FTP test cycle. The transient cycle (Figure 3-1) consists of four phases: the
first is a NYNF (New York Non Freeway) phase typical of light urban traffic with frequent stops
and starts, the second is LANF (Los Angeles Non Freeway) phase typical of crowded urban
traffic with few stops, the third is a LAFY (Los Angeles Freeway) phase simulating crowded
expressway traffic in Los Angeles, and the fourth phase repeats the first NYNF phase. This cycle
consists of a cold start after parking overnight, followed by idling, acceleration and deceleration
phases, and a wide variety of different speeds and loads sequenced to simulate the running of the
vehicle that corresponds to the engine being tested. There are few stabilized running conditions,
and the average load factor is about 20 to 25% of the maximum horsepower available at a given
speed.
Emission and operation parameters are measured while the engine operates during the
test cycle. The engine torque is determined by applying performance percentages with an engine
lug curve (maximum torque curve). Engine torque is then converted to engine brake horsepower
using engine revolution per minute (RPM). Brake specific emissions rates are reported in g/
bhp-hr and then converted to g/mile using pre-defined conversion factors (CFR Title 40, Part
86.1342-90).
5-2
-------
NYNF
LANF
LAFY
HYNF
Figure 3-1 FTP Transient Cycle (DieselNet 2006)
Because the engine dynamometer test procedure does not directly account for the impacts
from load and grade changes, a chassis dynamometer test procedure and the cycle known as the
HDV urban dynamometer driving schedule (HDV-UDDS) was developed [CFR Title 40, Part
86, App. I], sometimes referred to as "cycle D". This cycle is different from the UDDS cycle for
light-duty vehicles (FTP-72). This FtDV cycle lasts 1060 seconds and covers 5.55 miles. The
average speed for FtDV UDDS is 18.86 mph while the maximum speed is 58 mph. Figure 3-2
shows the speed profile for the chassis UDDS test.
200 400 600
Time, s
800
1000
Figure 3-2 Urban Dynamometer Driving Schedule Cycle for Heavy-Duty Vehicle (DieselNet 2006)
5-3
-------
3.1.1.2 Baseline Emission Rates
Baseline emission rates (g/bhp-hr) for heavy-duty vehicles are obtained from the engine
dynamometer test results collected during U.S. EPA's cooperative test program with engine
manufacturers. The zero mile levels and deterioration rates for NOx, CO, and HC are presented
in the following tables for heavy-duty gasoline and diesel engines. All the emission rates are
available from "Update of Heavy-Duty Emission Levels (Model Years 1998-2004+) for Use in
MOBILE6" (Lindhjem and Jackson 1999).
Table 3-1. Heavy-Duty Vehicle NOY Emission Rates in MOBILE6
Zero Mile Level (g/bhp-hr) Deterioration (g/bhp-hr/10,000 miles)
Model Year _.. , _ _.. , _
Class Gasoline Diesel Engine Gasoline Diesel Engine
Engine Heavy Med. Light Engine Heavy Med. Light
1988-1989
1990
1991-1993
1994-1997
1998-2003
2004+
4.96
3.61
3.24
3.24
2.59
2.59
6.28
4.85
4.56
4.61
3.68
1.84
6.43
4.85
4.53
4.61
3.69
1.84
4.34
4.85
1.38
1.08
3.26
1.63
0.044
0.026
0.038
0.038
0.038
0.038
0.01
0.004
0.004
0.003
0.003
0.003
0.009
0.006
0.007
0.001
0.001
0.001
0.002
0.011
0.003
0.001
0.001
0.001
Table 3-2 Heavy-Duty Vehicle CO Emission Rates in MOBILE6
Model
Year Class
Zero Mile Level (g/bhp-hr) Deterioration (g/bhp-hr/10,000 miles)
Gasoline Diesel Engine Gasoline Diesel Engine
Engine Heavy Med. Light Engine Heavy Med. Light
1988-1989
1990
1991-1993
1994-1997
1998-2003
2004+
13.84
6.89
7.10
7.10
7.10
7.10
1.34
1.81
1.82
1.07
1.07
1.07
1.70
1.81
1.26
0.85
0.85
0.85
1.21
1.81
0.40
1.19
1.19
1.19
0.246
0.213
0.255
0.255
0.255
0.255
0.008
0.005
0.003
0.004
0.004
0.004
0.018
0.007
0.010
0.009
0.009
0.009
0.022
0.012
0.004
0.003
0.003
0.003
Table 3-3 Heavy-Duty Vehicle HC Emission Rates in MOBILE6
Model
Year Class
Zero Mile Level (g/bhp-hr)
Gasoline Diesel Engine
Engine Heavy Med. Light
Deterioration (g/bhp-hr/10,000 miles)
Gasoline Diesel Engine
Engine Heavy Med. Light
1988-1989
1990
1991-1993
1994-1997
1998-2003
2004+
0.62
0.35
0.33
0.33
0.33
0.33
0.47
0.52
0.30
0.22
0.22
0.22
0.66
0.52
0.40
0.31
0.31
0.31
0.64
0.52
0.47
0.26
0.26
0.26
0.023
0.023
0.021
0.021
0.021
0.021
0.001
0.000
0.000
0.001
0.001
0.001
0.002
0.001
0.001
0.001
0.001
0.001
0.002
0.001
0.001
0.001
0.001
0.001
5-4
-------
3.1.1.3 Conversion Factors
Because emission standards for both gasoline and diesel heavy-duty vehicles are ex-
pressed in terms of grams per brake-horsepower hour (g/bhp-hr), the MOBILE6.2 model em-
ploys conversion factors of brake horsepower-hour per mile (bhp-hr/mile) to convert the emis-
sion certification data from engine testing to grams per mile. Conversion factors are a function
of fuel density, brake-specific fuel consumption (BSFC), and fuel economy for each HDV class
(U.S. EPA 2002b). The conversion factors were calculated using Equation 3-1:
ConversionFactorCbhp-hr/rm) = Fuel Density (Ib/gal) (Equation 3-1)
F BSFC (lb/bhp-hr)x Fuel Economy (mi/gal)
To calculate BSFC, U.S. EPAfirst obtained data from model year 1987 through 1996 sup-
plied by six engine manufacturers (U.S. EPA2002d). U.S. EPA then performed regression analy-
sis for BSFCs by model year for each weight class and used a logarithmic curve to extrapolate
values prior to 1988 and after 1995, since sales data were only available for model years 1988
through 1995 (U.S. EPA2002d).
Fuel economy was calculated using a regression curve derived from the 1992 Truck
Inventory and Use Survey (TIUS) conducted by the U.S. Census Bureau. Fuel densities were
determined from National Institute for Petroleum and Energy Research (NIPER) publications
for both gasoline and diesel (Browning 1998). Using the equation defining the conversion factor
together with the data described above, weight class specific conversion factors were calculated
for gasoline and diesel vehicles for model years 1987 through 1996 (U.S. EPA2002c).
3.1.2 EMFAC
EMFAC (CARB 2007) was developed by CARS separately from MOBILE based upon
the presence of vehicle technologies in the on-road fleet that would be subject to more stringent
standards and fuels used in California. The latest version, EMFAC 2002, was released in Sep-
tember 2002. EMFAC can estimate emissions for calendar years 1970 to 2040.
EMFAC abandoned the use of conversion factors from EMFAC 2000 and used chassis
dynamometer data collected for 70 trucks tested over the Urban Dynamometer Driving Schedule
(UDDS). Although the use of UDDS test data marked a significant improvement, it is hard to
say that UDDS adequately represented the full range of heavy duty diesel operation. Although
the cycle was constructed from actual truck activity data, it lacks extended cruises known to
cause many trucks to default to a high NO emitting, fuel saving mode referred to as "Off-Cycle"
5-5
-------
NOx. The cycle also lacks hard accelerations known to result in high emissions of particulate
matter (CARS 2002).
CARB continues to develop more mode test cycles designed to better depict the emis-
sions of HDDVs under real world conditions, including emissions from engine programming
to go "off-cycle" at certain speeds. Activity data from instrumented truck studies conducted by
Battelle and Jack Faucett Associates for CARB (CARB 2002) have been used to develop a four
mode heavy-heavy-duty diesel cycle. Figure 3-3 shows these four mode cycles developed by
CARB. The creep mode produced the greatest gram per mile results followed by the transient
and the cruise mode. The transient and cruise modes produced higher and lower emissions, re-
spectively, than the HDDS (CARB 2002).
*
ft MA
Creep
Cruse
IDLE
—^Transient I
Figure 3-3 CARB's Four Mode Cycles (CARB 2002)
3.1.3 Summary
EPA's MOBILE series models have significantly improved through the series of model
revisions from 1970s. However, the MOBILE series of models still has major modeling de-
fects for the heavy-duty components. These defects have been widely recognized for more than
10 years (Guensler et al. 1991). One of the most frequently stated defects is that fleet average
speed, which aggregates other vehicle activity factors that may yield significant bias in emissions
characterization, is used to characterize vehicle emission rates.
In developing emissions inventories using the MOBILE and EMFAC (CARB 2007)
emission rate models, vehicle activity is estimated using travel demand models. The estima-
tion of VMT was based on EPA's fleet characterization study (U.S. EPA 1998). It is common to
estimate heavy-duty travel as a fixed percentage of predicted traffic volumes (TRB 1995). This
-------
estimate is not correct since heavy-duty truck travel does not follow the same spatial and tempo-
ral patterns as light-duty vehicle travel (Schlappi et al. 1993).
3.2 Fuel-Based Vehicle Emission Models
The fuel-based emission inventory models for heavy-duty diesel trucks combine vehicle
activity data (i.e., volume of diesel fuel consumed) with emission rates normalized to fuel con-
sumption (i.e., mass of pollutant emitted per unit volume of fuel burned) to estimate emissions
within a region of interest (Dreher and R. Harley 1998). This approach was proposed to increase
accuracy of truck VMT estimation by combining state level truck VMT with statewide fuel sales
to estimate total heavy-duty truck activity, using the amount of fuel consumed as a measure of
activity.
In California, fuel consumption data are available through tax records at the statewide
level and this statewide fuel consumption can be apportioned to provide emission estimates for
an individual air basin by month, day of week, and time of day. At the same time, emission rates
are normalized to fuel consumption using Equation 3-2:
El. =
P (Equation 3-2)
where El : emission index for pollutant P, in units of mass of pollutant emitted
per unit mass of fuel burned;
S : brake specific pollutant emission rate obtained from the dynamometer
test, expressed in g/bhp-hr units;
BSFC : brake specific fuel consumption of the engine being tested, also in
g/bhp.
Exhaust emissions are estimated by multiplying vehicle activity, as measured by the vol-
ume of fuel used, by emission rates which are normalized to fuel consumption and expressed as
grams of pollutant emitted per gallon of diesel fuel burned instead of grams of pollutant per mile
(Dreher and R. Harley 1998). Average emission rates for subgroups of vehicles are weighted by
the fraction of total fuel used by each vehicle subgroup to obtain an overall fleet-average emis-
sion rate. The fleet-average emission rate is multiplied by regional fuel sales to compute pollut-
ant emissions (Singer and Harley 1996).
The advantages of the fuel-based approach include the fact that fuel -use data are avail-
able from tax records in California. Furthermore, emission rates normalized to fuel consumption
vary considerably less over the full range of driving conditions than travel-normalized emission
5-7
-------
factors (Singer and Harley 1996). The disadvantage is obvious, too. Tax records are not avail-
able for other states. It is difficult to get input data outside of California, limiting the scope of
the modeling approach. Furthermore, the users first have to run two models to predict fuel used
and then predict emission rates, which is not statistically efficient.
3.3 Modal Emission Rate Models
Modal emission rate models work on the premise that emissions are better modeled as a
function of specific modes of vehicle operation (idle, steady-state cruise, various levels of ac-
celeration/deceleration, etc.), than as a function of average vehicle speed (Bachman 1998; Rama-
murthy et al. 1998; U.S. EPA2001b). Emissions of heavy-duty vehicles powered by diesel cycle
engines are more likely to be a function of brake work output of engine than normal gasoline
vehicles, because instantaneous emissions levels of diesel engine are highly correlated with the
instantaneous work output of the engine (U.S. EPA2001b).
With the consideration of vehicle modal activity, EPA and various research communities
have been developing modal activity-based emission models. The report published by National
Research Council (NRC 2000) comprehensively reviewed the modeling of mobile source emis-
sions and provided recommendations for the improvement of future mobile source emission
models. The following sections will introduce the most representative modal emission models
one by one.
3.3.1 CMEM
The Comprehensive Modal Emissions Model (CMEM) (Barth et al. 2000) was developed
by the Center for Environmental Research and Technology at University of California Riverside
(UCR-CERT). Development of CMEM was first funded by National Cooperative Highway Re-
search Program Project (1995-2000) and then is being enhanced and improved with EPA funding
(2000-present). From 2001, CE-CERT created a modal-based inventory at the micro- (intersec-
tion), meso- (highway link), and macro- (region) scale levels for light-duty vehicles (LDV) and
heavy-duty diesel (HDD) vehicles. The CMEM model derives a fuel rate from road-load and a
simple powertrain model. Emissions rates are then derived empirically from the fuel rate. Fuel
rate, or fuel consumption per unit time, forms the basis for CMEM.
The CMEM HDD emissions model (Barth et al. 2004) accepted the same approach as the
light-duty vehicle model. In that model, second-by-second tailpipe emissions are modeled as the
product of three components: fuel rate (FR), engine-out emission indices (grams of emissions/
gram of fuel), and an emission after-treatment pass fraction. The model is composed of six mod-
-------
ules: 1) engine power demand; 2) engine speed; 3) fuel-rate; 4) engine control unit; 5) engine-out
emissions; and 6) after-treatment pass fraction. The vehicle power demand is determined based
on operating variables [second-by-second vehicle speed (from which acceleration can be derived;
note that acceleration can be input as a separate input variable), grade, and accessory use (such
as air conditioning)] and specific vehicle parameters (vehicle mass, engine displacement, cross-
sectional area, aerodynamics, vehicle accessory load, transmission efficiency, and drive-train
efficiency, and so on). The core of the model is the fuel rate calculation which is a function of
power demand and engine speed. Engine speed is determined based on vehicle velocity, gear
shift schedule and power demand (Barth et al. 2004). The model uses a total of 35 parameters to
estimate vehicle tailpipe emissions.
3.3.2 MEASURE
The Mobile Emissions Assessment System for Urban and Regional Evaluation (MEA-
SURE) (Bachman et al. 2000) model was developed by Georgia Institute of Technology in the
late 1990s. The MEASURE model is developed within a geographic information system (GIS)
and employs modal emission rates, varying emissions according to vehicle technologies and
modal operation (cruise, acceleration, deceleration, idle). The model emission rate database
consists of more than 13,000 laboratory tests conducted by the EPA and CARB using standard-
ized test cycle conditions and alternative cycles (Bachman 1998). The aggregate modal model
within MEASURE employs emission rates based on theoretical engine-emissions relationships.
The relationships are dependent on both modal and vehicle technology variables, and they are
"aggregate" in the sense that they rely on bag data to derive their modal activities (Washington
et al. 1997a). Emission rates were statistically derived from the emission rate data as a function
of operating mode power demand surrogates. The model uses statistical techniques to predict
emission rates using a process that utilizes the best aspects of hierarchical tree-based regression
(HTBR) and ordinary least squares regression (OLS) (Breiman et al. 1984). HTBR is used to
reduce the number of predictor variables to a manageable number, and to identify useful interac-
tions among the variables; then OLS regression techniques are applied until a satisfactory model
is obtained (Fomunung et al. 2000). Vehicle activity variables include average speed, accel-
eration rates, deceleration rates, idle time, and surrogates for power demand. The MEASURE
model for light-duty vehicles was completed in 2000.
MEASURE provides the following benefits since it has been developed under the GIS
platform (Bachman et al. 2000): 1) manages topographical parameters that affect emissions;
2) calculates emissions from vehicle modal activities; 3) allows a 'layered' approach to indi-
5-9
-------
vidual vehicle activity estimation; and 4) aggregates emission estimates into grid cells for use in
photochemical air quality models.
3.3.3 MOVES
To keep pace with new analysis needs, modeling approaches, and data, the U.S. EPA's
Office of Transportation and Air Quality (OTAQ) is developing a modeling system termed
MOVES (Koupal et al. 2004, U.S. EPA2001a). This new system will estimate emissions for on-
road and non-road sources, cover a broad range of pollutants, and allow multiple scale analysis,
from fine-scale analysis to national inventory estimation. In the future, MOVES will serve as
the replacement for MOBILE6 and NONROAD (U.S. EPA 200 la). This project was previously
known as the New Generation Mobile Source Emissions Model (NGM) (U.S. EPA2001a).
The current plan for MOVES is to use vehicle specific power (VSP) as a variable on
which emission rates can be based (Koupal et al. 2002). The VSP approach to emissions char-
acterization was developed by Jimenez-Palacios (Jimenez -Palacios 1999). VSP is a function of
speed, acceleration, road grade, etc., as shown in Equation 3-3:
VSP = vx(ax(\+z) + gxgrade + gxCR) + Q.5p xCD xAxv3/m (Equation 3-3)
where: v: vehicle speed (assuming no headwind) (m/s)
a: vehicle acceleration (m/s2)
e: mass factor accounting for the rotational masses (~0. 1) - constant
g: acceleration due to gravity (m/s2)
grade: road grade (ratio of rise to run)
CR: rolling resistance (-0.0135)
[i: air density (1.2)
CD: aerodynamic drag coefficient (dimensionless)
A: the frontal area (m2)
m: vehicle mass (metric tons)
The basic concept of MOVES starts with the characterization of vehicle activity and the
development of relationships between characterized vehicle activity and energy consumption,
and between energy consumption and vehicle emission (Nam 2003). The U.S. EPA established a
modal binning approach, developed using VSP, to characterize the relationship between vehicle
activity and energy consumption. Originally, a total of 14 modal bins were developed based on
different VSP ranges (U.S. EPA 2001a). This approach was revised in two different ways. U.S.
EPA refined the VSP binning approach by the association of second-by-second speed, engine
3-10
-------
rpm, and acceleration rates, and the original 14 VSP binning approaches are revised with the
combination of five different speed operating modes and redirected to a total of 37 VSP bins
(Koupal et al. 2004). Researchers at North Carolina State University (NCSU) divided each bin
into four strata representing two engine sizes and two odometer reading categories, and this ap-
proach was referred to as the "56-bin" approach. (U.S. EPA2002b).
Another important conceptual model for MOVES was developed by NCSU in 2002 (Frey
et al. 2002). Dr. Frey summarized the conceptual analytical methodology in the report "Recom-
mended Strategy for On-Board Emission Data Analysis and Collection for the New Generation
Model" (Frey et al. 2002). This method uses power demand estimate (P) as a variable on which
emission rates can be based (Frey et al. 2002) as shown in Equation 3-4.
P = v X a (Equation 3-4)
where: P : power demand (mph2/sec)
v : vehicle speed (mph)
a : vehicle acceleration in (mph/s)
This method uses on-board emissions data where data are collected under real-world
conditions to develop a modal emission model which can estimate emissions at different scales
such as microscale, mesoscale, and macroscale. The philosophy is similar to MEASURE (Fomu-
nung 2000), which first segregated the data into four modes based on suitable modal definitions,
then developed an OLS regression model for each mode using explanatory variables selected by
HTBR techniques. These explanatory variables include model year, humidity, temperature, alti-
tude, grade, pressure, and power. Second and third powers of speed and acceleration were also
included in the regression analysis.
3.3.4 HDDV-MEM
The researchers in Georgia Institute of Technology have developed a beta version of
FtDDV-MEM, which is based on vehicle technology groups, engine emission characteristics, and
vehicle modal activity (Guensler et al. 2005). The FtDDV-MEM first predicts second-by-second
engine power demand as a function of on-road vehicle operating conditions and then applies
brake-specific emission rates to these activity predictions. The FtDDV-MEM consists of three
modules: a vehicle activity module (with vehicle activity tracked by vehicle technology group),
an engine power module, and an emission rate module. The model framework is illustrated in
Figure 3-4.
5-11
-------
D
! •H-ura-iity
| «A; acccktetioa
*p aiid«a^7
»W wind sp*i*d
A lubastruchue
B. Operating Euviioiuuent
C. Vohime* and .Hilbtlerfs
P. FreiclitPagsaisser Loads
F Onrorul <">peri>tioiis
F Engine Power Functions
A'cessoivLoad
Ho K* Loed P owst
iese. by. g?c)
Figure 3-4 A Framework of Heavy-Duty Diesel Vehicle Modal Emission Model (Guensler et al. 2005)
3.3.4.1 Model Development Approaches
The HDDV-MEM modeling framework is designed for transportation infrastructure im-
plementation on link-by-link basis. While the modeling routines are actually amenable to imple-
mentation on a vehicle-by-vehicle basis, the large number of vehicles operating on infrastructure
links precludes practical application of the model in this manner. As such, the model framework
capitalizes upon previous experience gained in development of the MEASURE modeling frame-
work, in which vehicle technology groups were employed. A new heavy-duty vehicle visual
classification scheme, which is an EPA and Federal Highway Administration (FHWA) hybrid
vehicle classification scheme developed by Yoon et al. (Yoon et al. 2004b), classified vehicle
technology groups by engine horsepower ratings, vehicles GVWR, vehicle configurations, and
vehicle travel characteristics (Yoon 2005c). On the other hand, the MEASURE model employs
load surrogates for the implementation of a light-duty modal modeling regime. This new model-
ing framework directly implements heavy-duty vehicle operating loads and uses these load pre-
dictions in the emission prediction process. An engine power module is designed for this task.
3-12
-------
Emission rates are first established for various heavy-duty technology groups (engine
and vehicle family, displacement, certification group, drivetrain, fuel delivery system, emission
control system, etc.) based upon statistical analysis of standard engine dynamometer certifica-
tion data, or on-road emission rate data when available (Wolf et al. 1998; Fomunung et al. 2000).
The following subsets will discuss three main modules in the HDDV-MEM.
3.3.4.2 Vehicle Activity Module
The vehicle activity module provides hourly vehicle volumes for each vehicle technol-
ogy group on each transportation link in the modeled transportation system. The annual average
daily traffic (AADT) estimate for each road link is processed to yield vehicle-hours of operation
per hour for each technology group (using truck percentages, VMT fraction by vehicle technol-
ogy group, diesel fraction, hourly volume apportionment of daily travel, link length, and average
vehicle speed) (Guensler et al. 2005; Yoon 2005c), as shown in Equation 3-5.
VAvMf = (AADTsx(NLs/WL)xHVFvh xVFv xDFv)x(SLs/ASV) (Equation 3-5)
where: VA: the estimated vehicle activity (veh-hr/hr):
v: the vehicle technology group
h: the hour of day
s: the transportation link
f: the facility type for the link
AADT : the annual average daily traffic for the link (number of vehicles)
NL : the number of lanes in the specific link direction
TNL: the total number of lanes on the link
HVFvh: the hourly vehicle fraction
VF : the VMT fraction for each vehicle technology group
DFy: the diesel vehicle fraction for each technology group
SL : the link length (miles)
ASy: the link average speed of the technology group (mph)
To estimate on-road running emissions from each link, two sets of calculations are
performed. On-road vehicle activity (vehicle-hr) for each hour is multiplied by engine power
demand for observed link operations (positive tractive power demand plus auxiliary power de-
mand), and then by baseline emission rates (g/bhp-hr). These calculations are processed sepa-
rately for each speed/acceleration matrix cell (Yoon et al. 2005b). Emissions from motoring/
idling activity are calculated by the determination of the vehicle-hours of motoring/idling activity
on each link for each hour and the multiplication of the baseline idle emission rate (g/hr).
3-13
-------
3.3.4.3 Engine Power Module
Internal combustion engines translate linear piston work (force through a distance) to a
crankshaft, rotating the crankshaft and creating engine output torque (work performed in angular
rotation). The crankshaft rotation speed (engine speed in revolutions per minute) is a function
of engine combustion and physical design parameters (mean effective cylinder pressure, stroke
length, connecting rod angle, etc.). The torque available at the crankshaft (engine output shaft)
is less than the torque generated by the pistons, in that there are torque losses inside the engine
associated with operating a variety of internal engine components. Torque is transferred from the
engine output shaft to the driveshaft via the transmission (sometimes through a torque-converter,
i.e., fluid coupling) and through a series of gears that allows the drive shaft to rotate at differ-
ent speeds relative to engine crankshaft speed. The drive shaft rotation is then transferred to the
drive axle via the rear differential. The ring and pinion gears in the rear differential translate the
rotation of the drive shaft by 90 degrees from the drive shaft running along the vehicle to the
drive axle that runs across the vehicle. Torque available at the drive axle is now delivered direct-
ly to the drive wheels. This process generates the tractive force used to overcome road friction,
wind resistance, road grade (gravity), and other resistive forces, allowing the vehicle to acceler-
ate on the roadway. Figure 3-5 illustrates the primary components of concern.
Trinimlttion,
manual or automatic,
has qearsels that
match engine speed
lo desired road
speed
Axle •haft
turning Inside each
roar axle housing tube
transmits power from
the diflerential lo the/
rear wheels
Engln*
provides Ihe power
(torque x speed) to
propel the vehicle
via the drivetrain
Jl^T y f Drlvwhait
^*$jP j\ •/ rtaccae no* P.' '*"»»
XTOrN//1-? /
Wr7
JT T — i
T II Ballhouilng
LS^-* con lains th e clutch
lor a manual
transmission or
the torque converter
lor an automatic
er transmission
0
the transmission to
the differential housing.
U-ioinlB allow it to
nde up and down
with the rear axle
Differential
turns power flow
SO degrees and allows
one wheel lo rotate
(aster than the other
on curves or when
traction differs
Figure 3-5 Primary Elements in the Drivetrain (Gillespie 1992)
The vehicle drivetrain (engine, torque converter, transmission, drive shaft, rear differen-
tial, axles, and wheels) is designed as a system to convert engine torque into useful tractive force
3-14
-------
at the wheel-to-pavement interface. When the tractive force is greater than the sum offerees
acting against the vehicle, the vehicle accelerates in the direction of travel. Given that on-road
speed/acceleration patterns for HDDVs can be observed (or empirically modeled), the modal
modeling approach works backwards from observed speed and acceleration to estimate the trac-
tive force (and power) that was available at the wheels to meet the observed conditions. Then,
working backwards from tractive force, the model accounts for additional power losses that
occurred between the engine and the wheels to predict the total brake-horsepower output of the
engine. Force components that reduce available wheel torque and tractive force include:
• Aerodynamic drag, which depends on the frontal area, the drag coefficient, and the
square of the vehicle speed;
• Tire rolling resistance, which is determined by the coefficient of rolling resistance,
vehicle mass, and road grade (where the coefficient of rolling resistance is a function
of tire construction and size; tire pressure; axle geometry, i.e., caster and camber; and
whether the wheels are driven or towed);
• Grade load, which is determined by the roadway grade and vehicle mass; and
• Inertial load, which is determined by the vehicle's mass and acceleration.
The tractive force required at the interface between the tires and the road to overcome these re-
sistive forces and provide vehicle acceleration can be described by (Gillespie 1992), as shown in
Equation 3-6:
\ = Fn + FR + Fw + Fj + ma (Cation 3-6)
where: FT: the tractive force available at the wheels (Ibf)
FD: the force necessary to overcome aerodynamic drag (Ibf)
FR: the force required to overcome tire rolling res:tance (Ibf)
FW: the force required to overcome gravitational force (Ibf)
Fr the force required to overcome inertial loss (Ibf)
m: the vehicle mass (Ibm)
a: the vehicle acceleration (ft/sec2)
Load prediction models could employ a wide variety of aerodynamic drag (Wolf-Hein-
rich 1998) and rolling resistance functional forms, some of which may be more appropriate for
certain vehicle designs and at certain vehicle speeds. Note that vehicle mass is a critical param-
eter that must be included in the load-based modeling approach. Therefore, estimates of gross
3-15
-------
vehicle weight must be included in any transit (vehicle weight plus passenger loading) or heavy-
duty truck (vehicle weight plus cargo payload) application. The following subsections describe
each force in Equation 3-6, taken from Yoon et al. (Yoon et al. 2005a).
Aerodynamic Drag Force
As a vehicle moves forward through the atmosphere, drag forces are created at the in-
terface of the front of the vehicle and by the vacuum generated at the tail of the vehicle. The
flow of the air around the vehicle creates a very complex set offerees providing both resistance
to forward motion and vehicle lift. The net aerodynamic drag force is a function of air density,
aerodynamic drag coefficient, vehicle frontal area, and effective vehicle velocity, as shown in
Equation 3-7 (Yoon et al. 2005a).
, ^ „ A ,,T/2 (Equation 3-7)
where: FD: aerodynamic drag force
p : the air density (lb/ft3)
g : the acceleration of gravity (32.2 ft/sec2)
Cd : the aerodynamic drag coefficient
Af: the vehicle frontal area (ft2)
V : the effective vehicle velocity (ft/sec)
Rolling Resistance Force (FR^
Rolling resistance force is the sum of the forces required to overcome the combined fric-
tion resistance at the tires. Tires deform at their contact point with the ground as they roll along
the roadway surface. Rolling resistance is caused by contact friction, the tires' resistance to
deformation, aerodynamic drag at the tire, etc. The force required to overcome rolling resistance
can be expressed with rolling resistance coefficient, vehicle weight, and road grade, as shown in
Equation 3-8 (Yoon et al. 2005a).
FR=Crxmxgx cos(6) (Equation 3'8)
where: FR: force required to overcome rolling resistance
Cr: the rolling resistance coefficient
0 : the road grade (degrees)
m: vehicle mass in metric tons
g: acceleration due to gravity
3-16
-------
Gravitational Weight Force (Fw)
The gravitational force components account for the effect of gravity on vehicle weight
when the vehicle is operating on a grade. The grade angle is positive on uphill grades (generat-
ing a positive resistance) and negative on downgrades (creating a negative resistance),as shown
in Equation 3-9 (Yoon et al. 2005a).
Fw=mxgx sin(9) (Equation 3'9)
where: F : gravitational weight force
m: vehicle mass in metric tons
g: acceleration due to gravity
0 : the road grade (degrees)
Drivetrain Inertial Loss (F^
The engine, transmission, drive shaft, axles and wheels are all in rotation. The rotational
speed of each component depends upon the transmission gear ratio, the final drive ratio, and the
location of the component in the drive train (i.e., the total gear ratio between each component
and the wheels). The rotational moment of inertia of the various drivetrain components consti-
tutes a resistance to change in motion. The torque delivered by each rotating component to the
next component in the power chain (engine to clutch/torque converter, clutch/torque converter
to transmission, transmission to drive shaft, drive shaft to axle, axle to wheel) is reduced by the
amount necessary to increase angular rotation of the spinning mass during vehicle acceleration.
Given the torque loss at each component, the reduction in motive force available at the wheels
due to inertial losses along the drivetrain can be modeled (Wolf-Heinrich 1998). This model
term is most significant under low speed acceleration conditions, such as vehicle operation in
truck and rail yards where vehicles are lugging heavy loads over short distances. However, as
will be discussed later, significant new data will be required to incorporate the inertial loss effects
into modal models, as shown in Equation 3-10 (Yoon et al. 2005a).
axIEFF ax[(/y +(GJx/D) + (G,2xG,2)x(/
1 r2 r2 (Equation 3-10)
where: a : the acceleration in the direction of vehicle motion (ft/sec2)
IEFF : the effective moment of inertia (ft- Ibf -sec2)
3-17
-------
Iw : the rotational moment of inertia of the wheels and axles (ft-lbf -sec2)
ID : the rotational moment of inertia of the drive shaft (ft-lbf -sec2)
IT : the rotational moment of inertia of the transmission (ft-lbf-sec2)
IE : the rotational moment of inertia of the engine (ft-lbf-sec2)
Gt: the gear ratio at the engine transmission
Gd : the gear ratio in the differential
r : wheel radius (ft)
Power Demand
Using the equations outlined above, the total engine power demand, which is the combi-
nation of tractive power and auxiliary power demands, can be expressed in Equation 3-11 (Yoon
et al. 2005a):
P = [(—) x (FD + FR+Fw+FI+ma}~\ + AP (Equation 3-11)
where P: total engine power demand
V : the vehicle speed (ft/s)
FD: the force necessary to overcome aerodynamic drag (Ibf)
FR: the force required to overcome tire rolling res:tance (Ibf)
FW: the force required to overcome gravitational force (Ibf)
Fr the force required to overcome inertial loss (Ibf)
m: the vehicle mass (Ibm)
a: the vehicle acceleration (ft/sec2)
AP : the auxiliary power demand (bhp)
550 : the conversion factor to bhp
3.3.4.4 Emission Rate Module
The emission rate module provides work-related emission rates (g/bhp-hr) and idle emis-
sion rates (g/hr) for each technology group. The basic application of the HDDV-MEM incorpo-
rates a simple emission rate modeling approach. The predicted engine power demand (bhp) for
each second of vehicle operation is multiplied by emission rates in gram/bhp-sec for a given bhp
load. Technology groups (i.e., vehicles that perform similarly on the certification tests) are estab-
lished based upon the engine and control system characteristics and each technology group is as-
signed a constant g/bhp-sec emission rate based upon regression tree and other statistical analysis
of certification data. Under the assumption that testing cycles represent the typical modal activi-
ties undertaken by on-road activities, such emission rates are applied to on-road activity data.
Given the large repository of certification data, detailed statistical analysis of the certification
3-18
-------
test results can be used to obtain applicable emission rates for these statistically derived vehicle
technology groups. The data required for analysis must come from chassis dynamometer (the
engine remains in the vehicle and the vehicle is tested on a heavy-duty treadmill) and on-road
test programs in which second-by-second grams/second emission rate data have been collected
concurrently with axle-hp loads.
At this moment, HDDV-MEM accepts EPA's baseline running emission rate data as
work-related emission rates and EMFAC2002 idling emission rate test data as idle emission
rates. Diesel vehicle registration fractions and annual mileage accumulation rates are employed
to develop calendar year emission rates for each technology group. In the future, a constant
emission rate need not be used as more refined testing data become available. Linear, polyno-
mial, or generalized relationships can be established between gram/second emission rate and
tractive horsepower (axle horsepower) and other variables. Sufficient testing data are required to
establish statistically significant samples for each technology group.
3.3.4.5 Emission Outputs
HDDV-MEM outputs link-specific emissions in grams per hour (g/hr) for VOCs, CO,
NOX, and PM for each vehicle type. Toxic air contaminant emission rates (benzene, 1, 3-butadi-
ene, formaldehyde, acetaldehyde, and acrolein) are also estimated in grams/hour for each vehicle
type using the MOBILE6.2-modeled ratios of air toxics to VOC for each calendar year. HDDV-
MEM provides not only hourly emissions, but also aggregated total daily emissions (in accor-
dance with input command options). The structure of output files, which provide link-specific
hourly emissions, can be directly incorporated with roadway network features in a GIS environ-
ment for use in interactive air quality analysis in various spatial scales, i.e., national, regional,
and local scales (Guensler et al. 2005; Yoon 2005c).
3-19
-------
CHAPTER 4
4. EMISSION DATASET DESCRIPTION AND POST-PROCESSING PROCEDURE
Using second-by-second data collected from on-road vehicles (Brown et al. 2001, Ens-
field 2002), the research effort reported here developed models to predict emission rates as a
function of on-road operating conditions that affect vehicle emissions. Such models should be
robust and ensure that assumptions about the underlying distribution of the data are verified
and that assumptions associated with applicable statistical methods are not violated. Due to
the general lack of data available for development of heavy-duty vehicle modal emission rate
models, this study focuses on development of an analytical methodology that is repeatable with
different datasets collected across space and time. There are two second-by-second data sets in
which emission rate and applicable load and vehicle activity data have been collected in paral-
lel (Brown et al. 2001, Ensfield 2002). One database was a transit bus dataset, collected on
diesel transit buses operated by Ann Arbor Transit Authority (AATA) in 2001 (Ensfield 2002),
and another dataset was heavy HDV (HDV8B) dataset prepared by National Risk Management
Research Laboratory (NRMRL) in 2001 (Brown et al. 2001). Each is summarized in the follow-
ing sections.
4.1 Transit Bus Dataset
Transit bus emissions dataset was prepared by Sensors, Inc. (Ensfield 2002). Sensors,
Inc. has supplied gas analyzers and portable emissions testing systems worldwide for over three
decades. Their products, SEMTECH-G for gasoline powered vehicles, and SEMTECH-D for
diesel powered vehicles, are commercially available for on-vehicle emission test applications. In
October 2001, Sensors, Inc. conducted real-world, on-road emissions measurements of 15 heavy-
duty transit buses for U.S. EPA (Ensfield 2002). Transit buses were provided by the AATA and
all of them were New Flyer models with Detroit Diesel Series 50 engines. Table 4-1 summarizes
the buses tested for U.S. EPA.
4-1
-------
Table 4-1 Buses Tested for U.S. EPA (Ensfield 2002)
Displace Peak
Bus # Bus ID Odometer Engine series ment Torque Test Date
(liters) (Ib-ft)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
BUS360
BUS361
BUS363
BUS364
BUS372
BUS375
BUS377
BUS379
BUS380
BUS381
BUS382
BUS383
BUS384
BUS385
BUS386
1995
1995
1995
1995
1995
1996
1996
1996
1996
1996
1996
1996
1996
1996
1996
270476
280484
283708
247379
216278
211438
252253
260594
223471
200459
216502
199188
222245
209470
228770
SERIES 50 8047 GK40
SERIES 50 8047 GK38
SERIES 50 8047 GK37
SERIES 50 8047 GK42
SERIES 50 8047 GK41
SERIES 50 8047 GK39
SERIES 50 8047 GK36
SERIES 50 8047 GK35
SERIES 50 8047 GK28
SERIES 50 8047 GK29
SERIES 50 8047 GK30
SERIES 50 8047 GK31
SERIES 50 8047 GK32
SERIES 50 8047 GK33
SERIES 50 8047 GK34
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
8.5
890
890
890
890
890
890
890
890
890
890
890
890
890
890
890
10/25/2001
10/25/2001
10/24/2001
10/24/2001
10/26/2001
10/25/2001
10/24/2001
10/23/2001
10/23/2001
10/22/2001
10/17/2001
10/19/2001
10/17/2001
10/18/2001
10/19/2001
4.1.1 Data Collection Method
A total of 15 files were provided for the purpose of model development (Ensfield 2002).
Each file represents data collected from different transit buses. Five of these buses were 1995
model year and the rest were 1996 model year. All of the bus test periods lasted approximately
two hours. The buses operated along standard Ann Arbor bus routes and stopped at all regular
stops although the buses did not board or discharge any passengers. The routes were mostly
different for each test, and were selected for a wide variety of driving conditions. All of the bus
routes for the test are shown in Figure 4-1.
4-2
-------
Figure 4-1 Bus Routes Tested for U. S. EPA (Ensfield 2002).
Sensors, Inc. engineers performed the instrument setup and data collection for all the
buses. Test equipment, SEMTECH-D analyzer, is shown in Figure 4-2. Because engine comput-
er vehicle interface (SAE J1708) data were collected at 10 Hz, Sensors, Inc. engineers manually
started and stopped data collections at approximately 30 minute intervals to keep file size man-
ageable. A total of four trip files were generated per bus. Zero drift was checked between data
collections. Then four files for each bus were combined into one file after post-processing. The
time for each bus is thus sometimes not continuous. To derive other variables easily, like accel-
eration, and keep data manageable or other purposes, data for each bus were separated into trips
based on continuous time. After this processing, there were 62 "trips" in the transit bus database.
4-3
-------
Figure 4-2 SEMTECH-D in Back of Bus (Ensfield 2002)
4.1.2 Transit Bus Data Parameters
Each of the 15 data files share the same format. The data fields included in each file are
summarized in Table 4-2.
Table 4-2 Transit Bus Parameters Given by the U.S. EPA (Ensfield 2002)
Category Parameters
Test
Information
Vehicle
Characteristics
Roadway
Characteristics
Onroad Load
Parameters
Engine
Operating
Parameters
Environment
Conditions
Vehicle
Emission
Date; Time
License number; Engine size; Instrument configuration number
GPS Latitude (degree); GPS Longitude (degree); GPS Altitude (feet); Grade (%)
Vehicle speed (mph); Engine speed (rpm); Torque (Ib-ft); Engine power (bhp)
Engine load (%); Throttle position (0 - 100%); Fuel volumetric flow rate
(gal/s); Fuel specific gravity; Fuel mass flow rate (g/s); Calculated instanta-
neous fuel economy (mpg); Engine Oil temperature(deg F); Engine oil pres-
sure (kPa); Engine warning lamp (Binary); Engine coolant temperature (deg
F); Barometric pressure reported from ECM (kPa); Calculated exhaust flow
rate (SCFM)
Ambient temperature (deg C); Ambient pressure (mbar); Ambient relative
humidity (%); Ambient absolute humidity (grains/lb air)
HC, CO, NOx, CO2 emission (in ppm, g/sec, g/ke-fuel, g/bhp-hr units)
4-4
-------
4.1.3 Sensors, Inc. Data Processing Procedure
It is helpful to understand how Sensors, Inc. processed the dataset after data collection
This information is very important for data quality assurance and quality control. This section is
adapted primarily from the Sensor's field data collection report (Ensfield 2002).
Data Synchronization: According to Sensor's report, the analytical instruments, vehicle
interface, and global positioning system (GPS) equipment reported data individually to the
SEMTECH data logger asynchronously and at differing rates, but with a timestamp at millisec-
ond precision. The first step of the post-processing procedure is to eliminate the extra data by
interpolating and synchronizing all the data to 1 Hz. With all the raw data synchronized to the
same data rate, it is then time-aligned so that engine data corresponds to emissions data in real
time.
Mass Emissions Calculations: Mass emissions (gram/second) are calculated by fuel flow
method. With access to real-time, second-by-second fuel flow rates, a value for transient mass
emissions is computed as shown by the equation below. Using NO as an example, NO mass
emissions are calculated on a second-by-second basis (Ensfield 2002).
Equation4-l
where NO /s : NO emissions (grams/second)
NOfs: NO emission rate (grams of NO per gram of fuel)
Fueflow : flow of fuel per unit time (grams per second).
Fuel specific emissions are the ratios of the mass of each pollutant to the fuel in the
combusted air/fuel mixture. The mass fuel flow rate is converted from fuel volumetric flow rate
using fuel specific gravity.
Brake Specific Emissions Calculations: Engine torque is first computed by applying the
engine load parameter, which represents the ratio between current engine torque and maximum
engine torque, to the engine lug curve (maximum torque curve). Engine horsepower is then con-
verted from engine torque using engine speed data. Work (bhp-hr) is computed for each second
of the test, and brake specific emissions are reported as the sum of the grams of pollutant emitted
over the desired interval (one second) divided by the total work.
Vehicle Speed Validation: Vehicle speed is a critical parameter that influences the de-
rived parameters, acceleration and emission rates. It is important for researchers to understand
4-5
-------
the method of measurement and data accuracy. Sensors, Inc. measured vehicle speed using two
methods: vehicle Electronic Control Module (ECM) and Global Positioning System (GPS).
Figure 4-3 shows the GPS vs. ECM comparison for Bus 380. The regression analysis shows
that the ECM data are around 10% higher than the GPS data, according to Sensors report (Ens-
field 2002). Sensors, Inc. researchers believe that this comparison suggests that GPS data may
be more reliable for on-road testing. Buses of model year 1995 were equipped with an earlier
version ECM that did not provide vehicle speed and GPS velocity data were used in place of the
ECM data. Buses of model year 1996 were equipped with the current version ECM that can pro-
vide vehicle speed and vehicle speed was reported after validation with the GPS data. GPS data
were within 1% accuracy based upon analysis of 10 miles of data (Ensfield 2002).
GPS vs ECM Vehicle Speed Comparison
Bus 1, Trip 1
—VEH SPEED mph
GPS SPEED mph
400
BOO
800 1000
Elapsed Tlirw. sec
1200
1400
1C CO
Figure 4-3 Bus 380 GPS vs. ECM Vehicle Speed (Ensfield 2002)
4.1.4 Data Quality Assurance/Quality Check
After understanding the manner in which Sensors, Inc. processed the reported data set,
the data set for each bus was screened to check for errors or possible problems. Possible sources
of errors associated with data collection should be considered before undertaking data analysis
for the development of a model. The types of errors checked are listed below.
Loss of Data: Emission data are missing for some buses. For example, bus 382 had miss-
ing HC data for 343 seconds. Buses 361, 377 and 384 have similar problems. There might be
several reasons for loss of data. Communication between instruments might be lost or a particu-
4-6
-------
lar vehicle may have failed to report a particular variable. These records are removed from the
test database and not employed in development of HC models because the instantaneous emis-
sion values will be recorded as zero, introducing significant bias to the result. Similarly, calcu-
lated fuel economy data are missing for some buses.
Erroneous ECMData: There were some cases where certain engine parameters were well
outside physical limits, and these erroneous ECM data were filtered out with pre-defined filter
limits. The following filter limits (Ensfield 2002) were imposed on the rate of change of RPM,
fuel flow, and vehicle speed data:
Rate of change limit for RPM = 10,000 (RPM)/sec
• Rate of change limit for Fuel flow = 0.003 (gal/sec)/sec
• Rate of change limit for Vehicle speed = 21 (mph)/sec
According to Sensors, Inc. report, these filters remove the data outside the defined limits.
The SEMTECH post-processor automatically interpolates between the remaining data, and pro-
duces results at IHz as before (Ensfield 2002). Because this procedure was finished by manually
plotting the ECM parameters and computed mass results, all the buses' data were screened again
to check any remaining data spikes for data quality assurance purposes. No such errors were
identified for this kind of problem. But the modeler should keep in mind that data could be erro-
neous because "unreasonable" engine acceleration or deceleration was removed that could have
been within reasonable absolute limits.
GPSDropouts: There were a few instances when the GPS lost communication with the
satellite for unknown reasons, and these erroneous GPS data were removed manually (Ensfield
2002). To guarantee data quality, the modeler screened all GPS data again to check any remain-
ing erroneous cases. The principles for screening erroneous GPS data are based on the consis-
tency between GPS data and engine parameters. The secondary screening identified that bus
360 data still contained some erroneous GPS data. The questionable area covers the beginning
434 seconds of the whole trip (see Figure 4-4). Their GPS data are shown as red in the left fig-
ure. The right figure illustrates the time series plot for checked area. Although GPS signals are
reported as some fixed positions in the left figure while vehicle speed data are reported as zero in
the right figure, engine speed and engine power in the right figure shows that bus 360 did move
during that period. This error might due to GPS dropouts.
4-7
-------
Figure 4-4 Example Check for Erroneous GPS Data for Bus 360 (Ensfield 2002)
Due to GPS dropouts, the GPS signals were reported as some fixed positions. At the
same time, the vehicle speed might be reported as zero while other ECM data, such as engine
speed and engine power, would show that the bus did move during that period. If the modeler
fails to screen and remove such data, these data will be classified as idle mode. Further, these
data will cause erroneous analysis result for idle mode. The modeler screened all buses manually
and found that six buses had such problems (buses 360,361, 363, 364, 375 and 377). Usually,
this type of error was prevalent during the beginning of the bus trip. All erroneous data were
removed manually. The correction of the database to remove these erroneous data is critical to
model development (initial models associated with development of idle and load-based emission
rates were problematic until this database error was identified and corrected by the author).
Synchronization Errors: Data were checked for synchronization errors. An example
plot of such a check is presented in Figure 4-5 where part of the trip for Bus 360 is used. The
selected area covers about 200 seconds. Their GPS data are shown as the green/red part in the
left figure. The figure on the right illustrates the time series plot for the area checked. The speed
for red points in both figures is 0 mph. Although NOx correlates well to engine load and engine
speed, vehicle speed doesn't correlate well to engine data and NOx emissions data. Bus 360
was equipped with an earlier version ECM that did not provide vehicle speed. GPS velocity
data were used in place of the ECM data. According to Sensor's report, data synchronization
was only done between emissions data and engine data, not for vehicle speed for emissions data
(Ensfield 2002).
4-8
-------
:
Figure 4-5 Example Check for Synchronization Errors for Bus 360
All bus data were checked for this type of error and such errors were identified in all of
the test data for six buses (buses 360, 361, 363, 364, 375, 377). Coincidentally, these six buses
had GPS dropout problems, too. From Frey's work (Frey and Zheng 2001), small errors in
synchronization do not substantially impact estimate of total trip emissions. Such deviations will
influence the estimate for micro-scale analysis. To choose the right delay time to remove the
GPS data and vehicle speed data, the author compared the impacts of using a 2-second, 3-sec-
ond, and 4-second delay. Figure 4-6 illustrates histograms of engine power for zero speed data
based on three different proposed time delay options. A 3-second delay is chosen because engine
power distribution for zero speed data based on a 3-second delay is more reasonable. Compar-
ing to the 2-second delay results, zero speed data contain fewer data points with higher engine
power (>150 brake horsepower) for 3-second delay. Meanwhile, zero speed data contain more
data points with lower engine power (<20 brake horsepower) for a 3-second delay than 4-second
delay time.
4-9
-------
2.51
50 100 150 200 250
2* second delay
100 150 200 SO 30G
3-sec and delay
100 150 200 250 300
4-second delay
Figure 4-6 Histograms of Engine Power for Zero Speed Data Based on Three Different Time Delays
Road Grade Validation: According to Sensor's report, the GPS data were used for grade
calculation. Combing the velocity at time t with the difference in altitude between time t and t-1
second, the instantaneous grade is computed as shown in Equation 4-2 (Ensfield 2002).
Grade, =-
velocityt
altitudet -altitude M
Equation 4-2
where gradet:
t:
Road grade at time t
time, t or t-1 second
velocityt: vehicle speed in feet per second at time t
altitude : altitude in feet at time t or t-1
The calculation formula can generate significant errors given the uncertainty in the GPS
position, particularly at low speeds where there is less of a differential in distance over the one-
second interval (Ensfield 2002). In the real world, the maximum recommended grade for use
in road design depends upon the type of facility, the terrain on which it is built, and the design
speed. Figure 4-7 is directly cited from Traffic Engineering (Roess et al. 2004) to present a
4-10
-------
general overview of usual practice. Roess et al. (2004) indicated that these criteria represent a
balance between the operating comfort of motorists and passengers and the practical constrains
of design and construction in more severe terrains.
— •• Level U-min -"-Rolling Tcf mm
«J I H
-Mnunutnom TWTMH
--- LcvclTcrrain -
Rolling Tcnutn
(hiRufjlAflcrtaK
^•- Level Terrain — -
•40 -is ft>
Disugn SpixJ (
--Kullntf Icirjm
iku
MounuinciinTetnM
Figure 4-7 General Criteria for Maximum Grades (Roess et al. 2004)
The modeler screened the grade data in the database and found that 0.42% of the data
have higher grade (> 10%). Meanwhile, 2% of the road grade data have higher rate of change
(> 5%). This means some road grade data are dubious or erroneous. Considering Sensors, Inc.
recommendations, road grade data would only be used as reference, and would not be used di-
rectly in model development.
4.1.5 Database Formation
The data dictionaries of the source files were reviewed for parameter content. Not all
variables reported will be included in explanatory analysis. A standard file structure was de-
signed to accommodate the available format. Emissions rate data with units of grams/second
were selected to develop the proposed emission rate model. Because volumetric fuel rate, fuel
4-11
-------
specific gravity, and fuel mass flow rate are used to calculate mass emissions (g/s), these vari-
ables will be excluded in further analysis. Similarly, because percent engine load, engine torque,
and engine speed are used to calculate engine power (brake horsepower), only engine power
(bhp) is selected to represent power related variables. Exhaust flow rate is excluded because it is
back-computed from the mass emissions generated with the fuel flow method. Fuel economy is
excluded because it is 30 second moving average data and computed for a test period by sum-
ming the fuel consumed and dividing by the distance traveled. Because GPS data were used for
grade calculation and road grade data would only be used as reference, a dummy variable was
created to represent different road grade ranges.
At the same time, variables that might be helpful in explaining variability in vehicle emis-
sions were included in the proposed file structure although they were not provided in the original
dataset. These variables include model year, odometer reading, and acceleration. Acceleration
data were derived from speed data using central difference method. Table 4-3 summarizes the
parameter list for explanatory analysis.
Table 4-3 List of Parameters Used in Explanatory Analysis for Transit Bus
Category Parameters
Test Information
Vehicle Characteristics
Roadway Characteristics
Onroad Load Parameters
Engine Operating Parameters
Environmental Conditions
Vehicle Emissions
Date; Time
License number; Model year; Odometer reading; Engine size; Instru-
ment configuration number
Dummy variable for road grade range
Engine power (bhp); Vehicle speed (mph); Acceleration (mph/s)
Throttle position (0 - 100%); Engine oil temperature (deg F); Engine
oil pressure (kPa); Engine warning lamp (Binary); Engine coolant tem-
perature (deg F); Barometric pressure reported from ECM (kPa)
Ambient temperature (deg C); Ambient pressure (mbar); Ambient rela-
tive humidity (%); Ambient absolute humidity (grains/lb air)
HC, CO, NOx emission (in g/sec)
4.1.6 Data Summary
After the post-processing procedure was completed, the summary of the emissions and
activity data as well as environmental and roadway characteristics is given in Table 4-4.
4-12
-------
Table 4-4 Summary of Transit Bus Database
Bus ID
Numbers of Seconds of Data
Vehicle Operation
Average Speed (rnph)
Average Engine Power (blip)
Emission Data
Average CO (g/s)
Average Nox (g/s)
Average HC (g/s)
Environmental Characteristics
Average Ambient Temperature (deg C)
Average Ambient Pressure (mbar)
Average Humidity (grains/ (Ib air))
360
7606
11.116
71.952
0.029652
0.11049
0. (1)1 838
20358
977.16
24.512
361
5153
25.804
87.536
0.018965
0.1484
0,001304
16.666
971.08
26.745
363
7623
14.626
65.822
0.022419
0.066047
0.000239
25,623
965,69
88.396
364
5284
19.046
79.599
0.020627
0,12341
0,003492
20,358
985.58
33.227
372
5275
21.45
72.395
0.016582
0.087625
0,002371
21,375
982.05
32.494
375
7323
16.814
86.307
0.031844
0.13697
0.001377
17.5
977.52
24.394
377
780S
12.518
78.121
0,028571
0,074597
0,000557
26,012
973.08
70.653
379
7880
15.118
84.82
0,030731
0,10658
0.001807
23,788
974,27
70.818
380
8006
13.035
72.987
0.052504
0,10393
0,001073
23,648
973,22
67.525
381
7282
16.335
65.724
0.034294
0.090166
0,000609
22,465
987,82
46.016
382
3136
19.947
85.224
0.052822
014089
0.00132
21.746
994,71
27.868
383
7943
18.253
67.249
0.026207
0,11873
0,001803
21.282
983.55
44.646
384
8453
18.262
64.199
0,036183
010457
0,00137
18.17
992.7
22.494
385
8423
16.559
62.512
0.023527
0,095998
0,001693
21,842
991,34
29.766
386
10339
17.319
62.979
0.047062
0,10635
0.00147
20,389
985,65
37.239
-------
4.2 Heavy-duty Vehicle Dataset
The heavy-duty vehicle emission dataset is prepared by the U.S. EPAN ational Risk
Management Research Laboratory (NRMRL) (U.S. EPA 2001b). EPA's Onroad Diesel Emis-
sions Characterization (ODEC) facility has been collecting real-world gaseous emissions data for
many years (U.S. EPA2001c). The on-road facility incorporated a 1990 Kenworth T800 tractor-
trailer as its test vehicle to collect this database. When this truck was purchased, it had already
logged over 900,000 miles and was due for an overhaul of its Detroit Diesel Series 60 engine.
The vehicle was tested prior to having this work done and after the overhaul. NRMRL collected
the test data for U.S. EPA from 1999 to 2000 and included all the results and findings in a report
titled: "Heavy Duty Diesel Fine Particulate Matter Emissions: Development and Application of
On-Road Measurement Capabilities" (U.S. EPA2001c).
4.2.1 Data Collection Method
The general capabilities of the ODEC facility are shown in Figure 4-8. The facility is designed
to collect data while traveling along the public roadways using a 1990 Kenworth T800 tractor-trailer.
This truck was tested using two types of tests. During 'parametric' testing, the truck systematically fol-
lows a test matrix representing the full range of load, grade, speed and acceleration conditions. During
'highway' testing, the truck travels along an interstate highway with no specific agenda other than cover-
ing the distance safely and efficiently; speed and acceleration vary randomly with grade, speed limit, and
traffic effects. Tables 4-5 and 4-6 summarize the tests finished by NRMRL for U.S. EPA.
Stack Measurements
Opacity
Temperature
Velocity Head
Static Pressure
Engine Measurements
~Intake. Exhaust, Coolant
and Oil Temperatures
Speed, RPM
Drive Shaft Measurements
TOITJUO
— Speed, RPM
W v
Operational
Measurements
Speed, tan/h
v y
Front-to-Rcar G-Force
Computerized
Data Acquisition
System
Exhaust Sample Measurements
Oj.%
co!r%
co,%
CO, ppm
HQX, ppm
THCa, ppm
Figure 4-8 Onroad Diesel Emissions Characterization Facility (U.S. EPA2001c)
4-14
-------
Table 4-5 Onroad Tests Conducted with Pre-Rebuild Engine
Test Load Grade(s) _,
ID IbGCW % Comments
3FOOV
3FOOC
3FOOA
3HOOV
3HOOC
3HOOA
3EOOV
3EOOC
3EOOA
3FOGA
3FOSA
3FOV
3HOGA
3HOSA
3HOV
3EOGA
3EOSA
3EOV
3F3&6
3H3&6
3E3&6
3F-SEQ
3DRI
3FIL
3DIOX*
79280
79280
79280
61060
61060
61060
42840
42840
42840
79280
79280
79280
61060
61060
61060
42840
42840
42840
79280
61060
42840
79280
79280
61060
61060
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
3.1,6.0
3.1,6.0
3.1,6.0
Zero
Various
Various
Various
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Governed Acceleration
Short-shift Acceleration
Constant Speed Testing
Governed Acceleration
Short-shift Acceleration
Constant Speed Testing
Governed Acceleration
Short-shift Acceleration
Constant Speed Testing
Uphill Grade Tests
Uphill Grade Tests
Uphill Grade Tests
Dyno Sequence Simulations
Open Highway Tests - Tunnel
Open Highway Tests - Filters
Open Highway Tests - Dioxin
*Note: These tests are not available.
4-15
-------
Table 4-6 Onroad Tests Conducted with Post-Rebuild Engine
Test ID L°™b Grade(s)% Comments
LrCW
5FOV
5FOC*
5FOA*
5HOV
5HOC*
5HOA*
5EOV
5EOC*
5EOA*
5F3&6
5H3&6
5E3&6
5F-SEQ*
5 Plume
SNOxB*
5DIOX*
74000
74000
74000
61440
61440
61440
42600
42600
42600
74000
61440
42600
74000
61440
61440
61440
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
Zero
3.1,6.0
3.1,6.0
3.1,6.0
Zero
Various
Various
Various
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Constant Speed Testing
Cost Down & Acceleration
Governed Acceleration & Short-shift Acceleration
Uphill Grade Tests
Uphill Grade Tests
Uphill Grade Tests
Dyno Sequence Simulations
Open Highway Tests - Plume
Open Highway Tests - Burst
Open Highway Tests - Dioxin
*Note: These test results are not available.
4.2.2 Heavy-duty Vehicle Data Parameters
A total of 42 files were collected for the pre-rebuild engine and a total of 38 file collected
for the post-rebuild engine. Each file represents data collected for a different engine and test.
Preliminary analysis of individual files indicated that the format of files was same for all avail-
able files. The data fields included in each file are summarized in Table 4-7 below.
4-16
-------
Table 4-7 List of Parameters Given in Heavy-duty Vehicle Dataset Provided by U.S. EPA
Category Parameters
Test Information
Vehicle
Characteristics
Onroad Load
Parameters
Engine Operating
Parameters
Environment
Conditions
Vehicle Emissions
Date; Time
Vehicle make/model; Model year; Engine type; Engine Rating; Vehicle mainte-
nance history
Truck load weight (Ib); Vehicle speed (mph); Measured engine power (bhp)
Engine speed (RPM); Shaft volts; Torque volts; Fuel H/C ratio; Fuel factor;
Engine intake air temperature (deg F); Engine exhaust air temperature (deg °F);
Engine coolant temperature (deg °F); Engine oil temperature (deg °F)
Barometric pressure (inches Hg); Ambient humidity (%)
CO, NOY, and HC emission (in ppm, g/hr, g/kg fuel and g/hp-hr units)
4.2.3 Data Quality Assurance/Quality Control Check
Although a total of 80 tests were finished for that project, preliminary screening found
that there were some test files missing from the data DVD provided by U.S. EPA to the research-
ers. The missing test files include: 3DIOX, 5EOC, 5HOC, 5FOC, 5F-SEQ, SNOxB, and 5DIOX.
For quality assurance purposes, the available data files were screened to check for errors or pos-
sible problems. Possible sources of errors for data collection should be considered before devel-
oping the model. The types of errors checked are listed below.
Loss of Data: Measured horsepower (engine power) and emission data were missing
for some tests. Tests 3F-SEQ, 3FIL1, 3FIL2, and 3FIL3 had no measured horsepower data for
the entire test. These test files couldn't be included in emission model development. In addi-
tion, tests 3EOOA, 3EOOC, 3EOOV, 3FOGA, 3FOSA, 3FOV, 3HOSA, 3FIL4, 3FIL5, 3FIL7, 3FIL8,
3FIL9, 3FIL10, and 5HOV had no HC emission data. This problem will be fixed by removing
these tests for HC emission model development. Test 3HOSA also had no CO emission data and
this problem will be treated by removing this test for CO emission model development.
Duplicated Records: A notable issue was duplicate records with different emission values
for same time in some test files. After communicating with Mr. Brown who prepared this dataset
for EPA, the reason was identified: the data were recorded at rates as high as 10 Hz to improve
the resolution of the data. To keep consistent with other test files, these data were post-processed
as one data point for each second.
Erroneous Load Data: The "measured horsepower" field is engine power data calculated
from measurement of the drive shaft torque and rotational speed. Results from the literature
4-17
-------
review show that engine power is a major explanatory variable of possible erroneous load data.
This variable was screened to check for errors or possible problems. An example of a check
of measured horsepower is given in Figure 4-9. The observed relationship between measured
horsepower and engine speed is to some extent a relationship between vehicle speed and en-
gine speed which can be found in "Fundamentals of Vehicle Dynamics" (Gillespie 1992). At a
given gear ratio, the relationship between engine speed and road speed is to some extent a linear
relationship. The geometric progression in the left figure reflects the choices made in selection
of transmission gear ratios. The right figure shows a problematic linear relationship between
measured horsepower and vehicle speed. Essentially, the right figure appears to show no gear
changes as vehicle speed increases, indicating that measured horsepower has been calculated
incorrectly for this test. Such problems exist in the series of tests 3DRI and test SPlume. These
test files were removed from emission model development.
Tisl 30RI2-2. Of an Highway Tssls
200 400 600 800 1000 1200
Measured Ho«e»ower ithp)
Figure 4-9 Example Check for Erroneous Measured Horsepower for Test 3DRI2-2
Vehicle Speed Validation: The author reviewed NRMRL's report (U.S. EPA 2001c)
related to vehicle speed validation. Vehicle speed data were measured with a Datron LSI opti-
cal speed sensor. The product literature specifies an accuracy of+/- 0.2% and a reproducibility
of+/- 0.1% over the measurement range of 0.5 to 400 kph. Figure 4-10 from NRMRL's report
4-18
-------
correlates the speed measurement to a drive shaft speed sensor that was scaled using a National
Institute of Standards and Technology (NIST)-traceable frequency source. The outliers at the
low-speed indicated when the truck was turning (the tractor and the trailer-mounted speed sensor
traveled less distance than the tractor does during turns). Notwithstanding these points, the cor-
relation is a good indication of speed measurement precision.
70
6O -
S.
to
3O -
2O -
1O -
5OO 1OOO 15OO
Drive Shaft Speed, rpm
2OOO
Figure 4-10 Vehicle Speed Correlation (U.S. EPA2001c)
At the same time, NRMRL provided Figure 4-11 (U.S. EPA 200Ic) to show the precision
for four ranges of vehicle speed, along with similar estimates of accuracy. This figure will help
researchers deal with speed measurement noise in the future.
10-30
30-45 45-60
Speed range, mph
Above 60
I Precision (correlation error) G Accuracy estimate
Figure 4-11 Vehicle Speed Error for Different Speed Ranges (U.S. EPA 200Ic)
4-19
-------
4.2.4 Database Formation
The data dictionaries of the source files were reviewed for parameter content (Table 4-8).
Not all variables reported are included in explanatory analysis. A standard file structure was
designed to accommodate the available format. Emissions data with units of gram/second were
selected to develop the proposed emission model. All variables used to calculate mass emissions
were excluded in further analysis. Similarly, because the "measured horsepower" field is calcu-
lated from measurements of drive shaft torque and rotational speed, only "measured horsepower"
is used to represent power related variables. At the same time, variables like acceleration that
might be helpful in explaining variability in vehicle emissions were included in the proposed file
structure although they were not provided in the original dataset. Acceleration data were derived
from speed data using the central difference method.
Table 4-8 List of Parameters Used in Explanatory Analysis for HDDV
Category Parameters
Test Information
Vehicle Characteristics
Onroad Load Parameters
Engine Operating
Parameters
Environment Conditions
Vehicle Emissions
Date; Time
Vehicle make/model; Model year; Engine type; Engine rating; Vehicle
maintenance history
Truck load weight (Ib); Vehicle speed (mph); Acceleration (mph/s);
Measured engine power (bhp)
Engine intake air temperature (deg F); Engine exhaust air tem-
perature (deg F); Engine coolant temperature (deg F); Engine oil
temperature (deg F)
Barometric pressure (Hg), Ambient moisture (%)
CO, NO , and HC emission (in g/s units)
4.2.5 Data Summary
After the post-processing procedure was completed, a summary of the emissions and
activity data as well as environmental and roadway characteristics is given in Table 4-9.
4-20
-------
Table 4-9 Summary of Heavy-Duty Vehicle Data U.S. EPA2001c).
Test ID
Number
of
Seconds
of Data
Vehicle Operation
Average
Speed
(mph)
Average
Engine
Power
(bhp)
Emission Data
Average Average Average
CO(g/s) NO (g/s) HC(g/s)
Environment
Characteristics
Barometric
Pressure
(Hg)
Ambient
Moisture
3FOOV
3FOOC
3FOOA
3HOOV
3HOOC
3HOOA
3EOOV
3EOOC
3EOOA
3FOGA
3FOSA
3FOV
3HOGA
3HOSA
3HOV
3EOGA
3EOSA
3EOV
3F3&6
3H3&6
3E3&6
3FIL4
3FIL5
3FIL6
3FIL7
3FIL8
3FIL9
3FIL10
5FOV
5HOV
5EOV
5F3&6a
5F3&6b
5H3&6a
5H3&6b
5E3&6
4430
7991
1904
3718
7593
1959
3863
7962
1810
577
792
3635
594
707
3331
421
571
3395
8629
10573
9825
12456
13738
6415
10678
12248
11956
12367
4895
4091
4407
6971
5058
6919
6951
10807
43.55
36.49
43.55
43.66
39.43
48.04
41.41
39.31
50.15
35.93
36.26
41.65
33.81
34.27
41.53
32.91
31.99
42.64
36.59
43.13
44.74
66.54
58.76
66.94
62.76
64.70
65.62
63.71
32.87
42.36
42.60
36.24
38.69
39.74
39.44
46.01
163.10
323.79
475.12
130.99
112.50
218.50
123.42
104.95
197.07
302.14
287.45
152.23
253.63
223.73
143.38
233.93
180.73
103.63
131.00
107.06
121.69
152.91
129.99
130.11
164.82
147.26
153.44
167.73
96.09
126.14
105.84
147.99
133.54
133.01
148.26
124.07
0.11633
0.08200
0.17476
0.08386
0.07456
0.20521
0.10896
0.07489
0.22324
0.23114
0.25140
0.14879
0.30036
NA
0.08892
0.37978
0.23652
0.08879
0.14409
0.16769
0.16617
0.06994
0.06354
0.06273
0.07042
0.06688
0.06551
0.07481
0.10716
0.12564
0.10681
0.13716
0.14044
0.12723
0.15400
0.13981
0.27983
0.19566
0.34262
0.22701
0.17866
0.32078
0.21157
0.14908
0.26108
0.41269
0.37947
0.28413
0.48494
0.32498
0.27712
0.30728
0.33325
0.25745
0.31374
0.27507
0.23913
0.29925
0.22315
0.20833
0.28353
0.26035
0.20905
0.35788
0.23558
0.30933
0.29045
0.31607
0.30661
0.28763
0.32910
0.27674
0.001442
0.001166
0.001471
0.001429
0.001414
0.001751
NA
NA
NA
NA
NA
NA
0.002159
NA
0.002436
0.000589
0.003042
0.002805
0.001426
0.001753
0.001839
NA
NA
0.001409
NA
NA
NA
NA
0.002828
NA
0.002894
0.003111
0.001924
0.002397
0.002807
0.002827
28.273
28.272
28.272
28.273
28.272
30.423
28.273
28.272
30.137
29.995
29.995
29.995
29.690
29.690
28.020
29.976
29.976
29.976
28.282
28.273
28.250
29.238
29.238
29.238
29.854
29.773
29.418
30.132
30.101
30.179
30.278
28.004
28.009
28.024
28.014
28.024
1.6874
1.6874
1.6874
1.6874
1.6874
1.3573
1.6874
1.6874
1.9020
0.4685
0.4685
0.4685
1.6059
1.6059
0.4742
0.5812
0.5812
0.5812
1.2520
1.6874
1.5716
0.3886
0.3886
0.3886
0.1480
0.1484
0.1502
0.1466
0.5761
0.6091
0.8601
0.9070
0.8862
0.8138
1.2149
1.0131
4-21
-------
CHAPTER 5
5. METHODOLOGICAL APPROACH
The following chapter lays the theoretical foundation of the conceptual framework of
model development. This chapter outlines the statistical methods, addresses issues that arise in
statistical modeling, and presents the solutions that are employed to address these issues. This
chapter will serve as a guide or "road map" for the underlying methodology of the model devel-
opment process.
5.1 Modeling Goal and Objectives
The goal of this research is to provide emission rate models that fill the gap between the
existing models and ideal models for predicting emissions of NO , CO, and HC from heavy-duty
diesel vehicles. Problems in existing models, like EPA's MOBILE series and CARB's EMFAC
series of models, have been highlighted in previous chapters. U.S. EPA is currently developing a
new set of modeling tools for the estimation of emissions produced by on-road and off-road mo-
bile sources. MOVES, a new model under development by EPA's OTAQ, is a modeling system
designed to better predict emissions from on-road operations. The philosophy behind MOVES
is the development of a model that is as directly data-driven as possible, meaning that emission
rates are developed from second-by-second or binned data.
Using second-by-second data collected from on-road vehicles, this research effort will
develop models that predict emissions as a function of on-road variables known to affect vehicle
emissions. The model should be robust and ensure that assumptions about the underlying distri-
bution of the data are verified and the properties of parameter estimates are not violated. With
limited available data, this study focuses on development of an analytical methodology that is
repeatable with a different data set from across space and across time. As more data become
available, the proposed model will need to be re-estimated to ensure that the model is transfer-
able across additional HDV engine types, operating conditions, environmental conditions, and
even perhaps geographical regions.
5-1
-------
5.2 Statistical Method
The purpose of statistical modeling was to determine which explanatory variables sig-
nificantly influence vehicle emissions so that the data can be stratified by those variables and a
corresponding regression relationship can be developed. For many statistical problems there are
several possible solutions. In comparing the means of two small groups, for instance, we could
use a t test, a t test with a transformation, a Mann-Whitney U test, or one of several others. The
choice of method depends on the plausibility of normal assumptions, the importance of obtaining
a confidence interval, the ease of calculation, etc.
Parametric or non-parametric approaches to evaluation can be applied. Parametric meth-
ods are used when the distribution is either known with certainty or can be guessed with a certain
degree of certainty. These methods are meaningful only for continuous data which are sampled
from a population with an underlying normal distribution or whose distribution can be rendered
normal by mathematical transformation. Analysts must be careful to ensure that significant er-
rors are not introduced when assumptions are not met. In contrast, nonparametric methods make
no assumptions about the distribution of the data or about the functional form of the regression
equation. Nonparametric methods are especially useful in situations where the assumptions
required by parametric are in question. Brief overviews and underlying theories of statistical
methods that might used in this research are addressed in the following sections.
5.2.1 Parametric Methods
5.2.1.1 Thef-Test
Student's t-test is one of the most commonly used techniques for testing whether the
means of two groups are statistically different from each other. This test tries to determine
whether the measured difference between two groups is large enough to reject the null hypothesis
or whether such differences are just due to "chance". The formula for the t-test (Equation 5-1) is
a ratio. The numerator of the ratio is just the difference between the two means or averages. The
denominator is a measure of the variability or dispersion of the data.
Li SVry
(Equation 5-1)
5-2
-------
where xl and x2 are the sample means, *? and -s22 are the sample variances, nl and n2 are
the sample sizes and Hs a Student t quantile with nl + n2 - 2 degrees of freedom.
Usually a significance level of 0.05 (or equivalently, 5%) is employed in statistical analy-
ses. The significance level of a statistical hypothesis test is a fixed probability of wrongly reject-
ing the null hypothesis HQ, if it is in fact true. Another index is p-value which is the probability
of getting a value of the test statistic as extreme as or more extreme than that observed by chance
alone, if the null hypothesis HQ is true. The p-value is compared with the actual significance
level of the test and, if it is smaller, the result is significant. That is, if the null hypothesis were to
be rejected at the 5% significance level, this would be reported as "p < 0.05".
The assumptions for f-test include: 1) the populations are normally distributed; 2) vari-
ances in the two populations are equal; and 3) the populations are independent. The results of
the analysis may be incorrect or misleading when assumptions are violated. For example, if
the assumption of independence for the sample values is violated, then the two-sample t test is
simply not appropriate. If the assumption of normality is violated or outliers are present, the
two-sample t test may not be the most powerful available test. This could mean the difference
between detecting a true difference or not. A nonparametric test or employing a transformation
may result in a more powerful test.
5.2.1.2 Ordinary Least Squares Regression
Regression analysis is a statistical methodology that utilizes the relation between two
or more quantitative variables so that one variable can be predicted from the other, or others
(Neter et al. 1996). There are many different kinds of regression models, like the linear regres-
sion model, exponential regression model, logistic regression model, and so on. Among them,
linear regression is a commonly used and easily understood statistical method. Linear regression
explores relationships that can be described by straight lines or their generalization to many di-
mensions. Regression allows a single response variable to be described by one or more predictor
variables.
Ordinary least squares (OLS) regression is a common statistical technique for quantifying
the relationship between a continuous dependent variable and one or more independent variables
(Neter et al. 1996). The dependent variables may be either continuous or discrete. Neter et al.
(1996) provides the basic OLS regression equation for a single variable regression model as
shown in Equation 5-2:
5-3
-------
Yt = Po + (3 iXi + 8. (Equation 5-2)
where:
Y = value of the response variable in the ith trial
P0, P: = estimators of regression parameters
X. = value of the predictor variable in the ith trial
e. = random error term with mean E{s.} = 0 and variance o2 (s.}= o2;
e. and e. are uncorrelated so that their covariance is zero.
i j
The parameters of the OLS regression equation, PQ and p., are found by the least squares
method, which requires that the sum of squares of errors be minimized. Gauss-Markov theorem
(Neter et al. 1996) states that, among all unbiased estimators that are linear combinations of ys,
the OLS estimators of regression coefficients have the smallest variance; i.e., they are the best
linear unbiased estimators. The Gauss-Markov Theorem does not tell one to use least squares all
the time, but it strongly suggests use of least squares (Neter et al. 1996).
In linear regression, there are key assumptions that must be met, including:
• Y. are independent normal random variables;
• The expected value of the error terms e. is zero;
• The error terms e. are assumed to have constant variance o2;
i
• The error terms e. are assumed normally distributed;
• The error terms e. are assumed to be uncorrelated so that their covariance is zero; and
• The error terms e. are independent of the explanatory variable
If the above assumptions are violated the regression equation may yield biased results
(Neter et al. 1996). For example, if the explanatory variable is not independent of the error term,
larger sample sizes do not lead to lower standard errors for the parameters, and the parameter
estimates (slope, etc.) are biased. If the error is not distributed normally, for example, there may
5-4
-------
be fat tails. Consequently, use of the normal distribution may underestimate true 95% confidence
intervals.
5.2.1.3 Robust Regression
OLS models generally rely on the normality assumption and are often fitted by means of
the least squares estimators. However, the sensitivity of these estimation techniques is related to
this underlying assumption which has been identified as a weakness that can lead to erroneous
interpretations (Copt and Heritier 2006). Robust regression procedures dampen the influence of
outlying cases, as compared to OLS estimation, in an effort to provide a better fit for the major-
ity of cases. Robust regression procedures are useful when a known, smooth regression function
is to be fitted to data that are "noisy", with a number of outlying cases, so that the assumption of
a normal distribution for the error terms is not appropriate (Neter et al. 1996). The method of
moments (MM) estimators are designed to be both highly robust against outliers and highly ef-
ficient.
5.2.2 Nonparametric Methods
Nonparametric methods have several advantages compared with parametric methods.
Nonparametric methods require no or very limited assumptions to be made about the format
of the data, and they may therefore be preferable when the assumptions required for paramet-
ric methods are not valid (Whitley and Ball 2002). Nonparametric methods can be useful for
dealing with unexpected, outlying observations that might be problematic with a parametric
approach. Nonparametric methods are intuitive and are simple to carry out by hand, for small
samples at least.
However, nonparametric methods may lack power as compared with more traditional
approaches (Siegel 1988). This lack of power is a particular concern if the sample size is small
or if the assumptions for the corresponding parametric method hold true (e.g., normality of the
data). Nonparametric methods are geared toward hypothesis testing rather than estimation of ef-
fects. It is often possible to obtain nonparametric estimates and associated confidence intervals,
but this process is not generally straightforward. In addition, appropriate computer software for
nonparametric methods can be limited, although the situation is improving.
5.2.2.1 Chi-Square Test
The Chi-square (Koehler and Larnz 1980), best known goodness-of-fit test, assumes that
the observations are independent and that the sample size is reasonably large. This method can
5-5
-------
be used to test whether a sample fits a known distribution, or whether two unknown distribu-
tions from different samples are the same. The test can detect major departures from a logistic
response function, but is not sensitive to small departures from a logistic response function. The
test assumptions are that the sample is random and that the measurement scale is at least ordinal
(Conover 1980; Neter et al. 1996).
Pearson's chi-square goodness of fit test statistic is shown in Equation 5-3 (StatsDirect
2005): 2
^ (Oj - Ej ) (Equation 5-3)
where O. are observed counts, E. are corresponding expected count and c is the number of
classes for which counts/frequencies are being analyzed.
The test statistic is distributed approximately as a chi-square random variable with c-1
degrees of freedom. The test has relatively low power (chance of detecting a real effect) with
all but large numbers or big deviations from the null hypothesis (all classes contain observations
that could have been in those classes by chance).
The handling of small expected frequencies is controversial. Koehler and Larnz asserted
that the chi-square approximation is adequate provided all of the following are true: total of ob-
served counts (N) > 10; number of classes (c) > 3; all expected values > 0.25 (Koehler and Larnz
1980).
5.2.2.2 Kolmogorv-Smirnov Two-Sample Test
The Kolmogorov-Smirnov (K/S) two-sample test (Chakravart and Roy 1967) compares
the empirical distribution functions of two samples, Ej and E2. The Kolmogorov-Smirnov test is
a nonparametric test, which can be used to test whether two or more samples are governed by the
same distribution by comparing their empirical distribution functions.
The Kolmogorov-Smirnov two sample test statistic can be defined as shown in Equation
5-4 (Chakravart and Roy 1967):
D =
El (i) - E 2 (i) (Equation 5-4)
where E. and E are the empirical distribution functions for the two samples.
5-6
-------
The Kolmogorov-Smirnov (K/S) two-sample test provides an improved methodology
over the chi-squared test since data do not have to be assigned arbitrarily to bins. Further, it is a
non-parametric test so a distribution does not have to be assumed. However, the main disadvan-
tage to the K/S is similar to the chi-square in that the orders of magnitude of separate tests that
would have to be conducted to test all the possible combinations of variables in the datasets is
logistically infeasible (Hallmark 1999).
5.2.2.3 Wilcoxon Mann-Whitney Test
The Wilcoxon Mann-Whitney Test (Easton and McColl 2005) is one of the most power-
ful of the nonparametric tests for comparing two populations. This test is used to test the null hy-
pothesis that two populations have identical distribution functions against the alternative hypoth-
esis that the two distribution functions differ only with respect to location (median), if at all.
The Wilcoxon Mann-Whitney test does not require the assumption that the differences
between the two samples are normally distributed. In many applications, the Wilcoxon Mann-
Whitney Test is used in place of the two sample Mest when the normality assumption is ques-
tionable. This test can also be applied when the observations in a sample of data are ranks, that
is, ordinal data rather than direct measurements.
The Mann Whitney U statistic is denned as shown in Equation 5-5 (StatsDirect 2005):
(Equation 5-5)
where samples of size n1 and n2 are pooled and R. are the ranks.
U can be resolved as the number of times observations in one sample precede observa-
tions in the other sample in the ranking. Wilcoxon rank sum, Kendall's S and the Mann-Whitney
U test are exactly equivalent tests. In the presence of ties the Mann-Whitney test is also equiva-
lent to a chi-square test for trend.
5.2.2.4 Analysis of Variance (ANOVA)
ANOVA (Analysis of Variance) (Neter et al. 1996), sometimes called an F test, is closely
related to the t test. The major difference is that, where the t test measures the difference be-
tween the means of two groups, an ANOVA tests the difference between the means of two
or more groups. ANOVA modeling does not require any assumptions about the nature of the
statistical relation between the response and explanatory variables, nor do they require that the
explanatory variables be quantitative.
5-7
-------
The ANOVA, or single factor ANOVA, compares several groups of observations, all of
which are independent, but each group of observations may have a different mean. A test of
great importance is whether or not all the means are equal. The advantage of using ANOVA rath-
er than multiple t-tesis is that it reduces the probability of a type-I error (making multiple com-
parisons increases the likelihood of finding something by chance). One potential drawback to
an ANOVA is that it can only tell that there is a significant difference between groups, not which
groups are significantly different from each other. The breakdowns of the total sum of squares
and degrees of freedom, together with the resulting mean squares, are presented in an ANOVA
table such as Table 5-1.
Table 5-1 ANOVA Table for Single-Factor Study (Neter et al. 1996)
Source of
Variation
Between
treatments
Error
(within
treatments)
Total
Sum of
Squares (SS)
- --2
Z^i i \ i )
.^-m, tf
**.W.-T1
Degrees of
Freedom
(df)
r - 1
nT-r
nT-l
Mean Square (MS)
MSTR-SSTR
r-l
SSE
NT-r
Expected Mean Square
E(MS)
a 2 Z^ i ^ *i *"• '
r-l
2
o
A factorial ANOVA can examine data that are classified on multiple independent vari-
ables. A factorial ANOVA can show whether there are significant main effects of the indepen-
dent variables and whether there are significant interaction effects between independent variables
in a set of data. Interaction effects occur when the impact of one independent variable depends
on the level of the second independent variable (Neter et al. 1996). Computation can be per-
(8)
formed with standard statistical software such as SAS .
5.2.2.5 HTBR
HTBR (Breiman et al. 1984) is a forward step-wise variable selection method, similar
to forward stepwise regression. This method is also known as Classification and Regression
Tree (CART) analysis. This technique generates a "tree" structure by dividing the sample data
-------
recursively into a number of groups. The groups are selected to maximize some measure of
difference in the response variable in the resulting groups. As Washington et al. summarized in
1997 (Washington et al. 1997a), this method is based upon iteratively asking and answering the
following questions: (1) which variable of all of the variables 'offered' in the model should be
selected to produce the maximum reduction in variability of the response? and (2) which value
of the selected variable (discrete or continuous) results in the maximum reduction in variability
of the response? The HTBR terminology is similar to that of a tree; there are branches, branch
splits or internal nodes, and leaves or terminal nodes (Washington et al. 1997a).
To explain the method in mathematical terms, the definitions are presented by Washing-
ton et al. (Washington et al. 1997a). The first step is to define the deviance at a node. A node
represents a data set containing L observations. The deviance, D , can be estimated as shown in
equation 5-6:
11
« ^ ,»
(Equation 5-6)
1=1
where
D = total deviance at node a, or the sum of squared error (SSE) at the
node
Y! a = Ith observation of dependent variable y at node a
X" = estimated mean of L observations in node a
Next, the algorithm seeks to split the observation at node a on a value of an independent
variable, X., into two branches and corresponding nodes b and c, each containing M and N of the
original L observations (M+N=L) of the variable X.. The deviance reduction function evaluated
over all possible Xs then can be defined as shown in Equations 5-7 thru 5-9:
(Equation 5-7)
M
A =
m=l
N
i,b Xb) (Equation 5-8)
(Equation 5-9)
n=\
5-9
-------
where
A 1]x = the total deviance reduction function evaluated over the domain of
all Xs
Dfe = total deviance at node b
D = total deviance at node c
= mth observation on dependent variable y in node b
y = nth observation on dependent variable y in node c
m b
%b = estimated mean of M observations in node b
Jc = estimated mean of N observations in node c
The variable Xk and its optimum split X is sought so that the reduction in deviance is
maximized, or more formally when (as shown in equation 5-10):
Aw = i>,.fl -*J2 -±(ym. -*6)2 -tov -*c)2 = max (Etluation 5-10)
where
m-\ n-\
A =the total deviance reduction function evaluated over the domain of
all Xs
Y! a = fh observation of dependent variable y at node a
X" = estimated mean of L observations in node a
ym b = m* observation on dependent variable y in node b
yn c = nfe observation on dependent variable y in node c
xb = estimated mean of M observations in node b
~XC = estimated mean of N observations in node c
The maximum reduction occurs at a specific value X of the independent variable Xk.
When the data are split at this point, the remaining samples have a much smaller variance than
the original data set. Thus, the reduction in node a deviance is greatest when the deviances at
nodes b and c are smallest. Numerical search procedures are employed to maximize Equation
5-10 by varying the selection of variables used as a basis for a split and the value to use for each
variable at a split.
In growing a regression tree, the binary partitioning algorithm recursively splits the data
in each node until the node is homogenous or the node contains too few observations. If left
5-10
-------
unconstrained, a regression tree model can "grow" until it results in a complex model with a
single observation at each terminal node that explains all the deviance. However, for application
purposes, it is desirable to create criteria to balance the model's ability to explain the maximum
amount of deviation with a simpler model that is easy to interpret and apply. Some software,
such as S-Plus™, allows the user to select such criteria. The software allows the user to interact
with the data in the following manner to select variables and help simplify the final model:
• Response variable: the response variable is selected by the user from a list of fields
from the data set;
• Predictor variables: one or more independent variables can be selected by the user
from a list of fields associated with the dataset;
• Minimum number of observations allowed at a single split: sets the minimum number
of observations that must be present before a split is allowed (default is 5);
• Minimum node size: sets the allowed sample size at each node (default is 10); and
• Minimum node deviance: the deviance allowed at each node (default is 0.01).
However, unlike OLS regression models, a shortcoming of HTBR is the absence of
formal measures of model fit, such as ^-statistics, F-ratio, and r-square, to name a few. Thus,
the HTBR model is used to guide the development of an OLS regression model, rather than as
a model in its own right. Similar uses of HTBR techniques have been developed and applied in
previous research papers (Washington et al. 1997a; Washington et al. 1997b; Fomunung et al.
1999; Freyetal. 2002).
5.3 Modeling Approach
The model development process will start by using HTBR both as a data reduction tool
and for identifying potential interactions among the variables. Then OLS Regression or Robust
Regression is used with the identified variables to estimate a preliminary "final" model. After
that, we need to check the model for compliance with normality assumptions and goodness of fit.
Several diagnostic tools are available to perform these checks. Once a preliminary
"final" model is obtained, regression coefficients are examined using their ^-statistics and cor-
relation coefficients to determine which variables should be removed or retained in the model for
further analysis. However this procedure can lead to the removal of potentially important inter-
correlated explanatory variables. In fact, variable agreement with underlying scientific principles
5-11
-------
of combustion, pollutant formation and emission controls (cause-effect relationships) should be
the basis for the ultimate decisions regarding variable selection. Thus, a ^-statistic may indicate
that a parameter is insignificant (at level of significance = 0.05), while theory indicates that such
a parameter should be retained in the model for further analysis. This type of error is usually
referred to as a type II error (Fomunung 2000).
F-statistics and adjusted coefficient of multiple determination, R2 are used to determine
the effect-size of the parameters. Usually, adding more explanatory variables to the regression
model can only increase R2 and never reduce it, because SSE can never become larger with more
X variables and total sum of squares (SSTO) is always the same for a given set of responses.
The adjusted coefficient of multiple determination can adjust R2 by dividing each sum of squares
by its associated degrees of freedom. The F-test is used to test whether the parameter can be
dropped even if the ^-statistic is appropriate.
In multiple regression analysis, the predictor or explanatory variables tend to be corre-
lated among themselves and with other variables related to the response variable but not included
in the model. The effects of multicollinearity are many and can be severe. Neter et al. (Neter
et al. 1996) have documented a few of these: when multicollinearity exists the interpretation of
partial slope coefficients becomes meaningless; multicollinearity can lead to estimated regression
coefficients that vary widely from one sample to another; and there may be several regression
functions that provide equally good fits to the data, making the effects of individual predictor
variables difficult to assess.
There are some informal diagnostic tools suggested to detect this problem. A frequently
used technique is to calculate a simple correlation coefficient between the predictor variables to
detect the presence of inter-correlation among independent variables. Large values of correlation
is an indication that multicollinearity may exist. Large changes in the estimated regression coef-
ficients when a predictor variable is added or deleted are also an indication. Finally, multicol-
linearity may be a problem if estimated regression coefficients are calculated with an algebraic
sign that is the opposite of that expected from theoretical considerations or prior experience (i.e.,
the beta coefficient is compensating for the beta coefficient of a correlated explanatory variable).
A formal method of detecting this problem is the variance inflation factor (VIF), which is
a measure of how much the variances of the estimated regression coefficients are inflated as com-
pared to when the predictor variables are not linearly related (Neter et al. 1996). This method is
widely used because it can provide quantitative measurements of the impact of multicollinear-
ity. The largest VIF value among all Xs is used to assess the severity of multicollinearity. As a
5-12
-------
rule of thumb, a VIF in excess of 10 is frequently used as an indication that multicollinearity is
severe.
Diagnostic plots are examined to verify normality and homoscedasticity (i.e., homogene-
ity of variance) assumptions as well as the goodness of fit. Because of the difficulty in assessing
normality, it is usually recommended that non-constancy of error variance should be investigated
first (Neter et al. 1996). The plots used to identify any patterns in the residuals are considered
as informal diagnostic tools and include plots of the residuals versus the fitted values and plot of
square root of absolute residuals versus the fitted values. The normality of the residuals can be
studied from histograms, box plots, and normal probability plots of the residuals. In addition,
comparisons can be made of observed frequencies with expected frequencies if normality ex-
ists. Usually, heteroscedasticity and/or inappropriate regression functions may induce a depar-
ture from normality. When OLS is applied to heteroskedastic models the estimated variance is a
biased estimator of the true variance. OLS either overestimates or underestimates the true vari-
ance, and, in general it is not possible to determine the nature of the bias. The variances, and the
standard errors, may therefore be either understated or overstated.
5.4 Model Validation
Model validity refers to the stability and reasonableness of the regression coefficients,
the plausibility and usability of the regression function, and the ability to generalize inferences
drawn from the regression function. Validation is a useful and necessary part of the model-build-
ing process (Neter et al. 1996).
Two basic ways of validating a regression model are internal and external. Internal
validation consists of model checking for plausibility of signs and magnitudes of estimated coef-
ficients, agreement with earlier empirical results and theory, and model diagnostic checks such as
distribution of error terms, normality of error terms, etc. Internal validation will be performed as
part of the model estimation procedure.
External validation is the process to check the model and its predictive ability with the
collection of new data, such as data from another location or time, or using a holdout sample.
Considering there are only 15 buses/engines in the data set, it is not practical to split the data
set and hold a sample for validation purposes. Splitting the data set will definitely influence
the regression estimators. However suggestions and procedures for external validation will be
provided.
5-13
-------
CHAPTER 6
6. DATA SET SELECTION AND ANALYSIS OF EXPLANATORY VARIABLES
6.1 Data Set Used for Model Development
Development of a modal model designed to predict emissions on a second-by-second ba-
sis as a function of engine load requires the availability of appropriate emission test data. Modal
modeling required the availability of second-by-second vehicle emissions data, collected in par-
allel with corresponding revealed engine load data. In 2004, only two data sets could be identi-
fied for use in this modeling effort. U.S. EPA provided two major HDV activity and emission
databases to develop the emission rate model (Ensfield 2002) (U.S. EPA 2001b). One database
is a transit bus database, which included emissions data collected on diesel transit buses oper-
ated by the AATA in 2001, and another database is heavy HDV (HDV8B) database prepared by
NRMRL in 2001. The transit database consisted of data collected from 15 buses with the same
type of engines while the HDV8B database consisted of only one truck engine tested extensively
on-road under pre-rebuild and post rebuild engine conditions. To decide whether it is suitable to
combine these two data sets or treat them individually, two dummy variables were added to the
databases to describe vehicle types. For the first dummy variable named "bus", 1 was assigned
for transit bus, and 0 for others. For the second dummy variable, 1 was assigned for FtDDV with
pre-rebuild engine, and 0 for others. HTBR was applied to all data sets to examine whether tran-
sit buses behave differently from FtDDVs or not. The regression trees and results for NO , CO,
and HC emission rates are given in Figures 6-1 to Figure 6-3.
6-1
-------
12938 0.2571
Figure 6-1 HTBR Regression Tree Result for NOv Emission Rate for All Data Sets
1
0.1044
X
iis
-------
0.001476
0.002710
0.001637
Figure 6-3 HTBR Regression Tree Result for HC Emission Rate for All Data Sets
Dummy variable for bus is selected as the first split for all three trees above. Therefore
transit bus and HDDV should be treated separately. Since there are 15 engines in the transit bus
data set and one engine (pre-rebuild and post-rebuild for the same engine) in the HDDV data set,
the transit bus data set should be used for the final version of the conceptual model development.
6.2 Representative Ability of the Transit Bus Data Set
The transit bus data set was collected by Sensors, Inc. in Oct. 2001 (Ensfield 2002). The
buses tested came from the AATA and included 15 New Flyer models with Detroit Diesel Series
50 engines. All of the buses were of model years 1995 and 1996. All of the bus tested periods
lasted approximately 2 hours. The buses operated during standard AATA bus routes and stopped
at all regular stops although the buses did not board or discharge any passengers (Ensfield 2002)
The routes were mostly different for each test, and were selected for a wide variety of driving
conditions (see Figure 4-1).
Figure 6-4 shows the speed-acceleration matrix developed with second-by-second data.
There are two high speed/acceleration frequency peaks here. One is the bin of speed < 2.5 mph
and acceleration (-0.25 mph/s, 0.25 mph/s) and contains 26.11% of the observations, while the
other is the combination of several adjacent bins which covers speed (22.5 mph, 47.5 mph) and
acceleration (-0.75 mph/s, 0.75 mph/s).
6-3
-------
Figure 6-4 Transit Bus Speed-Acceleration Matrix
Georgia Institute of Technology researchers collected more than 6.5 million seconds of
transit bus speed and position data using Georgia Tech Trip Data Collectors ( an onboard com-
puter with GPS receiver, data storage, and wireless communication device) installed on two
Metropolitan Atlanta Rapid Transit Authority (MARTA) buses in 2004 (Yoon et al. 2005b). With
second-by-second data, the research team developed transit bus speed/acceleration matrices for
the combinations between roadway facility type (arterial or local road) and time range (morning,
midday, afternoon, night). For each matrix, two high acceleration/deceleration frequency peaks
were also found. This finding is consistent with the AATA data set, indicating at least that the on-
road operations of the buses in Ann Arbor are similar to operations in the Atlanta region.
This data set was collected under a wide variety of environmental conditions, too. The
temperature ranged from 10 °C to 30 °C, the relative humidity ranged from 15% to 65%, while
the barometric pressure ranged from 960 mbar to 1000 mbar (Figure 6-5). So we can use this
data set to examine the impact of environmental conditions on emissions.
6-4
-------
30
I
^ 20
S
1
t-
iy
12
10
* * *
*
•
*
.
I
*
*
»
*
1
+ * *
* * • + •
-•*••*•••«
* *
*
J 1!
Bus No
Figure 6-5 Test Environmental Conditions
Transit buses tested were provided by the AATA and all of them are New Flyer models
with Detroit Diesel Series 50 engines. Since these buses utilized consistent engine technologies
(i.e., fuel injection type, catalytic converter type, transmission type, and so on), the ability of esti-
mated emission models to incorporate the effect of other types of vehicle technologies is limited.
Another limitation is the consideration of the effects of emission control technology deterioration
on emission levels since these buses were only 5 or 6 years old during the test.
6.3 Variability in Emissions Data
6.3.1 Inter-bus Variability
Data are presented to illustrate the variability in observed data. Inter-bus variabilities are
illustrated using median and mean of NO , CO, and HC emission rates for each bus from Figures
6-6 to 6-8. The difference between median and mean is an indicator of skewness for the distribu-
tion of emission rates.
6-5
-------
016
014 -
O.I6r
0,14
Figure 6-6 Median and Mean of NO Emission Rates by Bus
6-6
-------
0.06
0.06
I
.1 0.03
0.06
006
I
luLliiii
0 2 4 6 8 10 12 14 16 0 2 t 6 8 10 12 14 16
Bus Ho Bus No
Figure 6-7 Median and Mean of CO Emission Rates by Bus
i 6 8 10 14 16
« 6 8 10 14 16
Figure 6-8 Median and Mean of HC Emission Rates by Bus
6-7
-------
The purpose of inter-bus variability analysis was to characterize the range of variability in
vehicle average emissions among all of the buses, to determine whether the data set is relatively
homogeneous. Although there are some clusters among the buses as suggested from Figures 6-6
to 6-8 and some skewness in the distribution as suggested by upper tails in Figure 6-9, it is not
obvious that this data set lacks homogeneity and should be separated into different groups. Thus,
this data set is treated as a single group for purposes of analysis and model development.
Empirical CDF
Empirical CDF
Empirical CDF
Figure 6-9 Empirical Cumulative Distribution Function Based on Bus Based Median Emission
Rates for Transit Buses
6.3.2 Descriptive Statistics for Emissions Data
Applicable numerical summary statistics, such as variable means and standard deviations,
are presented in Table 6-1. Relatively simple graphics such as histograms and boxplots describ-
ing variable distributions are presented in Figures 6-10 to 6-12. It may also be necessary to as-
sess whether the individual variables are normally distributed prior to any further analysis using
parametric methods that are based upon this assumption.
6-8
-------
Table 6-1 Basic Summary Statistics for Emissions Rate Data for Transit Bus
*** Summary Statistics for data in: transitbus.data ***
Min:
IstQu.:
Mean:
Median:
3rdQu.:
Max:
Total N:
NA's:
StdDev.:
NO
O.OOOOOOe+000
3.030000e-003
3.183675e-002
7.540000e-003
2.197000e-002
3.057700e+000
1.075350e+005
O.OOOOOOe+000
8.479305e-002
HC
O.OOOOOOe+000
2.195000e-002
1.052101e-001
5.058000e-002
1.731100e-001
2.427900e+000
1.075350e+005
O.OOOOOOe+000
1.162344e-001
CO
O.OOOOOOe+000
4.200000e-004
1.438709e-003
9.300000e-004
1.840000e-003
6.679000e-002
1.075350e+005
O.OOOOOOe+000
1.956353e-003
1-1
(C
g
0.0 05 1.0 1.5 2.0
NOx Emission Rate (g/s)
2.5
O
-i—
-4-2024
QuanHes of Standard Normal
Figure 6-10 Histogram, Boxplot, and Probability Plot of NO Emission Rate
6-9
-------
s
•=•
00 OS 1.0 15 2S) IS 30
CO Emission Rate (gftj
-4-20 2
OuartilK al Standard Normal
Figure 6-11 Histogram, Boxplot, and Probability Plot of CO Emission Rate
II
00 0.02 0.04 006
HC Emission Rale (gfe)
•4-202
Quart*! ol asrtttrd Normal
Figure 6-12 Histogram, Boxplot, and Probability Plot of HC Emission Rate
6-10
-------
Further analysis indicated that there are some zero values in the emission data. There
might be several reasons for zero values. Missing data caused by loss of communication be-
tween instruments or failure of a particular vehicle were recorded as zero in the data set. Those
zero values were already identified in the data post-processing procedure in Chapter 4. Zero
values might also have occurred when the reference air contained significant amounts of a pollut-
ant so the instrument systematically reported negative emission values. Sensors, Inc. suggested
that negative data should be set to zero. Thus these negative values were artificially recorded as
zero, not observed by test equipment as zero. These zero values would create truncation issues
in the model, since the Sensors, Inc. transit bus data set contained only valid positive emission
data. Usually, truncation is found when a random variable is not observable over its entire range.
Truncation could not be treated as a missing data problem as the missing observations are ran-
dom. In statistics consideration or analysis can be limited to data that meet certain criteria or to
a data distribution where values above or below a certain point have been eliminated (or cannot
occur). A program was written in MATLAB® to check for the presence of zero emissions esti-
mates in the data set. There were 1.45% zero values for NO emissions, 1.65% zero values for
X '
CO emissions and 3.84% zero values for HC emissions. Since negative emission values were
not observable for the transit bus data set, further analysis will focus on truncated data sets with
valid positive emission data only.
The numerical summary statistics such as variable means and standard deviations for
truncated emission data are presented in Table 6-2, and relatively simple graphics such as his-
tograms and boxplots describing variable distributions are presented from Figures 6-13 to 6-15.
The mean of truncated NOx emission data increases 1.26%, while the mean of truncated CO
emission data increases 1.23% and the mean of truncated HC emission data increases 0.99%,
compared with the means of the original data set.
6-11
-------
Table 6-2 Basic Summary Statistics for Truncated Emissions Rate Data
Min:
IstQu.:
Mean:
Median:
3rdQu.:
Max:
Total N:
NA's:
StdDev.:
NO
X
l.OOOOOOe-005
2.256000e-002
1.067578e-001
5.243500e-002
1.749625e-001
2.427900e+000
1.059760e+005
O.OOOOOOe+000
1.163785e-001
CO
l.OOOOOOe-005
3.190000e-003
3.236955e-002
7.770000e-003
2.246000e-002
3.057700e+000
1.057650e+005
O.OOOOOOe+000
8.539871e-002
HC
l.OOOOOOe-005
4.700000e-004
1.496171e-003
9.900000e-004
1.880000e-003
6.679000e-002
1.034050e+005
O.OOOOOOe+000
1.973375e-003
i
£
0.0 0.5 1.0 1.5 2.0 2.5
Truncated NOx Emission Rate (gte)
Q _
-4-20 2 4
Quantltes of Standard Normal
Figure 6-13 Histogram, Boxplot, and Probability Plot of Truncated NO Emission Rate
6-12
-------
00 05 10 IS 20 2.S 30
Truncated CO Enwsicn Rale (8/5)
•2024
OuonWes of StsnOara Nam*
Figure 6-14 Histogram, Boxplot, and Probability Plot of Truncated CO Emission Rate
$ 3
a: <=
00 0.02 0.04 DOS
Tnmcated HC Emisnon Rate (aft)
8
3.
-202
OuanUes ol Standard Normal
Figure 6-15 Histogram, Boxplot, and Probability Plot of Truncated HC Emission Rate
These boxplots for truncated emission data show that there are some obvious outliers in
the measured emissions of all three pollutants, and the histograms suggest a high degree of non-
normality, also indicated in the probability plots. There is thus a need to transform the response
6-13
-------
variable to correct for this condition. Transformations are used to present data on a different
scale. In modeling and statistical applications, transformations are often used to improve the
compatibility of the data with assumptions underlying a modeling process, to linearize the rela-
tion between two variables whose relationship is non-linear, or to modify the range of values of a
variable (Washington et al. 2003).
6.3.3 Transformation for Emissions Data
Although evidence in the literature suggests that a logarithmic transformation is most
suitable for modeling motor vehicle emissions (Washington 1994; Ramamurthy et al. 1998;
Fomunung 2000; Frey et al. 2002), this transformation needs to be verified through the Box-Cox
procedure. The Box-Cox function in MATLAB® can automatically identify a transformation
from the family of power transformations on emission data, ranging from -1.0 to 1.0. The lamb-
das chosen by the Box-Cox procedure are 0.22875 for truncated NOx, -0.0648 for truncated CO,
0.14631 for truncated HC.
The Box-Cox procedure is only used to provide a guide for selecting a transformation,
so overly precise results are not needed (Neter et al. 1996). It is often reasonable to use a nearby
lambda value with the power transformation. The lambda values used for transformations are
1/4 for truncated NOx, 0 for truncated CO, 0 for truncated HC. Histograms, boxplots and nor-
mal-normal plots describing transformed variable distributions are presented in Figures 6-16 to
6-18, where a great improvement is noted.
6-14
-------
- .
B
32 04 06 0,8 1.0 1.2
-4-20 24
Quanttes of Standard NotmeJ
Figure 6-16 Histogram, Boxplot, and Probability Plot of Truncated Transformed NO Emission Rate
III,
-5 -4 -3 -2 .1 0
-4-2024
Quartles of Started Nonnal
Figure 6-17 Histogram, Boxplot, and Probability Plot of Truncated Transformed CO Emission Rate
6-15
-------
...III
a
•20J
Ouartfe! ot SUnaard Ncrml
Figure 6-18 Histogram, Boxplot, and Probability Plot of Truncated Transformed HC Emission Rate
Although transformations can result in improvement of a specific modeling assumption
such as linearity or normality, they can often result in the violation of others. Thus, transforma-
tions must be used in an iterative fashion, with continued checking of other modeling assump-
tions as transformations are made. Dr. Washington suggested the comparisons should always
be made on the original untransformed scale of Y when comparing statistical models and these
comparisons extend to goodness of fit statistics and model validation exercises (Washington et al.
2003).
6.3.4 Identification of High Emitter
From a modeling viewpoint, it is important to accurately predict the number of 'high
emitter' vehicles in the fleet (older technology, poorly maintained, or tampered vehicles that emit
significantly elevated emissions relative to the fleet average under all operating conditions) and
the fraction of activities that yield high emissions for normal emitting vehicles. Historic practic-
es to identify 'high emitters' in a data set have relied on judgment to set cut points that are often
indefensible from a statistical, and sometimes even practical, perspective. U.S. EPA uses five
times the prevailing emission standards as the cut point across all pollutants (U.S. EPA 1993),
while CARB has defined different emission regimes ranging from normal to super emitters and
used different criteria for each regime (CARB 1991; Carlock 1994) (see Table 6-3).
6-16
-------
Table 6-3 CARS Emission Regime Definition (Carlock 1994)
Emitter Status NO CO HC
Normal
Moderate
High
Very High
Super
< 1 standard
1 to 2 standard
2 to 3 standard
3 to 4 standard
> 4 standard
< 1 standard
1 to 2 standard
2 to 6 standard
6 to 10 standard
> 10 standard
< 1 standard
1 to 2 standard
2 to 4 standard
5 to 9 standard
> 9 standard
In contrast, the methodology employed in MEASURE database development at Georgia
Tech is statistically based. Wolf et al. used regression tree techniques to classify vehicles into
classes that behave similarly, exhibit similar technology characteristics, and exhibit similar mean
emission rates under standardized testing conditions (Wolf et al. 1998). The cut points within
each technology class are then defined on the basis of pre-selected percentiles of a normal distri-
bution of the emission rates for each pollutant. The analysis by Wolf et al. specified a cut point
of 97.73 percent (that is, mean + 2 standard deviations), which implies that approximately 2.27
percent of the vehicles in each technology class are high emitters.
For this research, although inter-bus variability exists in the data set, these 15 buses
should be treated as one technology class because they shared the same fuel injection type, cata-
lytic converter type, transmission type, and their model year and odometer reading were similar.
Just as in Wolf's approach, the emissions value located at two standard deviations above the
mean of the normalized emissions distribution is used as a cutpoint to distinguish between nor-
mal and high emission points. Theoretically, this method will consistently identify approximate-
ly 2.27 percent of the data as high emission points. That means 97.73 percent of the population
should fall into the normal status. Analysis results showed that 0.33 percent of NO emission,
3.76 percent of CO emission, and 1.37 percent of HC emissions were identified as high emission
points. After assigning those high emissions points to different buses, the distribution is shown
in Table 6-4.
6-17
-------
Table 6-4 Percent of High Emission Points by Bus
NO CO HC
X
bus 360
bus 361
bus 363
bus 364
bus 372
bus 375
bus 377
bus 379
bus 380
bus 381
bus 382
bus 383
bus 384
bus 385
bus 386
Total
0.02%
0.32%
0.06%
0.04%
0.00%
0.69%
0.00%
0.67%
0.52%
0.10%
1.14%
0.88%
0.50%
0.55%
0.20%
0.36%
2.80%
1.08%
3.10%
0.87%
0.13%
3.16%
4.44%
2.85%
7.67%
4.76%
8.12%
3.44%
5.10%
2.10%
6.63%
3.81%
5.06%
0.25%
0.00%
7.38%
1.96%
0.27%
0.00%
1.17%
0.69%
0.14%
0.36%
1.82%
1.33%
0.60%
0.57%
1.38%
For each individual bus, the highest proportion is 1.14 percent for bus 382 for NO emis-
sions, 8.12 percent for bus 380 for CO emissions, and 7.38 percent for bus 364 for HC emissions.
No evidence from Table 6-4 suggests that there are some "high emitters" (older technology,
poorly maintained, or tampered vehicles) in the data set. This conclusion makes sense since
all buses were only 5 or 6 years old during the test. Another finding indicated that a small frac-
tion of a bus's observed activity exhibited disproportionately high emissions. Activities found
in the literature include hard accelerations at low speeds, moderate acceleration at high speeds,
or equivalent accelerations against gravity (Fomunung 2000). Given that high emissions points
make up only 0.33 percent of the data set for NO , 3.76 percent for CO, and 1.37 percent for HC,
it is not necessary to develop two different models for normal emissions and high emissions.
Based on this analysis, these 15 buses should be treated as one technology class since no high
emitters were identified.
6.4 Potential Explanatory Variables
There are four main groups of parameters that affect vehicle emissions as indicated in
the literature (Guensler 1993; Clark et al. 2002). These groups are: 1) vehicle characteristics,
including vehicle type, make, model year, engine type, transmission type, frontal area, drag coef-
ficient, rolling resistance, vehicle maintenance history, etc.; 2) roadway characteristics, includ-
ing road grade and possibly pavement surface roughness, etc.; 3) on-road load parameters, like
6-18
-------
on-road driving trace (sec-cy-sec) or speed/acceleration profile, vehicle payload, on-road operat-
ing modes, driver behavior, etc.; and 4) environmental conditions, including humidity, ambient
temperature, and ambient pressure (Feng et al. 2005; Guensler et al. 2005).
In general, emissions from HDDVs are more likely to be a function of brake-horsepower
load on the engine (especially for NO ) than emissions from light-duty gasoline vehicles, because
instantaneous emissions levels of diesel engines are highly correlated with the instantaneous
work output of the engine (Ramamurthy et al. 1999; Feng et al. 2005). That is, in particular, the
higher the engine load, the higher emissions for NO . The emissions modeling framework (from
which most of the items below are derived) is outlined in the Regional Applied Research Effort
(RARE) report (Guensler et al. 2006). The goal of that modeling regime was to predict on-road
load and then apply appropriate emission rates to the load. Most of the items outlined below are
related to the amount of engine load that a vehicle will experience. Although each of the vari-
ables below is important, the values are not always available in on-road testing data (although in
the future we need to make sure that these data are all collected). But, engine load in the AATA
database could be used in emission rate model development for this research. Also, there are
some factors, such as temperature and humidity, that may affect emission rates independent of
load, or perhaps interacting with load. The model should incorporate such variables.
6.4.1 Vehicle Characteristics
Factors related to vehicle characteristics influencing heavy-duty diesel vehicle emissions
which are summarized in the literature include vehicle class (i.e., weight, engine size, horsepow-
er rating), model year, vehicle mileage, emission control system (i.e., engine exhaust aftertreat-
ment system), transmission type, inspection and maintenance history, etc. (Guensler 1993; Clark
et al. 2002).
The effect of vehicle class on emissions is significant. Five main factors that cause a
vehicle to demand engine power are vehicle speed, vehicle acceleration, drive train inertial ac-
celeration, vehicle weight, and road grade. As the required power and work performed by the
vehicle increase, the amount of fuel burned to produce that power also increases, and the appli-
cable emission rates also generally increase. Thus, emissions vary as a function of vehicle class
and vehicle configuration. The higher truck classes with larger engines are heavier and, thus,
typically produce more emissions. Vehicle configurations with large frontal areas and high drag
coefficients will yield higher emissions when operated at higher speeds and/or accelerated at
higher rates.
6-19
-------
The concept of vehicle technology groups is to identify and track subsets of vehicles that
have similar on-road load responses and similar laboratory emission rate performance. The basic
premise is that vehicles in the same heavy-duty vehicle class, employing similar drive train sys-
tems, and of the same size and shape have similar load relationships. There is also an important
practical consideration in establishing vehicle technology groups. Researchers need to be able to
identify these vehicles in the field during traffic counting exercises.
The starting point for technology group criteria is a visual classification scheme. Yoon et
al. (Yoon et al. 2004a) developed a new HDV visual classification scheme called the X-scheme
based on the number of axles and gross vehicle weight ratings (GVWR) as a hybrid scheme
between the FHWA truck and U. S. EPA HDV classification schemes. With field-observed HDV
volumes, emissions rates estimated using the X-scheme were 34.4% and 32.5% higher for NOx
and PM, compared to using the standard U.S. EPA guidance (U.S. EPA2004c). The X-scheme
reflects vehicle composition in the field more realistically than does the standard U.S. EPA guid-
ance (U.S. EPA 2004c), which shifted heavy-HDV volumes into light- or medium-HDV volumes
21% more frequently than the X-scheme. Figure 6-19 shows X-scheme classes and their typical
figures (Yoon et al. 2004a).
X2
X3
HDVZb,
HDV3,HDY4,
HDVS^HDVe,
HDV7
HDVSa
HDVSb
^_ — t-.^7^
URW »«t
- xll ' l
•Q — " (JO -"^
• t>- wo
V*11* '-•"»•
.-^SCJ^CF
Figure 6-19 The X Classes and Typical Vehicle Configurations
Vehicle age and model year effects are accounted for because some vehicle models have
much lower average emissions. Researchers from West Virginia University reported that most
regulated emissions from engines produced by Detroit Diesel Corporation have declined over
the years and the expected trend of decreasing emission levels with the model year of the engine
6-20
-------
is clear and consistent for PM, HC, CO and NO , starting with the 1990 models (Prucz et al.
2001). Information on vehicle age can be obtained from a registration database using vehicle
identification numbers and truck manufacturer records. The registration database can be sorted
by calendar year and show vehicles registered in the given year by model year. However, given
the differences noted between field-observation fleet composition and registration data in the
light-duty fleet (Granell et al. 2002), significant additional research efforts designed to model the
on-road subfleet composition (classifications and model year distributions) are even more war-
ranted for HDVs. It is also important to keep in mind that heavy-duty engines accumulate miles
of travel very rapidly and that engine rebuilding is a common practice. Hence, the age of the
vehicle does not necessarily equal the age of the engine. Previous field work in Atlanta indicates
that on-road surveys provide better information on fleet composition (Ahanotu 1999). To refine
the model, appropriate data sets that include detailed information on engine type, transmission
type, etc. will be needed to appropriately subdivide the observed on-road groups and continue to
develop respective emission rates. The data collection challenge in this area is daunting, but it is
worthwhile to perform once to provide a library of information that can be used in a large num-
ber of modeling applications.
Vehicle weight is critical to the demand engine power that must be supplied to produce
the tractive force needed to overcome inertial and drag forces and then influence vehicle emis-
sions. NO emissions increase as the vehicle weight increases and this relationship does not vary
much from vehicle to vehicle (Gajendran and Clark 2003). The effects of vehicle age, engine
horsepower ratings, transmission type, and engine exhaust aftertreatment were also investigated
in other literature (Clark et al. 2002; Feng et al. 2005).
The vast majority of heavy-duty vehicles are normal emitters, but a small percentage of
vehicles are high-emitters under every operating condition, typically because they have been
tampered with or they are malfunctioning (i.e., defective or mal-maintained engine sensors or
actuators). As the vehicle ages, general engine wear and tear will increase emission rates mod-
erately due to normal degradation of emission controls of properly functioning vehicles. On the
other hand, as vehicles age, the probability increases that some of the vehicles will malfunction
and produce significantly higher emissions (i.e., become high-emitters). Probability functions
that classify vehicles within specific model years (and later, within specific statistically-derived
vehicle technology groups) are currently being developed through the assessment of certification
testing and various roadside emissions tests. Obtaining additional detailed sources of data for
developing failure models appears to be warranted.
After engine horsepower at the output shaft has been reduced by power losses associ-
ated with fluid pressures, operation of air conditioning, and other accessory loads, there is still an
6-21
-------
additional and significant drop in available power from the engine before reaching the wheels.
Power is required to overcome mechanical friction within the transmission and differential, inter-
nal working resistance in hydraulic couplings and friction of the vehicle weight on axle bearings.
The combined effect of these components is parameterized as drive train efficiency. However,
the more difficult and more significant component of power loss in the drive train is associated
with the inertial resistance of drive train components rotational acceleration (Gillespie 1992).
A heavy-duty truck drive train is significantly more massive than its light-duty counter-
part. The net effect of drive train inertial losses when operating in higher gears on the freeway
may not be significant enough to be included in the model (relative to the other load-related com-
ponents in the model for these heavy vehicles). However, recent studies appear to indicate very
high truck emission rates (gram/second) in "creep mode" stop and start driving activities noted
in ports and rail yards. Thus, high inertial loads for low gear, low speed, and acceleration opera-
tions may contribute significantly to emissions from mobile sources in freight transfer yards and
therefore should not be ignored (Guensler et al. 2006).
The inertial losses are a function of a wide variety of physical drive train characteristics
(transmission and differential types, component mass, etc.) and on-road operating conditions. To
refine the use of inertial losses in the modal model, new drive train testing data will be designed
to evaluate the inertial losses for various engine, drive shaft, differential, axle, and wheel com-
binations and to establish generalized drive train technology classes. Then, gear selection prob-
ability matrices for each drive train technology class and gear and final drive ratio data can be
provided in lookup tables for model implementation, in place of the inertial assumptions current-
ly employed. However, data are currently significantly lacking for development of such lookup
tables.
6.4.2 Roadway Characteristics
The three basic geometric elements of a roadway are the horizontal alignment, the cross-
slope or amount of super-elevation and the longitudinal profile or grade. Among them, road
grade has been shown to have significant impact on engine load and vehicle emissions (Guensler
1993). Other roadway characteristics, such as lane width, are also noted to have a significant
impact on the speed-acceleration profiles of heavy-duty vehicles and can therefore affect engine
load (Grant et al. 1996).
6-22
-------
6.4.3 Onroad Load Parameters
Onroad load parameters include on-road driving trace (second-by-second) or speed/ac-
celeration profile, engine load, on-road operating modes (i.e., idling, motoring, acceleration,
deceleration, and cruise), driver behavior, and so on. Vehicle speed and acceleration are integral
components for the estimation of vehicle road load, and therefore engine load. Previous studies
indicated that increased engine power requirements could result in the increase in NOx emissions
(Ramamurthy and Clark 1999; Feng et al. 2005). Clark et al. reported that the vehicle applica-
tions and duty cycles can have an effect on the emission produced (Clark et al. 2002). This study
found that over a typical day of use for any vehicle, one that stops and then accelerates more
often might produce higher distance-specific emissions, providing all else is held constant.
Passenger and freight payloads together with the vehicle tare weight contribute to the
demand for power that must be supplied to produce the tractive force needed to overcome inertial
and drag forces. Passenger loading functions for transit operations can be obtained through anal-
ysis of fare data or on-board passenger count programs. On the heavy-duty truck side, on-road
freight weight distributions by vehicle class can be derived from roadside weigh station studies.
Ahanotu conducted detailed weigh-in-motion studies in Atlanta and found that reasonable load
distributions by truck class and time of day could be applied in such a modal modeling approach
(Ahanotu 1999). Although additional field studies are warranted to examine the validity of the
Atlanta results over time and the transferability of findings in Atlanta to other metropolitan areas
(especially considering the potential variability in commodity transport, such as agricultural
goods, that may occur in other areas), the modeling methodology seems appropriate.
6.4.4 Environmental Conditions
Environmental conditions under which the vehicle is operated include humidity, ambient
temperature, and ambient pressure. U.S. EPA is currently conducting studies to find the effect of
ambient conditions on HDDV emissions (NRC 2000). The current MOBILE6.2 model includes
correction factors to account for the impact of environmental conditions on vehicle emission
rates. Given the lack of compelling additional data available for analysis, it may be necessary
to ignore the effects of these environmental parameters (altitude, temperature, and humidity) or
simply incorporate the existing MOBILE6.2 correction factors. Preliminary analyses of the data
and methods used to derive the MOBILE6.2 environmental correction factors indicate that the
embedded equations in MOBIL6.2 probably need to be revisited.
6-23
-------
6.4.5 Summary
It is impossible for modeler to include all explanatory variables identified in the literature
review for model development because the explanatory variables available for model develop-
ment and model validation are only a subset of potential explanatory variables identified above.
Therefore, the conceptual model will only include available variables and derived variables in
the data set provided.
6.5 Selection of Explanatory Variables
As mentioned earlier, available explanatory variables for transit buses are only a subset of
potential explanatory variables identified. In brief, available explanatory variables can be sum-
marized as:
• Test information: date, time;
• Vehicle characteristics: license number; model year, odometer reading, engine size,
instrument configuration number;
• Roadway characteristics: road grade (%);
• Onroad loadparameters: engine power (bhp), vehicle speed (mph), acceleration
(mph/s);
• Engine operating parameters: throttle position (0 - 100%), engine oil temperature
(deg F), engine oil pressure (kPa), engine warning lamp (Binary), engine coolant tem-
perature (deg F), barometric pressure reported from ECM (kPa);
• Environmental conditions: ambient temperature (deg C), ambient pressure (mbar),
ambient relative humidity (%), ambient absolute humidity (grains/lb air).
The most important question related to engine power is how to simulate engine power in
the real world for application purposes. Georgia Institute of Technology researchers developed
a transit bus engine power demand simulator (TB-EPDS), which estimates transit bus power
demand for given speed, acceleration, and road grade conditions (Yoon et al. 2005a; Yoon et al.
2005b). Speed-acceleration-road grade matrices were developed from speed and location data
obtained using a Georgia Tech Trip Data Collector. The researchers conclude that speed-accel-
eration-road grade matrices at the link level or the route level are both acceptable for regional
inventory development. However, for micro-scale air quality impact analysis, link-based ma-
6-24
-------
trices should be employed (Yoon et al. 2005a). Although significant uncertainties still exist for
inertial loss which is significant at low speeds and motoring mode with negative engine power,
this research showed that using engine power as load data is possible for application purposes.
Thus we concluded that engine power could be used as load data in estimated emission models.
The relationships between explanatory variables were investigated using S-Plus®. Three
variables were excluded because they have only a single value for all records, and they are en-
gine size, instrument configuration number and engine warning lamp. There are 14 explanatory
variables included in correlation analysis. The correlation matrix is shown in Table 6-5.
Table 6-5 Correlation Matrix for Transit Bus Data Set
*** Correlations for data in: transitbus.data ***
model.year
odometer
temperature
baro
SCB.RH
humid
grade
vehicle, speed
throttle.position
oil.temperature
oil.pressure
coolant, temperature
eng.bar.press
engine.power
model.year
odometer
temperature
baro
SCB.RH
humid
grade
vehicle, speed
throttle.position
oil.temperature
oil.pressure
model.year
1.0000000000
-0.655273106
0.047048515
0.394378106
0.068411842
0.030997734
-0.004241021
-0.014916204
-0.00186824
0.051759069
0.050521339
0.206727241
0.137781076
-0.006066455
SCB.RH
0.0684118427
0.3438144652
0.4882140119
-0.6324801472
1.0000000000
0.9318790788
-0.0060751123
-0.0345026977
0.0134235743
0.096018579
-0.0498528376
odometer
-0.655273106
1.0000000000
0.186771499
-0.704310642
0.343814465
0.39026148
0.00052737
-0.062908098
0.009346571
-0.011881827
-0.098442472
-0.117710067
-0.248876183
0.021283229
humid
0.030997734
0.390261480
0.751260451
-0.649522446
0.931879078
1.000000000
-0.006411009
-0.117870984
-0.024720165
0.087317807
-0.077649741
temperature
0.047048515
0.186771499
1.0000000000
-0.326938545
0.488214011
0.751260451
-0.005590441
-0.225478003
-0.09113266
0.042676227
-0.073256993
0.077114798
-0.260525088
-0.059512654
grade
-0.004241021
0.00052737
-0.005590441
0.002384338
-0.006075112
-0.006411009
1.0000000000
0.000896568
0.020186507
-0.007116669
0.009836954
baro
0.394378106
-0.704310642
-0.326938545
1.0000000000
-0.632480147
-0.649522446
0.002384338
0.054918347
-0.014470281
-0.026744091
0.034212231
0.045844706
0.371021489
-0.035718725
vehicle, speed
-0.014916204
-0.062908098
-0.225478003
0.054918347
-0.034502697
-0.117870984
0.000896568
1.0000000000
0.387705398
0.018641433
0.567493814
6-25
-------
coolant, temperature
eng.bar.press
engine. power
ace
model.year
odometer
temperature
baro
SCB.RH
humid
grade
vehicle, speed
throttle.position
oil.temperature
oil.pressure
coolant, temperature
eng.bar.press
engine. power
ace
model.year
odometer
temperature
baro
SCB.RH
humid
grade
vehicle, speed
throttle.position
oil.temperature
oil.pressure
coolant, temperature
eng.bar.press
engine. power
0.2005559889
-0.3663829274
0.0257436423
0.0000403711
throttle.position
-0.001868240
0.009346571
-0.091132660
-0.014470281
0.013423574
-0.024720165
0.020186507
0.387705398
1.000000000
0.012077329
0.681336402
0.059605193
0.102861968
0.959310116
0.660747116
coolant.temperature
0.206727200
-0.117710000
0.077114700
0.045844700
0.200555900
0.171558800
-0.014531500
0.072998100
0.059605100
0.335667300
-0.298083200
1.000000000
0.284506700
0.050584800
0.171558840
-0.373540032
-0.003279122
0.003340728
oil.temperature
0.051759069
-0.011881827
0.042676227
-0.026744091
0.096018570
0.087317807
-0.007116669
0.018641433
0.012077329
1.000000000
-0.117896787
0.335667341
0.059886972
0.007171781
-0.004185245
eng.bar.press
41 0.137781076
67 -0.248876183
98 -0.260525088
06 0.371021489
88 -0.366382927
40 -0.373540032
24 0.002132063
99 0.143270319
93 0.102861968
41 0.059886972
57 0.022549030
00 0.284506753
53 1.000000000
45 0.089702976
-0.014531524 0.072998199
0.002132063 0.143270319
0.021662091 0.303209657
0.012930076 0.000224126
oil.pressure
0.050521339
-0.098442472
-0.073256993
0.034212231
-0.049852837
-0.077649741
0.009836954
0.567493814
0.681336402
-0.117896787
1.000000000
-0.298083257
0.022549030
0.656609695
0.465493435
engine .power
-0.006066455
0.021283229
-0.059512654
-0.035718725
0.025743642
-0.003279122
0.021662091
0.303209657
0.959310116
0.007171781
0.656609695
0.050584845
0.089702976
1.000000000
All variable pairs with correlation coefficients greater than 0.5 were scrutinized and
subjected to further analysis, which invariably helped in paring down the number of variables.
The values in the correlation matrix show that throttle position and engine power, ambient rela-
tive humidity and ambient absolute humidity are highly correlated (higher than 0.90). Model
6-26
-------
year and odometer, odometer and barometric pressure, barometric pressure and ambient relative
humidity, barometric pressure and ambient absolute humidity, ambient absolute humidity and
temperature, oil pressure and throttle position, oil pressure and vehicle speed, oil pressure and
engine power, throttle position and acceleration, engine power and acceleration are moderately
correlated (higher than 0.50). Other pairs of variables, however, have only slight correlations.
The relationship between throttle position and engine power is shown in Figure 6-20.
Since engine power is derived from percent engine load, engine torque, and engine speed, and
previous studies indicated that increased engine power requirements could result in the increase
in NOx emissions (Ramamurthy and Clark 1999; Feng et al. 2005), the author retained engine
power in the database.
:Y=.1.,2P.13+Z3754X
.R Square =092038
0 10 20 30 40 50 60 70 80 90 100
Throttle Position (0-100%)
Figure 6-20 Throttle Position vs. Engine Power for Transit Bus Data Set
Ambient relative humidity and ambient absolute humidity provide the same informa-
tion in two different ways, and either is enough to consider the influence of ambient humidity on
emissions. The author retained ambient relative humidity in the database.
6-27
-------
Three other findings related to the correlation matrix are:
1. All environmental characteristics, like temperature, humidity, and barometric pres-
sure, are moderately correlated with each other (Figure 6-21), which indicates mod-
elers should consider such relationships when developing environmental factors.
2. Engine power is correlated with not only on-road load parameters such as vehicle
speed, acceleration, and road grade, but also engine operating parameters such as
throttle position and engine oil pressure. Engine power in this data set is derived
from measured engine speed, engine torque and percent engine load. On the other
hand, engine power could be derived theoretically from vehicle speed, accelera-
tion and road grade using an engine power demand equation. So, engine power
can connect on-road modal activity with engine operating conditions at this level.
This fact strengthens the importance of introducing engine power into a conceptual
emissions model and to improve the ability to simulate engine power for regional
inventory development.
3. Engine operating parameters, like throttle position (0 - 100%), engine oil pres-
sure (kPa), engine oil temperature (deg F), engine coolant temperature (deg F), and
barometric pressure reported from ECM (kPa), are highly or moderately related
to on-road operating parameters. For example, engine power and throttle position
are highly correlated, while oil pressure and vehicle speed, oil pressure and en-
gine power, throttle position and acceleration are moderately correlated. Although
engine operating parameters may have power to explain the variability of emis-
sion data, it is difficult to obtain such data in the real world for modeling purposes.
These four variables are retained for further analysis of their relationships with
emissions. Although these four variables will be excluded from the emission model
at this time, analysis of these potential relationships may indicate a need for further
research in this area.
6-28
-------
30
26
24
22
14 -
~.T.r~. I."*"-".-"..,
** *« ......
55 -
35
90
25 -
20 -
970 980 990
Barometric Pressure (mbar)
*******
;
- • ***+ ***** *+'•
* • **»• • •
• »*• •*•
• - - - *
•.*>•••.* +
*»
**
»• »* *
t ....+..
**•»» +
•
* +*
• *»
•••* »4 :
***+ *••*»»
»»' "«~ :
»+•**•*•»*+» *•*«-»»*
• * ...... :
i 1
50 970 980 990 100
Baromelric Pressure (mbar)
10 20 30 40 SO 60
Ambient Relative Humidity (%)
/O
Figure 6-21 Scatter plots for environmental parameters
6-29
-------
CHAPTER 7
7. MODAL ACTIVITY DEFINITIONS DEVELOPMENT
7.1 Overview of Current Modal Activity Definitions
Current research suggests that vehicle emission rates are highly correlated with modal
vehicle activity. Modal activity is a vehicle activity characterized by cruise, idle, acceleration
or deceleration operation. Consequently, a modal approach to transportation-related air quality
modeling is becoming widely accepted as more accurate in making realistic estimates of mobile
source contribution to local and regional air quality. Research at Georgia Tech has clearly identi-
fied that modal operation is a better indicator of emission rates than average speed (Bachman
1998). The analysis of emissions with respect to driving modes, also referred to as modal emis-
sions, has been done in several recent researches (Barth et al. 1996; Bachman 1998; Fomunung
et al. 1999; Frey et al. 2002; Nam 2003; Barth et al. 2004). These studies indicated that driv-
ing modes might have the ability to explain a significant portion of variability of emission data.
Usually, driving can be divided into four modes: acceleration, deceleration, cruise, and idle. But
driving mode definitions in literature were somewhat arbitrary. To define the driving modes or
choose more reasonable definitions for the proposed modal emissions model, current driving
mode definitions used in different modal emission models need to be investigated first.
MEASURE's Definitions
Researchers at Georgia Tech developed the MEASURE model in 1998 (Guensler et al.
1998). This model was developed from more than 13,000 laboratory tests conducted by the
EPA and CARB using standardized test cycle conditions and alternative cycles (Bachman 1998).
Modal activities variables were introduced into the MEASURE model as follows: acceleration
(mph/sec), deceleration (mph/sec), cruise (mph) and percent in idle time. In addition, two surro-
gate variables were also developed, inertialpower surrogate (IPS) (mph2/s), which was defined
as acceleration times velocity and drag power surrogate (DPS) (mph3/s), which was defined as
acceleration times velocity squared. Within each mode, several 'cut points', or threshold values,
7-1
-------
were specified and used to create several categories. In total, six threshold values were denned
for acceleration, three for deceleration, five for cruise modes, seven for IPS, and seven for DPS.
Modal activity surrogate variables were added as percent of cycle time spend in specified operat-
ing conditions (Fomunung et al. 1999).
NCSU's Definitions
Dr. Frey at NCSU defined four modes of operation (idle, acceleration, deceleration, and
cruise), for U.S. EPA's MOVES' model in 2001 (Frey and Zheng 2001; Frey et al. 2002). The
following description is directly cited from his report (Frey et al. 2002).
Idle is defined as based upon zero speed and zero acceleration. The
acceleration mode includes several considerations. First, the vehicle must be
moving and increasing in speed. Therefore, speed must be greater than zero and
the acceleration must be greater than zero. However, vehicle speed can vary
slightly during events that would typically be judged as cruising. Therefore,
in most instances, the acceleration mode is based upon a minimum accelera-
tion of 2 mph/sec. However, in some cases, a vehicle may accelerate slowly.
Therefore, if the vehicle has had a sustained acceleration rate averaging at least
1 mph/sec for at least three seconds or more, that is also considered accelera-
tion. Deceleration is defined in a similar manner as acceleration, except that the
criteria for deceleration are based upon negative acceleration rates. All other
events not classified as idle, acceleration, or deceleration, are classified as cruis-
ing. Thus, cruising is approximately steady speed driving but some drifting of
speed is allowed.
Physical Emission Rate Estimator's (PERE 's) Definitions
Dr. Nam developed his definitions when he introduced his Physical Emission Rate Esti-
mator (PERE) model in 2003 (Nam 2003). Idle is defined as speed less than 2 mph. Accelera-
tion mode is based on acceleration rate greater than 1 mph/sec. However, deceleration is based
on deceleration rate less than -0.2 mph/sec. Other events are classified as cruise mode and the
acceleration range is between -0.2 mph/sec and 1 mph/sec. Nam also mentioned in his report
that the definition of cruise (based only on acceleration) will change depending on the speed in
future studies.
7-2
-------
Summary
Current driving mode definitions related to modal emission models are all significantly
different from each other. NCSU used one absolute critical value, 2 mph/sec, for acceleration
and deceleration mode. However, PERE chose two different critical values, 1 mph/sec and -0.2
mph/sec, for acceleration and deceleration mode individually. The critical values, 2 mph/sec, 1
mph/sec, or 0.2 mph/sec, were chosen somewhat arbitrarily. MEASURE used several thresh-
old values to add modal activity surrogate variables. Table 7-1 summarizes these modal activity
definitions.
Table 7-1 Comparison of Modal Activity Definition
MEASURE NCSU PERE
Idle
Acceleration
Deceleration
Cruise
Speed=0, Acc=0
Acc>6,Acc>5,Acc>4,
Acc>3,Acc>2,Acc>l
Acc<-3,Acc<-2, Acc<-l
Speed>70, Speed>60,
Speed>50, Speed>40,
Speed>30
Speed=0, Acc=0
Acc>2 or Acc>l for
three seconds
Acc<-2 or Acc<-l for
three seconds
Other events
Speed<2
Acc>l
Acc<-0.2
-0.2
-------
dant on the available speed/acceleration data and data quality. For example, a lack of zero speed
records does not mean that there is no idle activity in the data set.
The initial proposed modal activity definitions were defined as follows:
• Idle is defined as based on speeds less than 2.5 mph and absolute acceleration less
than 0.5 mph/sec.
• Acceleration mode is based upon a minimum acceleration of 0.5 mph/sec.
• Deceleration is denned in a manner similar to acceleration, except that the criteria for
deceleration are based upon negative acceleration rates.
• All other events not classified as idle, acceleration, or deceleration, are classified as
cruise.
At the same time, several different critical values were chosen to examine the reasonable-
ness of the proposed criteria. Four different mode definitions using different critical values are
shown in Table 7-2.
Table 7-2 Four Different Mode Definitions and Modal Variables
^^^^^^^^^^| Idle Acceleration Deceleration Cruise
Definition 1
Definition 2
Definition 3
Definition 4
Speed < 2.5 & abs(acc) < 0.5
Speed<2.5&abs(acc)< 1
Speed < 2.5 & abs(acc) < 1.5
Speed<2.5&abs(acc)<2
Ace > 0.5
Acc> 1
Acc> 1.5
Acc>2
Ace < -0.5
Acc<-l
Ace < -1.5
Ace < -2
Other
Other
Other
Other
Note: Unit for speed is mph, unit for acceleration is mph/sec.
A program was written in MATLAB™ to determine the driving mode for second-by-
second data and estimate the average value of emissions for each of the driving modes. At the
same time, average modal emission rates were estimated for each mode based on different modal
activity definitions in Table 7-2. Figures 7-1 to 7-3 present a comparison of average modal emis-
sion rates for different pollutants (NO , CO, and HC).
7-4
-------
Figure 7-1 Average NO Modal Emission Rates for Different Activity Definitions
Figure 7-2 Average CO Modal Emission Rates for Different Activity Definitions
7-5
-------
Different Modal Aclmly CVin lions
Figure 7-3 Average HC Modal Emission Rates for Different Activity Definitions
These four different modal activity definitions show a kind of consistent pattern. The
average emissions during the acceleration mode are significantly higher than any other driving
mode for all of the pollutants. The average emission rate during deceleration mode is the lowest
of the four modes for NOx and CO emissions while the average emission rate during idle mode is
the lowest of the four modes for HC emissions. The average cruising emission rate is typically
higher than the average idling and decelerating emission rate, except for CO emission in defini-
tions 3 and 4.
To assess whether the average modal emission rates are statistically significantly different
from each other, two-sample tests were estimated for each pair. Lilliefors tests for goodness of
fit to a normal distribution were first used for each mode based on different modal activity defini-
tions. The results show that all of them reject the null hypothesis of normal distribution at 5%
level. A Kolmogorov-Simirnov two-sample test was chosen to take place of the t-test because
the assumption of normal distribution was questionable. The Kolmogorov-Smirnov two-sample
test is a test of the null hypothesis that two independent samples have been drawn from the same
population (or from populations with the same distribution). The test uses the maximal differ-
ence between cumulative frequency distributions of two samples as the test statistic. Results of
the Kolmogorov-Smirnov two-sample tests are presented in Table 7-3 in terms of p-values where
7-6
-------
"Ace" represents acceleration mode while "Dec" represents deceleration mode. The cases where
the p-value is less than 0.05 indicate that the distributions are different at the 5% level. All p-
values for 72 possible pairwise comparisons are lower than 0.05, indicating that the distributions
for these pairs are statistically different from each other.
Table 7-3 Results for Pairwise Comparison for Modal Average Estimates In Terms of P-value
Definitonl
Definiton2
DefinitonS
Definiton4
^^^^ Idle-Ace Idle-Dec Idle-Cruise Ace-Dec Ace-Cruise Dec-Cruise
NO
X
CO
HC
NOx
CO
HC
NOx
CO
HC
NOx
CO
HC
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
The modal emission analysis results suggest that all four mode definitions proposed in
Table 7-2 appear reasonable. These modal definitions allow some explanation of differences in
emissions based upon driving mode, as revealed by the fact that the modal emission distributions
differ from each other. A further step is taken here to see which mode definition would be identi-
fied as the most appropriate definition by utilizing HTBR technique. For each definition, three
dummy variables are added to represent idle, acceleration, and deceleration mode. The regres-
sion trees are developed between emission data and these three dummy variables for each defini-
tion are shown in Figures 7-4 to 7-6. The sensitivity test results based on these regression trees
for NO , CO, and HC are summarized in Table 7-4.
7-7
-------
(a) Definition 1
(c) Definition 3
(b) Definition 2
(d) Definition 4
Figure 7-4 HTBR Regression Tree Result for NO Emission Rate
(a) Definition 1 (b) Definition 2
(c) Definition 3
(d) Definition 4
Figure 7-5 HTBR Regression Tree Result for CO Emission Rate
-------
(a) Definition 1
(b) Definition 2
(c) Definition 3
(d) Definition 4
Figure 7-6 HTBR Regression Tree Result for HC Emission Rate
Table 7-4 Sensitivity Test Results for Four Mode Definition
NO Mode Number Deviance Mean ER Residual Mean Deviance
Definition 1
Definition 2
Definition 3
Idle
Acceleration
Deceleration
Cruise
Idle
Acceleration
Deceleration
Cruise
Idle
Acceleration
Deceleration
Cruise
105976
29541
25931
22242
28262
31064
18894
16644
39374
32010
13417
12768
47781
1435.00
11.04
320.90
41.32
365.10
16.05
206.50
21.14
567.80
23.07
130.50
14.27
739.30
0.10680
0.03235
0.22480
0.02671
0.13930
0.03342
0.23110
0.02214
0.14070
0.03470
0.2297
0.02065
0.14350
0.006967 = 738.3 / 106000
0.007658 = 811.5/106000
0.00856 = 907.1/106000
7-9
-------
NO Mode Number Deviance Mean ER Residual Mean Deviance
Definition 4
CO
Definition 1
Definition 2
Definition 3
Definition 4
HC
Definition 1
Definition 2
Idle
Acceleration
Deceleration
Cruise
Idle
Acceleration
Deceleration
Cruise
Idle
Acceleration
Deceleration
Cruise
Idle
Acceleration
Deceleration
Cruise
Idle
Acceleration
Deceleration
Cruise
Idle
Acceleration
Deceleration
Cruise
Idle
Acceleration
Deceleration
Cruise
32717
8719
9452
55088
105765
29287
25866
22456
28156
30764
18864
16919
39218
31691
13402
13035
47637
32375
8712
9681
54997
103405
28780
25122
22287
27216
30250
18330
16805
38020
30.240
77.150
9.191
879.200
771.300
2.166
559.400
3.903
47.380
4.185
484.900
2.410
88.710
9.131
410.100
1.861
138.700
15.5200
339.1000
0.7047
198.7000
0.40270
0.09337
0.09143
0.07644
0.11600
0.09492
0.06668
0.05355
0.16010
0.03583
0.22600
0.02015
0.14490
0.032370
0.005590
0.099740
0.006564
0.018910
0.005944
0.122400
0.005803
0.021250
0.006610
0.147600
0.005454
0.024440
0.007365
0.179700
0.005049
0.028560
0.0014960
0.0009217
0.0022310
0.0012180
0.0016530
0.0009176
0.0023860
0.0011790
0.0016680
0.009397 = 995.8 / 106000
0.005795 = 612.9/105800
0.005486283 = 580.2 / 105800
0.005293 = 559.8 / 105800
0.005239 = 554/105800
3.648e-006 = 0.3772 / 103400
3.629e-006 = 0.3752/103400
7-10
-------
NO Mode Number Deviance Mean ER Residual Mean Deviance
Definition 3
Definition 4
Idle
Acceleration
Deceleration
Cruise
Idle
Acceleration
Deceleration
Cruise
31157
12999
12970
46279
31849
8443
9613
53500
0.09651
0.04355
0.04256
0.19330
0.09835
0.02944
0.03257
0.21760
0.0009258
0.0025110
0.0011600
0.0016890
0.0009364
0.0026390
0.0011470
0.0017120
3.636e-006 = 0.376 / 103400
3.656e-006 = 0.378 / 103400
7.3 Conclusions
Comparison of modal average estimates shows that the average modal emission rates are
statistically different from each other for three different pollutants. HTBR regression tree results
demonstrate that all four definitions can work well to divide the database. Comparisons of re-
sidual mean deviance indicate that definition 1 has the smallest residual mean deviance for NOx
(definition 4 for CO and definition 2 for HC). However, differences were small. At this time, it
is difficult to choose one definition for three pollutants based just on sensitivity analysis results
in this chapter. The analysis results in this section indicate that driving mode definition could
not be transferred directly from one research study to another research study. A better approach
would be to test several different critical values and obtain the most suitable definition instead of
testing only one definition developed from other research. For this research, more analysis will
be performed in the chapters that follow to develop the most suitable driving mode definitions.
7-11
-------
CHAPTER 8
8. IDLE MODE DEVELOPMENT
In Chapter 7, the concept of driving modes was introduced and several sensitivity tests
(comparison of modal average estimates, comparison of HTBR regression tree results, and com-
parison of residual mean deviance) were performed for four different mode definitions. Based
on sensitivity analysis results, it is difficult to choose one definition for three pollutants at this
moment. More analysis will be performed next to develop the most suitable driving mode defini-
tion. This chapter will focus on developing the suitable definition for idle mode.
Theoretically, idle mode is usually defined as zero speed and zero acceleration. In real
world data collection efforts, this definition must be refined due to the presence of speed mea-
surement error. In this research, idle mode will be defined by speed and acceleration. The criti-
cal value could not be deduced directly from previous research. It is better to test several critical
values statistically and identify the most suitable idle definition.
8.1 Critical Value for Speed in Idle Mode
Three critical values were tested to get the appropriate critical value for speed in defin-
ing idle activity. Figures 8-1 to 8-3 illustrate engine power vs. emission rates for three pollutants
for three critical speed values: 1 mph, 2.5 mph, and 5 mph. Figure 8-4 compares engine power
distributions for these three critical values. Because engine power distributions for three pollut-
ants exhibit similar patterns, only NO emissions are shown in Figure 8-4. Tables 8-1 and 8-2
provide the engine power distribution for these three critical values in two ways: by number and
percentage.
8-1
-------
Speed <= t mph
_. 1.5
a
S
200 250 300 -0 50 ,00 150 200 260
En»mP«nr(bh|0 EnjiM Pownr (hhp)
0 50 100 160 JOO
Engim Power (blip)
Figure 8-1 Engine Power vs. NO Emission Rate for Three Critical Values
Speed <= 1 mph
0 50 100 150 200 25C 300
Engine Power (bhp)
0 50 100 150 200 250 300
Enjms Power fbhp)
0 50 100 160 200 S50 30G
Engine Power (bhp]
Figure 8-2 Engine Power vs. CO Emission Rate for Three Critical Values
8-2
-------
Speed <= 5 mph
SpewJ «= 2.5 mph
^004
I ;
"0 50 100 150 200 250 300 "fl 50 100 150 200 250 300 UQ SO tOO 150 200 250 300
Engine Powet (blip) Engine Pews? (bhp) Engine Power (bhp)
Figure 8-3 Engine Power vs. HC Emission Rate for Three Critical Values
0 SO 100 150 201 J50 300 "0 SO 100 ISO 200 350 300 "l> SO 100 ISO 200 250 300
Figure 8-4 Engine Power Distribution for Three Critical Values based on NO,_ Emissions
-------
Table 8-1 Engine Power Distribution for Three Critical Values for Three Pollutants
Engine Power (brake horsepower (bhp)
Speed Pollutant * y v y Vf
[020) [2030) [3040) [4050) > 50 Total
< 5 mph
<2.5mph
< 1 mph
NOY
CO
HC
NO
V
CO
HC
NO
V
CO
HC
31631
31258
30737
29222
28880
28373
27516
27217
26713
2272
2269
2264
2098
2096
2093
2011
2010
2007
1323
1316
1321
1196
1189
1194
1100
1093
1099
152
149
147
83
81
80
51
51
48
2348
2342
2284
1143
1139
1106
700
699
680
37726
37334
36753
33742
33385
32846
31378
31070
30547
Table 8-2 Percentage of Engine Power Distribution for Three Critical Values for Three Pollutants
Engine Power (brake horsepower (bhp)
pee o utant ^ [2Q 30) [3040) [4050) > 50 Total
< 5 mph
<2.5mph
< 1 mph
NO
V
CO
HC
NO
CO
HC
NO
CO
HC
83.84%
83.73%
83.63%
86.60%
86.51%
86.38%
87.69%
87.60%
87.45%
6.02%
6.08%
6.16%
6.22%
6.28%
6.37%
6.41%
6.47%
6.57%
3.51%
3.52%
3.59%
3.54%
3.56%
3.64%
3.51%
3.52%
3.60%
0.40%
0.40%
0.40%
0.25%
0.24%
0.24%
0.16%
0.16%
0.16%
6.22%
6.27%
6.21%
3.39%
3.41%
3.37%
2.23%
2.25%
2.23%
100%
100%
100%
100%
100%
100%
100%
100%
100%
Based on the analysis above, a critical value of 5 mph includes more data points with
higher engine power (>50 bhp) than 2.5 mph and 1 mph. However, there is no large difference
for engine power distributions between 2.5 mph and 1 mph. These two critical values for speed
will be tested further with different acceleration values in the next section. The results will be
used to make a final decision with regards to deceleration mode.
8.2 Critical Value for Acceleration in Idle Mode
After setting the critical value for speed, the next step is to determine a critical value for
acceleration. In total, four options were tested.
• Option 1: speed < 2.5 mph and absolute acceleration < 2 mph/s
8-4
-------
• Option 2: speed < 2.5 mph and absolute acceleration < 1 mph/s
• Option 3: speed < 1 mph and absolute acceleration < 2 mph/s
• Option 4: speed < 1 mph and absolute acceleration < 1 mph/s
Using the same method as outlined in the previous section, Figures 8-5 to 8-7 illustrate
engine power vs. emission rates for three pollutants for four options above. Figure 8-8 compares
engine power distribution for data falling into these four options. Because engine power distri-
butions for three pollutants exhibit a similar pattern, only NO emissions are shown in Figure
8-8. Tables 8-3 and 8-4 provide the engine power distribution for four options in two ways: by
number and percentage.
Option I
OelitMi 2 Optun 3 Option <
2.5 - 2.Sr 25r
25
2
15
1
1
-
' J
Jifll
,„
*. '
' • "*
r
f
100 3DO
Engine Power (bhp)
Q S 00 200 300
Engine Pawe* {bhp)
0 100 200 300
Engine Powei (bhp)
100 300
ncpni Power (bhp)
°
Figure 8-5 Engine Power vs. NO Emission Rate for Four Options
8-5
-------
Option 1
Qpllon 2
Cptron3
Gplwjfi A
o '5
1
Jjj
II
i '•.
jS't
fe
g-£.
o a
ne Power (I
,
D a
•T'l
35
2 r
S 2
1
|
0
1
05
0
D 1
•
••'-,". *
^l:!,.
35
: S 2
1
8
i
i
••^
^M
*
s* :.
#?
r
i
100 200 300 "0 100 200 300
Engine Row,, (bhp) Engine Powar (bhp)
w
Ofi
°c
.
•J
^'"^.
II
En*
• "
0 J(
ne Powe« (
]
D 30
Up)
Figure 8-6 Engine Power vs. CO Emission Rate for Four Options
Figure 8-7 Engine Power vs. HC Emission Rate for Four Options
8-6
-------
35r
Option 1
1.6
OS
36 r
Option 2
1.5
3.5
110"
Option 3
x1(f
Option 4
100 200 300
Engine Power
Figure 8-8 Engine Power Distribution for Four Options based on NO Emission Rates
Table 8-3 Engine Power Distribution for Four Options for Three Pollutants
Engine Power (brake horsepower (bhp))
o utants ^2o^ p030) [3040) [4050) > 50 Total
Option 1
Option 2
Option 3
Option 4
NO
V
CO
HC
NO
CO
HC
NO
CO
HC
NO
CO
HC
28694
28366
27855
27571
27284
26771
27367
27071
26569
26719
26446
25944
2075
2073
2070
2030
2028
2026
1999
1998
1995
1969
1968
1966
1177
1170
1175
1120
1114
1119
1091
1084
1090
1057
1051
1056
78
76
75
53
51
51
50
50
47
34
34
32
693
690
674
290
287
283
527
526
512
205
204
198
32717
32375
31849
31064
30764
30250
31034
30729
30213
29984
29703
29196
5-7
-------
Table 8-4 Percentage of Engine Power Distribution for Three Critical Values for Three Pollutants
Engine Power (brake horsepower (bhp)
o utants ^2o^ [2030) [3040) [4050) > 50 Total
Option 1
Option 2
Option 3
Option 4
NO
CO
HC
NO
CO
HC
NO
CO
HC
NO
Y
CO
HC
87.70%
87.62%
87.46%
88.76%
88.69%
88.50%
88.18%
88.10%
87.94%
89.11%
89.03%
88.86%
6.34%
6.40%
6.50%
6.53%
6.59%
6.70%
6.44%
6.50%
6.60%
6.57%
6.63%
6.73%
3.60%
3.61%
3.69%
3.61%
3.62%
3.70%
3.52%
3.53%
3.61%
3.53%
3.54%
3.62%
0.24%
0.23%
0.24%
0.17%
0.17%
0.17%
0.16%
0.16%
0.16%
0.11%
0.11%
0.11%
2.12%
2.13%
2.12%
0.93%
0.93%
0.94%
1.70%
1.71%
1.69%
0.68%
0.69%
0.68%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
Based on the above analysis, data falling into option 2 and option 4 contain fewer data
points with higher engine power (>50 bhp) than data falling into option 1 and option 3. But a
large difference is not observerd in the engine power distribution for data falling into option 2
and option 4. Based upon these results, the idle mode is denned as speed < 2.5 mph and absolute
acceleration < 1 mph/s.
8.3 Emission Rate Distribution by Bus in Idle Mode
After denning "speed < 2.5 mph and absolute acceleration < 1 mph/s" as idle mode, emis-
sion rate histograms for each of the three pollutants for idle operations are presented in Figure
8-9. Figure 8-9 shows significant skewness for all three pollutants for idle mode. Inter-bus
response variability for idle mode operations is illustrated in Figures 8-10 to 8-12 using median
and mean of NOx, CO, and HC emission rates. Table 8-5 presents the same information in tabu-
lar form. The difference between median and mean is also an indicator of skewness.
-------
C 01 02 03 0< 0.5 0 01 02 0.3 0.4 05 0 O.Ot 002 0.03 004 005
NOx Emission Rale (o/s) CO Emission Rate (g's) HC Emission Rate (9/5)
Figure 8-9 Histograms of Three Pollutants for Idle Mode
Figure 8-10 Median and Mean of NO Emission Rates in Idle Mode by Bus
8-9
-------
ll
0012
15 DOCS
I
0 2 4 6 8 ID 12 14 16
Bos No
I
e » in 13 14
Bus No
Figure 8-11 Median and Mean of CO Emission Rates in Idle Mode by Bus
D 2 4 6 8 10 12 14 IE
0 2 4 6 8 10 12 11 16
Bus No
Figure 8-12 Median and Mean of HC Emission Rates in Idle Mode by Bus
8-10
-------
Table 8-5 Median, and Mean of Three Pollutants in Idle Mode by Bus
NO CO HC
X
Bus ID Median Mean Median Mean Median Mean
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
0.071020
0.020455
0.022555
0.025050
0.055210
0.028880
0.023370
0.033210
0.026200
0.027115
0.027605
0.027790
0.024210
0.023750
0.032140
0.059444
0.020216
0.032140
0.026480
0.054766
0.035050
0.025393
0.038500
0.027371
0.028768
0.036734
0.027520
0.026982
0.024339
0.030031
0.004830
0.005740
0.000670
0.003110
0.013150
0.005390
0.000960
0.006730
0.000930
0.001915
0.002980
0.002290
0.001205
0.002590
0.004860
0.009145
0.008895
0.005408
0.003601
0.011739
0.013385
0.001572
0.011425
0.001218
0.004044
0.009836
0.002736
0.003428
0.005782
0.006155
0.00072
0.00063
0.00007
0.00071
0.00220
0.00076
0.00019
0.00085
0.00024
0.00020
0.00034
0.00065
0.00043
0.00043
0.00055
0.002441
0.000865
0.000385
0.000927
0.002272
0.001311
0.000219
0.001531
0.000298
0.000228
0.000624
0.000950
0.000498
0.000453
0.000579
Figures 8-10 to 8-12 and Table 8-5 illustrate that bus 372 has the largest median and
the second largest mean for CO and HC emissions, and the second largest median and the sec-
ond largest mean for NO emissions. The activity of bus 372 in terms of distribution of engine
power by bus was compared to that of other buses in an effort to identify why the emission rates
were significantly higher than for other buses. Table 8-6 and Figure 8-13 show that bus 372 has
higher min (2nd), 1st quartile (2nd), median (1st), and 3rd quartile (2nd) engine power compared
to the other 14 buses. Engine power in idle mode may include cooling fan, air compressor, air
conditioner, and alternator loads (Clark et al. 2005). Considering test buses and engines are simi-
lar in many ways, this difference might be caused by variability across the engines, or may be
associated with unrecorded air conditioner use. In analyzing the database, the modeler could not
identify a contribution of air conditioner to engine power in idle mode. So, model development
will include these data but readers should be cautioned that the noted variability is an indication
that significant numbers of vehicles may need to be tested in the future if such inter-engine dif-
ferences are significant in the fleet. In addition, the role of air conditioning usage on engine load
in transit buses warrants additional future research.
8-11
-------
r :cu:
i coo
HE i 1 i 35CDi 1 1 a53Di 1 : cBOO
SCO
-coo -
«a
CM
OJJ
1X0
.00 J
CM
«r -
:LU:
- w: -
co:
COO
:oo
CM
0 100 300 300 0 100 200 300 0 100 203 300 0 100 2O> 300 0 1CO 300 300 0 100 200 300 0 100 203 300 0 ICO 200 3DO
Bus 3E1 Bus 3B3 Bus 3y Bux 373 Bus 375 Bui 377 Bus 379
-
-
-
- J
i'.O'j
'TO
'-(l-i
tJJ-'
Jjj
ran
i
-
con
OX'
-n-
EClj
:oj
roii
••
•
-
-
-t i
to:
>co:
mr
so:
co:
TO:
-
-
«c
•rnt
vrr
SLL
co:
m:
i
-
-
-
i
iff.
'VJ!
TOT
;ct
OE
OT
n
-
jjiij
•:ri:
m-
50C
OK
5CC
1!
-
-
-
-
-
-~-l 1
n no xa 109 0 1002003x0 mo an 300 a 1003113000 IDD ao sen o iao3003ana 100 200
Busseo Busaai tj<..-«.
Figure 8-13 Histograms of Engine Power in Idle Mode by Bus
8-12
-------
Table 8-6 Engine Power Distribution in Idle Mode by Bus
Bus ID Min lstQuartile Median 3rd Quartile Max
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
3.92
0
0
0
0
0
0
0
2.67
0
0
0
0
0
4.68
15.36
5.35
13.1
13.18
26.44
12.52
8.5
15.86
7.85
8.7
7.35
7.16
6.01
4.53
9.18
18.7
12.52
13.34
13.85
31.84
13.81
9.17
17.15
8.49
10.49
8.52
10.03
7.34
7.19
13.33
19.83
13.83
15.16
14.99
33.10
18.08
9.85
19.42
9.17
11.17
13.89
12.5
8.51
8.51
14.46
135.43
89.47
152.94
154.51
79.08
167.72
166.86
126.64
100.99
148.28
99.04
91.86
117.39
139.05
105.44
8.4 Discussions
8.4.1 High HC Emissions
Figure 8-7 shows that there are some high HC emissions in idle mode. Based on defini-
tions of "speed < 2.5 mph and absolute acceleration < 1 mph/s", 388/30250=1.28% of data points
in idle mode for HC are high emissions. These high emissions were noted in the HC emissions
data, not in NO and CO. All high HC emissions have been coded as high-idle to determine if
they are related to any other parameters. Tree analysis could be used for this screening analysis.
After screening engine speed, engine power, engine oil temperature, engine oil pressure, engine
coolant temperature, ECM pressure, and other parameters, no specific operating parameters re-
lated to these high-idle emissions were identified.
On the other hand, regression tree analysis results by bus and trip are presented in Figure
8-14. The left figure shows that these high HC emissions occurred in bus 360 and 372 while the
right figure shows that these high HC emissions happened in bus 360 trip 4 and bus 372 trip 1.
Even for HC emissions, Figure 8-14 shows that these high emissions are not a common situa-
tion in idle mode. There are 1529 idle segments in total for 15 buses, but most of these high HC
emissions came just from three idle segments. These three idle segments are: bus 360 trip 4 idle
8-13
-------
segment 1 (130 seconds), bus 360 trip 4 idle segment 38 (516 seconds) and bus 372 trip 1 idle
segment 1 (500 seconds). More specifically, bus 360 trip 4 idle segment 1 contains 102 high HC
emissions, bus 360 trip 4 idle segment 38 contains 264 high HC emissions, while bus 372 trip 1
idle slots contain 13 high HC emissions. Figures 8-15 to 8-17 illustrate time series plots for HC
for these three idle segments while vehicle speed, engine speed, engine power, engine oil tem-
perature, engine oil pressure, engine coolant temperature and ECM pressure are presented, too.
These figures do not include NO and CO because NO and CO do not show such patterns as
these three idle segments for HC. These three idle segments contain 379 high HC emissions in
total. Thus about 98% of high emissions came from three idle segments only. Exclusion of these
three idle segments based on all current information is difficult. The modeler prefers to keep
these data since these outliers might reflect variability in the real world. However, future data
collection efforts should seek to identify the causes of such events.
By Bus
By Trip
i
••• -
tnf.t,\ '.
(Bus 372 tip 1>
0«Mi
I Bus 360 Up 4)
Figure 8-14 Tree Analysis Results for High HC Emission Rates by Bus and Trip
:/vw
Figure 8-15 Time Series Plot for Bus 360 Trip 4 Idle Segment 1(130 Seconds)
8-14
-------
ir
S'»
H
-it
Figure 8-16 Time Series Plot for Bus 360 Trip 4 Idle Segment 38 (516 Seconds)
Figure 8-17 Time Series Plot for Bus 372 Trip 1 Idle Segment 1 (500 Seconds)
8.4.2 High Engine Operating Parameters
Figure 8-15 shows that engine speed once jumped to about 2000 rpm during bus 360 trip
4 idle segment 1, while corresponding engine power and engine oil pressure jumped, too. This
jump lasted only 9 seconds. There are several reasons which might be responsible for this jump.
Possibly bus 360 moved slowly from one location to another location while the GPS failed to
detect the movement. Other explanations might be that the engine experienced a computer or
sensor problem. This kind of jump, higher engine speeds (about 2000 rpm) accompanied by
higher engine power and engine oil pressure in idle mode, did occur in the real world. The jump
shown in Figure 8-16 was not such an occurrence since engine speed was only about 1000 rpm
during that jump. After screening the whole dataset, another example of a jump is shown in Fig-
ure 8-18. The jump in bus 383 trip 1 idle segment 12 lasts 28 seconds. Since there are only two
observations of such jumps in the whole database, there are not enough data to assess whether
8-15
-------
they co nstitute a new mode. These observations might indicate that one should pay attention
to slow movement during an idle segment. Since these two idle segments show some unusual
activities, the modeler will retain them to avoid any bias in the results.
Figure 8-18 Time Series Plot for Bus 383 Trip 1 Idle Segment 12 (1258 Seconds)
8.5 Idle Emission Rates Estimation
Based on definition of "speed < 2.5 mph and absolute acceleration < 1 mph/s", about 30%
of available data are classified as idle mode. Usually, modelers estimate the idle emission rate
by averaging all emission rates in idle mode. Although there are some data points with higher
engine power (> 50 bhp) in idle mode, about 90% of data in idle mode exhibit engine power be-
tween 0 and 20 bhp. After detailed analysis of all idle segments using time series plots, although
some data may be incorrectly classified as the idle mode, no anomalies were noted. To avoid in-
troducing any significant bias, a single idle emission rate is developed for each pollutant. When
we treat all data as a whole and put them in the pool, the mean and confidence interval can reflect
the distribution of emission rates in real world. Table 8-7 provides idle mode statistical analysis
results for NO , CO, and HC.
v" ~
8-16
-------
Table 8-7 Idle Mode Statistical Analysis Results for NOY, CO, and HC
NOx CO HC
minimum
lstQuartile
mean
median
3rd Quartile
maximum
skewness
Total Number
0.00121
0.02201
0.03342
0.02670
0.03549
0.40259
4.45050
31064
0.00002
0.00120
0.00594
0.00293
0.00554
0.48118
13.1840
30764
0.00001
0.00026
0.00092
0.00051
0.00079
0.05232
11.6100
30250
Due to the non-normality of emission rates, the median value (the value that divides
observations into an upper and lower half) and the inter-quartile range (the range of values that
includes the middle 50% of the observations) are the most appropriate for describing the distribu-
tion. The mean and skewness for the original data are presented in Table 8-8 as well. Although
transformation for three pollutants already discussed based on the whole data set in Chapter
6, lambdas chosen by Box-Cox procedure for the whole data set and idle mode are different.
Lambdas chosen by Box-Cox procedure for the whole data set are 0.22875 for NO , -0.0648
for CO, 0.14631 for HC, while lambdas for idle mode are -0.19619 for NOx, -0.0625 for CO,
0.002875 for HC. At the same time, using transformation to estimate the mean and construct
confidence intervals will create other problems. Therefore the modeler considers bootstrap, an-
other class of general method, to obtain the estimation and construct confidence intervals.
The bootstrap is a procedure that involves choosing random samples with replacement
from a data set and analyzing each sample the same way (Li 2004). To obtain the 95% confi-
dence interval, the simple method is to take 2.5% and 97.5% percentile of the P replications Tp
T .., T as the lower and upper bounds, respectively. The bootstrap function in this study will
resample the emission data 1000 times and compute the mean, 2.5% and 97.5% percentile on
each sample. Results are presented in Figure 8-20 and Table 8-8.
Original data
Resampling
BooMriip Statistic
\,
\,
x,
Figure 8-19 Graphical Illustration of Bootstrap (Adopted from Li 2004))
8-17
-------
0033 00332 00334 00336 00339 0034
Ntean ol NOx Emission Rate [g/sj
5.7 5.8 59 E 61 62 63
Mian of CO Emission Ran (g/s)
Figure 8-20 Bootstrap Results for Idle Emission Rate Estimation
Table 8-8 Idle Emission Rates Estimation and 95% Confidence Intervals Based on Bootstrap
Average 2.5% Percentile 97.5% Percentile
NO
X
CO
HC
Estimation
Confidence Interval
Estimation
Confidence Interval
Estimation
Confidence Interval
0.033415
0.033162
0.033669
0.0059439
0.0058184
0.0060693
0.00091777
0.00089742
0.00093811
0.010754
0.010509
0.010998
0.00036116
0.00034446
0.00037775
0.000059167
0.000047572
0.000070763
0.083266
0.082279
0.084252
0.028429
0.028083
0.028775
0.0037260
0.0036412
0.0038108
Based on table 8-9, the modeler recommends idle emission rates for NO as 0.033415
g/s with 95% confidence interval (0.010754, 0.083266), CO as 0.0059439 g/s with 95% confi-
8-18
-------
dence interval (0.00036116, 0.028429), HC as 0.00091777 g/s with 95% confidence interval
(0.000059167, 0.0037260).
8.6 Conclusions and Further Considerations
In this research, idle mode is defined as "speed < 2.5 mph and absolute acceleration <1
mph/s". However the critical value could not be introduced from other research to this research
directly. It is more appropriate to test several critical values and obtain the most suitable one
instead of testing only one developed from other research.
Inter-bus variability analysis results indicate that bus 372 has the largest mean for NO ,
CO, and HC emissions. Meanwhile, bus 372 has higher minimum (2nd), 1st Quartile (2nd), me-
dian (1st), and 3rd Quartile (2nd) engine power by comparison to the other 14 buses. Since test
buses and engines are similar in most ways, this difference might be caused by variability of
the engines or air conditioner usage. However, the contribution of the air conditioner to engine
power in idle mode could not be identified in the database. Future research regarding the role of
the air conditioner on engine power and emission rates in idle mode may be able to detect a dif-
ference.
Although some trips or some buses have higher mean and standard deviation than others,
this kind of variability will decrease when all data in idle mode are treated as a whole. On the
other hand, some elevated emissions events may simply reflect real world variability. Without
additional evidence, modelers should treat all data as a whole instead of removing outliers and
potentially biasing results.
There are two observations of an emissions jump that appears to be unrelated to engine
speed, engine power, and engine oil temperature, in a single idle segment. The modeler first as-
sumed that the bus moved too slowly from one location to another location for the GPS/ECM to
detect the movement. Other explanations might be an engine computer problem or sensor prob-
lem. These two jumps might be evidence to support further research on slow movements during
idle segments.
In summary, the modeler recommends idle emission rates for NO as 0.033415 g/s with
95% confidence interval (0.010754, 0.083266), CO as 0.0059439 g/s with 95% confidence inter-
val (0.00036116, 0.028429), HC as 0.00091777 g/s with 95% confidence interval (0.000059167,
0.0037260).
8-19
-------
CHAPTER 9
9. DECELERATION MODE DEVELOPMENT
Chapter 7 introduced the concept of driving mode into the study and several sensitivity
tests were performed for four different definitions, including comparison of modal average emis-
sion rate estimates, HTBR regression tree results, and residual mean deviance. After developing
the idle mode definition and emission rate in Chapter 8, the next task is dividing the rest of the
vehicle activity data into driving mode (deceleration, acceleration and cruise) for further analy-
sis. The deceleration mode is examined first.
9.1 Critical Value for Deceleration Rates in Deceleration Mode
The first task related to analysis of emission rates in the deceleration mode is identify-
ing critical values for deceleration. The literature indicates that critical values of -1 mph/s and -2
mph/s should be examined. Because the critical value of "acceleration < -1 mph/s" also includes
all data that conform with a critical value of "acceleration < -2 mph/s", comparison of data that
fall between these two potential cut points is first performed. In summary, these three decelera-
tion bins for analysis include:
• Option 1: acceleration < -2 mph/s
Option 2: acceleration > -2 mph/s & acceleration < -1 mph/s
• Option 3: acceleration > -1 mph/s & acceleration < 0 mph/s
If the critical value is set as -1 mph/s for deceleration mode, data falling into option 1 and
option 2 will be classified as deceleration mode while data falling into option 3 will be classified
as cruise mode. If the critical value is set as -2 mph/s for deceleration mode, data falling into op-
tion 1 will be classified as deceleration mode while data falling into option 2 and option 3 will be
classified as cruise mode.
9-1
-------
Figure 9-1 illustrates engine power distribution for these three options. Figures 9-2 to 9-4
compare engine power vs. emission rate for three pollutants for three options. Tables 9-1 and 9-2
provide the distribution for these three options in two ways: by number and percentage.
Table 9-1 Engine Power Distribution for Three Options for Three Pollutants
Engine Power (brake horsepower (bhp))
Deceleration Pollutants ^ 5Q ^
Option 1
Option 2
Option 3
NO
CO
HC
NO
CO
HC
NO
CO
HC
9322
9558
9483
6748
6800
6754
6806
6782
6705
94
89
94
127
126
125
950
949
921
16
15
16
101
99
99
1062
1061
1044
5
4
5
42
42
42
562
558
541
15
15
15
174
171
172
4353
4326
4212
9452
9681
9613
7192
7238
7192
13733
13676
13423
Table 9-2 Percentage of Engine Power Distribution for Three Options for Three Pollutants
Engine Power (brake horsepower (bhp))
Deceleration Pollutants ^ 5Q ^
Option 1
Option 2
Option 3
NO
CO
HC
NO
V
CO
HC
NO
V
CO
HC
98.6%
98.7%
98.6%
93.8%
93.9%
93.9%
49.6%
49.6%
50.0%
1.0%
0.9%
1.0%
1.8%
1.7%
1.7%
6.9%
6.9%
6.9%
0.2%
0.2%
0.2%
1.4%
1.4%
1.4%
7.7%
7.8%
7.8%
0.1%
0.0%
0.1%
0.6%
0.6%
0.6%
4.1%
4.1%
4.0%
0.2%
0.2%
0.2%
2.4%
2.4%
2.4%
31.7%
31.6%
31.4%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
9-2
-------
11000
Opium I
£ 5000
JUDO
HOOOr
Option 2
£ axiol
11000r
i/rlra '.
a so 100 i5t an 350
Engine Ptwui (bnp)
0 SO 100 160 300 350 300
0 93 100 150 200 250 303
Pom: (Wip)
Figure 9-1 Engine Power Distribution for Three Options
.15
-1.5
D.S
Opllon2
0 50 100 ISO 200 2SO
Engine Power (bhpj
0 60 100 ISO 200 260
Engine Ptrwer (bhp)
Figure 9-2 Engine Power vs. NO Emission Rate for Three Options
9-3
-------
35
3
25
g,
3
g
as
Sis
o
1
05
Oplwn 1
-
•
.
•fc-.i .,- ,i,- -i •• i •
35
3
25
I 2
S
1
o 1.5
o
,
05
0
Option 2
-
-
,
.
•
Ml f-ifJ*tJi..l-' •'*'•*. LJ&d
35
3
25
I 2
£
i
1
o 15
1
0.5
0
Option 3
•
•
.
-
50 IOD 150 200 250 300
Engine Power (bhp)
100 ISO 200 250 300
Engine Pff«ror (bhp)
0 50 100 150 200 250 300
Engine Power (bhp)
Figure 9-3 Engine Power vs. CO Emission Rate for Three Options
OfKfon 1
0.05
003
O.P7
0.06
0.05
0.04
0.03
003
Option 2
50 100 150 2DO 250 300
Engine Power (bhp)
007
Omion 3
0 50 100 150 200 250 300
Engine Power (bhp)
0 50 100 ISO 200 230 300
Engine Powti {bhp)
Figure 9-4 Engine Power vs. HC Emission Rate for Three Options
9-4
-------
There is little difference in the engine power distributions noted for data falling into op-
tion 1 and option 2 while the power distribution for option 3 is obviously different from option
1 and option 2 in the above figures and tables. Tables 9-1 and 9-2 show that the engine power is
more concentrated in the lower engine power regime (< 20 bhp) for data in deceleration mode.
Tables 9-1 and 9-2 better reflect the power demand of the vehicle in real world in deceleration
mode. Hence, the critical value is set to -1 mph/s for deceleration mode.
9.2 Analysis of Deceleration Mode Data
9.2.1 Emission Rate Distribution by Bus in Deceleration Mode
After defining vehicle activity data with "acceleration <-l mph/s" as deceleration mode,
emission rate histograms for each of the three pollutants for deceleration operations are presented
in Figure 9-5. Figure 9-5 shows significant skewness for all three pollutants for deceleration
mode. Inter-bus emission rate variability is illustrated by plotting median and mean NOx, CO,
and HC emission rates in deceleration mode for each bus in Figures 9-6 to 9-8 and Table 9-3.
The difference between median and mean is also an indicator of skewness.
140001
60001
0.5 1
NO* Emission R«B (^
HI."
D 02 01 06 08
CO Emission Rue (jft)
a oai 002 003 ow DOS
HC Emission flule (j/s)
Figure 9-5 Histograms of Three Pollutants for Deceleration Mode
9-5
-------
0035
0-03
I 0,02
i
0015
001
I
I
I
0.036
O 025
0.015
0 2 4 6 6 10 12 14 16
Bus Mo
0 1 t
10 13 14 IB
Figure 9-6 Median and Mean of NO Emission Rates in Deceleration Mode by Bus
Figure 9-7 Median and Mean of CO Emission Rates in Deceleration Mode by Bus
9-6
-------
8 10 12 14
But No
Figure 9-8 Median and Mean of HC Emission Rates in Deceleration Mode by Bus
Table 9-3 Median, and Mean for NO , CO, and HC in Deceleration Mode by Bus
NOx CO HC
Bus ID Median Mean Median Mean Median Mean
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
0.00325
0.00624
0.00483
0.00324
0.00437
0.00499
0.00414
0.02664
0.00525
0.01666
0.01214
0.00741
0.00828
0.02066
0.00341
0.01998
0.02206
0.01952
0.01255
0.01924
0.01997
0.01940
0.03457
0.01914
0.02420
0.03541
0.02385
0.02869
0.02118
0.01786
0.00502
0.00384
0.00446
0.00474
0.00578
0.00410
0.00317
0.00397
0.00359
0.00369
0.00450
0.00322
0.00259
0.00377
0.00406
0.00814
0.00535
0.00486
0.00586
0.00803
0.00567
0.00630
0.00522
0.00716
0.00452
0.00564
0.00452
0.00411
0.00585
0.00583
0.00040
0.00079
0.00004
0.00551
0.00161
0.00066
0.00034
0.00078
0.00060
0.00034
0.00073
0.00128
0.00113
0.00088
0.00091
0.00097
0.00095
0.00008
0.00613
0.00229
0.00085
0.00040
0.00103
0.00072
0.00038
0.00083
0.00172
0.00127
0.00086
0.00120
9-7
-------
Figures 9-6 to 9-8 and Table 9-3 illustrate that bus 379 has the largest median and the sec-
ond largest mean for NOx emissions, bus 372 has the largest median and the second largest mean
for CO emissions, while bus 364 has the largest median and mean for HC emissions. At the
same time, bus 382 has the largest mean for NOx emissions, and bus 360 has the largest mean for
CO emissions. The above figures and table demonstrate that although variability exists among
buses, it is difficult to determine which, if any, bus is a high emitter (i.e., a bus that exhibits ex-
tremely high emission rates under all operating conditions, which also may exhibit significantly
different emissions responses to operating activity than normal emitters).
The modeler notices that there is also a small number of some very high HC emis-
sions events noted in deceleration mode. Based on definitions of "acceleration < -1 mph/s",
242/16237=1.49 % of data points in deceleration mode for HC are high emissions. This hap-
pened only for HC. This did not occur for NO and CO. All high HC emissions have been
coded to determine if they are related to any other parameters. Tree analysis could be used for
this screening analysis. After screening engine speed, engine power, engine oil temperature, en-
gine oil pressure, engine coolant temperature, ECM pressure, and other parameters, no operating
parameters appeared to be correlated to these high emissions events.
High HC emissions distribution by bus and trip are presented in Table 9-4. Unlike idle
mode where high HC emissions occurred mainly in three idle segments (bus 360, trip 4, idle seg-
ment 1; bus 360, trip 4, idle segment 38; and bus 372, trip 1, idle segment 1), high HC emissions
are dispersed among seven different buses and 18 different trips. Although there is not enough
evidence to suggest a specific bus is a "high emitter", bus 364 is worthy of additional attention.
There are 5284 data points for bus 364 and, among them, 887 data points classified as decelera-
tion mode. There are 408 high HC emissions data points for bus 364 in deceleration mode. The
percentage of high HC emission for bus 364 is 7.72% (408/5284), while the percentage of high
HC emissions for bus 364 in deceleration mode is about 21% (193/887). Given the limited avail-
able data, no conclusion could be drawn about high HC emissions in deceleration mode. These
potential outliers may simply reflect real-world emissions variability for these engines.
Emission rate behavior as a function of operating mode and power for high-emitting ve-
hicles may differ significantly from normal-emitting vehicles. Since no high-emitting vehicle is
identified in the AATA data set, it is impossible for the modeler to examine such a difference. To
ensure that models are applicable to normal and high-emitters in the fleet, models have to have
both normal and high-emitters available in the analytical data set. Thus it is important to identify
high-emitting vehicles and bring them in for testing.
9-8
-------
Table 9-4 High HC Emissions Distribution by Bus and Trip for Deceleration Mode
Bus ID Number of High HC Events
Number of High HC Events
Bus 360
Bus 361
Bus 364
Bus 372
Bus 383
Bus 384
Bus 386
11
1
193
19
11
1
6
Bus 360, trip 3
Bus 360, trip 4
Bus 361, trip 5
Bus 364, trip 1
Bus 364, trip 2
Bus 364, trip 3
Bus 372, trip 1
Bus 372, trip 2
Bus 372, trip 3
Bus 372, trip 4
Bus 383, trip 1
Bus 383, trip 2
Bus 383, trip 3
Bus 383, trip 4
Bus 384, trip 3
Bus 386, trip 1
Bus 386, trip 2
Bus 386, trip 4
3
8
1
46
61
86
6
4
O
6
O
O
2
O
1
1
2
O
9.2.2 Engine Power Distribution by Bus in Deceleration Mode
Engine power distribution by bus is shown in Figure 9-9 and Table 9-5. When the bus is
decelerating, the engine typically absorbs energy, yielding low engine power, or even negative
engine power. Table 9-5 reflects this characteristic of deceleration mode. According to Sensors,
Inc. report (Ensfield 2001), negative engine power is recorded as zero power in the data, which
explains the large number of zero power values in the deceleration mode. The emission rates
under negative engine power conditions may be signficiantly different from those under positive
engine power. Further analysis will examine this question. Moreover, bus 372 has the greatest
3rd Quartile engine power in deceleration mode, consistent with the finding in idle mode.
9-9
-------
Table 9-5 Engine Power Distributions in Deceleration Mode by Bus
Bus No Minimum 1st Quartile Median 3rd Quartile Maximum
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3.88
5.16
6.70
0
20.41
5.84
3.33
11.77
5.19
7.19
5.84
8.51
5.86
6.00
7.18
275.40
173.10
274.90
254.30
112.00
274.90
275.10
164.90
29.40
121.15
20.75
94.65
162.37
102.59
42.20
9-10
-------
MOM
-
'•At.'
iut>
.-•no
run
Xf!
ao
iyj
ZflC
-
-
-
-
-
-
-
-
KB
400
;nn
ron
an
fan
4oe
.TJtl
n
-
-
-
1_
tin
400
200
-
tClll
£00
400
;i:o
-
-
-
-
-
-
133
4-jj
200
cm
on
CUU
400
•
1
-
-
-
•
-
-
a:
400
200
on
ULU
ecu
400
200
-
-
-
-
-
-
^^,
6CO
SO!
.Tin
001
xu
6(0
ao
jnn
-
-
-
-
-
-
-
600
400
200
000
iUl
BOO
400
TOO
-
-
-
-
-
0 100 KB 300 0 1002003000 1003203000 1002003000 100 200 300 0 1002003000 100200X00 100230300
Bus 361 Bus 363 Bus 364 Bus 372 Bus 375 Bus 377 Bus 379
800-
300, , 1 1BOO
.-.HI
ii.i
900
400-
200
ODO
20D-
IK'
-
-
-
•
-
-
-
EOT
an
3D
IUl
300
600
410
:tij
n
-
-
-
-
-
-
-
-
BOO
410
no
BOO
cjj
600
400
200
n
-
-
-
•
•
-
-
•
n 1
CD
;o
11. -i
EC 3
ec:
1C]
:ci
-
-
-
•
•
•
•
-
0 1002003000 1002003000 1DC2IH30DO 1002003000 1002003000 1D02003000 100200300
Bus 380 Bus 381 Bus 382 Bus 383 Bus 384 Bus 385
Figure 9-9 Histograms of Engine Power in Deceleration Mode by Bus
9-11
-------
Based on definitions of "acceleration < -1 mph/s", about 1% of data points with high
engine power (>50 bhp) fall in deceleration mode (Table 9-1). Figure 9-10 illustrates plots of
engine power vs. vehicle speed, engine power vs. engine speed, and vehicle speed vs. engine
speed. Figure 9-10 shows that higher engine power always occurred with higher vehicle speed
and higher engine speed. These data points with higher engine power likely reflect the variabil-
ity of the real world and are all retained in the data set and mode definition to avoid potentially
biasing results.
aUgnmoce
higher enigma pew (>50)
f£h.
^:-
*/*, !*•*
.i
19X1
. 1600
1000
0 SO ICO I5D 2DQ 360 300
Engin« Power (bhp)
Dece.erasipn mod?
i-igner enigfii pmvcr {>50)
'
'
*»
*
". " •*
'** **•
t^ -
*&••'•
!** "•".'•
$*•
SPJ ';
c *
*
^
'•
*.
•
***»
• •;
* * *
* -
- •
« .
••
»
*
* „
•
• "»
. »
*
-
. .
•
•
•
.'
•.
100 ISO JOO 250 300
Englm Ptwtr (Jihp)
Figure 9-10 Engine Power vs. Vehicle Speed, Engine Power vs. Engine Speed, and Vehicle
Speed vs. Engine Speed
9.3 The Deceleration Motoring Mode
Bus engines absorb energy during the deceleration mode, resulting in low or negative en-
gine power. According to the Sensors, Inc. report (Ensfield 2001), such negative power was re-
corded as zero power. The emissions under these negative engine power conditions may be sig-
nificantly different from those under positive engine power conditions, and therefore may need to
be included in the modeling regime as a separate mode of operation. To examine this possibility,
deceleration mode data were split into two mode bins for analysis. The first bin includes all data
points with zero engine power in deceleration mode, termed 'deceleration motoring mode.' The
9-12
-------
remaining data in the deceleration mode, which exhibit positive engine power, are classified as
deceleration non-motoring mode. The analysis will begin as a comparison of histograms of three
pollutants between deceleration motoring mode and deceleration non-motoring mode (Figure
9-11). Table 9-6 compares the mean, median, and skewness of emission distributions between
these two modes for the three pollutants. The statistical results for all deceleration data are also
presented as a reference. Figure 9-11 and Table 9-6 show that lower emission rates are more
prevalent in the deceleration motoring mode than in the deceleration non-motoring mode. Skew-
ness of emission distributions for deceleration motoring mode is also smaller.
Figure 9-11 Histograms for Three Pollutants in Deceleration Motoring Mode (a) and Decelera-
tion Non-Motoring Mode (b)
To test the differences between deceleration motoring mode and deceleration non-mo-
toring mode, a Kolmogorov-Simirnov two-sample test was chosen rather than a standard t-tesl,
because the normal distribution assumption was questionable. The Kolmogorov-Smirnov two-
sample test is a test of the null hypothesis that two independent samples have been drawn from
the same population (or from populations with the same distribution). The test uses the maximal
difference between cumulative frequency distributions of two samples as the test statistic. Re-
sults of the Kolmogorov-Smirnov two-sample tests demonstrate that the differences in emission
rates under deceleration motoring mode and deceleration non-motoring mode are statistically
significant.
9-13
-------
Table 9-6 Comparison of Emission Distributions between Deceleration Mode and Two Sub-
Modes (Deceleration Motoring Mode and Deceleration Non-Motoring Mode)
Number
Minimum
lstQuartile
Median
3rd Quartile
Maximum
Mean
Skewness
16644
0.00001
0.00182
0.00611
0.03155
1.30640
0.02215
6.02890
16919
0.00001
0.00249
0.00398
0.00605
0.85208
0.00580
30.6459
16805
0.00001
0.00039
0.00068
0.00120
0.04200
0.00118
5.76530
Sub-mode 1 :Deceleration Motoring Mode
Number
Minimum
1st Quartile
Median
3rd Quartile
Maximum
Mean
Skewness
10925
0.00001
0.00124
0.00272
0.00816
0.14930
0.00978
3.08780
11304
0.00001
0.00269
0.00401
0.00567
0.20366
0.00528
12.27120
11240
0.00001
0.00041
0.00067
0.00110
0.01425
0.00111
3.92760
Sub-mode 2: Deceleration Non-Motoring Mode
Number
Minimum
1st Quartile
Median
3rd Quartile
Maximum
Mean
Skewness
5719
0.00002
0.01973
0.03431
0.05658
1.30640
0.04576
5.7018
5615
0.00003
0.00204
0.00384
0.00741
0.85208
0.00685
26.8539
5565
0.00001
0.00034
0.00069
0.00150
0.04200
0.00131
6.8026
9-14
-------
9.4 Deceleration Emission Rate Estimations
Using the "acceleration < -1 mph/s" cutpoint, about 16% of total data collected are clas-
sified in the deceleration mode. While deceleration emission rates could simply be estimated
directly by averaging all deceleration mode emission rates, the emission rate distribution is non-
normal. Because lambdas identified by the Box-Cox procedure for the whole dataset and decel-
eration mode subsets are different, and because using a transformation to estimate the mean and
construct confidence intervals will create other problems, the bootstrap (another class of general
methods) was used for estimation of the mean and for construction of confidence intervals. The
bootstrap function in this study resampled the emission rate data 1000 times and computed the
mean, 2.5%, and 97.5% percentile of each sample.
The results of the bootstrap analyses indicate that splitting the deceleration mode into
deceleration motoring mode and deceleration non-motoring mode using the zero engine power
criteria is warranted. The bootstrap distributions of mean emission rates for deceleration mode,
deceleration motoring mode, and deceleration non-motoring mode are presented in Figures 9-12
to 9-14 and Table 9-7. To illustrate the difference in emission rate estimation between decelera-
tion motoring mode and deceleration non-motoring mode, Figure 9-15 presents bootstrap means
and confidence intervals for the emission rates of all three pollutants. For reference purposes,
deceleration mode emission rate estimations are also presented. Table 9-7 and Figure 9-15 show
that the average emission rate for the deceleration motoring mode is much lower than that for
deceleration non-motoring mode for all pollutants especially for NO .
9-15
-------
121 0,0215 0.022 0.0225 0.023 0.0235
Mean of NO Emission Kilt (o/s)
92 93 :•>; 'js la 10.2 10.4
M»inotfW6mis«innRate(s'5) , )cf>
OOU 0045 0046 0047 0.048 0049
Mean 01 NO Emission Kale (s/s)
Figure 9-12 Bootstrap Results for NO Emission Rate Estimation in Deceleration Mode
Deceleration Mode
Decelerating Motoring Mode
Revised Deceleration Mode
260
5.4 56 58 6 62 64
Mean ul CO Emisaiou Rate (ys) ^ ]Q 1
5 51 52 53 54 5.5 SB
Maun ef CO Emission Rate (g/s] x 1£jJ
6 6.5 7 75
Mean of CO Emission Rate <9/s) x ^J
Figure 9-13 Bootstrap Results for CO Emission Rate Estimation in Deceleration Mode
9-16
-------
Deceleration Mode
inj M-nlDrirg Modi
I 12 1 14 ! 16 I IB 12 1.2! 124
Mean e( HC Emission Rate (grs) , ,fl-=
300
Rensed DacelBiition Mode
ISO
1 105 11 I 15 1-2
Mean of HC Emission Rsle (grt) x10celiruionMotoiingMotte
Deceleration NorvMolortng Mode
65
CO Emission RM. (5/1)
Oeceleratton r.iolu-irg Mode
Oeceleulion Non.Wotor.ng Mode
1.2 1.25
HC Emission Rate (j/s)
1.3
135
Figure 9-15 Emission Rate Estimation Based on Bootstrap for Deceleration Mode
9-17
-------
Table 9-7 Emission Rate Estimation and 95% Confidence Intervals Based on Bootstrap for De-
celeration Mode
Average
2.5%
97.5%
Percentile Percentile
Deceleration Mode
N0x
CO
HC
Estimation
Confidence Interval
Estimation
Confidence Interval
Estimation
Confidence Interval
0.02215
0.02161
0.02268
0.00580
0.00562
0.00598
0.00118
0.00115
0.00121
0.00024
0.00022
0.00027
0.00055
0.00051
0.00059
0.00004
0.00004
0.00004
0.10919
0.10427
0.11411
0.02191
0.02067
0.02314
0.00652
0.00626
0.00679
Deceleration Motoring Mode
NO
X
CO
HC
Estimation
Confidence Interval
Estimation
Confidence Interval
Estimation
Confidence Interval
0.00978
0.00945
0.01010
0.00529
0.00514
0.00543
0.00111
0.00109
0.00114
0.00017
0.00015
0.00019
0.00072
0.00068
0.00075
0.00004
0.00004
0.00004
0.06540
0.06306
0.06774
0.01743
0.01635
0.01850
0.00652
0.00621
0.00683
Deceleration Non-Motoring Mode
N0x
CO
HC
Estimation
Confidence Interval
Estimation
Confidence Interval
Estimation
Confidence Interval
0.04578
0.04457
0.04698
0.00686
0.00643
0.00728
0.00131
0.00125
0.00137
0.00173
0.00152
0.00195
0.00037
0.00033
0.00040
0.00004
0.00003
0.00005
0.17187
0.16343
0.18031
0.02846
0.02587
0.03104
0.00650
0.00594
0.00706
9-18
-------
Based on table 9-7, the deceleration emission rate for NOx is set as 0.02215 g/s with
95% confidence interval (0.00024 to 0.10919), CO as 0.00580 g/s with 95% confidence interval
(0.00055 to 0.02191), HC as 0.00118 g/s with 95% confidence interval (0.00004 to 0.00652).
The deceleration motoring emission rate for NO is set as 0.00978 g/s with 95% confidence
interval (0.00017 to 0.06540), CO as 0.00529 g/s with 95% confidence interval (0.00072 to
0.01743), HC as 0.00111 g/s with 95% confidence interval (0.00004 to 0.00652). The decelera-
tion non-motoring mode emission rate for NO is set as 0.04578 g/s with 95% confidence inter-
val (0.00173 to 0.17187), CO as 0.00686 g/s with 95% confidence interval (0.00037 to 0.02846),
HC as 0.00131 g/s with 95% confidence interval (0.00004 to 0.00650).
9.5 Conclusions and Further Considerations
In this research, deceleration mode is defined as "acceleration < -1 mph/s". However the
emissions under negative engine power are different from those under positive engine power.
Hence, the deceleration mode is split into deceleration motoring mode and deceleration non-
motoring mode based on engine power.
Inter-bus variability analysis indicates that bus 372 has the largest 3rd Quartile value for
engine power among 15 buses in deceleration mode, consistent with the finding in idle mode. At
the same time, inter-bus variability analysis results show that bus 379 has the largest median and
the second largest mean for NOx emissions, bus 372 has the largest median and the second larg-
est mean for CO emissions, while bus 364 has the largest median and mean for HC emissions.
But it is difficult to conclude that these buses should be classified as high emitters or that there
are any special modes that should be modeled separately as high-emitting modes.
Some high HC emissions events are noted in deceleration mode. After screening engine
speed, engine power, engine oil temperature, engine oil pressure, engine coolant temperature,
ECM pressure, and other parameters, these operating parameters could not be linked to these
high emissions occurrences. Additional causal variables may be in play that are not included in
the data available for analysis.
Based on definitions of "acceleration < -1 mph/s", about 1% of data points exhibit some-
what unusually high engine power (> 50 bhp) in deceleration mode. Analysis shows that higher
engine power always happened with higher vehicle speed and higher engine speed. These high-
er-power data points likely reflect the variability in real world power demand (perhaps associated
with operations on grade, which could not be identified in the database). All of these data were
retained in the model to avoid potentially biasing the results.
9-19
-------
In summary, the deceleration non-motoring mode emission rate for NO is set as 0.04578
g/s, CO as 0.00686 g/s, and HC as 0.00131 g/s. The deceleration motoring emission rate for NO
is set as 0.00978 g/s, CO as 0.00529 g/s, and HC as 0.00111 g/s. Emission rate estimation for the
deceleration motoring mode is significantly lower than the deceleration non-motoring mode for
all three pollutants, especially for NO .
9-20
-------
CHAPTER 10
10. ACCELERATION MODE DEVELOPMENT
After developing the idle mode definition and emission rate in Chapter 8 and deceleration
mode definitions and emission rates in Chapter 9, the next task is to divide the rest of the data
into acceleration and cruise mode. This chapter examines the definition of acceleration activity
and emission rates for acceleration activity.
10.1 Critical Value for Acceleration in Acceleration Mode
The first task related to analysis of emission rates in the acceleration mode is identifying
a critical value for acceleration. Two values were tested: 1 mph/s and 2 mph/s. Since the critical
value of "acceleration > 1 mph/s" will include all data under the critical value of "acceleration
> 2 mph/s", comparison of data falling between these two potential cut points is conducted first.
Once selected, the chosen critical value will be used to divide the data into acceleration mode
and cruise mode. Thus "acceleration > 0 mph/s and acceleration < 1 mph/s" will be another op-
tion. Similarly to analysis for deceleration mode, these three options will be:
• Option 1: acceleration > 2 mph/s
• Option 2: acceleration > 1 mph/s and acceleration < 2 mph/s
• Option 3: acceleration > 0 mph/s and acceleration < 1 mph/s
Figure 10-1 illustrates engine power distribution for these three options. Figures 10-2 to
10-4 compare engine power vs. emission rate for three pollutants for three options. Tables 10-1
and 10-2 provide the distribution for these three options in two ways: by number and percentage.
10-1
-------
Figure 10-1 Engine Power Distribution for Three Options
Figure 10-2 Engine Power vs. NO Emission Rate (g/s) for Three Options
10-2
-------
Opta 1
Ofbi :
I
I
I 2
i
M
§
100 i» a» so xa
Ci*y« » F^Ww* fbhp)
Figure 10-3 Engine Power vs. CO Emission Rate (g/s) for Three Options
OpIM I
005
gaw
I
o 003
001
005
lorn
i
o 0
-------
Table 10-1 Engine Power Distribution for Three Options for Three Pollutants
Engine Power (brake horsepower (bhp))
Acceleration Pollutants
(0 50) (50 100) (100 150) (150 200) > 200 Total
Option 1
Option 2
Option 3
NO
CO
HC
NO
CO
HC
NO
CO
HC
322
319
318
613
606
605
3208
3190
3104
446
444
440
865
858
843
4130
4105
3972
852
851
833
1358
1355
1328
4378
4362
4195
1229
1228
1203
1324
1321
1287
2490
2487
2408
5870
5870
5649
6015
6012
5824
3205
3185
3131
8719
8712
8443
10175
10152
9887
17411
17329
16810
Table 10-2 Percentage of Engine Power Distribution for Three Options for Three Pollutants
Engine Power (brake horsepower (bhp))
Acceleration Pollutants (Q ^ (5Q WQ) (WQ ^ (15Q2QQ) ^ ^
Option 1
Option 2
Option 3
NO
CO
HC
NO
V
CO
HC
NO
Y
CO
HC
3.7%
3.7%
3.8%
6.0%
6.0%
6.1%
18.4%
18.4%
18.5%
5.1%
5.1%
5.2%
8.5%
8.5%
8.5%
23.7%
23.7%
23.6%
9.8%
9.8%
9.9%
13.3%
13.3%
13.4%
25.1%
25.2%
25.0%
14.1%
14.1%
14.2%
13.0%
13.0%
13.0%
14.3%
14.4%
14.3%
67.3%
67.4%
66.9%
59.1%
59.2%
58.9%
18.4%
18.4%
18.6%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
100.0%
If the critical value is set as 1 mph/s for acceleration mode, data falling into option 1 and
option 2 will be classified as acceleration mode while data falling into option 3 will be classi-
fied as cruise mode. If the critical value is set as 2 mph/s for acceleration mode, data falling into
option 1 will be classified as acceleration mode while data falling into option 2 and option 3 will
be classified as cruise mode. There is little difference in the engine power distributions noted for
data falling into option 1 and option 2 while the power distribution for option 3 is obviously dif-
ferent from option 1 and option 2 in the above figures and tables. Table 10-1 and 10-2 show that
the engine power is more concentrated in higher engine power (>200 bhp) for data in accelera-
tion mode. Tables 10-1 and 10-2 better reflect the power demand of the vehicle in real world in
acceleration mode. Hence, the critical value is set as 1 mph/s for acceleration mode.
10-4
-------
After defining "acceleration > 1 mph/s" as acceleration mode, cruise mode data will
consist of all of the remaining data in the database (i.e., data not previously classified into idle,
deceleration, and now acceleration). Unlike idle and deceleration mode, there is a general rela-
tionship between engine power and emission rate for acceleration mode and cruise mode. Even
though the engine power distribution for acceleration mode is different from that of cruise mode
(Table 10-3), these two modes share a relationship between engine power and emission rate (Fig-
ure 10-5), although there are potentially some significant differences noted in the HC chart.
Table 10-3 Engine Power Distribution for Acceleration Mode and Cruise Mode
Pollutants
Engine Power Distribution
(050) (50100) (100150) (150200) > 200
Acceleration mode
Number
Percentage
NO
Y
CO
HC
NO
V
CO
HC
935
925
923
4.95%
4.90%
5.04%
1311
1302
1283
6.94%
6.90%
7.00%
2210
2206
2161
11.70%
11.69%
11.79%
2553
2549
2490
13.51%
13.51%
13.58%
11885
11882
11473
62.90%
62.99%
62.59%
18894
18864
18330
100.00%
100.00%
100.00%
Cruise mode
Number
Percentage
NO
Y
CO
HC
NO
Y
CO
HC
15885
15834
15481
40.34%
40.37%
40.72%
8988
8940
8600
22.83%
22.80%
22.62%
7173
7145
6830
18.22%
18.22%
17.96%
3536
3529
3394
8.98%
9.00%
8.93%
3792
3770
3715
9.63%
9.61%
9.77%
39374
39218
38020
100.00%
100.00%
100.00%
10-5
-------
Figure 10-5 Engine Power vs. Emission Rate for Acceleration Mode and Cruise Mode
The relationships between emission rate and power for acceleration mode data will be ex-
plored in this chapter, while the relationships between emission rate and power for cruise mode
data will be explored in the next chapter.
10.2 Analysis of Acceleration Mode Data
10.2.1 Emission Rate Distribution by Bus in Acceleration Mode
After denning vehicle activity data with "acceleration >1 mph/s" as acceleration mode,
emission rate histograms for each of the three pollutants for acceleration operations are presented
in Figure 10-6. Figure 10-6 shows significant skewness for all three pollutants for acceleration
mode. There are also a small number of some very high HC emissions events noted in accelera-
tion mode. After screening engine speed, engine power, engine oil temperature, engine oil pres-
sure, engine coolant temperature, ECM pressure, and other parameters, no operating parameters
appeared to be correlated with the high emissions events.
10-6
-------
latBO
IBOOD
son
6000
NOi CnuMiwi Rut l»l!
-J i
OS I 15 2 ?r>
CO troitnon R«t (*>!)
002 804 I)DB
HC Cmiuoti R«i (Vt)
Figure 10-6 Histograms of Three Pollutants for Acceleration Mode
Inter-bus response variability for acceleration mode operations is illustrated in Figures
10-7 to 10-9 using median and mean of NOx, CO, and HC emission rates. Table 10-4 presents
the same information in tabular form. The difference between median and mean is also an indi-
cator of skewness.
Table 10-4 Median and Mean of Three Pollutants in Acceleration Mode by Bus
Bus ID Median Mean Median Mean Median Mean
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
0.27729
0.30170
0.14459
0.28948
0.17834
0.31092
0.17827
0.17788
0.26410
0.18011
0.25957
0.28125
0.14058
0.26033
0.18627
0.28991
0.17335
0.20883
0.26620
0.19806
0.06527
0.05177
0.03836
0.03501
0.02980
0.05929
0.04755
0.08430
0.08238
0.07856
0.09217
0.08001
0.09012
0.05650
0.03475
0.08619
0.09612
0.10346
0.19149
0.12646
0.00159
0.00184
0.00022
0.00306
0.00250
0.00143
0.00104
0.00222
0.00210
0.00095
0.00182
0.00228
0.00039
0.00363
0.00279
0.00176
0.00112
0.00276
0.00253
0.00106
10-7
-------
Bus ID Median Mean Median Mean Median Mean
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
0.28966
0.24419
0.18775
0.17783
0.22674
0.29152
0.26739
0.22139
0.21706
0.24673
0.09234
0.05355
0.07111
0.05141
0.10412
0.18179
0.13112
0.17389
0.07893
0.23806
0.00263
0.00308
0.00401
0.00361
0.00272
0.00272
0.00368
0.00429
0.00384
0.00282
Figure 10-7 Median and Mean of NO Emission Rates in Acceleration Mode by Bus
10-8
-------
H
005
fo,s
!
8
I01
005
U 16
8 1C 13 M 16
But No
Figure 10-8 Median and Mean of CO Emission Rates in Acceleration Mode by Bus
u
:•:
I
0 J J
6 fl 10 13
Bus No
II Id
Figure 10-9 Median and Mean of HC Emission Rates in Acceleration Mode by Bus
10-9
-------
Figures 10-7 to 10-9 and Table 10-4 illustrate that NO emissions are more consistent than
CO and HC emissions. Across the 15 buses, Bus 386 has the largest median and mean for CO
emissions, while Bus 384 has the largest median and mean for HC emissions. The above figures
and table demonstrate that although variability exists across buses, it is difficult to conclude that
there are any true "high emitters." That is, the emissions from these buses are not consistently more
than one or two standard deviations from the mean under normal operating conditions. Meanwhile,
Bus 363 has the smallest mean and median HC emissions compared to the other 14 buses.
10.2.2 Engine Power Distribution by Bus in Acceleration Mode
Engine power distribution in acceleration mode by bus is shown in Figure 10-10 and
Table 10-5. When the bus is accelerating, the engine will be required to produce more power.
Figure 10-10 and Table 10-5 reflect this characteristic of acceleration mode. The distribution
of engine power in acceleration mode is significantly different from deceleration mode and idle
mode. Bus 372 has the largest minimum engine power in acceleration mode, consistent with the
finding for idle mode and deceleration mode. The maximum power values for each bus match
well with the manufacturer's engine power rating. Although variability for engine power distri-
bution exists across buses, it is difficult to conclude that such variability is affected by individual
buses, bus routes, or other factors. The relationship between power and emissions appears con-
sistent across the buses for acceleration mode.
Table 10-5 Engine Power Distribution in Acceleration Mode by Bus
Bus ID Number Min 1st Quartile Median 3rd Quartile Max Mean
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
1507
545
1287
931
728
1599
1751
1427
1823
1362
691
1043
1292
1377
1532
0
7.16
0
0
34.42
0
3.35
0
0
0
0
0
0
0
13.81
162.96
131.96
111.52
142.82
145.57
140.92
166.25
204.15
202.69
139.86
173.36
161.16
144.10
143.51
164.27
255.57
199.58
200.39
228.25
213.51
259.45
256.89
264.54
262.11
220.00
250.90
250.37
213.87
226.37
244.80
275.05
261.51
267.06
270.01
264.70
275.13
275.08
275.18
275.15
272.21
275.05
275.08
269.50
274.99
275.06
275.59
275.54
275.59
275.56
275.56
275.57
275.60
275.58
275.54
275.60
275.58
275.59
275.60
275.55
275.60
212.04
184.46
180.03
197.27
199.81
205.56
212.09
233.71
228.55
199.20
218.82
213.70
198.80
201.67
215.95
10-10
-------
Engine power distribution also shows that about 0.19% (36/18895) of data points show
zero load in acceleration mode. For the 36 data points exhibiting zero indicated engine load,
about 92% (33/36) occurred on roads reported to have zero or negative grade. Due to the inac-
curacy of road grade values, it was not possible to simulate the engine power in this research.
However, in the real world, linear acceleration with zero load can happen on downhill stretches.
Application of load based emission rates to predicate engine load will be able to take grade into
account in the overall modeling framework. Because only 36 data points with zero load were
included in the acceleration data, it was unnecessary to develop a sub-model for them. Mean-
while, such zero loads in acceleration mode do reflect the variability of acceleration data in the
real world.
-
u
= • ... -I .1:
U
UHJ
-
TO
au
303
J
Ml
KB
-.1
00
J
HC
"
,ji
0
J
DB
-.,
>,i
0
D
D
,,
,,
= '.
a
j
r.
o
.
E1
n
.1
j
UCJU
^,
-.1,
SB
n
J
n iin OT TO "n ion jii in "(i NII xn m "o vn xa SB "n «n m m "n mi 201 3m ~t ten an wo
ifcnMI HmWI Bw»I Swtt) 8« fU BMlK BM.MB
Figure 10-10 Histograms of Engine Power in Acceleration Mode by Bus
10-11
-------
10.3 Model Development and Refinement
10.3.1 HTBR Tree Model Development
The potential explanatory variables included in the emission rate model development ef-
fort include:
• Vehicle characteristics: model year, odometer reading, bus ID (14 dummy variables);
• Roadway characteristics: dummy variable for road grade;
• Onroad load parameters: engine power (bhp), vehicle speed (mph), acceleration
(mph/s);
• Engine operating parameters: engine oil temperature (deg F), engine oil pressure
(kPa), engine coolant temperature (deg F), barometric pressure reported from ECM
(kPa);
• Environmental conditions: ambient temperature (deg C), ambient pressure (mbar),
ambient relative humidity (%).
The HTBR technique is used first to identify potentially significant explanatory variables;
this analysis provides the starting point for conceptual model development. The HTBR model
is used to guide the development of an OLS regression model, and not a model in its own right.
HTBR can be used as a data reduction tool and for identifying potential interactions among the
variables. Then OLS regression is used with the identified variables to estimate a preliminary
"final" model.
These 27 variables were first offered to the tree model. To arrive at the "best" model,
various regression tree models were created. The initial model was created by allowing the tree
to grow unconstrained for the first cut. Once an initial model was created, the supervised tech-
nique in S-PLUS was used to simplify the model by removing the lower branches of the tree that
explained the least deviance. For application purposes, the resulting tree was examined to ensure
that the model's predictive ability was not compromised by allowing the overall amount of devi-
ance to increase significantly.
The 27 variables include continuous, categorical, and dummy variables. Dummy vari-
ables for buses could be used to indicate the variability of buses. Like the analysis in Chapter
6, these 15 buses could be treated as a single group for purposes of analysis and model develop-
ment. HTBR technique can examine the potential additional influence of road grade (i.e., above
and beyond the contribution to power demand) using a dummy variable to represent a grade
10-12
-------
category (the final model does not include this dummy variable due to the inaccuracy of road
grade values). Analysis results in Chapter 6 indicate that all environmental characteristics, like
temperature, humidity and barometric pressure, are moderately correlated with each other. On
the other hand, engine operating parameters, like engine oil pressure, engine oil temperature, en-
gine coolant temperature, and barometric pressure reported from ECM, are highly or moderately
related to on-road operating parameters, like engine power, vehicle speed, and acceleration. The
modeler should be aware of such correlations among explanatory variables.
Although evidence in the literature suggests that a logarithmic transformation is most
suitable for modeling motor vehicle emissions (Washington 1994; Ramamurthy et al. 1998;
Fomunung 2000; Frey et al. 2002), this transformation needs to be verified through the Box-Cox
procedure. The Box-Cox function in MATLAB™ can automatically identify a transforma-
tion from the family of power transformations on emission data, ranging from -1.0 to 1.0. The
lambdas chosen by Box-Cox procedure for acceleration mode are 0.683 for NOx, 0.094438 for
CO, 0.31919 for HC. The Box-Cox procedure is used only to provide a guide for selecting a
transformation, so overly precise results are not needed (Neter et al. 1996). It is often reasonable
to use a nearby lambda value that is easier to understand for the power transformation. Although
the lambdas chosen by the Box-Cox procedure are different for acceleration and cruise mode,
the nearby lambda values are same for these two modes. In summary, the lambda values used
for transformations are /^ for NOx, 0 for CO (indicating a log transformation), and 1A for HC for
acceleration mode. Figures 10-11 to 10-13 present histogram, boxplot, and probability plots
of truncated emission rates in acceleration mode for NO , CO, and HC, while Figures 10-14
x ^
to 10-16 present the same plots for truncated transformed emission rates for NOx, CO and HC,
where a great improvement is noted.
-:• v 2 4
OarOn u
-------
ll...
00 05 10 15 20
Truncated CO Emission Rate (gfc) in Acceleration Mode
-4-2024
QuanHes of Standard Normal
Figure 10-12 Histogram, Boxplot, and Probability Plot of Truncated CO Emission Rate in Acceleration Mode
,
0.0 0.01 0.03 0.05
Truncated HC Emission Rate Cg/s) in Acceleration Mode
a.
5 -
.4 -2 0 24
Quanttes of Standard Normal
Figure 10-13 Histogram, Boxplot, and Probability Plot of Truncated HC Emission Rate in Acceleration Mode
10-14
-------
ll
0.0 04 OB 1.2
-4 -2024
Ouartfcs o( Standard Normal
Figure 10-14 Histogram, Boxplot, and Probability Plot of Truncated Transformed NOx Emission
Rate in Acceleration Mode
o -
.ll
6
-S -4 -3 -2 -1 0
-4 -i 0 ! 4
fcs o( Slandad Normal
Figure 10-15 Histogram, Boxplot, and Probability Plot of Truncated Transformed CO Emission
Rate in Acceleration Mode
10-15
-------
ll
0.1 0.2 0.3 0.4
-2024
QuoriHes ol Standard normal
Figure 10-16 Histogram, Boxplot, and Probability Plot of Truncated Transformed HC Emission
Rate in Acceleration Mode
10.3.1.1 NO HTBR Tree Model Development
Figure 10-17 illustrates the initial tree model used for truncated transformed NO emis-
° X
si on rate in acceleration mode. Results for the initial model are given in Table 10-6. The tree
grew into a complex model, with a considerable number of branches and 36 terminal nodes. Fig-
ure 10-18 illustrates the amount of deviation explained corresponding to the number of terminal
nodes.
10-16
-------
engine pnwsr<77 ?,
engine pcsver<32 62
0.1B8419
vehicle_speej<25 95
Figure 10-17 Original Untrimmed Regression Tree Model for Truncated Transformed NO Emis-
sion Rate in Acceleration Mode
Table 10-6 Original Untrimmed Regression Tree Results for Truncated Transformed NO Emis-
sion Rate in Acceleration Mode
Regression tree:
tree (formula = N0x.50 ~ model. year + odometer + temperature + baro + humidity + ve-
hicle.speed + oil.temperture + oil.press + cool.temperature + eng.bar.press + engine.
power + acceleration + bus360 + bus361 + bus363 + bus364 + bus372 + bus375 + bus377
+ bus379 + bus380 + bus381 + bus382 + bus383 + bus384 + bus385 + dummy.grade, data =
busdata!0242006.1.3,
na.action = na.exclude, mincut = 400, minsize = 800, mindev = 0.01)
Variables actually used in tree construction:
[1] "engine.power" "vehicle.speed" "temperature"
[5] "bus375" "humidity" "oil.press"
[9] "eng.bar.press" "bus379" "model.year"
Number of terminal nodes: 36
Residual mean deviance: 0.005538 = 104.4 / 18860
Distribution of residuals:
Min. 1st Qu. Median Mean
"baro"
"odometer"
"oil.temperture"
3rd Qu.
-3.769e-001 -4.176e-002 -4.298e-003 3.661e-017 3.957e-002
Max.
i.965e-001
For model application purposes, it is desirable to select a final model specification that
balances the model's ability to explain the maximum amount of deviation with a simpler model
that is easy to interpret and apply. Figure 10-18 indicates that reduction in deviation with ad-
dition of nodes after 4, although potentially statistically significant, is very small. A simplified
tree model was derived which ends in 4 terminal nodes as compared to the 36 terminal nodes in
the initial model. The residual mean deviation only increased from 104.4 to 151.2 and yielded a
much more efficient model. Results are shown in Table 10-7 and Figure 10-19. Based on above
analysis, an NO acceleration emission rate model will be developed based upon these results.
10-17
-------
i.OOO 5.000 2.500 1.600 1.100 0.700 0.450 0.210 0.110
I I I I I I I I I I ! I I I I I I I I I I I . I I
i
10
30
Figure 10-18 Reduction in Deviation with the Addition of Nodes of Regression Tree for Trun-
cated Transformed NO Emission Rate in Acceleration Mode
0.2581
temoerat jre<20 5
vehicle.spieed<25 95
0.5482
0 5034
04456
Figure 10-19 Trimmed Regression Tree Model for Truncated Transformed NO Emission Rate in
Acceleration Mode
10-18
-------
Table 10-7 Trimmed Regression Tree Results for Truncated Transformed NO Emission Rate in
° X
Acceleration Mode
Regression tree:
snip.tree(tree = tree(formula = NOx.50 ~ model.year + odometer + temperature +
baro + humidity + vehicle.speed + oil.temperture + oil.press +
cool.temperature + eng.bar.press + engine.power + acceleration +
bus360 + bus361 + bus363 + bus364 + bus372 + bus375 + bus377 + bus379 +
bus380 + bus381 + bus382 + bus383 + bus384 + bus385 + dummy.grade,
data = busdata!0242006.1.3, na.action = na.exclude, mincut = 400,
minsize = 800, mindev = 0.01), nodes = c(13., 7., 12., 2.))
Variables actually used in tree construction:
[1] "engine.power" "vehicle.speed" "temperature"
Number of terminal nodes: 4
Residual mean deviance: 0.008002 = 151.2 / 18890
Distribution of residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-4.265e-001 -5.813e-002 -7.517e-004 8.861e-016 5.810e-002 8.710e-001
node), split, n, deviance, yval
* denotes terminal node
1) root 18894 247.20 0.4669
2) engine.power<72.3 1397 13.67 0.2581 *
3) engine.power>72.3 17497 167.70 0.4836
6) vehicle.speed<25.95 13777 121.40 0.4662
12) temperature<20.5 4902 42.44 0.5034 *
13) temperature>20.5 8875 68.45 0.4456 *
7) vehicle.speed>25.95 3720 26.60 0.5482 *
This tree model suggests that engine power is the most important explanatory variable for
NO emissions. This result is consistent with previous research results which verified the impor-
tant effect of engine power on NO emissions (Ramamurthy et al. 1998; Clark et al. 2002; Barth
et al. 2004). Analysis in the previous chapter also indicates that engine power is correlated with
not only on-road load parameters such as vehicle speed, acceleration, and grade, but also engine
operating parameters such as throttle position and engine oil pressure. On the other hand, en-
gine power in this research is derived from engine speed, engine torque and percent engine load.
Therefore engine power can correlate on-road modal activity with engine operating conditions to
that extent. This fact strengthens the importance of introducing engine power into the concep-
tual model and the need to improve the ability to simulate engine power for regional inventory
development.
HTBR results suggest that temperature may be an important predictive variable for NOx
emissions under certain conditions. Temperature effects may need to be integrated into new
models in the form of a temperature correction factor. But adequate data are not yet available for
this purpose. For the time being, temperature is removed from consideration in further linear re-
gression model development, but the effect is probably significant and should be examined when
more comprehensive emission rate data collected under a wider variety of temperature conditions
are available for analysis.
10-19
-------
10.3.1.2 CO HTBR Tree Model Development
Figure 10-20 illustrates the initial tree model used for truncated transformed CO emission
rate in acceleration mode. Results from the initial model are given in Table 10-8. The tree grew
into a complex model with a considerable number of branches and 33 terminal nodes. Figure 10-
21 illustrates the amount of deviation explained corresponding to the number of terminal nodes.
engin9.p
< 1 6. 785
vehicla so ied<19 Q5
1 15
Figure 10-20 Original Untrimmed Regression Tree Model for Truncated Transformed CO Emis-
sion Rate in Acceleration Mode
Table 10-8 Original Untrimmed Regression Tree Results for Truncated Transformed CO Emis-
sion Rate in Acceleration Mode
Regression tree:
tree(formula = log.CO ~ model.year + odometer + temperature + baro + humidity +
vehicle.speed + oil.temperture + oil.press + cool.temperature +
eng.bar.press + engine.power + acceleration + bus360 + bus361 + bus363 +
bus364 + bus372 + bus375 + bus377 + bus379 + bus380 + bus381 + bus382 +
bus383 + bus384 + bus385 + dummy.grade, data = busdata!0242006.1.3,
na.action = na.exclude, mincut = 400, minsize = 800, mindev = 0.01)
Variables actually used in tree construction:
[1] "engine.power" "humidity" "vehicle.speed" "acceleration"
[5] "odometer" "model.year" "baro"
Number of terminal nodes: 33
Residual mean deviance: 0.1184 = 2229 / 18830
Distribution of residuals:
Min. 1st Qu. Median Mean
-2.552e+000 -2.001e-001 -1.285e-002 3.025e-017
"eng.bar.press"
3rd Qu.
1.981e-001
Max.
1.653e+000
For model application purposes, it is desirable to select a final model specification that
balances the model's ability to explain the maximum amount of deviation with a simpler model
that is easy to interpret and apply. Figure 10-21 indicated that the reduction in deviation with ad-
dition of nodes after four, although potentially statistically significant, is very small. A simplified
10-20
-------
tree model was derived which ends in four terminal nodes as compared to the 33 terminal nodes
in the initial model. The residual mean deviation only increased from 2229 to 3093 and yielded
a much cleaner model than the initial one. Results are shown in Table 10-9 and Figure 10-22.
The CO acceleration emission rate model will be developed based upon these results.
11000
150.0 44.0
i i i t i i i
250
150 130
i i i i
66
52
-inf
10
15
20
30
size
Figure 10-21 Reduction in Deviation with the Addition of Nodes of Regression Tree for Trun-
cated Transformed CO Emission Rate in Acceleration Mode
pOW6r
-------
Table 10-9 Trimmed Regression Tree Results for Truncated Transformed CO Emission Rate in
Acceleration Mode
Regression tree:
snip.tree(tree = tree(formula = log.CO ~ model.year + odometer + temperature +
baro + humidity + vehicle.speed + oil.temperture + oil.press +
cool.temperature + eng.bar.press + engine.power + acceleration +
bus360 + bus361 + bus363 + bus364 + bus372 + bus375 + bus377 + bus379 +
bus380 + bus381 + bus382 + bus383 + bus384 + bus385 + dummy.grade,
data = busdata!0242006.1.3, na.action = na.exclude, mincut = 400,
minsize = 800, mindev = 0.01), nodes = c(12., 7., 2., 13.))
Variables actually used in tree construction:
[1] "engine.power" "vehicle.speed"
Number of terminal nodes: 4
Residual mean deviance: 0.164 = 3093 / 18860
Distribution of residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-3.019e+000 -2.450e-001 -1.062e-002 -9.774e-017 2.430e-001 1.735e+000
node), split, n, deviance, yval
* denotes terminal node
1) root 18864 5309.0 -1.1990
2) engine.power<82.625 1624 560.0 -1.9810 *
3) engine.power>82.625 17240 3662.0 -1.1250
6) vehicle.speed<19.05 9752 1994.0 -0.9339
12) engine.power<152.965 2335 522.6 -1.2510 *
13) engine.power>152.965 7417 1163.0 -0.8342 *
7) vehicle.speed>19.05 7488 847.2 -1.3740 *
This tree model suggested that engine power is the most important explanatory variable
for CO emissions, consistent with NO emissions. This tree will be used as reference for linear
' X
regression model development.
10.3.1.3 HC HTBR Tree Model Development
Figure 10-23 illustrates the initial tree model used for the truncated transformed HC emis-
sion rate in acceleration mode. Results for the initial model are given in Table 10-10. The tree
grew into a complex model with a considerable number of branches and 30 terminal nodes.
10-22
-------
bus377<0 5
Irim
M,temperjik,e<186 75
0 167089
engine power<62Q5
<9l>
Figure 10-23 Original Untrimmed Regression Tree Model for Truncated Transformed HC Emis-
sion Rate in Acceleration Mode
Table 10-10 Original Untrimmed Regression Tree Results for Truncated Transformed HC Emis-
sion Rate in Acceleration Mode
Regression tree:
tree(formula = HC.25 ~ model.year + odometer + temperature + baro + humidity +
vehicle.speed + oil.temperture + oil.press + cool.temperature +
eng.bar.press + engine.power + acceleration + bus360 + bus361 + bus363 +
bus364 + bus372 + bus375 + bus377 + bus379 + bus380 + bus381 + bus382 +
bus383 + bus384 + bus385 + dummy.grade, data = busdata!0242006.1.3,
na.action = na.exclude, mincut = 400, minsize = 800, mindev = 0.01)
Variables actually used in tree construction:
[1] "odometer" "bus377" "bus381" "baro"
[5] "engine.power" "humidity" "vehicle.speed" "oil-press"
[9] "bus375" "oil.temperture" "acceleration" "bus384"
[13] "bus364" "model.year"
Number of terminal nodes: 31
Residual mean deviance: 0.0005694 = 10.42 / 18300
Distribution of residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.004e-001 -1.347e-002 -2.222e-003 1.386e-016 1.091e-002 2.755e-001
Figure 10-23 and Table 10-12 suggest that the tree analysis of HC emission rates identi-
fied a number of buses that appear to exhibit significantly different emission rates under all load
conditions than the other buses (i.e., some of the bus dummy variables appeared significant in
the initial tree splits). Two bus dummy variables split the data pool at the top levels of the HC
tree model. The first cut point of "odometer > 282096" in the HC tree model could be directly
replaced by "bus 363 > 0.5", because only bus 363 has an odometer reading larger than 282096.
10-23
-------
There were three bus dummy variables that split the first three levels of the HC tree model.
Although higher emissions were noted for all three pollutants for some of the 15 buses, the divi-
sion was even more obvious for HC emissions (see Figure 10-9 and Table 10-4), consistent with
the findings in idle and deceleration mode. Although it is tempting to develop different emis-
sion rates for these buses to reduce emission rate deviation in the sample pool, it is difficult to
justify doing so. Unless there is an obvious reason to classify these three buses as high emitters
(i.e., significantly higher than normal emitting vehicles, perhaps by as much as a few standard
deviations from the mean), and unless there are enough data to develop separate emission rate
models for high emitters, one cannot justify removing the data from the data set. Until data exist
to justify treating these buses as high emitters, the bus dummy variables for individual buses are
removed from the analyses and all 15 buses are treated as part of the whole data set.
Another tree model was generated excluding the bus dummy variables, model year, and
odometer. This new tree model is illustrated in Figure 10-25 and Table 10-11. The tree model is
then trimmed for application purposes, as was done for the NO and CO models.
8.400 2100 0.490
ii i i i i
0.320 0.110 0.090 0.070 0.044 0.018 0.016
i i i i i ii ii i ii ii
ID
I
15
I
20
I
25
30
size
Figure 10-24 Reduction in Deviation with the Addition of Nodes of Regression Tree
for Truncated Transformed HC Emission Rate in Acceleration Mode
10-24
-------
0.1286
fingins prnkmr<5fi
baro -J :•:':'.' r
01682
0.2134
02423
Figure 10-25 Trimmed Regression Tree Model for Truncated Transformed HC in Acceleration
Mode
Table 10-11 Trimmed Regression Tree Results for Truncated Transformed HC in Acceleration Mode
mincut = 400, minsize =
14.))
Regression tree:
snip.tree(tree = tree(formula = HC.25 ~ temperature + baro + humidity +
vehicle.speed + oil.temperture + oil.press + cool.temperature +
eng.bar.press + engine.power + acceleration + dummy.grade, data =
busdata!0242006.1.3, na.action = na.exclude
800, mindev = 0.01), nodes = c(2., 6., 15.,
Variables actually used in tree construction:
[1] "baro" "engine.power"
Number of terminal nodes: 4
Residual mean deviance: 0.001018 = 18.65 / 18330
Distribution of residuals:
Min. 1st Qu. Median Mean
-9.502e-002 -2.174e-002 -2.213e-003 9.390e-016
node), split, n, deviance, yval
* denotes terminal node
1) root 18330 30.840 0.2099
2) baro<969.5 1189 1.239 0.1286 *
3) baro>969.5 17141 21.210 0.2155
6) engine.power<56.24 850 1.069 0.1682 *
7) engine.power>56.24 16291 18.140 0.2180
14) baro<989.5 13717 13.970 0.2134 *
15) baro>989.5 2574 2.372 0.2423 *
3rd Qu.
1.844e-002
Max.
3.100e-001
The new tree model suggests that barometric pressure is the most important explana-
tory variable for HC emission rates. However, this finding is challenged by this fact: among
10-25
-------
those 1189 data points (baro < 969.5) in the first left branch, 1187 data points belong to bus 363.
Although this dataset was collected under a wide variety of environmental conditions, the scope
of barometric pressures was limited for individual buses tested. As reported earlier, Bus 363
exhibited significantly lower HC emissions that the other buses (see Figure 10-9); the reason is
not clear at this time. To develop a reasonable tree model given the limited data collected, the
environmental parameters are excluded from the model until a greater distribution of environ-
mental conditions can be represented in a test data set. With data collected from a more com-
prehensive testing program, environmental variables can be integrated into the model directly, or
perhaps correction factors for the emission rates can be developed. The secondary trimmed tree
is presented in Figure 10-26 and Table 10-12.
annine nnwer<54 5S5
f
oil press '427 75
0.1559
.ena_bar_pre;s<100,249
02266
0.1937
02169
Figure 10-26 Secondary Trimmed Regression Tree Model for Truncated Transformed HC Emis-
sion Rate in Acceleration Mode
10-26
-------
Table 10-12 Secondary Trimmed Regression Tree Results for Truncated Transformed HC Emis-
sion Rate in Acceleration Mode
Regression tree:
snip.tree(tree = tree(formula = HC.25 ~ engine.power + vehicle.speed +
acceleration + oil.temperture + oil.press + cool.temperature +
eng.bar.press, data = busdata!0242006.1.3, na.action = na.exclude,
mincut = 400, minsize = 800, mindev = 0.1), nodes = c(7., 13., 12.
Variables actually used in tree construction:
[1] "engine.power" "oil.press" "eng.bar.press"
Number of terminal nodes: 4
Residual mean deviance: 0.00136 = 24.92 / 18330
Distribution of residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.178e-001 -2.378e-002 6.119e-004 -4.275e-017 2.231e-002 3.223e-001
node), split, n, deviance, yval
* denotes terminal node
1) root 18330 30.840 0.2099
2) engine.power<54.555 988 1.779 0.1559 *
3) engine.power>54.555 17342 26.020 0.2130
6) oil.press<427.75 12457 18.610 0.2076
12) eng.bar.press<100.249 4989 9.241 0.1937 *
13) eng.bar.press>100.249 7468 7.763 0.2169 *
7) oil.press>427.75 4885 6.136 0.2266 *
This tree model suggests that engine power is the most important explanatory variable
for HC emissions, consistent with analysis of NO and CO emission rates. HTBR results also
suggest that oil pressure and engine barometric pressure may be important predictive variables
for HC emissions under certain conditions. After excluding engine barometric pressure and oil
pressure from the tree model, leaving engine power only, the residual mean deviation increased
slightly from 24.92 to 27.34. While engine operating parameters such as oil pressure and engine
barometric pressure may impact emissions, such variables are not easy to include in real-world
models. The final HTBR tree for HC emissions is shown in Figure 10-27 and Table 10-13. An
HC acceleration emission rate model will be developed based upon these results.
10-27
-------
engine power<14 825
angina powQr14.825 550 0.8171 0.1717 *
3) engine.power>54.555 17342 26.0200 0.2130
6) engine.power<98.385 1177 1.8580 0.2022 J
7) engine.power>98.385 16165 24.0100 0.2137
10-28
-------
10.3.2 OLS Model Development and Refinement
Once a manageable number of modal variables have been identified through regression
tree analysis, the modeling process moves into the phase where ordinary least squares techniques
are used to obtain a final model. The research objective here is to identify the extent to which
the identified factors influence emission rates in acceleration mode. Modelers rely on previous
research, a priori knowledge, educated guesses, and stepwise regression procedures to identify
acceptable functional forms, to determine important interactions, and to derive statistically and
theoretically defensible models. The final model will be our best understanding about the func-
tional relationship between independent variables and dependent variables.
10.3.2.1 NO Emission Rate Model Development for Acceleration Mode
Based on previous analysis, truncated transformed NOx will serve as the independent
variable. However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing the performance of statistical models.
HTBR tree model results suggest that engine power is the best one to begin with. Linear regres-
sion model with engine power will be developed first, followed by a combined power and ve-
hicle speed model.
10.3.2.1.1 Linear Regression Model with Engine Power
Let's select engine power to begin with, and estimate the model:
7 = /?Q + ft ^engine.power) + Error (1.1)
The regression run yields the results shown in Table 10-14.
10-29
-------
Table 10-14 Regression Result for NO Model 1.1
Call: lm(formula = NOx.50 ~ engine.power, data = busdata!0242006.1.3, na.action =
na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.4093 -0.08133 0.005414 0.07084 0.9344
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 0.3054 0.0021 147.9391 0.0000
engine.power 0.0008 0.0000 83.3557 0.0000
Residual standard error: 0.09781 on 18892 degrees of freedom
Multiple R-Sguared: 0.2689
F-statistic: 6948 on 1 and 18892 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
engine.power -0.9387
Analysis of Variance Table
Response: NOx.50
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
engine.power 1 66.4763 66.47630 6948.175 0
Residuals 18892 180.7482 0.00957
These results suggest that engine power explains about 27% of the variance in truncated
transformed NOx. F-statistic shows that/?7 ^ 0, and the linear relationship is statistically signifi-
cant. To evaluate the model, residual normality is checked by examining quantile-quantile (QQ)
plot and checking constancy of variance by examining residuals vs. fitted values.
10-30
-------
(a) Scatter Plot
(b) Residual vs. Fit
n^r'.^r^*•
i
(c) Response vs. Fit
(d) Residuals Normal QQ
Figure 10-28 QQ and Residual vs. Fitted Plot for NOx Model 1.1
The residual plot in Figure 10-28 shows a slight departure from linear regression assump-
tions indicating a need to explore a curvilinear regression function. Since the variability at the
different X levels appears to be fairly constant, a transformation on X is considered. The reason
to consider transformation first is to avoid multicollinearity brought about by adding the second-
order of X. Based on the prototype plot in Figure 10-28, the square root transformation and loga-
rithmic transformation are tested. Scatter plots and residual plots based on each transformation
should then be prepared and analyzed to determine which transformation is most effective.
Y = /?Q + ^engine.power^IT> + Error
Y =/?Q + ft Jog w(engine.power+1) + Error
(1.2)
(1.3)
The result for Model 1.2 will be shown in Table 10-15 and Figure 10-29, while the result
for Model 1.3 will be shown in Table 10-16 and Figure 10-30.
10-31
-------
Table 10-15 Regression Result for NO Model 1.2
Call: lm(formula = NOx.50 ~ engine.power'" (1/2), data = busdata!0242006.1.3,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.4106 -0.07981 0.004093 0.06858 0.9248
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 0.1912 0.0030 63.2141 0.0000
I (engine.power'" (1/2) ) 0.0196 0.0002 93.5953 0.0000
Residual standard error: 0.09455 on 18892 degrees of freedom
Multiple R-Sguared: 0.3168
F-statistic: 8760 on 1 and 18892 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
I (engine.power'" (1/2) ) -0.9738
Analysis of Variance Table
Response: NOx.50
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
I (engine.power'" (1/2)) 1 78.3199 78.31986 8760.082 0
Residuals 18892 168.9047 0.00894
(a) Scatter Plot
• t II (I » 01 ill «»
(O Response vs. Fit
i -j0'
(b) Residua! vs. Fit
(d) Residuals Normal QQ
Figure 10-29 QQ and Residual vs. Fitted Plot for NO Model 1.2
10-32
-------
Table 10-16 Regress!on Result for NO Model 1.3
*** Linear Model ***
Call: lm(formula = NOx.50 ~ loglO(engine.power + 1), data = busdata!0242006.1.3,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.4109 -0.07485 0.001841 0.06716 0.9119
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) -0.0514 0.0052 -9.7873 0.0000
loglO(engine.power + 1) 0.2291 0.0023 99.6000 0.0000
Residual standard error: 0.09263 on 18892 degrees of freedom
Multiple R-Sguared: 0.3443
F-statistic: 9920 on 1 and 18892 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
loglO(engine.power + 1) -0.9917
Analysis of Variance Table
Response: NOx.50
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
loglO(engine.power +1) 1 85.1206 85.12056 9920.161 0
Residuals 18892 162.1040 0.00858
(a) Scatter Plot
(c) Response vs. Fit
(b) Residual vs. Fit
(d) Residuals Normal QQ
Figure 10-30 QQ and Residual vs. Fitted Plot for NOx Model 1.3
10-33
-------
The results suggest that by using square root transformed engine power, the model increases
the amount of variance explained in truncated transformed NOx from about 27% (Model 1.1) to
about 32% (Model 1.2), while the increase is about 34% (Model 1.3) by using log transformed
engine power.
Model 1.3 improves the R2 more than does Model 1.2. The residuals scatter plot for
Model 1.3 (Figure 10-30) shows a more reasonably linear relationship than Model 1.2 (Figure
10-29). Figure 10-30 also shows that Model 1.3 does a better job in improving the pattern of
variance. QQ plot shows general normality with the exceptions arising in the tails.
10.3.2.1.2 Linear Regression Model with Engine Power and Vehicle Speed
HTBR tree model results also suggest that vehicle speed may be an important predictive
variable for emissions under certain conditions. After developing a linear regression model with
engine power, adding vehicle speed might improve the model predictive ability. The new model
is proposed as:
Y = /?Q + ft Jog ^(engine.power+1) + ^vehicle.speed + Error (1.4)
The result for Model 1.4 is shown in Table 10-17 and Figure 10-31.
10-34
-------
Table 10-17 Regress!on Result for NO Model 1.4
Call: lm(formula = NOx.50 ~ loglO(engine.power + 1) + vehicle.speed, data
busdata!0242006.1.3, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.4133 -0.07416 0.004219 0.06303 0.9019
Coefficients:
(Intercept)
loglO(engine.power + 1)
vehicle.speed
Value Std. Error
-0.0195 0.0053
0.2007 0.0025
0.0019 0.0001
t value Pr(>|t|)
-3.6693 0.0002
79.3288 0.0000
25.1554 0.0000
Residual standard error: 0.09112 on 18891 degrees of freedom
Multiple R-Sguared: 0.3656
F-statistic: 5442 on 2 and 18891 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept) loglO(engine.power + 1)
loglO(engine.power + 1) -0.9681
vehicle.speed 0.2383 -0.4470
Analysis of Variance Table
Response: NOx.50
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
loglO(engine.power +1) 1 85.1206 85.12056 10251.92 0
vehicle.speed 1 5.2540 5.25404 632.80 0
Residuals 18891 156.8499 0.00830
10-35
-------
(b) Response vs. Fit
(a) Residual vs. Fit
-
fc-
•it > p
(c) Residuals Normal QQ
0* 01 03 01 »« 0* *«
Figure 10-31 QQ and Residual vs. Fitted Plot for NOx Model 1.4
The results suggest that by using vehicle speed and transformed engine power, the model
increases the amount of variance explained in truncated transformed NOx from about 34%
(Model 1.3) to about 37% (Model 1.4). The residuals scatter plot for Model 1.4 (Figure 10-31)
shows a more reasonably linear relationship. Figure 10-31 also shows that model 1.4 does a bet-
ter job in improving the pattern of variance. QQ plot shows general normality, with deviation at
the tails.
10.3.2.1.3 Linear Regression Model with Dummy Variables
Figure 10-19 suggests that the relationship between NO and engine power may be
somewhat different across the engine power ranges identified in the tree analysis. That is, there
may be higher or lower NOx emissions in different engine power operating ranges. One dummy
variable is created to represent different engine power ranges identified in Figure 10-19 for use in
linear regression analysis as illustrated below:
Engine power (bhp) dummy 1
< 72.30 1
> 72.30 0
10-36
-------
This dummy variable and the interaction between dummy variable and engine power are
then tested to determine whether the use of the variables and interactions can help improve the
model:
Y = ft + ft log (engine.power+1) + fl vehicle.speed + ft dummy 1 +
ft dummy 1 log (engine.power+1) + ft dummyIvehicle.speed + Error
(1.5)
The result for Model 1.5 is shown in Table 10-18 and Figure 10-32.
Table 10-18 Regression Result for NO Model 1.5
Call: lm(formula = NOx.50 ~ loglO(engine.power + 1) + vehicle.speed + dummy1
loglO( engine.power + 1) + dummyl:vehicle.speed, data = busdata!0242006.1.3,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.4124 -0.07157 0.003012 0.06319 0.8924
Coefficients:
(Intercept)
loglO(engine.power + 1)
vehicle.speed
dummyl
dummyl:loglO(engine.power + 1)
Value Std. Error
0.1439 0.0115
0.1281
0.0023
-0.1492
t value Pr (>111
0.0609
0.0052
0.0001
0.0148
0.0081
12.4979
24.8261
28.9191
-10.0783
7.4995
0.0000
0.0000
0.0000
0.0000
0.0000
dummyl:vehicle.speed -0.0035 0.0003 -10.4883 0.0000
Residual standard error: 0.09022 on 18888 degrees of freedom
Multiple R-Sguared: 0.3781
F-statistic: 2297 on 5 and 18888 degrees of freedom, the p-value is 0
Analysis of Variance Table
Response: NOx.50
Terms added seguentially (first to last)
Df Sum of
loglO(engine.power + 1)
vehicle.speed
dummyl
dummyl:loglO(engine.power + 1)
dummyl:vehicle.speed
Residuals
1
1
1
1
1
18888
85
5
1
0
0
Sq
1206
2540
9017
3018
8955
Mean Sg
85.12056
25404
90166
30180
89546
F Value
10456.89
645.45
233.62
37.08
110.01
153.7510 0.00814
Pr (F)
loglO(engine.power + 1) 0.OOOOOOe+000
vehicle.speed 0.OOOOOOe+000
dummyl 0.OOOOOOe+000
dummyl:loglO(engine.power + 1) 1.158203e-009
dummyl:vehicle.speed 0.OOOOOOe+000
Residuals
10-37
-------
(b) Response vs. Fit
(a) Residuals vs. Fit
«« 01 »} oil
91 91
(c) Residuals Normal QQ
00 01 02 03
Figure 10-32 QQ and Residual vs. Fitted Plot for NOx Model 1.5
The results suggest that by using dummy variables and interactions with transformed
engine power and vehicle speed, the model slightly increases the amount of variance explained
in truncated transformed NOx from about 37% (Model 1.4) to about 38% (Model 1.5).
Model 1.5 slightly improves the R2 compared to Model 1.4. The residuals scatter plot
for Model 1.5 (Figure 10-32) shows a slightly more linear relationship. Figure 10-32 also shows
that Model 1.4 may also do a slightly better job in improving the pattern of variance. The QQ
plot shows general normality with the exceptions arising in the tails. However, it is important
to note that the model improvement, in terms of amount of variance explained by the model, is
marginal at best.
10.3.2.1.4 Model Discussions
The performance of alternative models can be evaluated by comparing model predictions
and actual observations for emission rates. The performance of the model can be evaluated in
terms of precision and accuracy (Neter et al. 1996). The R2 value is an indication of precision.
Usually, higher R2 values imply a higher degree of precision and less unexplained variability in
10-38
-------
model predictions than lower R2 values. The slope of the trend line for the observed versus pre-
dicted values is an indication of accuracy. A slope of one indicates an accurate prediction, in that
the prediction of the model corresponds to an observation. Since the R2 and slope are derived by
comparing model predictions and actual observations for emission rates, these numbers will be
different from those observed in linear regression models.
The models' predictive ability is also evaluated using the root mean square error (RMSE)
and the mean prediction error (MPE) (Neter et al. 1996). The RMSE is a measure of prediction
error. When comparing two models, the model with a smaller RMSE is a better predictor of
the observed phenomenon. Ideally, mean prediction error is close to zero. RMSE and MPE are
calculated as follows:
Equation (10-1)
Equation (10-1)
where:
RMSE:
n:
y.:
y.:
MPE:
root mean square error
number of observations
observaton y
mean of observation y
mean predictive error
Previous sections provide the model development process from one model to another
model. To test whether the linear regression with power was a beneficial addition to the regres-
sion tree model, the mean ERs at HTBR end nodes (single value) are compared to the predictions
from the linear regression function with engine power. The results of the performance evalua-
tion are shown in Table 10-19. The improvement in R2 associated with moving toward a linear
function of engine power is large. Hence, the use of the linear regression function will provide a
significant improvement in spatial and temporal model prediction capability. But this linear re-
gression function might still be improved. Since the R2 and slope in Table 10-19 are derived by
comparing model predictions and actual observations for emission rates (untransformed y), these
numbers are different from those obtained from linear regression models.
10-39
-------
Two transforms of engine power were tested: square root transformation and log trans-
formation. The results of the performance evaluation are shown in Table 10-19. Results suggest
that linear regression function with log transformation performs slightly better.
The addition of vehicle speed was also tested. The results of the performance evaluation
are shown in Table 10-19. Analysis results suggest that a linear regression function for engine
power and vehicle speed also performs slightly better.
Since the regression tree modeling exercise indicated that a number of power cutpoints
may play a role in the emissions process, an additional modeling run was performed. The results
of the performance evaluation are also shown in Table 10-19. Analysis results suggest that a
linear regression function with dummy variables performs slightly better than the model without
the power cutpoints.
Table 10-19 Comparative Performance Evaluation of NO Emission Rate Models
Coefficient of
determination
Slope
(P,)
RMSE MPE
Mean ERs
Linear Regression (Power)
Linear Regression (Power5)
Linear Regression (log(Power))
Linear Regression (log(Power)+Speed)
Linear Regression (log(Power)+Speed+Dummy)
0.00026
0.190
0.215
0.236
0.268
0.280
1.000
0.838
0.901
1.012
1.001
1.036
0.10455
0.09463
0.09321
0.09178
0.08982
0.08912
0.00001
0.00428
0.00898
0.00872
0.00837
0.00834
Although the linear regression function with dummy variables works slightly better than
the linear regression function with engine power and vehicle speed, it introduces more explanato-
ry variables (dummy variables and the interaction with engine power) and increases the complex-
ity of the regression model. There is only one regression function for Model 1.4 while there are
two regression functions for Model 1.5. There is also no obvious reason why the engine may be
performing slightly differently within these power regimes, yielding different regression slopes
and intercepts. The fuel injection systems in these engines may operate slightly differently under
low load (near-idle) and high load conditions. This fuel injection system may be controlled by
the engine computer. There may be a sufficient number of low power cruise operations and high
power cruise operations that are incorrectly classified, and that may be better classified as idle
or acceleration events (perhaps due to GPS speed data errors). In any case, because the model
with dummy variables does not perform appreciably better than the model without the dummy
variables, the dummy variables are not included in the final model selection at this time. These
10-40
-------
dummy variables are, however, worth exploring when additional data from other engine technol-
ogy groups become available for analysis. Model 1.4 is selected as the preliminary 'final' model.
The next step in model evaluation is to once again examine the residuals for the improved
model. A principal objective was to verify that the statistical properties of the regression model
conform with a set of properties of least squares estimators. In summary, these properties require
that the error terms be normally distributed, have a mean of zero, and have uniform variance.
Test for Constancy of Error Variance
A plot of the residuals versus the fitted values is useful in identifying any patterns in the
residuals. Figure 10-3 l(c) shows this plot for NOx model. Without considering variance due to
high emission points and zero load data, there is no obvious pattern in the residuals across the
fitted values.
Test of Normality of Error terms
The first informal test normally reserved for the test of normality of error terms is a
quantile-quantile plot of the residuals. Figure 10-31 plot (c) shows the normal quantile plot of the
NO model. The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality. Under normality, we expect 68 percent of the residuals
to fall between ± VMSE and about 90 percent fall between ±1.645 VMSE . Actually, 72.64% of
residuals fall within the first limits, while 93.79% of residuals fall within the second limits. Thus,
the actual frequencies here are reasonably consistent with those expected under normality. The
heavy tails at both ends are a cause for concern, but are due to the nature of the data set. For
example, even after the transformation, the response variable is not a true normal distribution.
Based on the above analysis, the final NOx emission model for cruise mode is:
NOx= [-0.0195 + 0.2011og10(engine.power+l) + 0.0019vehicle.speed]2
Analysis results support the observation that the final NO emission model is significantly
better at explaining variability without making the model too complex. Since there is only one
engine type, complexity may not be valid in terms of transferability. This model is specific to the
engine classes employed in the transit bus operations. Different models may need to be devel-
oped for other engine classes and duty cycles.
10-41
-------
10.3.2.2 CO Emission Rate Model Development for Acceleration Mode
Based on previous analysis, truncated transformed CO will serve as the independent
variable. However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing statistical models. HTBR tree model
results suggest that engine power is best to begin with.
10.3.2.2.1 Linear Regression Model with Engine Power
Let's select engine power to begin with, and estimate the model:
7 = ft + ft ^engine.power + Error (2.1)
The regression run yields the results shown in Table 10-20.
Table 10-20 Regression Result for CO Model 2.1
Call: lm(formula = log.CO ~ engine.power, data = busdata!0242006.1.3, na.action =
na.exclude)
Residuals:
Min 1Q Median 3Q Max
-3.151 -0.3515 -0.05231 0.3448 1.453
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) -1.8549 0.0100 -185.2318 0.0000
engine.power 0.0031 0.0000 69.7761 0.0000
Residual standard error: 0.473 on 18862 degrees of freedom
Multiple R-Sguared: 0.2052
F-statistic: 4869 on 1 and 18862 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
engine.power -0.939
Analysis of Variance Table
Response: log.CO
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
engine.power 1 1089.300 1089.300 4868.698 0
Residuals 18862 4220.097 0.224
The results suggest that engine power explains about 21% of the variance in truncated
transformed CO. F-statistic shows that/?7^03 and the linear relationship is statistically signifi-
cant. To evaluate the model, the normality is examined in the QQ plot and constancy of variance
is checked by examining residuals vs. fitted values.
10-42
-------
(a) Scatter Plot
(b) Residual vs. Fit
TOO no JOB
(c) Response vs. Fit
(d) Residuals Normal QQ
tt >T* '
Figure 10-33 QQ and Residual vs. Fitted Plot for CO Model 2.1
The residual plot in Figure 10-33 shows a slight departure from linear regression assump-
tions indicating a need to explore a curvilinear regression function. Since the variability at the
different X levels appears to be fairly constant, a transformation on X is considered. The reason
to consider transformation first is avoiding multicollinearity brought about by adding the second-
order of X. Based on the prototype plot in Figure 10-33, the square root transformation and loga-
rithmic transformation were tested. Scatter plots and residual plots based on each transformation
should then be prepared and analyzed to determine which transformation is most effective.
(2.2)
Y =
engine.power^(l/2) + Error
Y = /? + fijog^engine.power+l) + Error
(2.3)
The result for Model 2.2 is shown in Table 10-21 and Figure 10-34, while the result for
Model 2.3 is shown in Table 10-22 and Figure 10-35.
10-43
-------
Table 10-21 Regression Result for CO Model 2.2
Call: lm(formula = log.CO ~ engine.power'" (1/2), data = busdata!0242006.1.3,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-2.798 -0.3492 -0.0529 0.3381 1.52
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) -2.3146 0.0149 -155.8023 0.0000
I (engine.power'" (1/2) ) 0.0793 0.0010 77.1161 0.0000
Residual standard error: 0.4626 on 18862 degrees of freedom
Multiple R-Sguared: 0.2397
F-statistic: 5947 on 1 and 18862 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
I (engine.power'" (1/2) ) -0.974
Analysis of Variance Table
Response: log.CO
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
I (engine.power'" (1/2)) 1 1272.706 1272.706 5946.896 0
Residuals 18862 4036.691 0.214
Residuals Normal QQ
Figure 10-34 QQ and Residual vs. Fitted Plot for CO Model 2.2
10-44
-------
Table 10-22 Regression Result for CO Model 2.3
Call: lm(formula = log.CO ~ loglO(engine.power + 1), data = busdata!0242006.1.3,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-2.187 -0.3475 -0.05182 0.3313 2.475
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) -3.2695 0.0261 -125.3639 0.0000
loglO(engine.power + 1) 0.9152 0.0114 80.0560 0.0000
Residual standard error: 0.4584 on 18862 degrees of freedom
Multiple R-Sguared: 0.2536
F-statistic: 6409 on 1 and 18862 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
loglO(engine.power + 1) -0.9918
Analysis of Variance Table
Response: log.CO
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
loglO(engine.power +1) 1 1346.515 1346.515 6408.966 0
Residuals 18862 3962.882 0.210
(a) Scatter Plot
(b) Residual vs. Fit
i-
(c) Response vs. Fit
(d) Residuals Normal QQ
Figure 10-35 QQ and Residual vs. Fitted Plot for CO Model 2.3
10-45
-------
The results suggest that by using transformed engine power, the model increases the
amount of variance explained in truncated transformed CO from about 21% to about 25%.
Model 2.3 improves the R2 more than does Model 2.2. The residuals scatter plot for
Model 2.3 (Figure 10-35) shows a more reasonably linear relationship than Model 2.2 (Figure
10-34). Figure 10-35 also shows that Model 2.3 does a better job in improving the pattern of
variance. QQ plot shows general normality with the exceptions arising in the tails.
10.3.2.2.2 Linear Regression Model with Engine Power and Vehicle Speed
HTBR tree model results also suggest that vehicle speed may be an important predictive
variable for emissions under certain conditions. After developing a linear regression model with
engine power, adding vehicle speed might improve the model predictive ability. The new model
is proposed as:
7 = ft + ft Jog w(engine.power+1) + ^vehicle.speed + Error (2.4)
The result for Model 2.4 will be shown in Table 10-23 and Figure 10-36.
Table 10-23 Regression Result for CO Model 2.4
Call: lm(formula = log.CO ~ loglO(engine.power + 1) + vehicle.speed, data
busdata!0242006.1.3, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-2.299 -0.236 -0.02889 0.2281 3.209
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) -3.7472 0.0225 -166.3169 0.0000
loglO(engine.power + 1) 1.3412 0.0107 125.1282 0.0000
vehicle.speed -0.0285 0.0003 -89.0585 0.0000
Residual standard error: 0.3846 on 18861 degrees of freedom
Multiple R-Sguared: 0.4746
F-statistic: 8517 on 2 and 18861 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept) loglO(engine.power + 1)
loglO(engine.power + 1) -0.9683
vehicle.speed 0.2380 -0.4463
Analysis of Variance Table
Response: log.CO
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
loglO(engine.power +1) 1 1346.515 1346.515 9103.577 0
vehicle.speed 1 1173.140 1173.140 7931.415 0
Residuals 18861 2789.742 0.148
10-46
-------
(b) Response vs. Fit
(a) Residual vs. Fit
(c) Residuals Normal QQ
71
Figure 10-36 QQ and Residual vs. Fitted Plot for CO Model 2.4
The results suggest that by using vehicle speed and transformed engine power, the model
increases the amount of variance explained in truncated transformed CO from about 25% to
about 47%.
Model 2.4 tremendously improves the R2 achieved in Model 2.3. The residuals scat-
ter plot for Model 2.4 (Figure 10-36) shows a reasonably linear relationship. Figure 10-36 also
shows that Model 2.4 does a slightly better job in improving the pattern of variance. QQ plot
shows general normality with the exceptions arising in the tails.
10.3.2.2.3 Linear Regression Model w ith Dummy Variables
Figure 10-22 suggests that the relationship between CO and engine power may be some-
what different across the engine power ranges identified in the tree analysis. That is, there may
be higher or lower CO emissions in different engine power operating ranges. One dummy vari-
able is created to represent different engine power ranges identified in Figure 10-22 for use in
linear regression analysis as illustrated below:
Engine power (bhp)
< 82.625
> 82.625
Dummy 1
1
0
10-47
-------
This dummy variable and the interaction between dummy variable and engine power are
then tested to determine whether the use of the variable and interactions can help improve the
model.
Y = ft + ft log (engine.power+1) + fl vehicle.speed + ft dummy 1 +
ft dummy 1 log (engine.power+1) + fl dummyIvehicle.speed + Error
The result for Model 2.5 are shown in Table 10-24 and Figure 10-37.
10-48
-------
Table 10-24 Regression Result for CO Model 2.5
Call: lm(formula = log. CO ~ loglO (engine .power + 1) + vehicle. speed
+ dummyl * loglO (
engine. power + 1) + dummy 1 * vehicle . speed, data = busdata!0242006 . 1 . 3,
na. action = na. exclude)
Residuals :
Min 1Q Median 3Q
-2.383 -0.233 -0.02602 0.2235
Coefficients :
(Intercept)
loglO (engine. power + 1)
vehicle . speed
dummy 1
dummyl : loglO (engine. power + 1)
dummyl : vehicle . speed
Residual standard error: 0.3655
Multiple R-Sguared: 0.5255
Max
2.124
Value Std. Error t value Pr(>
-4.4320 0.0498 -89.0217 0.
1.6746 0.0222 75.4956 0.
-0.0333 0.0003 -102.3796 0.
1.4402 0.0614 23.4537 0.
-1.0349 0.0321 -32.2634 0.
0.0414 0.0013 32.8802 0.
on 18858 degrees of freedom
F-statistic: 4177 on 5 and 18858 degrees of freedom, the p-value is
Correlation of Coefficients:
loglO (engine. power + 1)
vehicle . speed
dummyl
dummyl : loglO (engine. power + 1)
dummyl : vehicle . speed
loglO (engine. power + 1)
vehicle . speed
dummyl
dummyl : loglO (engine. power + 1)
dummyl : vehicle . speed
loglO (engine. power + 1)
vehicle . speed
dummyl
dummyl : loglO (engine. power + 1)
dummyl : vehicle . speed
Analysis of Variance Table
Response: log. CO
Terms added seguentially (first
loglO (engine. power + 1)
vehicle . speed
dummyl
dummyl : loglO (engine. power + 1)
dummyl : vehicle . speed
Residuals
(Intercept) loglO (engine. power + 1)
-0.9926
0.3000 -0.4020
-0.8108 0.8047
0.6864 -0.6915
-0.0774 0.1038
vehicle . speed dummyl
-0.2432
0.2780 -0.9559
-0.2581 0.0018
dummyl : loglO (engine .power + 1)
-0.1467
to last)
Df Sum of Sg Mean Sg F Value Pr
1 1346.515 1346.515 10079.07
1 1173.140 1173.140 8781.31
1 23.180 23.180 173.51
1 102.793 102.793 769.44
1 144.430 144.430 1081.10
18858 2519.338 0.134
It|)
0000
0000
0000
0000
0000
0000
0
(F)
0
0
0
0
0
10-49
-------
(a) Residuals vs. Fit
(b) Response vs. Fit
(c) Residuals Normal QQ
Figure 10-37 QQ and Residual vs. Fitted Plot for CO Model 2.5
Model 2.5 does improve R2 from around 0.47 to around 0.52 by adding the dummy
variables. The residuals scatter plot for Model 2.5 (Figure 10-37) shows a slightly more linear
relation. Figure 10-37 also shows that Model 2.5 perhaps may improve the pattern of variance.
The QQ plot again shows general normality with the exceptions arising in the tails. However,
it is important to note that the model improvement, in terms of amount of variance explained by
the model, is not large.
Then three more dummy variables will be created to represent different engine power and
vehicle speed ranges in Figure 10-22 and are shown as follow:
Thresholds
engine.power < 82.625
engine.power [82.625, 152.96] & vehicle.speed < 19.05
engine.power > 152.96 & vehicle.speed < 19.05
engine.power > 82.625 & vehicle.speed > 19.05
These three dummy variables and the interaction between dummy variables and engine
power and vehicle speed are added to improve the model. This model will be:
Dummy21 Dummy22 Dummy23
1 0 0
0 1 0
0 0 1
000
Y= PQ + P1log1Q(engine.power+l) + P2 vehicle.speed + P3dummy21 +
P4 dummy21 log1Q(engine.power+l) + P5 dummy21 vehicle.speed + P6 dummy22 +
P7 dummy22 log1Q(engine.power+l) + Pg dummy22 vehicle.speed + P9dummy23 +
P10dummy231og10(engine.power+l) +Pndummy23 vehicle.speed + Error
(2.6)
10-50
-------
The results for Mode. 2.6 are shown in Table 10-25 and Figure 10-35.
Table 10-25 Regression Result for CO Model 2.6
*** Linear Model ***
Call: lm(formula = log.CO ~
loglO(engine.power +
loglO(engine.power + 1) + vehicle.speed + duntmy21 *
1) + duntmy21 * vehicle, speed + duntmy22 * loglO (
dummy22 * vehicle.speed + dummy23
dummy23 * vehicle.speed, data =
na.action = na.exclude)
3Q
.2012
Max
2.124
Value
.5895
,1014
.0150
.5978
.4856
.3863
.4617
.0231
.8643
.0194
.3505
.0387
on 18852
-3.
1.
-0.
0.
-1.
-2.
-0.
0.
0.
-0.
1.
-0.
Std
0
Error
0945
0389
0007
1007
2216
1632
0.0448
0.0014
1048
0016
0701
0012
engine.power +1) +
engine.power +1) +
busdata!0242006.1.3,
Residuals:
Min 1Q Median
-2.562 -0.2086 -0.02372 0
Coefficients:
(Intercept)
loglO(engine.power + 1)
vehicle.speed
dummy21
dummy2 2
dummy2 3
dummy21:loglO(engine.power + 1)
dummy21:vehicle.speed
dummy22:loglO(engine.power + 1)
dummy22:vehicle.speed
dummy23:loglO(engine.power + 1)
dummy23:vehicle.speed
Residual standard error: 0.3517
Multiple R-Sguared: 0.5609
F-statistic: 2189 on 11 and 18852 degrees of freedom,
Analysis of Variance Table
Response: log.CO
Terms added seguentially (first to last)
Df Sum of Sg
loglO(engine.power + 1)
vehicle.speed
dummy21
dummy2 2
dummy2 3
dummy21:loglO(engine.power + 1)
dummy21:vehicle.speed
dummy22:loglO(engine.power + 1)
dummy22:vehicle.speed
dummy23:loglO(engine.power + 1)
dummy23:vehicle.speed
Residuals
loglOi
t value Pr(>|t|
-37.
28.
-21.
5.
-6.
-14.
-10.
16.
8.
-12.
19.
-30.
9720
3316
0912
9384
7035
6202
3020
8659
2494
1421
2614
9943
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
degrees of freedom
the p-value is 0
Mean Sg
1
1
1
1
1
1
1
1
1
1
1
18852
1346.
1173.
23.
67.
100.
35.
93.
3.
3.
12.
118.
515
140
180
463
345
491
450
681
564
318
804
1346.
1173.
23.
67.
100.
35.
93.
3.
3.
12.
118.
515
140
180
463
345
491
450
681
564
318
804
F Value
10887.89
9485.98
187.
545.
811.
286.
755.
29.76
28.82
99.61
960.65
.44
.50
.39
.98
.63
2331.445
0.124
Pr (F)
loglO(engine.power + 1) 0.OOOOOOe+000
vehicle.speed 0.OOOOOOe+000
dummy21 0.OOOOOOe+000
dummy22 0.OOOOOOe+000
dummy23 0.OOOOOOe+000
dummy21:loglO(engine.power + 1) 0.OOOOOOe+000
dummy21:vehicle.speed 0.OOOOOOe+000
dummy22:loglO(engine.power + 1) 4.942365e-008
dummy22:vehicle.speed 8.032376e-008
dummy23:loglO(engine.power + 1) 0.OOOOOOe+000
dummy23:vehicle.speed 0.OOOOOOe+000
Residuals
10-51
-------
(b) Response vs. Fit
(a) Residual vs. Fit
r
(c) Residuals Normal QQ
• 10
-------
comparing model predictions and actual observations for emission rates (untransformed y), these
numbers will be different from those obtained from linear regression models.
Table 10-26 Comparative Performance Evaluation of CO Emission Rate Models
Coefficient of
determination
Slope
RMSE MPE
Mean ERs
Linear Regression (Power)
Linear Regression (Power0 5)
Linear Regression (log(Power))
Linear Regression (log(Power)+Speed)
Linear Regression (log(Power)+Speed+Dummy Set 1)
Linear Regression (log(Power)+Speed+Dummy Set 2)
0.00003
0.0462
0.0502
0.0553
0.392
0.406
0.437
1.000
1.180
1.227
1.534
2.161
1.765
1.242
0.16032
0.16516
0.16420
0.16455
0.14252
0.13632
0.12565
-0.00002
0.05229
0.05006
0.05120
0.04211
0.03689
0.03003
The improvement in R2 associated with moving toward a linear function of engine power
is significant. Hence, the use of the linear regression function will provide a significant improve-
ment on spatial and temporal model prediction capability. However this linear regression func-
tion might still be improved.
Results suggest that a linear regression function with log transformation performs slightly
better than the others and that the use of dummy variables can further improve model perfor-
mance. Although the linear regression function with dummy variables performs slightly better
than the linear regression function with log transformation, the introduction of more explanatory
variables (dummy variables and the interaction with engine power) increases the complexity
of the regression model. As discussed in Section 10.3.2.1.4, there is no compelling reason to
include the dummy variables in the model since: 1) the models with dummy variables are more
complex without significantly improving model performance, and 2) there is no compelling en-
gineering reason at this time to support the difference in model performance within these specific
power regions. Yet, given the explanatory power of the power cutpoint dummy variables (a 10%
increase in explained variance), additional investigation into why these values are turning out to
be significant is definitely warranted. It may be wise to include such cutpoints in on-road mod-
els for various engine technology groups. Such dummy variables are, however, worth exploring
when additional data from other engine technology groups become available for analysis.
It can be argued that inclusion of the dummy variables for power is warranted. However,
Model 2.4 is chosen as the preliminary 'final' model based solely upon ease of implementation.
The next step in model evaluation is to once again examine the residuals for the improved model.
A principal objective was to verify that the statistical properties of the regression model conform
10-53
-------
to a set of properties of least squares estimators. In summary, these properties require that the
error terms be normally distributed, have a mean of zero, and have uniform variance.
Test for Constancy of Error Variance
A plot of the residuals versus the fitted values is useful in identifying patterns in the
residuals. Figure 10-36 plot (a) shows this plot for CO Model 2.4. Without considering variance
due to high emission points and zero load data, there is no obvious pattern in the residuals across
the fitted values.
Test of Normality of Error Terms
The first informal test normally reserved for the test of normality of error terms is a
quantile-quantile plot of the residuals. Figure 10-36 plot (c) shows the normal quantile plot of
CO Model 2.4. The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality. Under normality, we expect 68 percent of the residuals
to fall between ± V-MSE and about 90 percent to fall between ± 1.645 VMSE. Actually, 87.35%
of residuals fall within the first limits, while 92.19% of residuals fall within the second limits.
Thus, the actual frequencies here are reasonably consistent with those expected under normality.
The heavy tails at both ends are a cause for concern, but are due to the nature of the data set. For
example, even after the transformation, the response variable is not the real normal distribution.
Based on above analysis, final CO emission model for cruise mode is:
= i Q[-3.747+1.3411oglO(engine.power+l) - 0.02 8 5 vehicle. speed]
Analysis results support the observation that the final CO emission model (2.4) is signifi-
cantly better at explaining variability without making the model too complex. Since there is only
one engine type, complexity may not be valid in terms of transferability. This model is specific
to the engine classes employed in the transit bus operations. Different models may need to be
developed for other engine classes and duty cycles.
10.3.2.3 HC Emission Rate Model Development for Acceleration Mode
Based on previous analysis, truncated transformed HC will serve as the independent
variable. However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing statistical models. HTBR tree model
results suggest that engine power is the best one to begin with.
10-54
-------
10.3.2.3.1 Linear Regression with Engine Power
Let's select engine power to begin with, and estimate the model:
Y = fl + ft engine.power + Error (3.1)
The regression run yields the results shown in Table 10-27 and Figure 10-39.
Table 10-27 Regression Result for HC Model 3.1
Call: lm(formula = HC.25 ~ engine.power, data = busdata!0242006.1.3, na.action
na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.1285 -0.02417 -0.00003173 0.02467 0.2904
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 0.1840 0.0009 216.4203 0.0000
engine.power 0.0001 0.0000 32.4947 0.0000
Residual standard error: 0.03989 on 18328 degrees of freedom
Multiple R-Sguared: 0.05447
F-statistic: 1056 on 1 and 18328 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
engine.power -0.938
Analysis of Variance Table
Response: HC.25
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
engine.power 1 1.67991 1.679912 1055.908 0
Residuals 18328 29.15918 0.001591
10-55
-------
(a) Scatter Plot
(b) Residual vs. Fit
• A •>'• - . . • .
.•'. v%>^'k;vv
(c) Response vs. Fit
(d) Residuals Normal QQ
Figure 10-39 QQ and Residual vs. Fitted Plot for HC Model 3.1
The results suggest that engine power explains about 5% of the variance in truncated
transformed HC. F-statistic shows that/?7^ 0, and the linear relationship is statistically signifi-
cant. To evaluate the model, the normality is examined in the QQ plot and constancy of variance
is checked by examining residuals vs. fitted values.
The residual plot in Figure 10-39 shows a slight departure from linear regression assump-
tions indicating a need to explore a curvilinear regression function. Since the variability at the
different X levels appears to be fairly constant, a transformation on X is considered. The reason
to consider transformation first is to avoid multicollinearity brought about by adding the second-
order of X. Based on the prototype plot in Figure 10-39, the square root transformation and loga-
rithmic transformation are tested. Scatter plots and residual plots based on each transformation
should then be prepared and analyzed to determine which transformation is most effective.
7 = /?Q + ^engine.,power^lT> + Error
(3.2)
Y = ft+ftlog.(engme.powerJrl)+Err&r (3.3)
The result for Model 3.2 is shown in Table 10-28 and Figure 10-40, while the result for
Model 3.3 is shown in Table 10-29 and Figure 10-41.
10-56
-------
Table 10-28 Regression Result for HC Model 3.2
Call: lm(formula = HC.25 ~ engine.power'" (1/2), data = busdata!0242006.1.3, na.action
= na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.1173 -0.02389 -0.0002473 0.0244 0.2969
Coefficients:
(Intercept)
I (engine .power'" (1/2) )
Value Std. Error t value Pr(>|t|)
0.1625 0.0013 127.4341 0.0000
0.0034 0.0001 38.2005 0.0000
Residual standard error: 0.03948 on 18328 degrees of freedom
Multiple R-Sguared: 0.07375
F-statistic: 1459 on 1 and 18328 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
I (engine.power'" (1/2) ) -0.9735
Analysis of Variance Table
Response: HC.25
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
I (engine.power'" (1/2) ) 1 2.27433 2.274333 1459.28 0
Residuals 18328 28.56475 0.001559
(a) Scatter Plot
(c) Response vs. Fit
I
-------
Table 10-29 Regression Result for HC Model 3.3
Call: lm(formula = HC.25 ~ loglO(engine.power + 1), data = busdata!0242006.1.3,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.1186 -0.02345 -0.00007336 0.02386 0.3004
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 0.1136 0.0022 50.8911 0.0000
loglO(engine.power + 1) 0.0426 0.0010 43.4726 0.0000
Residual standard error: 0.03906 on 18328 degrees of freedom
Multiple R-Sguared: 0.09347
F-statistic: 1890 on 1 and 18328 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
loglO(engine.power + 1) -0.9916
Analysis of Variance Table
Response: HC.25
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
loglO(engine.power +1) 1 2.88268 2.882681 1889.863 0
Residuals 18328 27.95641 0.001525
(a) Scatter Plot
(b) Residual vs. Fit
.
(c) Response vs. Fit
(d) Residuals Normal QQ
, •
Figure 10-41 QQ and Residual vs. Fitted Plot for HC Model 3.3
10-58
-------
The results suggest that by using transformed engine power, the model increases the
amount of variance explained in truncated transformed HC from about 5% to about 9%.
Model 3.3 improves R2 relative to Model 3.2. The residuals scatter plot for Model 3.3
(Figure 10-41) also shows a more reasonably linear relation than Model 2.2 (Figure 10-40). Fig-
ure 10-41 also shows that Model 3.3 does a better job in improving the pattern of variance. QQ
plot shows general normality with the exceptions arising in the tails.
10.3.2.3.2 Linear Regression Model with Dummy Variables
Figure 10-26 suggests that the relationship between HC and engine power may differ
across the engine power ranges. One dummy variable is created to represent different engine
power ranges identified in Figure 10-26 for use in linear regression analysis as illustrated below:
Engine power (bhp) Dummy 1
< 54.555 1
> 54.555 0
This dummy variable and the interaction between dummy variable and engine power are
then tested to determine whether the use of the variable and interaction can help improve the
model.
Y = /? + fijog^engine.pawer+l) + /?2 dummy 1 + fi^dummyl logw(engine.power+l) + Error (3.4)
The results for Model 3.4 are shown in Table 10-30 and Figure 10-42.
10-59
-------
Table 10-30 Regression Result for HC Model 3.4
Call: lm(formula = HC.25 ~ loglO(engine.power + 1) + dummyl
1), data = busdata!0242006.1.3, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.1278 -0.02305 0.0002278 0.0231 0.314
loglO(engine.power +
Coefficients:
(Intercept)
loglO(engine.power + 1)
dummy1
dummyl:loglO(engine.power + 1)
Value Std
0.1734
0.0171
-0.0643
0.0195
Error
0.0042
0.0018
0.0062
0.0039
t value Pr (>111
41.4191
9.4715
-10.3151
4.9731
0.0000
0.0000
0.0000
0.0000
Residual standard error: 0.03873 on 18326 degrees of freedom
Multiple R-Sguared: 0.1084
F-statistic: 742.8 on 3 and 18326 degrees of freedom, the p-value is 0
Analysis of Variance Table
Response: HC.25
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value
loglO(engine.power +1) 1 2.88268 2.882681 1921.331
dummyl 1 0.42377 0.423774 282.449
dummyl:loglO(engine.power +1) 1 0.03711 0.037107 24.732
Residuals 18326 27.49553 0.001500
Pr (F)
loglO(engine.power + 1) 0.OOOOOOe+000
dummyl 0 . OOOOOOe+000
dummyl:loglO(engine.power + 1) 6.647205e-007
Residuals
10-60
-------
(a) Residuals vs. Fit
<
o re
• »
(b) Response vs. Fit
« H o* ait
(c) Residuals Normal QQ
Figure 10-42 QQ and Residual vs. Fitted Plot for HC Model 3.4
The results suggest that by using transformed engine power and speed, the model only in-
creases the amount of variance explained in truncated transformed HC from about 9% to about 10%.
Model 3.4 slightly improves R2 relative to Model 3.3. The residuals scatter plot for Model
3.4 (Figure 10-42) is not appreciably better nor does Model 3.4 do a better job in improving the pat-
tern of variance. The QQ plot still shows general normality with the exceptions arising in the tails.
10.3.2.3.3 Model Discussions
The previous sections outline the model development process from regression tree
model, to a simple OLS model, to more complex OLS models. To test whether the linear regres-
sion with power was a beneficial addition to the regression tree model, the mean ERs at HTBR
end nodes (single value) were compared to the predictions from the linear regression function
with engine power. The results of the performance evaluation are shown in Table 10-31. The
improvement in R2 associated with moving toward a linear function of engine power is nearly
imperceptible. Hence, the use of the linear regression function will provide almost no signifi-
cant improvement over spatial and temporal model prediction capability. This linear regression
function might still be improved. Since the R2 and slope in Table 10-31 are derived by compar-
10-61
-------
ing model predictions and actual observations for emission rates, these numbers will be different
from those obtained from linear regression models.
Table 10-31 Comparative Performance Evaluation of HC Emission Rate Models
Coefficient of „,
, A . A. Slope
determination r RMSE MPE
(R2)
Mean ERs
Linear Regression (Power)
Linear Regression (Power0 5)
Linear Regression (log(Power))
Linear Regression (log(Power) + Dummy)
0.000090
0.0166
0.0214
0.0281
0.0367
1.000
0.979
0.749
0.864
1.060
0.0019072
0.0019879
0.0019311
0.0019249
0.0019151
0.00000022
0.00061206
0.00040055
0.00040884
0.00040366
Results suggest that the linear regression function with log transformation performs
slightly better than the others and that the use of dummy variables can further improve model
performance, but again there is almost no perceptible change in terms of explained variance.
Although the linear regression function with log transformation and dummy variables performs
slightly better than linear regression function with log transformation alone, the revised model
introduces additional explanatory variables (dummy variables and the interaction with engine
power) and increases the complexity of regression model without significantly improving the
model. As discussed in Section 10.3.2.1.4, there is no compelling reason to include the dummy
variables in the model, given that: 1) the second model is more complex without significantly
improving model performance, and 2) there is no compelling engineering reason at this time to
support the difference in model performance within these specific power regions. These dummy
variables are, however, worth exploring when additional data from other engine technology
groups become available for analysis.
Model 3.3 is recommended as the preliminary 'final' model (although one might argue
that using the regression tree results directly would also probably be acceptable). The next step
in model evaluation is to once again examine the residuals for the improved model. A principal
objective was to verify that the statistical properties of the regression model conform to a set of
properties of least squares estimators. In summary, these properties require that the error terms
be normally distributed, have a mean of zero, and have the same variance.
Test for Constancy of Error Variance
A plot of the residuals versus the fitted values is useful in identifying any patterns in
the residuals. Figure 10-41 plot (b) is residuals vs. fit for HC Model 3.3. Without considering
variance due to high emission points and zero load data, it can be seen that there is no obvious
pattern in the residuals across the fitted values.
10-62
-------
Test of Normality of Error terms
The first informal test normally reserved for the test of normality of error terms is a
quantile-quantile plot of the residuals. Figure 10-40 plot (d) shows the normal quantile plot of of
HC Model 3.2. The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality. Under normality, we expect 68 percent of the residuals to
fall between ±VMS£ and about 90 percent to fall between ± 1.645 V-MSE . Actually, 84.83% of
residuals fall within the first limits, while 93.60% of residuals fall within the second limits. Thus,
the actual frequencies here are reasonably consistent with those expected under normality. The
heavy tails at both ends are a cause for concern, but this is due to the nature of the data set. For
example, even after the transformation, the response variable is not the real normal distribution.
Based on above analysis, final HC emission model for cruise mode is:
HC = [0.114+ 0.04261og1Q(engine.power+l)]4
10.4 Conclusions and Further Considerations
In this research, acceleration mode is defined as "acceleration >1 mph/s". Data not
considered to be in idle, deceleration or acceleration mode will be deemed to be in cruise mode.
Compared to cruise mode activity, the engine power is more concentrated in higher engine power
ranges (> 200 bhp) for acceleration mode activity.
Inter-bus variability analysis indicated that some of the 15 buses are higher emitters than
others (especially noted for HC emissions). However, none of the buses appears to qualify as a
traditional high-emitter, which would exhibit emission rates of two to three standard deviations
above the mean. Hence, it is difficult to classify any of these 15 buses as high emitters for mod-
eling purposes. At this moment, these 15 buses are treated as a whole for model development.
Modelers should keep in mind that although no true high-emitters are present in the database,
such vehicles may behave significantly different than the vehicles tested. Hence, data from high-
emitting vehicles should be collected and examined in future studies.
Some high HC emissions events are noted in acceleration mode. After screening engine
speed, engine power, engine oil temperature, engine oil pressure, engine coolant temperature,
ECM pressure, and other parameters, no variables were identified that could be linked to these
high emissions events. These events may represent natural variability in on-road emissions, or
10-63
-------
some other variable (such as grade or an engine variable that is not measured) may be linked to
these events.
Engine power is selected as the most important variable for three pollutants based on
HTBR tree models. This finding is consistent with previous research results which verified the
important role of engine power (Ramamurthy et al. 1998; Clark et al. 2002; Earth et al. 2004).
The HC relationship is significant but fairly weak. Analysis in previous chapters also indicates
that engine power is correlated with not only on-road load parameters such as vehicle speed,
acceleration, and grade, but also potentially with engine operating parameters such as throttle po-
sition and engine oil pressure. On the other hand, engine power in this research is derived from
engine speed, engine torque and percent engine load.
The regression tree models suggest that some other variables, like oil pressure and en-
gine barometric pressure, may also impact the HC emissions. Further analysis demonstrates that
by using engine power alone one might be able to achieve explanatory ability similar to using
engine power and other variables. To develop models that are efficient and easy to implement,
only engine power is used to develop emission models. However, additional investigation into
these variables is warranted as additional detailed data from engine testing become available for
analysis.
Given the relationships noted between engine indicated HP and emission rates, it is
imperative that data be collected to develop solid relationships in engine power demand models
(estimating power demand as a function of speed/acceleration, grade, vehicle characteristics,
surface roughness, inertial losses, etc.) for use in regional inventory development and microscale
impact assessment.
In summary, the modeler recommends the following acceleration emission models:
NOx = [-0.0195 + 0.2011og10(engine.power+l) + 0.0019vehicle.speed]2
CO = lO^"3'747 + 1-3411°g1°(engine-P°wer+1) - 0.0285vehicle.speed]
HC = [0.114 + 0.04261og1Q(engine.power+l)]4
10-64
-------
CHAPTER 11
11. CRUISE MODE DEVELOPMENT
After developing idle mode definition and emission rate in Chapter 8, deceleration mode
definition and emission rate in Chapter 9, and acceleration emission model in Chapter 10, the
next task will be to develop cruise mode.
11.1 Analysis of Cruise Mode Data
After dividing the database into idle mode, deceleration mode, and acceleration mode,
cruise mode data will be all of the remaining data in the database (i.e., data not previously clas-
sified into idle, deceleration, and acceleration). Unlike the idle and deceleration modes, there is
a general relationship between engine power and emission rate for acceleration mode and cruise
mode. The engine power distribution for data collected in the cruise mode is provided in Table
11-1.
Table 11-1 Engine Power Distribution for Cruise Mode
Engine Power Distribution
o utants ^ 5Q^ ^5Q 10Q^ ^10Q 15Q^ ^15Q 20Q^ ^ 20Q An
Number
Percentage
NO
X
CO
HC
NO
X
CO
HC
15885
15834
15481
40.34%
40.37%
40.72%
8988
8940
8600
22.83%
22.80%
22.62%
7173
7145
6830
18.22%
18.22%
17.96%
3536
3529
3394
8.98%
9.00%
8.93%
3792
3770
3715
9.63%
9.61%
9.77%
39374
39218
38020
100.00%
100.00%
100.00%
Emission rate histograms for each of the three pollutants for cruise operations are pre-
sented in Figure 11-1. Figure 11-1 shows significant skewness for all three pollutants for cruise
mode. Some high HC emissions events are noted in cruise mode. After screening engine speed,
engine power, engine oil temperature, engine oil pressure, engine coolant temperature, ECM
11-1
-------
pressure, and other parameters, no operating parameters appeared to correlate with the high emis-
sions events.
0 OS I IS 2 ?5 3 36
CO fcinsMm H«tt (»M
Figure 11-1 Histograms of Three Pollutants for Cruise Mode
11.1.1 Engine Rate Distribution by Bus in Cruise Mode
Inter-bus response variability for cruise mode operations is illustrated in Figures 11-2 to
11-4 using median and mean of NOx, CO, and HC emission rates. Table 11-2 presents the same
information in tabular form. The difference between median and mean is also an indicator of
skewness.
11-2
-------
I C V 10 12 M 16
I H \.
4 8 9 10 \1 u 16
'I,,; h:
Figure 11-2 Median and Mean of NO Emission Rates in Cruise Mode by Bus
005|- .-•
Figure 11-3 Median and Mean of CO Emission Rates in Cruise Mode by Bus
11-3
-------
Bin Mo
Figure 11-4 Median and Mean of HC Emission Rates in Cruise Mode by Bus
Table 11-2 Median and Mean of Three Pollutants in Cruise Mode by Bus
NOx CO HC
Bus ID Median Mean Median Mean Median Mean
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 3 84
Bus 385
Bus 386
0.11666
0.18479
0.05924
0.12779
0.09092
0.13714
0.11139
0.12570
0.16713
0.09227
0.14987
0.16355
0.11597
0.10244
0.12254
0.14506
0.18507
0.07384
0.14644
0.09936
0.16103
0.11094
0.15673
0.18183
0.11789
0.16698
0.18468
0.13933
0.13024
0.13632
0.01618
0.01091
0.00534
0.01259
0.01262
0.01254
0.01454
0.01394
0.01994
0.01074
0.01342
0.00921
0.00934
0.01266
0.01147
0.02891
0.01389
0.01341
0.01875
0.01704
0.02383
0.02559
0.02298
0.04532
0.02505
0.02544
0.01949
0.01903
0.02066
0.02197
0.00120
0.00122
0.00012
0.00237
0.00181
0.00121
0.00064
0.00151
0.00110
0.00060
0.00130
0.00126
0.00181
0.00187
0.00129
0.00146
0.00135
0.00021
0.00343
0.00236
0.00146
0.00075
0.00195
0.00148
0.00080
0.00155
0.00198
0.00221
0.00205
0.00167
11-4
-------
Figures 11-2 to 11-4 and Table 11-2 illustrate that NO emissions are more consistent than
° X
CO and HC emissions. Across the 15 buses, Bus 380 has the largest median and mean for CO
emissions, while Bus 364 has the largest median and mean for HC emissions. The above figures
and table demonstrate that although variability exists across buses, it is difficult to conclude that
there are any true "high emitters" in the database. This conclusion is consistent with the result
for the other three modes. As was also noted in the acceleration mode data, Bus 363 has the
smallest mean and median HC emissions compared to the other 14 buses.
11.1.2 Engine Power Distribution by Bus in Cruise Mode
Engine power distribution in cruise mode by bus is shown in Figure 11-5 and Table 11-3.
Bus 361 has the largest 1st quartile engine power in cruise mode while Bus 377 has the largest
median and 3rd quartile engine power in cruise mode. The maximum power values for each bus
match well with the manufacturer's engine power rating. Although variability for engine power
distribution exists across buses, it is difficult to conclude that such variability is affected by indi-
vidual buses, bus routes, or other factors. The relationship between power and emissions appears
consistent across the buses for acceleration mode.
Table 11-3 Engine Power Distribution in Cruise Mode by Bus
Bus ID Number ,„ _. ., Median _. ., Max Mean
Mm Quartile Quartile
Bus 360
Bus 361
Bus 363
Bus 364
Bus 372
Bus 375
Bus 377
Bus 379
Bus 380
Bus 381
Bus 382
Bus 383
Bus 384
Bus 385
Bus 386
1653
3140
3286
2575
2278
2890
1647
2544
1242
2537
1208
3062
3638
3327
4539
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
14.68
70.13
10.46
14.47
30.13
23.19
17.93
43.51
18.85
6.72
32.39
29.42
21.82
11.86
19.24
71.25
108.12
47.19
64.30
68.23
72.09
118.01
102.68
91.07
49.18
81.02
77.95
61.20
48.80
53.43
169.03
140.28
112.37
130.62
118.10
142.47
210.27
165.04
187.71
113.81
124.97
141.19
115.75
102.91
94.38
275.46
296.91
275.55
275.51
275.49
275.54
275.50
275.57
275.56
275.46
275.55
275.53
275.46
275.47
275.30
97.70
107.16
71.45
85.56
79.77
94.36
121.33
110.84
109.41
70.68
89.42
90.85
72.69
68.20
61.66
11-5
-------
0 100 200 300 0 NO 200 SO 0 TOO 200 300 0 100 3X) 300
001372 Ej.J'5 t'.: :*' Bttl379
0 100 2DO SO 0 TO 200 300 0 100 200
&J5«I E'.. J53 Bin364
- OOOh - OCO
o 100 JOO 300 0 ino xn wi o iim xm .¥« n im rm v» o inn mn vn n im ?m am o ion MI 306
Dv; £0 GUI 301 Duf332 Dye 303 C'jt X-< Cv? >;-' Dus 3X
Figure 11-5 Histograms of Engine Power in Cruise Mode by Bus
11-6
-------
11.2 Model Development and Refinement
11.2.1 HTBR Tree Model Development
The potential explanatory variables included in the emission rate model development ef-
fort include:
Vehicle characteristics: model year, odometer reading, bus ID (14 dummy variables)
Roadway characteristics: dummy variable for road grade;
Onroad loadparameters: engine power (bhp), vehicle speed (mph), acceleration (mph/s);
Engine operating parameters: engine oil temperature (deg F), engine oil pressure (kPa),
engine coolant temperature (deg F), barometric pressure reported from ECM (kPa);
Environmental conditions: ambient temperature (deg C), ambient pressure (mbar), ambi-
ent relative humidity (%).
HTBR technique is used first to identify potentially significant explanatory variables and
this analysis provides the starting point for conceptual model development. The HTBR model
is used to guide the development of an OLS regression model, rather than as a model in its
own right. HTBR can be used as a data reduction tool and for identifying potential interactions
among the variables. Then OLS regression is used with the identified variables to estimate a
preliminary "final" model.
Although evidence in the literature suggests that a logarithmic transformation is most
suitable for modeling motor vehicle emissions (Washington 1994; Ramamurthy et al. 1998;
Fomunung 2000; Frey et al. 2002), this transformation needs to be verified through the Box-Cox
procedure. The Box-Cox function in MATLAB™ can automatically identify a transforma-
tion from the family of power transformations on emission data, ranging from -1.0 to 1.0. The
lambdas chosen by the Box-Cox procedure for cruise mode are 0.40619 for NO , 0.012969 for
CO, 0.241 for HC. The Box-Cox procedure is only used to provide a guide for selecting a trans-
formation, so overly precise results are not needed (Neter et al. 1996). It is often reasonable to
use a nearby lambda value that is easier to understand for the power transformation. Although
the lambdas chosen by the Box-Cox procedure are different for acceleration and cruise modes,
the nearby lambda values are same for these two modes. In summary, the lambda values used
for transformations are Va for NO , 0 for CO (indicating a log transformation), and Vi for HC
for cruise mode. Figures 11-6 to 11-8 present the histogram, boxplot, and probability plots of
truncated emission rates in cruise mode for NO , CO, and HC, while Figures 11-9 to 11-11 pres-
11-7
-------
ent the same plots for truncated transformed emission rates for NO , CO and HC, where a great
improvement is noted.
a-
s
0.0 05 1.0 15 2.0 25
Truncated NOx Emission Rsie (g/s) in Cruise Mode
-4-20 24
QMarMes Of standard NwnwH
Figure 11-6 Histogram, Boxplot, and Probability Plot of Truncated NO Emission Rates in Cruise Mode
11-8
-------
OJ) 05 1.0 15 3.0 2.5 3X1
Truncaed CO Emission Rale ftjft) m Cruse Mode
I
-4-20 24
OUWUM of Standard Normal
Figure 11-7 Histogram, Boxplot, and Probability Plot of Truncated CO Emission Rate in Cruise Mode
8.
•
0.0 0.02 004 006
Truncated HC Emission Rale (g/s) in Cruse Mode
-4-20 2
Quantiles of Standard Normal
Figure 11-8 Histogram, Boxplot, and Probability Plot of Truncated HC Emission Rate in Cruise Mode
11-9
-------
5
o.o 05 1.0 15
.4 .2 0 1 4
Quantfe; of Stand ml Namgl
Figure 11-9 Histogram, Boxplot, and Probability Plot of Truncated Transformed NO Emission
Rate in Cruise Mode
5 -4 -3 -2 -1 0
•4-3024
•;»--a .:•..-. o
Figure 11-10 Histogram, Boxplot, and Probability Plot of Truncated Transformed CO Emission
Rate in Cruise Mode
11-10
-------
3-
Is
01 0.2 03 0.4 05
-2 0 2 4
Qusrties of Stendatd Nwmal
Figure 11-11 Histogram, Boxplot, and Probability Plot of Truncated Transformed HC Emission
Rate in Cruise Mode
11.2.1.1 NO HTBR Tree Model Development
X
Figure 11-12 illustrates the initial tree model used for the truncated transformed NOx
emission rate in cruise mode. Results for the initial model are given in Table 11-4. The tree
grew into a complex model, with a considerable number of branches and 32 terminal nodes. Fig-
ure 11-13 illustrates the amount of deviation explained corresponding to the number of terminal
nodes.
11-11
-------
engine_pojver< 19.05
enginenovfer< 109 555
055
Figure 11-12 Original Untrimmed Regression Tree Model for Truncated Transformed NO Emission
Rate in Cruise Mode
650.00 15.00 4.70 2.40 1.50 100 0.79 039
i i i i i i i i i i
o
o
o
I
o
o
co
sj
10
15
20
30
size
Figure 11-13 Reduction in Deviation with the Addition of Nodes of Regression Tree for Truncated
Transformed NO Emission Rate in Cruise Mode
11-12
-------
Table 11-4 Original Untrimmed Regression Tree Results for Truncated Transformed NO Emis-
sion Rate in Cruise Mode
Regression tree:
tree(formula = NOx.50 ~ model.year + odometer + temperature + baro + humidity +
vehicle.speed + oil.temperture + oil.press + cool.temperature + eng.bar.press + en-
gine.power + acceleration + bus360 + bus361 + bus363 + bus364 + bus372 + bus375 +
bus377 + bus379 + bus380 + bus381 + bus382 + bus383 + bus384 + bus385 + dummy.grade,
data = busdata!0242006.1.4, na.action = na.exclude, mincut = 400, minsize = 800,
mindev = 0.01)
Variables actually used in tree construction:
[1] "engine.power" "dummy.grade" "baro" "oil-press"
[5] "humidity" "vehicle.speed" "temperature" "bus372"
[9] "odometer" "model.year"
Number of terminal nodes: 32
Residual mean deviance: 0.005398 = 212.4 / 39340
Distribution of residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-4.634e-001 -4.130e-002 -1.265e-003 -1.315e-016 3.646e-002 1.180e+000
For model application purposes, it is desirable to select a final model specification that
balances the model's ability to explain the maximum amount of deviation with a simpler model
that is easy to interpret and apply. Figure 11-7 indicates that reduction in deviation with addition
of nodes after four, although potentially statistically significant, is very small. A simplified tree
model was derived which ends in four terminal nodes as compared to the 37 terminal nodes in
the initial model. The residual mean deviation only increased from 210.2 to 298.9 and yielded
a much cleaner model than the initial one. Results are shown in Table 11-5 and Figure 11-14.
Based on above analysis, NO cruise model will be developed based on this result.
11-13
-------
ermine nnwer19.05 7058
3) engine.power>52.525 23094
6) engine.power<109.555 10186
7) engine.power>109.555 12908
160.50 0.1831
47.70 0.1252 *
41.36 0.2588 *
285.90 0.4438
81.41 0.3791
128.40 0.4948
11-14
-------
This tree model suggests that engine power is the most important explanatory variable
for NO emissions. This finding is consistent with previous research results which verified the
important effect of engine power on NO emissions (Ramamurthy et al. 1998; Clark et al. 2002;
Earth et al. 2004). Analysis in previous chapter also indicates that engine power is correlated not
only with onroad load parameters such as vehicle speed, acceleration, and grade, but also with
engine operating parameters such as throttle position and engine oil pressure. On the other hand,
engine power in this research is derived from engine speed, engine torque and percent engine
load. So engine power can connect onroad modal activity with engine operating conditions to
that extent. This fact strengthens the importance of introducing engine power into the concep-
tual model and the need to improve the ability to simulate engine power for regional inventory
development.
11.2.1.2 CO HTBR Tree Model Development
Figure 11-15 illustrates the initial tree model used for truncated transformed CO emis-
sion rate in cruise mode. Results for initial model are given in Table 11-6. The tree grew into
a complex model with a considerable number of branches and 65 terminal nodes. Figure 11-16
illustrates the amount of deviation explained corresponding to the number of terminal nodes.
11-15
-------
ooollen
engine DUV er< 15.44
engine powpr<181 235
Figure 11-15 Original Untrimmed Regression Tree Model for Truncated Transformed CO Emis-
sion Rate in Cruise Mode
1500000 90000 24.000 12000 6.700 5100 3.000
i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i I i
'8
S
10
2C
30
I
40
50
60
Figure 11-16 Reduction in Deviation with the Addition of Nodes of Regression Tree for Trun-
cated Transformed CO Emission Rate in Cruise Mode
11-16
-------
Table 11-6 Original Untrimmed Regression Tree Results for Truncated Transformed CO Emis-
sion Rate in Cruise Mode
Regression tree:
tree(formula = log.CO ~ model.year + odometer + temperature + baro + humidity +
vehicle.speed + oil.temperture + oil.press + cool.temperature +
eng.bar.press + engine.power + acceleration + bus360 + bus361 + bus363 +
bus364 + bus372 + bus375 + bus377 + bus379 + bus380 + bus381 + bus382 +
bus383 + bus384 + bus385 + dummy.grade, data = busdata!0242006.1.4,
na.action = na.exclude, mincut = 400, minsize = 800, mindev = 0.01)
Variables actually used in tree construction:
[1] "engine.power" "oil-press" "baro"
[4] "cool.temperature" "vehicle.speed" "acceleration"
[7] "humidity" "odometer" "dummy.grade"
[10] "temperature" "eng.bar.press" "model.year"
[13] "oil.temperture"
Number of terminal nodes: 65
Residual mean deviance: 0.1089 = 4265 / 39150
Distribution of residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-2.335e+000 -1.783e-001 -1.233e-002 1.869e-016 1.691e-001 2.013e+000
For model application purposes, it is desirable to select a final model specification that
balances the model's ability to explain the maximum amount of deviation with a simpler model
that is easy to interpret and apply. Figure 11-16 indicates that reduction in deviation with addi-
tion of nodes after 4, although potentially statistically significant, is very small. A simplified
tree model was derived which ends in 4 terminal nodes as compared to the 67 terminal nodes in
the initial model. The residual mean deviation only increased from 4265 to 5698 and yielded a
much more efficient model. Results are shown in Table 11-7 and Figure 11-17. The CO cruise
emission rate model will be based upon these results.
11-17
-------
•r<114 35S
5445
engine pf>wbr<1B1 735
-1.753
-1.487
-2.321
-1.967
Figure 11-17 Trimmed Regression Tree Model for Truncated Transformed CO Emission Rate in
Cruise Mode
Table 11-7 Trimmed Regression Tree Results for Truncated Transformed CO Emission Rate in
Cruise Mode
Regression tree:
snip.tree(tree = tree(formula = log.CO ~ model.year + odometer + temperature +
baro + humidity + vehicle.speed + oil.temperture + oil.press +
cool.temperature + eng.bar.press + engine.power + acceleration +
bus360 + bus361 + bus363 + bus364 + bus372 + bus375 + bus377 + bus379 +
bus380 + bus381 + bus382 + bus383 + bus384 + bus385 + dummy.grade,
data = busdata!0242006.1.4, na.action = na.exclude, mincut = 400,
minsize = 800, mindev = 0.01), nodes = c(4.
Variables actually used in tree construction:
[1] "engine.power"
Number of terminal nodes: 4
Residual mean deviance: 0.1453 = 5698 / 39210
Distribution of residuals:
Min. 1st Qu. Median Mean
-2.679e+000 -2.065e-001 -7.150e-003 -4.942e-015
node), split, n, deviance, yval
* denotes terminal node
1) root 39218 8170.0 -1.944
2) engine.power<114.355 27187 4482.0 -2.076
4) engine.power<15.445 8414 1639.0 -2.321 *
5) engine.power>15.445 18773 2115.0 -1.967 *
3) engine.power>114.355 12031 2147.0 -1.646
6) engine.power<181.235 7220 1146.0 -1.753 *
7) engine.power>181.235 4811 797.8 -1.487 *
6., 7., 5.
3rd Qu.
2.041e-001
Max.
2.452e+000
11-18
-------
This tree model suggested that engine power is the most important explanatory variable
for CO emissions. This finding is consistent with NO emissions. This tree will be used as refer-
ence for linear regression model development.
11.2.1.3 HC HTBR Tree Model Development
Figure 11-18 illustrates the initial tree model used for truncated transformed HC emis-
sion rate in cruise mode. Results for initial model are given in Table 11-8. The tree grew into a
complex model with a considerable number of branches and 61 terminal nodes.
_busi
Figure 11-18 Original Untrimmed Regression Tree Model for Truncated Transformed HC Emis-
sion Rate in Cruise Mode
11-19
-------
Table 11-8 Original Untrimmed Regression Tree Results for Truncated Transformed HC Emis-
sion Rate in Cruise Mode
Regression tree:
tree(formula = HC.25 ~ model.year + odometer + temperature + baro + humidity +
vehicle.speed + oil.temperture + oil.press + cool.temperature +
eng.bar.press + engine.power + acceleration + bus360 + bus361 + bus363 +
bus364 + bus372 + bus375 + bus377 + bus379 + bus380 + bus381 + bus382 +
bus383 + bus384 + bus385 + dummy.grade, data = busdata!0242006.1.4,
na.action = na.exclude, mincut = 400, minsize = 800, mindev = 0.01)
Variables actually used in tree construction:
[1] "bus363" "bus364" "engine.power"
[4] "oil.temperture" "odometer" "oil-press"
[7] "humidity" "cool.temperature" "bus381"
[10] "bus377" "baro" "temperature"
[13] "bus372" "vehicle.speed" "dummy.grade"
[16] "bus385"
Number of terminal nodes: 56
Residual mean deviance: 0.0008147 = 30.93 / 37960
Distribution of residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.862e-001 -1.595e-002 -3.021e-003 -1.297e-018 1.230e-002 2.886e-001
Figure 11-18 and Table 11-8 suggest that the tree analysis of HC emission rates identi-
fied a number of buses that appear to exhibit significantly different emission rates under all load
conditions than the other buses (i.e., some of the bus dummy variables appeared as significant in
the initial tree splits). Two bus dummy variables split the data pool at the first two levels of the
HC tree model. This same result was noted for these buses in the acceleration mode. Although
variability exists for three pollutants across 15 buses, the division was even more obvious for HC
emissions (see Figure 11-4 and Table 11-2). Although it is tempting to develop different emis-
sion rates for these buses to reduce emission rate deviation in the sample pool, it is difficult to
justify doing so. Unless these is an obvious reason to classify these three buses as high emitters
(i.e., significantly higher than normal emitting vehicles, perhaps by as much as a few standard
deviations from the mean), and unless there are enough data to develop separate emission rate
models for high emitters, one cannot justify removing the data from the data set. Until such
data exist to justify treating these buses as high emitters, the bus dummy variables for individual
buses are removed from the analyses and all 15 buses are treated as part of the whole data set.
Another tree model was generated excluding the bus dummy variables. However, odom-
eter reading also had to be excluded because the previous "Bus 363<0.5" tree cutpoint was
replaced by "odometer>282096" (i.e., was identically correlated to the same bus). This new tree
model is illustrated in Figure 11-19 and Table 11-9. The tree model is then trimmed for applica-
tion purposes, as was done for the NO and CO models.
11-20
-------
bam968.5 35063 49.420 0.1943
6) engine.power<12.645 6821 13.850 0.1750 *
7) engine.power>12.645 28242 32.420 0.1989
14) oil.temperture<192.1 26727 29.900 0.2005
28) baro<980.5 11265 9.610 0.1918 *
29) baro>980.5 15462 18.820 0.2068 *
15) oil.temperture>192.1 1515 1.244 0.1706 *
11-21
-------
The new tree model suggests that barometric pressure is the most important explanatory
variable for HC emission rates. However, this finding is challenged by the fact that all the 2957
data points in the first left hand branch of the tree (barometric pressure < 968.5) belong to Bus
363. Although this dataset was collected under a wide variety of environmental conditions, the
scope of barometric pressure was limited for individual buses tested. As reported earlier, Bus
363 exhibited significantly lower HC emissions than the other buses (see Figure 11-4), but the
reason is not clear at this time. To develop a reasonable tree model given the limited data col-
lected, the environmental parameters are excluded from the model until a greater distribution of
environmental conditions can be represented in a test data set. With data collected from a more
comprehensive testing program, environmental variables can be integrated into the model direct-
ly, or perhaps correction factors for the emission rates can be developed. The secondary trimmed
tree is presented in Figure 11-20 and Table 11-10.
oil pressl<345.25
eng bar prsss
-------
Table 11-10 Trimmed Regression Tree Results for Truncated Transformed HC in Cruise Mode
Regression tree:
snip.tree(tree = tree(formula = HC.25 ~ engine.power + vehicle.speed +
acceleration + oil.temperture + oil.press + cool.temperature +
eng.bar.press, data = busdata!0242006.1.4, na.action = na.exclude,
mincut = 400, minsize = 800, mindev = 0.01), nodes = c(6., 5., 7.,
4.))
Variables actually used in tree construction:
[1] "eng.bar.press" "oil-press" "engine.power"
Number of terminal nodes: 4
Residual mean deviance: 0.00148 = 56.27 / 38020
Distribution of residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.310e-001 -2.290e-002 -2.164e-003 1.281e-015 1.942e-002 3.220e-001
node), split, n, deviance, yval
* denotes terminal node
1) root 38020 71.970 0.1876
2) eng.bar.press<99.9348 10827 24.640 0.1656
4) oil.press<345.25 4965 10.870 0.1400 *
5) oil.press>345.25 5862 7.754 0.1873 *
3) eng.bar.press>99.9348 27193 40.010 0.1963
6) engine.power<13.975 5879 12.660 0.1786 *
7) engine.power>13.975 21314 24.990 0.2012 *
The tree model excluding bus dummy variables, odometer readings, and environmental
conditions is shown in Figure 11-20 and Table 11-11. This final tree model suggests that engine
power is the most important explanatory variable for HC emissions. This finding is consistent
with analysis of NOx and CO emission rates. Although engine operating parameters such as oil
pressure might impact emissions, such variables are not easy to implement in real-world models.
After excluding engine barometric pressure and oil pressure from the tree model, leaving en-
gine power only, the residual mean deviation increased slightly from 56.27 to 65.56. The final
HTBR tree for HC emissions is shown in Figure 11-21 and Table 11-11. HC cruise emission rate
model will be developed based upon these results.
11-23
-------
-.armies. .noMtecg 1533.5
engine. power<0-265
01934
engine. DO ver<7.875
0.1757
0.1390
0.1697
Figure 11-21 Final Regression Tree Model for Truncated Transformed HC and Engine Power in
Cruise Mode
Table 11-11 Final Regression Tree Results for Truncated Transformed HC and Engine Power in
Cruise Mode
Regression tree:
snip.tree(tree = tree(formula = HC.25 ~ engine.power, data =
busdata!0242006.1.4, na.action = na.exclude, mincut = 400, minsize =
800, mindev = 0.01), nodes = c(ll., 10., 3.))
Number of terminal nodes: 4
Residual mean deviance: 0.001725 = 65.56 / 38020
Distribution of residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.372e-001 -2.070e-002 -6.875e-004 1.742e-015 2.090e-002 3.309e-001
node), split, n, deviance, yval
* denotes terminal node
1) root 38020 71.970 0.1876
2) engine.power<15.335 8298 21.630 0.1666
4) engine.power<0.265 4617 9.741 0.1757 *
5) engine.power>0.265 3681 11.020 0.1551
10) engine.power<7.875 1746 3.849 0.1390 *
11) engine.power>7.875 1935 6.311 0.1697 *
3) engine.power>15.335 29722 45.660 0.1934 *
11-24
-------
11.2.2 OLS Model Development and Refinement
Once a manageable number of modal variables have been identified through regression
tree analysis, the modeling process moves into the phase in which ordinary least squares tech-
niques are used to obtain a final model. The research objective here is to identify the extent to
which the identified factors influence emission rate in cruise mode. Modelers rely on previous
research, a priori knowledge, educated guesses, and stepwise regression procedures to identify
acceptable functional forms, to determine important interactions, and to derive statistically and
theoretically defensible models. The final model will be our best understanding about the func-
tional relationship between independent variables and dependent variables.
11.2.2.1 NO Emission Rate Model Development for Cruise Mode
Based on previous analysis, truncated transformed NO will serve as the independent
variable. However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing the performance of statistical models.
HTBR tree model results suggest that engine power is the best one to begin with.
11.2.2.1.1 Linear Regression Model with Engine Power
Let's select engine power to begin with, and estimate the model:
7 = /?Q + ft ^(engine.power) + Error (1.1)
11-25
-------
The regression run yields the results shown in Table 11-12 and Figure 11-22.
Table 11-12 Regression Result for NOY Model 1.1
Call: lm(formula = NOx.50 ~ engine.power, data = busdata!0242006.1.4, na.action =
na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.5717 -0.06302 0.006377 0.06653 1.259
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 0.1815 0.0007 242.8528 0.0000
engine.power 0.0018 0.0000 274.7573 0.0000
Residual standard error: 0.09765 on 39372 degrees of freedom
Multiple R-Sguared: 0.6572
F-statistic: 75490 on 1 and 39372 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
engine.power -0.7526
Analysis of Variance Table
Response: NOx.50
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
engine.power 1 719.8396 719.8396 75491.58 0
Residuals 39372 375.4263 0.0095
The results suggest that engine power explains about 66% of the variance in truncated
transformed NOx. F-statistic shows that/?7 ^ 0, and the linear relationship is statistically signifi-
cant. To evaluate the model, residual normality is examined in the QQ plot and constancy of
variance is checked by examining residuals vs. fitted values.
11-26
-------
(a) Scatter Plot
(b) Residual vs. Fit
. -^-r*~
r
(c) Response vs. Fit
(d) Residuals Normal QQ
Figure 11-22 QQ and Residual vs. Fitted Plot for NOx Model 1.1
The residual plot in Figure 11-22 shows a departure from linear regression assumptions
indicating a need to explore a curvilinear regression function. Since the variability at the differ-
ent X levels appears to be fairly constant, a transformation on X is considered. The reason to
consider transformation first is to avoid multicollinearity brought about by adding the second-or-
der of X. Based on the prototype plot in Figure 11-22, the square root transformation and loga-
rithmic transformation are tested. Scatter plots and residual plots based on each transformation
should then be prepared and analyzed to determine which transformation is most effective.
Y = /? + fl engine.power^IT> + Error (1.2)
Y = /? + fl Jog w(engine.power+1) + Error (1.3)
The result for Model 1.2 is shown in Table 11-13 and Figure 11-23, while the result for
Model 1.3 is shown in Table 11-14 and Figure 11-24.
11-27
-------
Table 11-13 Regression Result for NO Model 1.2
Call: lm(formula = NOx.50 ~ engine.power'" (1/2), data = busdata!0242006.1.4,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.5007 -0.04881 -0.0008896 0.05047 1.22
Coefficients:
(Intercept)
I (engine .power'" (1/2) )
Value Std. Error t value Pr(>|t|)
0.0874 0.0008 104.1024 0.0000
0.0311 0.0001 342.3056 0.0000
Residual standard error: 0.08364 on 39372 degrees of freedom
Multiple R-Sguared: 0.7485
F-statistic: 117200 on 1 and 39372 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
I (engine.power'" (1/2) ) -0.8649
Analysis of Variance Table
Response: NOx.50
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
I (engine.power'" (1/2)) 1 819.8002 819.8002 117173.2 0
Residuals 39372 275.4656 0.0070
(a) Scatter Plot
' • —
(c) Response vs. Fit
• « • • - "
••' '•/«•<•»;. .
* a * *• W *• •
-------
Table 11-14 Regression Result for NO Model 1.3
Call: lm(formula = NOx.50 ~ loglO(engine.power + 1), data = busdata!0242006.1.4,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.4047 -0.06677 -0.002155 0.06107 1.182
Coefficients:
(Intercept)
loglO(engine.power + 1)
Value Std. Error t value Pr(>|t|)
0.0306 0.0012 25.5525 0.0000
0.1895 0.0007 279.4403 0.0000
Residual standard error: 0.09656 on 39372 degrees of freedom
Multiple R-Sguared: 0.6648
F-statistic: 78090 on 1 and 39372 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
loglO(engine.power + 1) -0.9135
Analysis of Variance Table
Response: NOx.50
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
loglO(engine.power +1) 1 728.1347 728.1347 78086.87 0
Residuals 39372 367.1311 0.0093
(a) Scatter Plot
Response vs. Fit
-------
The results suggest that by using square root transformed engine power, the model in-
creases the amount of variance explained in truncated transformed NOx from about 66% (Model
1.1) to about 75% (Model 1.2), while remaining about 66% (Model 1.3) by using log trans-
formed engine power.
Model 1.2 improves the R2 more than does Model 1.3. The residuals scatter plot for
Model 1.2 (Figure 11-23) shows a more reasonably linear relation than Model 1.3 (Figure 11-24).
Figure 11-23 also shows that Model 1.2 does a better job in improving the pattern of variance.
QQ plot shows a kind of normality except two tails.
11.2.2.1.2 Linear Regression Model with Dummy Variables
Figure 11-14 suggests that the relationship between NO and engine power may be
somewhat different across the engine power ranges identified in the tree analysis. That is, there
may be higher or lower NO emissions in different engine power operating ranges. One dummy
variable is created to represent different engine power ranges identified in Figure 11-14 for use in
linear regression analysis as illustrated below:
Engine power (bhp) Dummy 1
< 52.525 1
> 52.525 0
This dummy variable and the interaction between dummy variable and engine power are
then tested to determine whether the use of the variables and interactions can help improve the
model.
Y = ft + /?j engine.power^IX) + /?2 dummy 1 + /?3 dummylengine.power^llT> + Error (1.4)
The result for Model 1.4 is shown in Table 11-15 and Figure 11-25.
11-30
-------
Table 11-15 Regression Result for NO Model 1.4
~ V
Call: lm(formula = NOx.50 ~ engine.power^(1/2) + dummyl
busdata!0242006.1.4, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.4812 -0.04778 0.0001059 0.04843 1.195
engine .power'" (1/2) , data =
Coefficients:
(Intercept)
I(engine, power'"(1/2))
dummy1
I(engine.power'"(1/2)):dummyl
Value Std. Error t value Pr(>|t|)
0.1581 0.0024 65.9078 0.0000
0.0254 0.0002 122.2468 0.0000
-0.0682 0.0026 -25.9438 0.0000
0.0020 0.0003 6.1264 0.0000
Residual standard error: 0.08224 on 39370 degrees of freedom
Multiple R-Sguared: 0.7569
F-statistic: 40850 on 3 and 39370 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept) I (engine. power'" (1/2 )) dummyl
I (engine. power'" (1/2) ) -0.9742
dummyl -0.9123 0.8888
I (engine, power'" (1/2) ): dummyl 0.6175 -0.6339 -0.8171
Analysis of Variance Table
Response: NOx.50
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
I (engine. power'" (1/2) ) 1 819.8002 819.8002 121203.8 0 . OOOOOOe+000
dummyl 1 8.9202 8.9202 1318.8 0 . OOOOOOe+000
I (engine, power'" (1/2) ): dummyl 1 0.2539 0.2539 37.5 9 . 073785e-010
Residuals 39370 266.2915
0.0068
11-31
-------
(b) Response vs. Fit
(a) Residuals vs. Fit
r
• i .
(c) Residuals Normal QQ
01 t!
r
Figure 11-25 QQ and Residual vs. Fitted Plot for NOx Model 1.4
The results suggest that by using dummy variables and interactions with transformed en-
gine power, the model increases the amount of variance explained in truncated transformed NO
from about 75% (Model 1.2) to about 77% (Model 1.4).
Model 1.4 slightly improves the R2 more than does Model 1.2. The residuals scatter plot
for Model 1.4 (Figure 11-25) shows a slightly more reasonably linear relation. Figure 11-25
shows that Model 1.4 may also do a slightly better job in improving the pattern of variance. The
QQ plot shows general normality with the exceptions arising in the tails. However, it is impor-
tant to note that the model improvement, in terms of amount of variance explained by the model,
is marginal at best.
11.2.2.1.3 Model Discussion
Previous sections provide the model development process from one OLS model to an-
other OLS model. To test whether the linear regression with power was a beneficial addition
to the regression tree model, the mean ERs at HTBR end nodes (single value) are compared to
the predictions from the linear regression function with engine power. The results of the per-
formance evaluation are shown in Table 11-16. The improvement in R2 associated with moving
toward a linear function of engine power is tremendous. Hence, the use of the linear regression
11-32
-------
function will provide a significant improvement in spatial and temporal model prediction capa-
bility. However this linear regression function might still be improved. Since the R2 and slope
in Table 11-16 are derived by comparing model predictions and actual observations for emission
rates (untransformed y), these numbers are different in linear regression models.
Two transforms of engine power were tested: square root transformation and log trans-
formation. The results of the performance evaluation are shown in Table 11-16. These results
suggest that linear regression function with square root transformation performs slightly better.
Given that the regression tree modeling exercise indicated that a number of power cut-
points may play a role in the emissions process, an additional modeling run was performed. The
results of the performance evaluation are shown in Table 11-16. Analysis results suggest that the
linear regression function with dummy variable performs slightly better than the model without
the power cutpoints.
Table 11-16 Comparative Performance Evaluation of NOx Emission Rate Models
Mean ERs
Coefficient of
determination
0.00003
Slope
1.000
RMSE
0.12008
MPE
-0.000006
Linear regression (power)
0.529
0.814
0.08542
0.01031
Linear regression (powerA0.5)
0.614
0.975
0.07494
0.00707
Linear regression (log(power))
0.587
1.287
0.08043
0.00933
Linear regression (powerA0.5) w/dummy
variables
0.627
1.011
0.07372
0.00704
Although the linear regression function with dummy variables performs slightly bet-
ter than linear regression function with square root transformation, more explanatory variables
(dummy variable and the interaction with engine power) are introduced and the complexity of
the regression model increases. There is only one regression function for Model 1.2 while there
are two regression functions for Model 1.4. There is also no obvious reason why the engine
may be performing slightly differently within these power regimes, yielding different regression
slopes and intercepts. The fuel injection systems in these engines may operate slightly different-
ly under low load (near-idle) and high load conditions. The fuel injection system may be con-
trolled by the engine computer, or there may be a sufficient number of low power cruise opera-
tions and high power cruise operations that are incorrectly classified, and may be better classified
as idle or acceleration events (perhaps due to GPS speed data errors). In any case, because the
model with dummy variables does not perform appreciably better than the model without the
dummy variables, the dummy variables are not included in the final model selection at this time.
11-33
-------
These dummy variables are, however, worth exploring when additional data from other engine
technology groups become available for analysis. Model 1.2 is selected as the preliminary 'final'
model.
The next step in model evaluation is to once again examine the residuals for the improved
model. A principal objective was to verify that the statistical properties of the regression model
conform to a set of properties of least squares estimators. In summary, these properties require
that the error terms be normally distributed, have a mean of zero, and have uniform variance.
Test for Constancy of Error Variance
A plot of the residuals versus the fitted values is useful in identifying any patterns in the
residuals. Figure 11-23 plot (b) shows this plot for NO model 1.2. Without considering vari-
ance due to high emission points and zero load data, there is no obvious pattern in the residuals
across the fitted values.
Test of Normality of Error terms
The first informal test normally reserved for the test of normality of error terms is a quan-
tile-quantile plot of the residuals. Figure 11-23 plot (d) shows the normal quantile plot of NO
model 1.2. The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality. Under normality, we expect 68 percent of the residuals
to fall between ±VMS£ and about 90 percent to fall between ± 1.645 VMSE. Actually, 81.79%
of residuals fall within the first limits, while 94.05% of residuals to fall within the second limits.
Thus the actual frequencies here are reasonably consistent with those expected under normality.
The heavy tails at both ends are a cause for concern, but are due to the nature of the data set. For
example, even after the transformation, the response variable is not a true normal distribution.
Based on the above analysis, the final NO emission rate model selected for cruise mode is:
NOx = (0.087 + 0.0311(engine.power)(1/2))2
Analysis results support the observation that the final NO emission model is significantly
better at explaining variability without making the model too complex. Since there is only one
engine type, complexity may not be valid in terms of transferability. This model is specific to the
engine classes employed in the transit bus operations. Different models may need to be devel-
oped for other engine classes and duty cycles.
11-34
-------
11.2.2.2 CO Emission Rate Model Development for Cruise Mode
Based on previous analysis, truncated transformed CO will serve as the independent
variable. However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing statistical models. HTBR tree model
results suggest that engine power is the best one to begin with.
11.2.2.2.1 Linear Regression Model with Engine Power
Let's select engine power to begin with, and estimate the model:
7 = /?Q + ft ^engine.power + Error (2.1)
The regression run yields the results shown in Table 11-17 and Figure 11-26.
Table 11-17 Regression Result for CO Model 2.1
Call: lm(formula = log.CO ~ engine.power, data = busdata!0242006.1.4, na.action
na.exclude)
Residuals:
Min 1Q Median 3Q Max
-2.779 -0.2088 -0.01417 0.2153 2.376
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) -2.2230 0.0030 -751.4277 0.0000
engine.power 0.0033 0.0000 125.1304 0.0000
Residual standard error: 0.3859 on 39216 degrees of freedom
Multiple R-Sguared: 0.2853
F-statistic: 15660 on 1 and 39216 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
engine.power -0.7525
Analysis of Variance Table
Response: log.CO
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
engine.power 1 2331.251 2331.251 15657.62 0
Residuals 39216 5838.839 0.149
These results suggest that engine power explains about 29% of the variance in truncated
transformed CO. F-statistic shows that/?7^ 0, and the linear relationship is statistically signifi-
cant. To evaluate the model, the normality is examined in the QQ plot and constancy of variance
is checked by examining residuals vs. fitted values.
11-35
-------
(a (Scatter Plot
(c) Response vs. Fit
(b) Residual vs. Fit
(d) Residuals Normal QQ
0 I 4
Figure 11-26 QQ and Residual vs. Fitted Plot for CO Model 2.1
Although the residual plot in Figure 11-26 shows a linear relationship between engine
power and truncated transformed CO, square root transformation and logarithmic transformation
are tested to see whether transformation would be useful to improve the model. Scatter plots
and residual plots based on each transformation should then be prepared and analyzed to decide
which transformation is most effective.
Y =
+
+ Error
(2.2)
Y= PQ + P1log1Q(engine.power+l) + Error
(2.3)
The results for Model 2.2 are shown in Table 11-18 and Figure 11-27, while the results
for Model 2.3 are shown in Table 11-19 and Figure 11-28.
11-36
-------
Table 11-18 Regression Result for CO Model 2.2
Call: lm(formula = log.CO ~ engine.power'" (1/2), data = busdata!0242006.1.4,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-2.679 -0.2124 -0.01769 0.2178 2.319
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) -2.3645 0.0039 -610.0636 0.0000
I(engine.power'" ( 1/2)) 0.0526 0.0004 125.3638 0.0000
Residual standard error: 0.3857 on 39216 degrees of freedom
Multiple R-Sguared: 0.2861
F-statistic: 15720 on 1 and 39216 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
I (engine.power'" (1/2) ) -0.8646
Analysis of Variance Table
Response: log.CO
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
I (engine.power'" (1/2)) 1 2337.466 2337.466 15716.09 0
Residuals 39216 5832.624 0.149
(a) Scatter Plot
(c) Response vs. Fit
(b) Residual vs. Fit
(d) Residuals Normal QQ
Figure 11-27 QQ and Residual vs. Fitted Plot for CO Model 2.2
11-37
-------
Table 11-19 Regression Result for CO Model 2.3
Call: lm(formula = log.CO ~ loglO(engine.power + 1), data = busdata!0242006.1.4,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-2.636 -0.2225 -0.0167 0.2193 2.308
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) -2.4326 0.0050 -489.4690 0.0000
loglO(engine.power + 1) 0.3031 0.0028 107.5567 0.0000
Residual standard error: 0.4011 on 39216 degrees of freedom
Multiple R-Sguared: 0.2278
F-statistic: 11570 on 1 and 39216 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
loglO(engine.power + 1) -0.9132
Analysis of Variance Table
Response: log.CO
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
loglO(engine.power +1) 1 1861.106 1861.106 11568.45 0
Residuals 39216 6308.983 0.161
(a) Scatter Plot
(c) Response vs. Fit
(b) Residual vs. Fit
(d) Residuals Normal QQ
Figure 11-28 QQ and Residual vs. Fitted Plot for CO Model 2.3
11-38
-------
The results suggest that by using transformed engine power, the model retains the amount
of variance explained in truncated transformed CO at about 29% (Model 2.2), and even decreas-
es to 23% (Model 2.3).
Considering two kinds of transformation, Model 2.2 improves the R2 more than does Model
2.3. The residuals scatter plot for Model 2.2 (Figure 11-27) shows a more reasonably linear re-
lationship than Model 2.3 (Figure 11-28). Figure 11-27 also shows that Model 2.2 does a better
job of improving the pattern of variance comparing with Model 2.3. The QQ plot shows a kind of
normality except for the two tails. Model 2.1 and Model 2.2 are both acceptable at this point.
11.2.2.2.2 Linear Regression Model with Dummy Variables
Figure 11-17 suggests that the relationship between CO and engine power may be some-
what different across the engine power ranges identified in the tree analysis. That is, there may
be higher or lower CO emissions in different engine power operating ranges. One dummy vari-
able is created to represent different engine power ranges identified in Figure 11-17 for use in
linear regression analysis as illustrated below:
Engine power (bhp) Dummy 1
<114.355 1
>114.355 0
This dummy variable and the interaction between dummy variable and engine power are
then tested to determine whether the use of the variable and interactions can help improve the
model.
Y = ft + /? engine.power^IT> + /?2 dummy 1 + /?3 dummy 1 engine.power^IT> + Error (2.4)
The regression yields the results shown in Table 11-20 and Figure 11-29.
11-39
-------
Table 11-20 Regression Result for CO Model 2.4
*** Linear Model ***
Call: lm(formula = log.CO ~ engine.power^(1/2) + dummyl * engine.power^(1/2), data
busdata!0242006.1.4, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-2.714 -0.2081 -0.01473 0.2136 2.37
Coefficients:
(Intercept)
I(engine, power'"(1/2))
dummy1
I(engine.power'"(1/2)):dummyl
Value Std. Error t value Pr(>|t|)
-2.6690 0.0250 -106.5896 0.0000
0.0772 0.0019 41.2399 0.0000
0.3472 0.0254 13.6516 0.0000
-0.0338 0.0020 -17.0016 0.0000
Residual standard error: 0.3836 on 39214 degrees of freedom
Multiple R-Sguared: 0.2936
F-statistic: 5432 on 3 and 39214 degrees of freedom, the p-value is 0
Analysis of Variance Table
Response: log.CO
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
I (engine.power'" (1/2) ) 1 2337.466 2337.466 15881.03 0
dummyl 1 18.325 18.325 124.50 0
I (engine, power'" (1/2) ): dummyl 1 42.545 42.545 289.05 0
Residuals 39214 5771.754 0.147
(a) Residuals vs. Fit
•
(b) Response vs. Fit
(c) Residuals Normal QQ
Figure 11-29 QQ and Residual vs. Fitted Plot for CO Model 2.4
11-40
-------
Model 2.4 improves R2 only marginally and retains the amount of variance explained in
truncated transformed CO at about 29%, same as Model 2.1 and Model 2.2. Model 2.4 slightly
improves R2 more than does Model 2.2. The residuals scatter plot for Model 2.4 (Figure 11-29)
shows a reasonably linear relationship. Figure 11-29 also shows that Model 2.4 does a good job
of improving the pattern of variance. QQ plot shows general normality with the exceptions aris-
ing in the tails. These three models (Model 2.1, Model 2.2, and Model 2.4) are all acceptable.
11.2.2.2.3 Model Discussion
The previous sections outline the model development process from a regression tree
model, to a simple OLS model, to more complex OLS models. Since the performance of the
models is evaluated by comparing model predictions and actual observations for emission rates,
the R2 and slope are different from those in previous linear regression models. The results of
each step in the model improvement process are presented in Table 11-21. The mean emission
rates at HTBR end nodes (single value) are compared to the results of various linear regression
functions with engine power. Since the R2 and slope in Table 11-21 are derived by comparing
model predictions and actual observations for emission rates (untransformed y), these numbers
are different from those encountered in linear regression models.
Table 11-21 Comparative Performance Evaluation of CO Emission Rate Models
Mean ERs
Coefficient of
determination
0.000005
Slope
1.000
RMSE
0.047559
MPE
0.0000002
Linear regression (power)
0.0880
1.422
0.04622
0.00749
Linear regression (power )
0.0899
1.984
0.04662
0.00804
Linear regression (log(power))
0.0659
2.560
0.04736
0.00866
Linear regression (power0-5) w/dummy variables
0.0915
1.657
0.04634
0.00777
The improvement in R2 associated with moving toward a linear function of engine power
is significant. Hence, the use of the linear regression function will provide a significant improve-
ment in spatial and temporal model prediction capability. However, this linear regression func-
tion might still be improved.
Results suggest that a linear regression function with square root transformation performs
slightly better than the others and that the use of dummy variables can further improve model
performance. However, given the marginal improvement in R2, one could argue that use of the
engine power may be just as reasonable considering the slope, RMSE, and MPE. Although the
11-41
-------
linear regression function with dummy variables performs slightly better than other linear re-
gression models, more explanatory variables (dummy variables and the interaction with engine
power) are introduced and the complexity of regression model increases. As discussed in Section
11.2.2.1, there is no compelling reason to include the dummy variables in the model, given that:
1) the second model is more complex without significantly improving model performance, and 2)
there is no compelling engineering reason at this time to support the difference in model perfor-
mance within these specific power regions. These dummy variables are, however, worth explor-
ing when additional data from other engine technology groups become available for analysis.
Considering all four parameters together, Model 2.1 is recommended as the preliminary
'final' model. The next step in model evaluation is to once again examine the residuals for the
improved model. A principal objective was to verify that the statistical properties of the regres-
sion model conform to a set of properties of least squares estimators. In summary, these proper-
ties require that the error terms be normally distributed, have a mean of zero, and have uniform
variance.
Test for Constancy of Error Variance
A plot of the residuals versus the fitted values is useful in identifying patterns in the
residuals. Figure 11-26 plot (b) shows this plot for CO Model 2.1. Without considering variance
due to high emission points and zero load data, there is no obvious pattern in the residuals across
the fitted values.
Test of Normality of Error Terms
The first informal test normally reserved for the test of normality of error terms is a
quantile-quantile plot of the residuals. Figure 11-26 plot (c) shows the normal quantile plot of
CO model 2.1. The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality. Under normality, we expect 68 percent of the residuals
to fall between ± V-MSE and about 90 percent to fall between ± 1.645 VMSE. Actually, 95.20%
of residuals fall within the first limits, while 96.97% of residuals fall within the second limits.
Thus the actual frequencies here are reasonably consistent with those expected under normality.
The heavy tails at both ends are a cause for concern, but these tails are due to the nature of the
data set. For example, even after the transformation, the response variable is not the real normal
distribution.
Based on the above analysis, the final CO emission rate model for the cruise mode is:
rn = i r)(~2-223+o.oo33engine.power)
11-42
-------
11.2.2.3 HC Emission Rate Model Development for Cruise Mode
Based on previous analysis, truncated transformed HC will serve as the independent
variable. However, modelers should keep in mind that the comparisons should always be made
on the original untransformed scale of Y when comparing statistical models. Previous analysis
results suggest that engine power is the best one to begin with.
11.2.2.3.1 Linear Regression Model with Engine Power
Let's select engine power to begin with, and estimate the model:
7 = ft + ft ^engine.power + Error (3.1)
The regression run shows the results in Table 11-22 and Figure 11-30.
Table 11-22 Regression Result for HC Model 3.1
Call: lm(formula = HC.25 ~ engine.power, data = busdata!0242006.1.4, na.action
na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.123 -0.0212 0.00002295 0.02228 0.3279
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 0.1769 0.0003 537.0480 0.0000
engine.power 0.0001 0.0000 43.0656 0.0000
Residual standard error: 0.04248 on 38018 degrees of freedom
Multiple R-Sguared: 0.04651
F-statistic: 1855 on 1 and 38018 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
engine.power -0.7501
Analysis of Variance Table
Response: HC.25
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
engine.power 1 3.34748 3.347484 1854.647 0
Residuals 38018 68.61934 0.001805
The results suggest that engine power explains about 5% of the variance in truncated
transformed HC. F-statistic shows that/?7^0, and the linear relationship is statistically signifi-
cant. To evaluate the model, the normality is examined in the QQ plot and constancy of variance
is checked by examining residuals vs. fitted values.
11-43
-------
(a) Scatter Plot
KOJ-
n so ^B t» 300 no we
(c) Response vs. Fit
(b) Residual vs. Fit
(d) Residuals Normal QQ
•
Figure 11-30 QQ and Residual vs. Fitted Plot for HC Model 3.1
The residual plot in Figure 11-30 shows a slight departure from linear regression assump-
tions indicating a need to explore a curvilinear regression function. Since the variability at the
different X levels appears to be fairly constant, a transformation on X is considered. The reason
to consider transformation first is to avoid multicollinearity brought about by adding the second-
order of X. Based on the prototype plot in Figure 11-30, the square root transformation and loga-
rithmic transformation are tested. Scatter plots and residual plots based on each transformation
should then be prepared and analyzed to determine which transformation is most effective.
Y = ft + ft engine.power^IT> + Error
Y = /?Q + ft JoglO (engine.power+1) + Error
(3.2)
(3.3)
The results for Model 3.2 are shown in Table 11 -23 and Figure 11-31, while the results
for Model 3.3 are shown in Table 11-24 and Figure 11-32.
11-44
-------
Table 11-23 Regression Result for HC Model 3.2
Call: lm(formula = HC.25 ~ engine.power'" (1/2), data = busdata!0242006.1.4, na.action
= na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.1233 -0.02113 -0.0002419 0.02195 0.3266
Coefficients:
(Intercept)
I (engine .power'" (1/2) )
Value Std. Error t value Pr(>|t|)
0.1700 0.0004 396.7451 0.0000
0.0022 0.0000 47.6385 0.0000
Residual standard error: 0.04227 on 38018 degrees of freedom
Multiple R-Sguared: 0.05633
F-statistic: 2269 on 1 and 38018 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
I (engine.power'" (1/2) ) -0.8625
Analysis of Variance Table
Response: HC.25
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
I (engine.power'" (1/2)) 1 4.05395 4.053948 2269.422 0
Residuals 38018 67.91288 0.001786
(a) Scatter Plot
-•
.
«
-------
Table 11-24 Regression Result for HC Model 3.3
Call: lm(formula = HC.25 ~ loglO(engine.power + 1), data = busdata!0242006.1.4,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.127 -0.02073 -0.0003198 0.02203 0.3226
Coefficients:
(Intercept)
loglO(engine.power + 1)
Value Std. Error t value Pr(>|t|)
0.1653 0.0005 313.2136 0.0000
0.0139 0.0003 46.4046 0.0000
Residual standard error: 0.04233 on 38018 degrees of freedom
Multiple R-Sguared: 0.05361
F-statistic: 2153 on 1 and 38018 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
loglO(engine.power + 1) -0.9114
Analysis of Variance Table
Response: HC.25
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
loglO(engine.power +1) 1 3.85779 3.857786 2153.39 0
Residuals 38018 68.10904 0.001791
(a) Scatter Plot
y
\-
,.
to i
Response vs. Fit
(b) Residual vs. Fit
(d) Residuals Normal QQ
Figure 11-32 QQ and Residual vs. Fitted Plot for HC Model 3.3
11-46
-------
The results suggest that by using transformed engine power, the model retains the amount
of variance explained in truncated transformed HC at about 5% (Model 2.2 and Model 2.3). The
improvement is very small.
Model 3.2 improves R2 relative to Model 3.3. The scatter plot for Model 3.2 (Figure
11-31) also shows a better linear relationship than Model 3.3 (Figure 11-32). Figure 11-31 also
shows that Model 3.2 does a good job of improving the pattern of variance. The QQ plot shows
general normality with the exceptions arising in the tails.
11.2.2.3.2 Linear Regression Model with Dummy Variables
Figure 11-21 suggests that the relationship between HC and engine power may differ
across the engine power ranges. One dummy variable is created to represent different engine
power ranges identified in Figure 11-21 for use in linear regression analysis as illustrated below:
Engine power (bhp) Dummy 1
< 15.335 1
> 15.335 0
This dummy variable and the interaction between dummy variable and engine power
are then tested to determine whether the use of the variable and interaction can help improve the
model.
Y = /?Q + yffj logw(engine.power+1) + fi2 dummy 1 + /?3 dummy 1 logw(engine.power+1) + Error (3.4)
The regression run shows the results in Table 11-25 and Figure 11-33.
11-47
-------
Table 11-25 Regression Result for HC Model 3.4
Call: lm(formula = HC.25 ~ loglO(engine.power + 1) + dummyl * loglO(engine.power +
1), data = busdata!0242006.1.4, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.1292 -0.0209 -0.0007262 0.02123 0.3423
Coefficients:
(Intercept)
loglO(engine.power + 1)
dummy1
dummyl:loglO(engine.power + 1)
Value Std. Error t value Pr(>|t|)
0.1695 0.0015 109.7632 0.0000
0.0124 0.0008 15.7058 0.0000
0.0022 0.0017 1.3388 0.1807
-0.0249 0.0012 -20.1153 0.0000
Residual standard error: 0.04184 on 38016 degrees of freedom
Multiple R-Sguared: 0.07514
F-statistic: 1030 on 3 and 38016 degrees of freedom, the p-value is 0
Analysis of Variance Table
Response: HC.25
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
loglO(engine.power +1) 1 3.85779 3.857786 2203.411 0
dummyl 1 0.84128 0.841276 480.503 0
dummyl:loglO(engine.power +1) 1 0.70843 0.708425 404.624 0
Residuals 38016 66.55934 0.001751
(b) Response vs. Fit
(a) Residuals vs. Fit
(c) Residuals Normal QQ
OW 017 Ot* 01» 030
rate Be«x • i|*dwnvi*iog10(«icHi»poMr> 1)
Figure 11-33 QQ and Residual vs. Fitted Plot for HC Model 3.4
11-48
-------
The results suggest that by using dummy variables and interactions with transformed en-
gine power, the model only increases the amount of variance explained in truncated transformed
HC from about 5% to about 8%.
Model 3.4 slightly improved R2 relative to Model 3.2. The F-statistic shows that all P
values are not equal to zero, and the linear relationship is statistically significant. The gap in the
residuals plot may be shifted regarding the intercept and slope by the difference of two regres-
sion functions.
11.2.2.3.3 Model Discussion
The previous sections outline the model development process from regression tree model,
to a simple OLS model, to more complex OLS models. Since the performance of the models
is evaluated by comparing model predictions and actual observations for emission rates, the
R2 and slope are different from those in previous linear regression models. To test whether the
linear regression with power was a beneficial addition to the regression tree model, the mean
ERs at HTBR end nodes (single value) are compared to the predictions from the linear regres-
sion function with engine power. The results of the performance evaluation are shown in Table
11-26. The improvement in R2 associated with moving toward a linear function of engine power
is nearly imperceptible. Hence, the use of the linear regression function will provide almost no
significant improvement in spatial and temporal model prediction capability. This linear regres-
sion function might still be improved. Since the R2 and slope in Table 11-26 are derived by
comparing model predictions and actual observations for emission rates (untransformed y), these
numbers are different from the results obtained from linear regression models.
Table 11-26 Comparative Performance Evaluation of HC Emission Rate Models
Coefficient of
determination
Slope
RMSE
MPE
Mean ERs
Linear regression (power)
Linear regression (power0'5)
Linear regression (log(power))
Linear regression (log(power)) w/dummy variables
0.00002
0.00766
0.00912
0.00950
0.00939
1.000
0.886
0.724
0.820
-1.142
0.0020519
0.0020984
0.0020845
0.0020831
0.0022933
0.0000003
0.00047397
0.00040936
0.00040857
0.00097449
Results suggest that the linear regression function with log transformation performs
slightly better than the others and that the use of dummy variables can further improve model
performance, but again there is almost no perceptible change in terms of explained variance.
Although the linear regression function with log transformation and dummy variables performs
slightly better than the linear regression function with square root transformation alone, the
11-49
-------
revised model introduces additional explanatory variables (dummy variables and the interaction
with engine power) and increases the complexity of the regression model without significantly
improving the model. As discussed in Section 11.2.2.1, there is no compelling reason to include
the dummy variables in the model, given that: 1) the second model is more complex without sig-
nificantly improving model performance, and 2) there is no compelling engineering reason at this
time to support the difference in model performance within these specific power regions. These
dummy variables are, however, worth exploring when additional data from other engine technol-
ogy groups become available for analysis.
Model 3.2 is recommended as the preliminary "final" model (although one might argue
that using the regression tree results directly would also probably be acceptable). The next step
in model evaluation is to once again examine the residuals for the improved model. A principal
objective was to verify that the statistical properties of the regression model conform ta a set of
properties of least squares estimators. In summary, these properties require that the error terms
be normally distributed, have a mean of zero, and have uniform variance.
Test for Constancy of Error Variance
A plot of the residuals versus the fitted values is useful in identifying any patterns in the
residuals. Figure 11-31 plot (c) shows this plot for HC Model 3.2. Without considering variance
due to high emission points and zero load data, there is no obvious pattern in the residuals across
the fitted values.
Test of Normality of Error terms
The first informal test normally reserved for the test of normality of error terms is a
quantile-quantile plot of the residuals. Figure 11-31 plot (d) shows the normal quantile plot of
the HC model. The second informal test is to compare actual frequencies of the residuals against
expected frequencies under normality. Under normality, we expect 68 percent of the residuals
to fall between ±VMS£ and about 90 percent to fall between ± 1.645 VMSE. Actually, 95.20%
of residuals fall within the first limits, while 96.99% of residuals fall within the second limits.
Thus, the actual frequencies here are reasonably consistent with those expected under normality.
The heavy tails at both ends are a cause for concern, but are due to the nature of the data set. For
example, even after the transformation, the response variable is not the real normal distribution.
The final HC emission rate model selected for cruise mode is:
HC = [0.170 + 0.0022(engine.power)(1/2)]4
11-50
-------
11.3 Conclusions and Further Considerations
In this research, engine power is used as the main explanatory variable to develop cruise
emission rate models. The explanatory ability of engine power varies by pollutant. In general,
the relationship between NOx and engine power is more highly correlated than the other two pol-
lutants.
Inter-bus variability analysis indicated that some of the 15 buses are higher emitters that
others (especially noted for HC emissions). However, none of the buses appear to qualify as
traditional high-emitters, which would exhibit emission rates of two to three standard devia-
tions above the mean. Hence, it is difficult to classify any of these 15 buses as high emitters
for modeling purposes. At this point, these 15 buses are treated as a whole data set for model
development. Modelers should keep in mind that although no true high-emitters are present in
the database, such vehicles may behave significantly differently than the vehicles tested. Hence,
data from high-emitting vehicles should be collected and examined in future studies.
Some high HC emissions events are noted in cruise mode. After screening engine speed,
engine power, engine oil temperature, engine oil pressure, engine coolant temperature, ECM
pressure, and other parameters, no variables were identified that could be linked to these high
emissions events. These events may represent natural variability in onroad emissions, or some
other variable (such as grade or an engine variable that is not measured) may be linked to these
events.
Engine power is selected as the most important variable for three pollutants based on
HTBR tree models. This finding is consistent with previous research results which verified the
important role of engine power (Ramamurthy et al. 1998; Clark et al. 2002; Earth et al. 2004).
The noted HC relationship is significant but fairly weak. Analysis in previous chapters also indi-
cates that engine power is correlated with not only onroad load parameters such as vehicle speed,
acceleration, and grade, but also potentially correlated with engine operating parameters such
as throttle position and engine oil pressure. On the other hand, engine power in this research is
derived from engine speed, engine torque and percent engine load.
The regression tree models still suggest that some other variables, like oil pressure and
engine barometric pressure, may also impact the HC emissions. Further analysis demonstrates
that by using engine power alone one might be able to achieve similar explanatory ability as
opposed to using engine power and other variables. To develop models that are efficient and
easy to implement, only engine power is used to develop emission models. However, additional
investigation into these variables is warranted as additional detailed data from engine testing
become available for analysis.
11-51
-------
Given the relationships noted between engine indicated HP and emission rates, it is
imperative that data be collected to develop solid relationships in engine power demand models
(estimating power demand as a function of speed/acceleration, grade, vehicle characteristics,
surface roughness, inertial losses, etc.) for use in regional inventory development and microscale
impact assessment.
In summary, the cruise emission rate models selected for implementation are:
NOx= [0.0087+0.0311 (engine.power)(1/2)]2
CO = 10A(-2.223+0.0033engine.power)
HC = [0.170+0.0022 (engine.power)(1/2)]4
11-52
-------
CHAPTER 12
12. MODEL VERIFICATION
In the previous chapters, three statistically-derived modal emission rate models were de-
veloped for use in predicting emissions of NOx, CO and HC from transit buses. This chapter dis-
cusses the reasons for using engine power instead of surrogate power variables in emission rate
modeling, the necessity of developing a linear regression model rather than using mean emission
rates, the need to introduce driving mode with load modeling, the possibility of combining ac-
celeration and cruise modes, and other issues.
12.1 Engine Power vs. Surrogate Power Variables
The first step towards verifying the model is to compare the explanatory power of real
load data and surrogate power variables. Different approaches have been proposed by several re-
searchers. The MOVES model employs vehicle specific power (VSP), defined as instantaneous
power per unit mass of the vehicle (Jimenez-Palacios 1999).
VSP is a measure of the road load on a vehicle, defined as the power per unit mass to
overcome road grade, rolling and aerodynamic resistance, and inertial acceleration (Jimenez-
Palacios 1999; U.S. EPA2002b; Nam 2003; Younglove et al. 2005):
g* grade + g*CR) + 0.5p*CZ)M*v3//w
where:
v: vehicle speed (assuming no headwind) in m/s
a: vehicle acceleration in m/s2
y : mass factor accounting for the rotational masses (-0.1)
g: acceleration due to gravity
grade: road grade
CR: rolling resistance (-0.0135)
p: air density (1.2)
CD: aerodynamic drag coefficient
A: the frontal area
M: vehicle mass in metric tons
12-1
-------
Using typical values for coefficients, in SI units the equation becomes (CDA/m ~ 0.0005)
(Younglove et al. 2005):
0.132) + 0.001208xv3
The VSP approach to emission characterization was developed by several researchers
(Jimenez-Palacios 1999; U.S. EPA2002b; Nam 2003; Younglove et al. 2005) and further devel-
oped as part of the MOVES model. The coefficients used to estimate VSP were different in pre-
vious research because of the choice of typical values of coefficients. However, the coefficients
given in the above equation are specific for light-duty vehicles. For example, a mass factor of
0. 1 is not suitable to describe the transit bus characteristics of inertial loss. This surrogate power
variable (VSP) is not suitable to compare with engine load data for this study. First, the imple-
mentation approach that is used in MOVES is based upon VSP bins, and not on instantaneous
VSP. Second, the coefficients given in the above equation are specific for light-duty vehicles, not
for transit buses.
Other research efforts have used surrogate power variables such as the inertial power sur-
rogate, defined as acceleration times velocity, and drag power surrogate, defined as acceleration
times velocity squared (Fomunung 2000). Earth and Frey also used acceleration times velocity
for power demand estimation (Barth and Norbeck 1997; Frey et al. 2002). Both surrogate vari-
ables for power demand can be used to compare NOx in cruise mode. Using surrogate variables
instead of real load data, the model is:
Y = fi0 + fij acceleration + /32 vehicle, speed + /33 vehicle. speed*acceleration +
P4 vehicle, speed ^acceleration + Error
The regression run shows the results in Table 12-1 and Figure 12-1.
12-2
-------
Table 12-1 Regression Result for NOx Model 1
Call: lm(formula = NOx.50 ~ vehicle.speed * acceleration + vehicle.speedA2 :
acceleration, data = busdata!0242006.1.4, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.4779 -0.08625 0.001824 0.08759 1.338
Coefficients:
(Intercept)
vehicle.speed
acceleration
vehicle.speed:acceleration
acceleration: I (vehicle.speedA2)
Value Std. Error t value Pr(>|t|)
0.1996 0.0018 113.0559 0.0000
0.0043 0.0001 77.4369 0.0000
0.0738 0.0052 14.2957 0.0000
0.0066 0.0004 15.5704 0.0000
-0.0001 0.0000 -13.7590 0.0000
Residual standard error: 0.1323 on 39369 degrees of freedom
Multiple R-Squared: 0.3708
F-statistic: 5801 on 4 and 39369 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept) vehicle.speed acceleration
vehicle.speed -0.9243
acceleration 0.0796
vehicle.speed:acceleration -0 . 0825
acceleration: I (vehicle.speedA2) 0.0782
-0.0590
0.0569
-0.0593
-0.9114
0.7978
vehicle.speed:acceleration
vehicle.speed
acceleration
vehicle.speed:acceleration
acceleration: I (vehicle.speedA2) -0.9678
Analysis of Variance Table
Response: NOx.50
Terms added sequentially (first to last)
Df Sum of Sq
vehicle.speed 1
acceleration 1
vehicle.speed:acceleration 1
acceleration: I (vehicle.speedA2) 1
F Value Pr(F)
6999.67 0
Residuals 39369 689.1106
Mean Sq
122.5215 122.5215
278.9165 278.9165 15934.55
1.4036 1.4036 80.19
3.3136 3.3136 189.31
0.0175
12-2
-------
(a} Residual vs. Fit
(b) Residuals Normal QQ
•
. :-.
•
02 03 04 0
c* Slmara Nwm*
Figure 12-1 QQ and Residual vs. Fitted Plot for NOx Model 1
The results suggest that the surrogate variable model can explain about 37 % of the vari-
ance in truncated transformed NO , whereas the OLS model developed in Chapter 10 explained
more than 75% of the cruise mode variance. Considering the theoretical equation of engine
power presented much earlier in Chapter 3, the surrogate variables can only represent some, and
not all, of the components of engine power. Given the importance of engine power in explaining
the variability of emissions, it is essential that field data collection efforts include the measure-
ment of indicated load data as well as all of the operating conditions necessary to estimate bhp
load when second-by-second emission rate data are collected.
12.2 Mean Emission Rates vs. Linear Regression Model
The modeling approach employed in this research involved the separation of data into
separate driving modes for analysis and then applying modeling techniques to derive emission
rates as a function of engine load. Although constant emission rates in grams/second were ad-
equate for idle, motoring, and non-motoring deceleration modes, modeling efforts in Chapters 10
and 11 demonstrated that a linear regression function should improve spatial and temporal model
prediction capability significantly for acceleration and cruise modes. However, one verification
comparison that should be undertaken is on the overall benefit of introducing engine load into the
modeling regime vs. simply using average emission rate values for each operating mode. This
comparison will provide insight into the overall effect of introducing engine load (even though it
is only introduced into acceleration and cruise modes).
There are a number of model goodness-of-fit criteria that can be used to assess the dif-
ference between the emissions predicted by the load-based modal emission rate model and the
mode-only emission rate models. Normally, one would compare the alternative model perfor-
12-4
-------
mance for an independent set of data collected from similar vehicles, which is currently not
available. Alternatively, model developers would set aside a significant subset of the data in the
model development data set so that the data are not used in model development and instead used
in model comparisons. However, there were not enough data available to do this. Hence, at this
time, the only comparisons that can be made are for alternative model performance using the
same data that were used to develop the models presented in this research effort.
The performance of the models is first evaluated by comparing model predictions and ac-
tual observations for emission rates. The performance of the model can be evaluated in terms of
precision and accuracy (Neter et al. 1996). TheR2 value is an indication of precision. Usually,
higher R2 values imply a higher degree of precision and less unexplained variability in model
predictions than lower R2 values. The slope of the trend line for the observed versus predicted
values is an indication of accuracy. A slope of one indicates an accurate prediction, in that the
prediction of the model corresponds to an observation.
The model's predictive ability is also evaluated using the root mean square error (RMSE)
and the mean prediction error (MPE) (Neter et al. 1996). The RMSE is a measure of prediction
error. When comparing two models, the model with a smaller RMSE is a better predictor of
the observed phenomenon. Ideally, mean prediction error is close to zero. RMSE and MPE are
calculated as follows:
\\"
RMSE = . -Y(> -j)2 Equation (12-1)
\n^
Equation (12-2)
where:
RMSE: = root mean square error
n: = number of observations
yr = observaton y
yr = mean of observation y
MPE: = mean predictive error
To test whether the linear regression with power was a beneficial addition to the regres-
sion tree model, the mean ERs at HTBR end nodes (single value) are compared to the predictions
from the linear regression function with engine power. The results of the performance evaluation
are shown in Table 12-2.
12-5
-------
Table 12-2 Comparative Performance Evaluation between Mode-Only Models and Linear Re-
gression Models
Coefficient of
determination
Slope
(P,)
RMSE
MPE
NOY
Mean ERs
Linear Regression
CO
Mean ERs
Linear Regression
HC
Mean ERs
Linear Regression
0.438
0.665
0.248
0.491
0.0686
0.0677
1.000
1.102
1.000
1.749
1.000
1.213
0.08725
0.07122
0.07406
0.06691
0.00190
0.00192
0.000002
0.021463
-0.000004
0.010285
0.0000005
0.000223
For NOx and CO, the R2 values indicate that load based modal emission model performs
slightly better than mean emission rates and the use of linear regression function can further im-
prove model performance. The results shown in Table 12-2 reinforce the importance of introduc-
ing linear regression functions in acceleration and cruise mode. For HC, there is no discernible
difference in model performance. Combining this finding with the performance results for HC
noted in Chapters 8 through 11, using constant emission rates for each operating mode could be
justified for this data set. When additional data are collected, researchers should compare mean
emission rates approaches to power-based approaches to ensure that power demand models for
HC are necessary.
12.3 Mode-specific Load Based Modal Emission Rate Model vs. Emission Rate Models as a
Function of Engine Load
Modal modeling approaches are becoming widely accepted as more accurate in making
realistic estimates of mobile source contributions to local and regional air quality. Research at
Georgia Tech has clearly identified that modal operation is a better indicator of emission rates
than average speed (Bachman 1998). The analysis of emissions with respect to driving modes,
also referred to as modal emissions, has been performed in recent research studies (Barth et al.
1996; Bachman 1998; Fomunung et al. 1999; Frey et al. 2002; Nam 2003; Barth et al. 2004).
These studies indicated that driving modes might have the ability to explain a certain portion of
the variability in emissions data. In Chapters 10 and 11, emission rates were derived as a func-
tion of driving mode (cruise, idle, acceleration, and deceleration operations) and engine power
because previous research efforts had separately suggested that vehicle emission rates were
12-6
-------
highly correlated with modal activity and engine power. In this research, five driving modes are
introduced in total: idle mode, deceleration motoring mode, revised deceleration mode, accelera-
tion mode, and cruise mode.
Chapters 10 and 11 did not compare the combined modal and engine power models to
models that use power alone to predict emission rates. To test the effect of adding driving modes
in the emission rate model, the derivation of a load-only model for NOx emissions is illustrated
in detail. Load-only CO emissions models and HC emissions models are also derived for com-
parison purposes and presented in final form (however, the detailed regression plots and tables
are omitted for the purposes of brevity).
As in previous chapters, the first step for a load based only model is to select the most im-
portant variable for NO emissions. When using the entire database at once (data are not broken
into mode subsets for this derivation), the appropriate transformation for NOx is 1A based on Box-
Cox results, rather than the /^ value used in developing models for acceleration and cruise mode
(see Chapters 10 and 11). The trimmed HTBR tree models for NOx are illustrated in Figure 12-2
and Table 12-3.
ginftpQwer<41 5.'
engine.power<4.515
engine. power<96.255
0.5926
0.6933
0.2768
0.4246
Figure 12-2 Trimmed Regression Tree Model for Truncated Transformed NO
12-7
-------
Table 12-3 Trimmed Regression Tree Results for Truncated Transformed NO
~ v
Regression tree:
tree(formula = NOx.25 ~ engine.power + vehicle.speed + acceleration +
oil.temperture + oil.press + cool.temperature + eng.bar.press +
model.year + odometer + bus360 + bus361 + bus363 + bus364 + bus372 +
bus375 + bus377 + bus379 + bus380 + bus381 + bus382 + bus383 + bus384 +
bus385 + dummy.grade, data = busdata!0242006.1, na.action = na.exclude,
mincut = 3000, minsize = 6000, mindev = 0.1)
Variables actually used in tree construction:
[1] "engine.power"
Number of terminal nodes: 4
Residual mean deviance: 0.005837 = 618.6 / 106000
Distribution of residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-5.187e-001 -4.510e-002 -9.204e-003 3.768e-016 5.004e-002 6.557e-001
node), split, n, deviance, yval
* denotes terminal node
1) root 105976 3058.00 0.4991
2) engine.power<41.535 62441 666.60 0.3823
4) engine.power<4.515 17897 195.50 0.2768 *
5) engine.power>4.515 44544 192.20 0.4246 *
3) engine.power>41.535 43535 316.60 0.6667
6) engine.power<96.255 11504 61.56 0.5926 *
7) engine.power>96.255 32031 169.20 0.6933 *
After testing different transformations for Y and adding dummy variables according to
HTBR results, Table 12-4 and Figure 12-3 show that a load based only model for NO emissions
is a fairly good model, considering the constancy of error variance and normality of error terms.
So, the final load based only model for NO emissions is:
NOx= [0.230 + 0.1951oglO(engine.power+l)]4
The regression run shows the results in Table 12-4 and Figure 12-3.
12-8
-------
Table 12-4 Regression Result for NOy Load-Based Only Emission Rate Model
Call: lm(formula = NOx.25 ~ loglO(engine.power + 1), data = busdata!0242006.1,
na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-0.4683 -0.04297 -0.01329 0.04138 0.663
Coefficients:
(Intercept)
loglO(engine.power + 1)
Value Std. Error t value Pr(>|t|)
0.2303 0.0005 489.9131 0.0000
0.1950 0.0003 657.2170 0.0000
Residual standard error: 0.0754 on 105974 degrees of freedom
Multiple R-Sguared: 0.803
F-statistic: 431900 on 1 and 105974 degrees of freedom, the p-value is 0
Correlation of Coefficients:
(Intercept)
loglO(engine.power + 1) -0.8702
Analysis of Variance Table
Response: NOx.25
Terms added seguentially (first to last)
Df Sum of Sg Mean Sg F Value Pr(F)
loglO(engine.power +1) 1 2455.676 2455.676 431934.2 0
Residuals 105974 602.494 0.006
(a) Residuals vs. Fit
„«*«
(b) Response vs. Fit
ftl 44 0*. «• 07
(c) Residuals Normal QQ
Figure 12-3 QQ and Residual vs. Fitted Plot for Load-Based Only NO Emission Rate Model
12-9
-------
Following the same derivation techniques, the final load-only model for CO emissions is:
CO = 10A[-2.659 + 0.0899(engine.power)(1/2)]
Following the same derivation techniques, the final load-only model for HC emissions is:
HC = 10A[-3.306 + 0.0382(engine.power)(1/2)]
The performance of the load-only models relative to the combined mode and load models
developed in Chapters 8 through 11 is presented in Table 12-5.
Table 12-5 Comparative Performance Evaluation Between Load-Based Only Emission Rate (ER)
Model and Load-Based Modal Emission Rate Model
Coefficient of
determination
Slope
(P,)
RMSE
MPE
NO
Load-Only Emission Rate Model
Mode/Load Emission Rate Models
CO
Load-Only Emission Rate Model
Mode/Load Emission Rate Models
HC
Load-Only Emission Rate Model
Mode/Load Emission Rate Models
0.715
0.665
0.246
0.490
0.0672
0.0677
1.181
1.102
2.071
1.749
0.982
1.213
0.06494
0.07122
0.07886
0.06691
0.00197
0.00192
0.011382
0.021463
0.015568
0.010285
0.000499
0.000223
For NO , both models perform well in explaining the variance of emission rates, reinforc-
ing the importance of including engine power as a variable in explaining the variance of NOx
emission rates. Results suggest that a mode/load modal emission modeling approach performs
slightly better than load-only emission rate models for CO. For HC, there is no discernible
difference in model performance. Combining this finding with the performance results for HC
noted in Chapters 8 through 11, using constant emission rates for each operating mode could be
justified for this data set. When additional data are collected, researchers should compare mode-
only approaches to power-based approaches to ensure that power demand models for HC are
necessary.
12-10
-------
12.4 Separation of Acceleration and Cruise Modes
In this research effort, separate models were developed for acceleration and cruise modes
(Chapters 10 and 11). However, it may be possible to combine acceleration and cruise mode
activity into a new "combined driving" mode. As noted in Chapter 10, although engine power
distribution for acceleration mode is different from cruise mode, these two modes share a similar
pattern. A quick analysis of the impact of combining acceleration and cruise mode is presented
in this section.
After examining HTBR results, selecting the important explanatory variables, testing dif-
ferent transformations for X and Y, and adding dummy variables according to HTBR results, the
final NO emission model for combined driving mode is:
X &
NOx= [0.113 + 0.0266(engine.power(1/2)]2
The final CO emission model for combined driving mode is:
CO = 10A[-2.238 + 0.0043(engine.power)]
while the final HC emission model for combined driving mode is:
HC = [0.167 + 0.0028(engine.power(1/2)]4
To test whether combining acceleration and cruise modes would benefit the load-based
modal emission model, the predictions from the linear regression function for combined driving
mode are compared to the predictions from sub-models for acceleration and cruise mode in the
load-based modal emission model. Since the other elements are the same for two models, they
will be excluded from test. The results of the performance evaluation are shown in Table 12-6.
12-11
-------
Table 12-6 Comparative Performance Evaluation between Linear Regression with Combined
Mode and Linear Regression with Acceleration and Cruise Modes
Coefficient of
determination Slope (P ) RMSE MPE
(R2)
NOY
Combined Driving Mode
Acceleration & Cruise Mode
CO
Combined Driving Mode
Acceleration & Cruise Mode
HC
Combined Driving Mode
Acceleration & Cruise Mode
0.531
0.527
0.177
0.452
0.0338
0.0410
0.921
0.953
1.594
1.775
0.907
0.905
0.08488
0.09312
0.10395
0.08966
0.00204
0.00203
0.00840
0.03904
0.02305
0.01873
0.00042
0.00041
Results shown in Table 12-6 suggest that separate linear regression functions for accelera-
tion and cruise modes perform significantly better than linear regression functions with combined
driving mode for CO. For NO and HC, both models perform similarly with respect to explain-
ing the variance of emission rates. In general, these results support introducing acceleration and
cruise mode into the conceptual model. However, as new data become available for testing,
researchers should examine whether it is reasonable to simply separate idle and deceleration
modes from other driving modes and then apply a simple power-based model to the remaining
combined driving activity for NO .
12.5 MOBILE6.2 vs. Load-Based Modal Emission Rate Model
The final step undertaken in the model verification process was a comparison of predic-
tion results from MOBILE6.2 and the load-based modal emission rate model developed in this
research. Comparisons are based upon the Ann Arbor transit vehicle test data. These data were
used to develop the modal emission rates for this report, but were not used in developing the
MOBILE6.2 model. Normally, one would compare alternative model performance using an
independent set of data collected from similar vehicles, which is currently not available. Hence,
the comparisons that will be presented are far from unbiased. When new data from an indepen-
dent test fleet become available, these comparisons should be performed again.
To facilitate the emission rate prediction comparison, lookup tables for MOBILE6.2
transit bus emission rates on arterial roads were first created for average speeds from 2.5 mph to
65 mph. The MOBILE6.2 calendar year was set to January 2002 since the data set was collected
during October 2001. The temperature was set as 75 °F, since the emission rates for transit buses
12-12
-------
in MOBILE6.2 do not change with temperature. Emissions predictions from MOBILE6.2 were
then obtained by combining lookup tables and corresponding speed values in the AATA data set.
The results of the performance evaluation are shown in Table 12-7.
Table 12-7 Comparative Performance Evaluation between MOBILE 6.2 and Load-Based Modal ER Model
Coefficient of
determination Slope (Pj) RMSE
(R2)
MPE
NOY
MOBILE 6.2
Load-Based Modal ER Model
CO
MOBILE 6.2
Load-Based Modal ER Model
HC
MOBILE 6.2
Load-Based Modal ER Model
0.172
0.665
0.0195
0.491
0.0408
0.0677
0.706
1.102
1.690
1.749
0.584
1.213
0.10825
0.07122
0.08516
0.06691
0.00194
0.00192
0.011217
0.021463
0.013399
0.010285
0.000173
0.000223
Results suggest that load-based modal emission rate model performs significantly better
than MOBILE6.2 for NOx and CO, and slightly better for HC. The performance of the load-
based modal emission rate model is not surprising because the same data used to develop the
model are used in the comparison. Results suggest that the load-based modal emission model
performs well vis-a-vis explaining the variance of NO and CO emission rates on a microscopic
level. The slight differences in RMSE and MPE indicate that both models (MOBILE6.2 and the
load-based modal emission model) perform well at the macroscopic level, and should perform
similarly when used in regional inventory development.
12.6 Conclusions
In general, the results provided here are encouraging for the load based modal emis-
sion model. The comparison between engine power and surrogate power variables confirms
the important role of engine power in explaining the variability of emissions. The comparison
between the load-only emission rate model and the load-based modal emission rate model shows
that the impact of driving mode on emissions is signficiant for NO and CO emissions while no
such trend is discernible for HC. The comparison between acceleration and cruise modes and
combined driving mode indicates that the relationships between engine power and emissions are
slightly different for acceleration and cruise modes. Splitting the database into five modes (idle
mode, decelerating motoring mode, deceleration non-motoring mode, acceleration mode, and
cruise mode) appears warranted.
12-13
-------
The data used to develop the load based modal emission model in this research are very
limited since the data set contained only 15 transit buses. Inter-bus variability is more obvious
for HC emissions since Bus 363 has the lowest HC emissions compared with the other 14 buses.
This kind of variability might influence the explanatory variables of the modal emission model
for HC emissions. When new data become available, these models should be re-derived to ob-
tain further improved performance in applications to the transit bus fleet.
12-14
-------
CHAPTER 13
13. CONCLUSIONS
The goal of this research is to provide emission rate models that fill the gap between
existing models and ideal models for predicting emissions of NOx, CO, and HC from heavy-duty
diesel vehicles. The researchers at Georgia Institute of Technology have developed a beta ver-
sion of HDDV-MEM (Guensler et al. 2005), which is based upon vehicle technology groups,
engine emission characteristics, and vehicles modal activity. The HDDV-MEM first predicts
second-by-second engine power demand as a function of onroad vehicle operating conditions
and then applies brake-specific emission rates to these activity predictions. The HDDV-MEM
consists of three modules: a vehicle activity module (with vehicle activity tracked by a vehicle
technology group), an engine power module, and an emission rate module.
Using second-by-second data collected from onroad vehicles, the research effort reported
herein developed models to predict emission rates as a function of onroad operating conditions
that affect vehicle emissions. Such models should be robust and ensure that assumptions about
the underlying distribution of the data are verified and that assumptions associated with appli-
cable statistical methods are not violated. Due to the general lack of data available for develop-
ment of heavy-duty vehicle modal emission rate models, this study focuses on development of an
analytical methodology that is repeatable with different data sets collected across space and time.
The only acceptable second-by-second data set in which emission rate and applicable load and
vehicle activity data had been collected in parallel was the AATAbus emissions database col-
lected by Sensors, Inc., for use by the U.S. EPA.
The models developed in this report are applicable to transit buses only, and are not ap-
plicable to all transit buses (see limitations discussion in Section 13.2). However, a significant
contribution of the research is in the development of the analytical framework established for
analysis of second-by-second emission rate data collected in parallel with engine load and other
onroad operating parameters, and in the development of applicable processes for developing sta-
tistical models using such data. To demonstrate the capability of the modeling framework, three
13-1
-------
modal emission rate models have been developed for prediction of NOx, CO and HC emissions
from mid-1990s transit buses.
The AATA transit bus data set was first post-processed through a quality control/quality
assurance process. Data problems were identified and corrected during this stage of the research
effort. The types of errors checked include: loss of data, erroneous ECM data, GPS dropouts,
and synchronization errors. Data records for which all data elements were not collected were
removed to avoid any bias to the results. No erroneous ECM data were identified. Six buses ex-
perienced GPS dropouts and synchronization errors and these problems were treated as described
in chapter 4. Emission rate variability was also assessed across the sample of buses to identify
any potential high-emitters that may behave differently than other buses under normal operating
conditions and therefore warrant separate model development. However, no high-emitters were
identified. To find the true 'high-emitters', modelers need to include a representative sample
of buses to try to ensure that mean emissions and response rates to operating variables are rep-
resented in the data. Since there are only 15 buses in the data set, modelers could not exclude
buses that showed higher emission rates than the others.
Model development then proceeded through a structured series of steps. Transformations
of emission rates (NO , CO, and HC) were verified through a Box-Cox procedure to improve
the specific modeling assumptions, such as linearity or normality. HTBR regression tree results
were used to identify the most important explanatory variables for emission rates. OLS regres-
sion models were developed for transformed emission rates using chosen explanatory variables.
Dummy variables were created to represent the cut points identified in HTBR trees. Interaction
effects for identified explanatory variables were also tested to see whether they could improve
the model. The models were comparatively evaluated and the most efficient models for each
pollutant were selected. By demonstrating statistical "robustness" and sufficiency in previous
chapters, the main goal of this research, that of "developing new load-based models with signifi-
cant improvement", was achieved.
This chapter will review the key accomplishments of this research. The chapter provides
the final models selected for implementation and begins with a summary of the final models
developed for the transit buses, followed immediately by a discussion on the limitations of these
models. The chapter concludes with the lessons learned and recommendations on further re-
search.
13-2
-------
13.1 Transit Bus Emission Rate Models
The goal of this research was to develop a methodology for creating load-based emis-
sion rate models designed to predict emission rates of NOx, CO, and HC from transit buses as a
function of onroad operating conditions. The models should be robust and ensure that statisti-
cal assumptions in model development are not violated. With limited available data, this study
developed a methodology that is repeatable with a different data set from across space and across
time. The final estimated models are presented in Table 13-1.
Table 13-1 Load Based Modal Emission Models
Driving Mode
NO
X
Idle Mode
Decelerating Motoring Mode
Deceleration Non-Motoring Mode
Acceleration Mode
Cruise Mode
CO
Idle Mode
Decelerating Motoring Mode
Deceleration Non-Motoring Mode
Acceleration Mode
Cruise Mode
HC
Idle Mode
Decelerating Motoring Mode
Revised Deceleration Mode
Acceleration Mode
Cruise Mode
0.033415 g/s
0.0097768 g/s
0.045777 g/s
NOx = (-0.0195 + 0.2011oglO(engine.power + 1) +
0.0019vehicle.speed)2
NOx = (0.0087 + 0.0311 (engine.power)(1/2))2
0.0059439 g/s
0.0052857 g/s
0.0068557 g/s
CO = 10A(-3.747 + 1.3411oglO(engine.power + 1) -
0.0285vehicle.speed)
r^f\ — i (-,(-2. 223+0. 0033engine.power)
0.00091777 g/s
0.001113 g/s
0.0013 12 g/s
HC = (0.114 + 0.04261oglO(engine.power + I))4
HC = (0.170 + 0.0022 (engine.power)(1/2))4
The transformations employed for the three pollutants in acceleration and cruise modes
are different. The predictive capabilities of each of the models for three pollutants are also dif-
ferent. The R2 value is high for NO and CO emission rates, but very low for HC emission rates.
HC models are not much better than simply using HTBR mean ERs. The relatively poor perfor-
mance of the HC models is not an inherent limitation of the modal modeling approach. Instead,
1O "
13-j
-------
it is a result of the lack of availability of a suitable explanatory variable for model development
purposes. Although the model with dummy variables and interactions works better, the final
model is not necessarily the best fit, but is one that can be readily implemented.
The three models include all of those significant variables identified as affecting gram/
second emissions rates, with the exception of those variables that are highly correlated with indi-
vidual bus ID. Although a few of the vehicles behaved differently from other vehicles, modelers
could not reasonably include bus ID as a variable, nor environmental parameters of testing since
all low barometric pressure tests were conducted on one or two vehicles. Additional explora-
tion of environmental conditions should be conducted by collecting data for a larger fleet under a
wider variety of environmental conditions over a longer time.
The new modal emission rate models all indicate that engine power has a significant im-
pact on the acceleration and cruise emission rates. This observation strengthens the importance
of using load based emission data to develop new emission models and simulate engine power
in real world applications. All three models were shown to be robust by use of several statistical
measures. Although some departures from accepted norms were noted, these departures were
judged not so serious as to compromise the usefulness of the models. Hence, no remedial mea-
sures were taken.
13.2 Model Limitations
There are several limitations in the models estimated and presented in this work. Theo-
retically, the models cannot be used to forecast emissions beyond the domain of variables used
in estimating the models. These models were developed from 15 buses equipped with same fuel
injection type, catalytic converter type, transmission type, and so on, so the models could not
consider the effect of variation in vehicle technologies on emissions. Another limitation is the
consideration of the effect of emission control technology deterioration on emission levels since
all buses were only 5 or 6 years old at the time testing was conducted. Although the speed/ac-
celeration profiles between the AATA data set and the Atlanta buses are similar, there is no way
to estimate the effect of changes in vehicle technologies and deterioration on emissions in the
current and future fleet in Atlanta. Such a limitation introduces obvious uncertainties in the use
of the model to make predictions for other fleets.
The predictive models are derived from a research effort conducted by other parties.
Modeling at this time cannot control for those variables for which data were not collected. This
inability to control the variables may yield several uncertainties in the models. First, important
or useful variables relevant to the effect of emission rates may not have been observed at all, so it
13-4
-------
may be difficult to derive a model with sufficient explanatory power, or variables that are select-
ed may simply be correlated to the true causal variables that are affecting instantaneous emission
rates. Second, the interpretation of the effects of individual variables effects might be limited.
For example, the ability of negative load to explain the variability on emissions is limited due to
the negative loads recorded as zero.
An additional limitation imposed by the data is the uncertainty introduced by the actual
data collection process. The uncertainty in the GPS position will introduce significant instan-
taneous error in grade computation (grade should be collected by means other than GPS). Al-
though filter limits were imposed on the rate of change of engine speed (RPM), fuel flow, and
vehicle speed data, data could yield unreasonable instantaneous vehicle acceleration or decelera-
tion rates, and still be within reasonable absolute limits. This uncertainty may bias predictions.
The possible presence of outliers has the potential to cause a misleading fit by dispropor-
tionately pulling the fitted regression line away from the majority of the data points (Neter et al.
1996). Cook's distance plots indicated that some points do have influence over the regression
fit. However, none of these points is indicative of obvious errors in data. It is difficult to deter-
mine whether those extreme values were actually outliers or not. Since the data passed through
EPAs rigorous QA/QC procedures and no "true" outliers exist, and these high-emission events
are assumed to be representative of events that occur in the real world. Therefore, all of these
data were retained in model development. When additional data become available, researchers
should make it a priority to examine these high emissions events to identify the underlying causal
factors.
13.3 Lessons Learned
Because driving mode definitions varied across previous research efforts, findings from
these efforts are not directly comparable. This study independently developed driving mode defi-
nitions through comparison across critical values. Suitable modal activity definition can divide
the data into several homogeneous groups according to emission rates and driving conditions.
Unlike previous research efforts which only present pairwise comparisons of modal average es-
timates or HTBR regression tree analyses, this study compared distributions of engine operating
characteristics under proposed vehicle mode definitions by defining applicable vehicle modes.
A representative data set is the most critical issue for development the final version of
the proposed model. This issue plays an important role no matter which modeling approach is
employed. The representative data set should reflect the real world with respect to vehicle emis-
sions and activity patterns. The data set used for the proposed model consists of EPA AATA data
13-5
-------
and includes 15 buses. At the time this research was conducted, the AATA data were the only ap-
plicable data set that contained all required data (second-by-second emission rates, engine load,
and applicable operating variables) all collected in parallel. New data sets will improve model
performance in future.
A combination of tree and OLS regression methods was used to estimate NO , CO and
HC emission models from EPAs transit bus database tested by Sensors, Inc. The HTBR tech-
nique was used as a tool to reveal underlying data structure and identify useful explanatory
variables and was demonstrated as a powerful tool that will allow researchers to deal with large
multivariate data sets with mixed mode (discrete and continuous) variables.
13.4 Contributions
This research verifies that vehicle emission rates are highly correlated with modal ve-
hicle activity. Furthermore, the relationship between engine power and emissions is also sig-
nificant and is quantified for the available data. Research results indicate that engine power is
more powerful than surrogate variables in predicting second-by-second grams/second emission
rates. Hence, to improve our understanding of emission rates, it is important to examine not only
vehicle operating modes, but also engine power distributions. Based upon the important role
of engine power in explaining the variability of emissions, it is critical to include the load data
measurement (and collection of all onroad operating parameters to estimate load, such as grade)
during the emission data collection procedure.
Another major contribution of the work is the establishment of a framework for emission
rate model development suitable for predicting emissions at microscopic level. As more databases
become available, the model development steps can be re-run to develop a more robust load-based
modal emission model based on the same philosophy. This living modeling framework provides
the ability to integrate necessary vehicle activity data and emission rate algorithms to support
second-by-second and link-based emissions prediction. Combined with a GIS framework, models
derived through this methodology will improve spatial/temporal emissions modeling.
13.5 Recommendation for Further Studies
The methodology developed and applied in this research can, and should, be used to
estimate similar models for the on-road fleet consisting of transit buses and heavy-duty vehicles.
Since emissions of these vehicles are heavily dependent on vehicle dynamics (that is, load and
power), a successful validation will provide further evidence of the "correctness" of the method
employed here. When new data become available and these models are re-derived, modelers
13-6
-------
can expect further improved performance in applications to the transit bus fleet and eventually to
other heavy-duty vehicle fleets.
Given the important role of engine power in explaining the variability of emissions, en-
gine load data should be measured during the emission data collection procedure and all param-
eters necessary to estimate onroad load (such as grade and vehicle payload) should be included in
the data collection efforts. Similarly, simulation of engine power demand for onroad operations
becomes important in the implementation of emission inventory modeling for heavy-duty transit
buses. Refinement of roadway characteristic data (grade, etc.) for urban areas is paramount and
research efforts that can quantify drive train inertial losses under various operating conditions
will help enhance modal model development.
Because all buses tested were of the same model with the same engine, the test data were
valuable from the perspective of controlling potential explanatory variables related to vehicle
characteristics. However, these data simultaneously constrain the ability to explain the effect of
vehicle technology groups and deterioration of emission control technologies on emissions data.
Expanded data collection efforts should focus on identification of appropriate vehicle technology
groups and high-emitting vehicle groups. In these test programs, it will also be important to test
buses under their real-world operating conditions (on a variety of routes, road types and grades,
onroad operating conditions, environmental conditions, passenger loadings, etc.) to better reflect
real world conditions. These high-resolution data collection efforts will provide the data needed
by modelers to develop new and enhanced modal emission rate models for a variety of heavy-
duty vehicle classes.
13-7
-------
14. REFERENCES
Ahanotu, D. (1999). Heavy-Duty Vehicle Weight and Horsepower Distributions: Measurement
of Class-Specific Temporal and Spatial Variability. School of Civil and Environmental
Engineering. Atlanta, GA, Georgia institute of Technology. Ph.D. dissertation.
AMS. (2005). "A look at U.S. air pollution laws and their amendments." Retrieved July 30,
2005, from http://www.ametsoc.org/sloan/cleanair/cleanairlegisl.html
Avol, et. al. (2001). "Respiratory effects of relocating to areas of differing air pollution levels."
Am. J. Respir. Crit. Care. Med. 164: 2067-2072.
Bachman, W. (1998). A GIS-Based Modal Model of Automobile Exhaust Emissions Final Re-
port. Atlanta, GA, Prepared by Georgia Institute of Technology for U.S. Environmental
Protection Agency. EPA-600/R-98-097.
Bachman, W., W. Sarasua, et al. (2000). "Modeling Regional Mobile Source Emissions in a GIS
Framework." Transportation Research C 8(1-6): 205-229.
Barth, M., F. An, et al. (1996). "Modal Emission Modeling: A Physical Approach." Transporta-
tion Research Record 1520: 81-88.
Barth, M., F. An, et al. (2000). "Comprehensive Modal Emissions Model (CMEM), Version 2.0
User's Guide." http://pah.cert.ucr.edu/cmem/cmem_users_guide.pdf. January 2000.
Barth, M., G. Gcora, et al. (2004). A Modal Emission Model for Heavy Duty Diesel Vehicles.
Proceedings of the 83rd Transportation Research Board Annual Meeting Proceedings
(CD-ROM), Washington, DC.
Barth, M. and J. Norbeck (1997). NCHRP Project 25-11: The Development of a Comprehensive
Modal Emission Model. Proceedings of the 7th CRC On-Road Vehicle Emissions Work-
shop, Coordinating Research Council, Atlanta, GA.
Breiman, L., J. Friedman, et al. (1984). Classification and Regression Trees. Wadsworth Interna-
tional Group, Belmont. CA.
14-1
-------
Brown, J. Edward, et al. (2001)." Heavy Duty Diesel Fine Particulate Matter Emissions: Devel-
opment and Application of On-Road Measurement Capabilities." Research Triangle Park,
NC, Prepared by ARCADIS Geraghty & Miller, Inc. for U.S. Environmental Protection
Agency. EPA-600/R-01-079.
Browning, L. (1998). Update of Heavy-Duty Engine Emission Conversion Factors — Analysis
of Fuel Economy, Non-Engine Fuel Economy Improvements and Fuel Densities, U.S.
Environmental Protection Agency.
CARB (1991). Modal Acceleration Testing. Mailout No. 91-12; Mobile Source Division; El
Monte, CA.
CARB (2002). "Heavy-Duty Diesels Compression Ignition Engine Emissions and Testing." Cali-
fornia Air Resources Board Emissions Inventory Series 1(10).
CARB (2004). " California's Air Quality History Key Events." California Air Resources Board
Retrieved July 2, 2004, from http://www.arb.ca.gov/html/brochure/history.htm
CARB (2007) "EMFAC" California Air Resources Board Retrieved July 20, 2007, from http://
www. arb. ca.gov/msei/onroad/latest_version.htm
Carlock, M. A. (1994). An Analysis of High Emitting Vehicles in the On-road Vehicle Fleet. Pro-
ceedings of the 87th Air and Waste Management Association Annual Meeting Proceeding
Pittsburgh, PA.
CEDF (2002). "Nitrogen Oxides: How NOx Emissions Affect Human Health and the Environ-
ment." Environmental Defense.
CFR (2007a). Calculations: exhaust emissions (40CFR86.1342-90). Code of Federal Regula-
tions. National Archives and Records Administration.
CFR (2007b). Urban Dynamometer Schedules (40CFR86. Appendix I). Code of Federal Regula-
tions. National Archives and Records Administration.
CFR (2004a). National Primary and Secondary Ambient Air Quality Standards (40CFR50). Code
of Federal Regulations. National Archives and Records Administration.
CFR (2004b). Gross Vehicle Weight Rating (40CFR86.1803). Code of Federal Regulations. Na-
tional Archives and Records Administration.
CFR (2004c). Useful Lift (40CFR86.1805). Code of Ferderal Regulations. National Archives and
Records Adminisration.
14-2
-------
Chakravart, L. and Roy (1967). Handbook of Methods of Applied Statistics, Volume I, John Wiley.
Clark, N. N., J. M. Kern, et al. (2002). "Factors Affecting Heavy-Duty Diesel Vehicle Emis-
sions." Journal of the Air & Waste Management Association 52: 84-94.
Clark, N. N., A. S. Khan, et al. (2005). Idle Emissions from Heavy-Duty Diesel Vehicles, Center
for Alternative Fuels, Engines, and Emissions (CAFEE), Department of Mechanical and
Aerospace Engineering, West Virginia University (WVU).
Conover, W. J. (1980). Practical Non-parametric Statistics, John Wiley and Sons; New York, NY.
Copt, S. and S. Heritier (2006). Robust MM-Estimation and Inference in Mixed Linear Models.
Department of Econometrics, Working Papers, University of Sydney.
Davis, W., K. Wark, et al., Eds. (1998). Air Pollution Its Origin and Control. 3rd Edition, 2003
Special Studies. Addison Wesley Longman, Inc. Menlo Park, California.
Denis, M. J. S., P. Cicero-Fernandez, et al. (1994). "Effects of In-Use Driving Conditions and Ve-
hicle/Engine Operating Parameters on "Off-Cycle" Events: Comparison with Federal Test
Procedure Conditions." Journal of the Air & Waste Management Association 44(1): 31-38.
DieselNet. (2006). "Heavy-Duty FTP Transient Cycle." Retrieved December 20, 2006, from
http://www.dieselnet.com/standards/cycles/ftp_trans.html
Dreher, D. and R. Harley (1998). "A Fuel-Based Inventory for Heavy-Duty Diesel Truck Emis-
sions." Journal of the Air & Waste Management Association 48: 352-358.
Easton, V J. and J. H. McColl. (2005). "Statistics Glossary." Retrieved March, 28, 2005, from
http://www.stats.gla.ac.uk/steps/glossary/index.html.
Ensfield, C. (2002). On-Road Emissions Testing of 18 Tier 1 Passenger Cars and 17 Diesel Pow-
ered Public Transport Buses. Saline, Michigan, Sensors, inc.
FCAP (2004). " Ambient Air Quality Trends: An Analysis of Data Collected by the U.S. Envi-
ronmental Protection Agency." Foundation for Clean Air Progress.
Feng, C., S. Yoon, et al. (2005). Data Needs for a Proposed Modal Heavy-Duty Diesel Vehicle
Emission Model. Proceedings of the 98th Air and Waste Management Association Annual
Meeting Proceeding (CD-ROM), Pittsburgh, PA.
Fomunung, I, S. Washington, et al. (1999). "A Statistical Model for Estimating Oxides of Nitrogen
Emissions from Light-Duty Motor Vehicles." Transportation Research D 4D(5): 333-352.
14-2
-------
Fomunung, I, S. Washington, et al. (2000). "Validation of the MEASURE Automobile Emis-
sions Model: A Statistical Analysis." Journal of Transportation Statistics 3(2): 65-84.
Fomunung, I. W. (2000). Predicting emissions rates for the Atlanta on-road light duty vehicular
fleet as a function of operating modes, control technologies, and engine characteristics.
Civil and Environmental Engineering. Atlanta, Georgia Institute of Technology. Ph.D.
dissertation.
Frey, H. C., A. Unal, et al. (2002). Recommended Strategy for On-Board Emission Data Analysis
and Collection for the New Generation Model. Raleigh, NC, Prepared by Computational
Laboratory for Energy, Air, and Risk, Department of Civil Engineering, North Carolina
State University, Prepared for Office of Transportation and Air Quality, U.S. Environmen-
tal Protection Agency, http://www.epa.gov/otaq/models/ngm/ncsu.pdf.
Frey, H. C. and J. Zheng (2001). Methods and Example Case Study for Analysis of Variability
and Uncertainty in Emissions Estimation (AUVEE). Research Triangle Park, NC, Pre-
pared by North Carolina State University for Office of Air Quality Planning and Stan-
dards, U.S. Environmental Protection Agency.
Gajendran, P. and N. N. Clark (2003). "Effect of Truck Operating Weight on Heavy-Duty Diesel
Emissions." Environment Science and Technology 37: 4309-4317.
Gauderman, et. al. (2002). "Association between air pollution and lung function growth in Southern
California children: Results from a second cohort." Am J Resp Crit Care Med 166(1): 74-84.
Gautam, M. and N. Clark (2003). Heavy-Duty Vehicle Chassis Dynamometer Testing for Emis-
sions Inventory, Air Quality Modeling, Source Apportionment and Air Toxics Emission
Inventory; Phase I Report. Coordinating Research Council, Project No. E-55/E-59.
Gillespie, T. (1992). Fundamentals of Vehicle Dynamics. Warrendale, PA, Society of Automotive
Engineers, Inc.
Granell, J. L., R. Guensler, et al. (2002). Using Locality-Specific Fleet Distributions in Emissions
Inventories: Current Practice, Problems, and Alternatives. Proceedings of the 81st Trans-
portation Research Board Annual Meeting (CD-ROM), Washington, DC.
Grant, C., R. Guensler, et al. (1996). Variability of Heavy-Duty Vehicle Operating Mode Fre-
quencies for Prediction of Mobile Emissions. Proceedings of the 89th Air and Waste
Management Association Annual Meeting Proceeding (CD-ROM), Pittsburgh, PA.
14-4
-------
Guensler, R. (1993). "Data Needs for Evolving Motor Vehicle Emission Modeling Approaches."
In: Transportation Planning and Air Quality II, Paul Benson, Ed.; American Society of
Civil Engineers: New York, NY; 1993.
Guensler, R., S. Yoon., et al. (2005). Heavy-Duty Diesel Vehicle Modal Emissions Modeling
Framework. Regional Applied Research Effort (RARE) Project. Presented to U.S. Envi-
ronmental Protection Agency, Georgia Institute of Technology.
Guensler, R., S. Yoon., et al. (2006). Heavy-Duty Diesel Vehicle Modal Emissions Modeling
Framework. Regional Applied Research Effort (RARE) Project. Presented to U.S. Envi-
ronmental Protection Agency, Georgia Institute of Technology.
Guensler, R., D. Sperling, et al. (1991). Uncertainty in the Emission Inventory for Heavy-Duty
Diesel-Powered Trucks. Proceedings of the 84th Air and Waste Management Association
Annual Meeting Proceedings (CD-ROM), Pittsburgh, PA.
Guensler, R., S. Washington, et al. (1998). "Overview of MEASURE Modeling Framework."
Proc. Conf. Transport Plan Air Quality A: 51-70.
Hallmark, S. L. (1999). Analysis and Prediction of Individual Vehicle Activity for Microscopic
Traffic Modeling. Civil and Environmental Engineering. Atlanta, Georgia Institute of
Technology. Ph.D. dissertation.
Heywood, J. B. (1998). Internal Combustion Engine Fundamentals. New York, The McGraw Hill
Publishing Company
HowStuffWorks (2005). Retrieved December 30, 2005, from http://www.howstuffworks.com
Jimenez-Palacios, J. (1999). Understanding and Quantifying Motor Vehicle Emissions with
Vehicle Specific Power and TILDAS Remote Sensing. Cambridge, MA, Massachusetts
Institute of Technology. Ph. D. dissertation.
Kelly, N. A. and P. J. Groblicki (1993). "Real-World Emissions from a Modern Production
Vehicle Driven in Los Angeles." Journal of the Air & Waste Management Association
43(10): 1351-1357.
Kittelson, D. B., D. F. Dolan, et al. (1978). "Diesel Exhaust Particle Size Distribution - Fuels and
Additive Effects." SAE Paper No. 780787.
Koehler, K. J. and K. Larnz (1980). "An empirical investigation of goodness-of-fit statistics for
sparse multinomials." Journal of the American Statistical Association 75: 336-344.
14-5
-------
Koupal, J., M. Cumberworth, et al. (2002). Draft Design and Implementation Plan for EPA's
Multi-Scale Motor Vehicle and Equipment Emissions System (MOVES), U. S. Environ-
mental Protection Agency. EPA-420/P-02-006.
Koupal, J., N. E. Nam, et al. (2004). The MOVES Approach to Modal Emission Modeling. Pro-
ceedings of the 14th CRC On-Road Vehicle Emissions Workshop, Coordinating Research
Council, San Diego, CA.
Li, L. (2004). Calculating the Confidence Intervals Using Bootstrap, Department of Statistics,
University of Toronto, Presented for a project of Ontario Power Generation on October
28, 2004.
Lindhjem, C. and T Jackson (1999). Update of Heavy-Duty Emission Levels (Model Years
1998-2004+) for Use in MOBILE6, U.S. Environmental Protection Agency.
Lloyd, A. C. and T. A. Cackette (2001). "Diesel Engines: Environmental Impact and Control."
Journal of Air and Waste Management Association 51: 809-847.
MOBILE6. (2007) Access http://www.epa.gov/otaq/m6.htm
Nam, E. K. (2003). Proof of Concept Investigation for the Physical Emission Rate Estimator
(PERE) to be Used in MOVES, Ford Research and Advanced Engineering.
Neter, J., M. H. Kutner, et al. (1996). Applied Linear Statistical Models, McGraw-Hill: Chicago IL.
Newton, K., W. Steeds, et al. (1996). The Motor Vehicle. Warrendale, PA, Society of Automotive
Engineers, Inc.
NRC (2000). Modeling Mobile-Source Emissions. Washington, D.C., National Academy Press,
National Research Council.
Peters, J. M., et al. (1999). "A study of twelve Southern California communities with differing
levels and types of air pollution. II. Effects on pulmonary function." Am. J. Respir. Crit.
CareMed. 159:768-775.
Prucz, J. C., N. N. Clark, et al. (2001). "Exhaust Emissions from Engines of the Detroit Diesel
Corporation in Transit Buses: A Decade of Trends." Environment Science and Technol-
ogy 35: 1755-1764.
Ramamurthy, R. and N. Clark (1999). "Atmospheric Emissions Inventory Data for Heavy-Duty
Vehicles." Environmental Science and Technology 33: 55-62.
14-6
-------
Ramamurthy, R., N. N. Clark, et al. (1998). "Models for Predicting Transient Heavy Duty Ve-
hicle Emissions." Society of Automotive Engineers SAE 982652.
Roess, R. P., E. S. Prassas, et al. (2004). Traffic Engineering, Pearson Education, Inc.
SCAQMD (2000). Multiple Air Toxics Exposure Study (MATES-II), South Coast Air Quality
Management District Governing Board.
Schlappi, M. G., R. G. Marshall, et al. (1993). "Truck travel in the San Francisco Bay Area."
Transportation Research Record 1383: 85-94.
Siegel S, andN. Castellan. (1988). Non-parametric Statistics for the Behavioural Sciences 2nd
Edition. McGraw-Hill. January 1988.
Singer, B. C. and R. A. Harley (1996). "A fuel-based motor vehicle emission inventory." Journal
of the Air & Waste Management Association 46: 581-593.
StatsDirect. (2005). "Statistical Help." http://www.statsdirect.com/ Retrieved May 30, 2005.
TRB (1995). Expanding Metropolitan Highways: Implications for Air Quality and Energy Use.
Washington, DC, Transportation Research Board, National Academy Press.
U.S. EPA (1993). User's Guide to MobileSa. http://www.epa.gov/otaq/models/mobile5/mob5ug.pdf
U.S. EPA (1995). National Air Quality and Emission Trends Report 1995, Office of Air Quality
Planning and Standards, U.S. Environmental Protection Agency, http://www.epa.gov/air/
airtrends/aqtrnd95/report/
U.S. EPA (1997). Emissions Standards Reference Guide for Heavy-Duty and Nonroad Engines.
http://www.epa.gov/otaq/cert/hd-cert/stds-eng.pdf. EPA-420-F-97-014.
U.S. EPA (1998). Update of Fleet Characterization Data for Use in MOBILE6 - Final Report,
U.S. Environmental Protection Agency. EPA-420/P-98-016.
U.S. EPA (200la). EPA's New Generation Mobile Source Emissions Model: Initial Proposal and
Issues, U.S. Environmental Protection Agency. EPA-420/R-01-007.
U.S. EPA (200 Ib). Update of Heavy-duty Emission Levels (Model Years 1988-2004) for Use in
MOBILE6, U.S. Environmental Protection Agency. EPA-420/R-99-010.
14-7
-------
U.S. EPA (200Ic). Heavy Duty Diesel Fine Particulate Matter Emissions: Development And Ap-
plication Of On-Road Measurement Capabilities. Research Triangle Park, NC. Prepared
by National Risk Management Research Laboratory for Office of Air Quality Planning
and Standards, U.S. Environmental Protection Agency. EPA-600/R-01-079.
U.S. EPA (2002a) "MOBILE6 Vehicle Emission Modeling Software" Retrieved July 20, 2007
from http ://www. epa.gov/otaq/m6. htm.
U.S. EPA (2002b). Update Heavy-Duty Engine Emission Conversion Factors for MOBILE6,
Analysis of Fuel Economy, Non-Engine Fuel Economy Improvements and Fuel Densi-
ties. EPA-420/P-98-014.
U.S. EPA (2002c). Methodology for Developing Modal Emission Rates for EPAs Multi-Scale
Motor Vehicle and Equipment Emission System. Raleigh, NC, Prepared by North Caro-
lina State University for Office of Transportation and Air Quality, U.S. Environmental
Protection Agency. EPA-420/R-01-027.
U.S. EPA (2002d). Update Heavy-duty Engine Emission Conversion Factors for MOBILE6:
Analysis of BSFCs and Calculation of Heavy-duty Engine Emission Conversion Factors.
EPA-420/P-98-015.
U.S. EPA (2003). National Air Quality and Emissions Trends Report, 2003 Special Studies Edi-
tion. Research Triangle Park, NC, Office of Air Quality and Standards, U.S. Environmen-
tal Protection Agency. EPA-454/R-03-005.
U.S. EPA (2004c). Technical Guidance on the Use of MOBIEL6 for Emissions Inventory Prepa-
ration. Publication No. EPA420-R-04-013. U.S. Environmental Protection Agency.
U.S. EPA (2005). "Fine Particle (PM2.5) Designations." Retrieved October 20, 2005, from http://
www.epa.gov/pmdesignations/index.htm.
U. S. EPA (2006). "National Ambient Air Quality Standards (NAAQS)." Retrieved October 30,
2006, from http://www.epa.gov/air/criteria.html
Washington, S. (1994). Estimation of a vehicular carbon monoxide modal emissions model and
assessment of an intelligent transportation technology, University of California at Davis.
Ph.D. dissertation.
Washington, S., J. Leonard, et al. (1997b). Forecasting Vehicle Modes of Operation needed as
Input to 'Model' Emissions Models. Proceedings of the 4th International Scientific Sym-
14-8
-------
posium on Transport and Air Pollution, Lyon, France.
Washington, S., L. F. Mannering, et al. (2003). Statistical and Econometric Methods for Trans-
portation Data Analysis, CRC Pr I Lie.
Washington, S., J. Wolf, et al. (1997a). "Binary Recursive Partitioning Method for Modeling Hot-
Stabilized Emissions from Motor Vehicles." Transportation Research Record 1587: 96-105.
Whitley, E. and J. Ball (2002). "Statistics review 6: Nonparametric methods." Critical Care 6:
509-513.
Wolf-Heinrich, H. (1998). Aerodynamics of Road Vehicles, Society Of Locomotive Engineers
Inc., USA.
Wolf, J., R. Guensler, et al. (1998). "High Emitting Vehicle Characterization Using Regression
Tree Analysis." Transportation Research Record 1641: 58-65.
Yoon, S. (2005c). A New Heavy-Duty Vehicle Visual Classification and Activity Estimation
Method For Regional Mobile Source Emissions Modeling. School of Civil and Environ-
mental Engineering. Atlanta, Georgia Institute of Technology. Ph.D. dissertation.
Yoon, S., H. Li, et al. (2005a). Transit Bus Engine Power Simulation: Comparison of Speed-
Acceleration-Road Grade Matrices to Second-by-Second Speed, Acceleration, and Road
Grade Data. Proceedings of the 98th Air and Waste Management Association Annual
Meeting Proceeding (CD-ROM), Pittsburgh, PA.
Yoon, S., H. Li, et al. (2005b). A Methodology for Developing Transit Bus Speed-Acceleration
Matrices to be used in Load-Based Mobile Source Emissions Models. Proceedings of the
84th Transportation Research Board Annual Meeting Proceedings (CD-ROM), Washing-
ton, DC.
Yoon, S., M. Rodgers, et al. (2004b). "Engine and Vehicle Characteristics of Heavy-Duty Diesel
Vehicles in the Development of Emissions Inventories: Model Year, Engine Horsepower
and Vehicle Weight." Transportation Research Record(1880): 99-107.
Yoon, S., P. Zhang, et al. (2004a). A Heavy-Duty Vehicle Visual Classification Scheme: Heavy -
Duty Vehicle Reclassification Method for Mobile Source Emissions Inventory Develop-
ment. Proceedings of the 97th Air and Waste Management Association Annual Meeting
Proceeding (CD-ROM), Pittsburgh, PA.
14-9
-------
Younglove, T., G. Scora, et al. (2005). Designing On-road Vehicle Test Programs for Effective
Vehicle Emission Model Development. Proceedings of the 84th Transportation Research
Board Annual Meeting Proceedings (CD-ROM), Washington, DC.
Zeldovich, Y. B., P. Y. Sadonikov, et al. (1947). "The oxidation of nitrogen in combustion and
explosions." Acta Physicochimica USSR 21(4): 577-628.
14-10
------- |