Automobile Exhaust Emission Modal Analysis Model Extension and Refinement

EPA-460/3-74-024 OCTOBER 1974 AUTOMOBILE EXHAUST EMISSION MODAL ANALYSIS MODEL EXTENSION AND REFINEMENT U.S. ENVIRONMENTAL PROTECTION AGENCY Office of Air and Wante Management Office of Mobile Source Air Pollution Control Certification and Surveillance Division Ann Arbor, Michigan 48105 ------- EPA-460/3-74-024 AUTOMOBILE EXHAUST EMISSION MODAL ANALYSIS MODEL EXTENSION AND REFINEMENT by H. T. McAdams Calspan Corporation P.O. Box 235 4455 Genesee Street Buffalo, New York 14221 Contract No. 68-03-0435 EPA Project Officer: C.J.Domke Prepared for U.S. ENVIRONMENTAL PROTECTION AGENCY Office of Air and Waste Management Office of Mobile Source Air Pollution Control Certification and Surveillance Division Ann Arbor, Michigan 48105 October 1974 ------- This report is issued by the Environmental Protection Agency to report technical data of interest to a limited number of readers. Copies are available free of charge to Federal employees, current contractors and grantees, and nonprofit organizations - as supplies permit - from the Air Pollution Technical Information Center, Environmental Protection Agency, Research Triangle Park, North Carolina 27711; or, for a fee, from the National Technical information Service, 5285 Port Royal Road, Springfield, Virginia 22161. This report was furnished to the Environmental Protection Agency by Calspan Corporation, in fulfillment of Contract No. 68-03-0435. The contents of this report are reproduced herein as received from Calspan Corporation. The opinions, findings, and conclusions expressed are those of the author and not necessarily those of the Environmental Protection Agency. Mention of company or product names is not to be considered as ari endorsement by the Environmental Protection Agency. Publication No. EPA-460/3-74-024 11 ------- ABSTRACT This report on modal analysis of automobile emissions was prepared for the United States Environmental Protection Agency, Division of Certification and Surveillance, Ann Arbor, Michigan, under EPA Contract No. 68-03-0435. The work reported herein constitutes a refinement and extension of a modal analysis exhaust emission model originally developed under EPA Contract No. 68-01-0435. This earlier effort was released as EPA-460/3-74-005, "Automobile Exhaust Emission Modal Analysis Model". The modal analysis exhaust emission model makes it possible to calcu- late the amounts of emission products emitted by individual vehicles or groups of vehicles over an arbitrary driving sequence. Refinements to the model permit an improvement in computational efficiency and a reduction in input data require- ments. Extensions of the model include a scheme for computation of fuel usage in terms of C0_, CO and HC output by means of a carbon-balance approach and a procedure for more definitive assessment of the precision of the model in pre- dicting group emissions. 111 ------- ACKNOWLEDGEMENTS Support and encouragement by the Environmental Protection Agency is gratefully acknowledged. Particular thanks go to C.J. Domke, M.E. Williams, and L.A. Platte for their guidance and suggestions. A number of individuals at Calspan Corporation contributed signifi- cantly to the work. These included P.E. Yates, who performed much of the computer analysis, and A.C. Keller, who provided valuable suggestions and criticism. Finally, special recognition goes to Paul Kunselman, formerly of Calspan, who was a prime mover in the development of the original model and whose insight was instrumental in the follow-on refinements and improvements. IV ------- TABLF. OF CONTFNTS Section Page 1 INTRODUCTION 1 2 MODEL COMPUTATIONAL EFFICIENCY 4 2.1 ESSENTIAL FEATURES OF THE BASIC MODEL 4 2.2 SIMPLIFICATION OF EMISSION INTEGRATION OVER DRIVING SEQUENCES ... 12 2.3 HYPSOMETRIC ANALYSIS OF DRIVING SEQUENCES ..... 14 2.4 VEHICLE AND DRIVING SEQUENCE AS VECTORS AFFECTING EMISSIONS . 15 3 MODAL TESTING REQUIREMENTS J7 4 GROUP EMISSION PREDICTIONS 44 4.1 MODEL PERFORMANCE FOR THE SDS BASE CASE 45 4.1.1 Accuracy of Group Emission Prediction ... 46 4.1.2 Precision of Group Emission Prediction ... ^g 4.1.3 Sampling Considerations 49 4.2 MODEL PERFORMANCE FOR ARBITRARY DRIVING SEQUENCES . 50 4.2.1 Theoretical Background 50 4.2.2 Variance Computations for Arbitrary Driving Sequences 51 5 PREDICTION OF FUEL ECONOMY 57 5.1 PREDICTION OF C02 F8 5.2 PREDICTION OF MILES PER GALLON 64 6 SUMMARY AND CONCLUSIONS 71 APPENDIX I - VARIANCE FUNCTIONS FOR REGRESSION ESTIMATES .... T-l APPENDIX II - COMPUTER PROGRAM REVISIONS FOR INCREASED COMPUTATIONAL EFFICIENCY IT.-1 ------- LIST OF FIGURES Figure No. Title 1 Average Speeds and Accelerations for Aceel/Decel and Steady State Modes 18 2 Acceleration Versus Speed, Mode 23 ..... 20 3 Acceleration Versus Speed, Mode 26 21 4 Acceleration Versus Speed, Composite for All Accel/Decel Modes . . . 22 5 Speed and Acceleration Versus Time, Mode 23 23 6 Cumulative Averages for Speed and Acceleration, Mode 23. 24 7 Speed/Acceleration Test Design Points 29 8 Normalized Variance Surface Based on 37 ( v, a) Design Points 30 9 Normalized Variance Surface Based on 37 ( v, a) Design Points 31 10 Normalized Variance Surface Based on 67 ( v, a) t/2 Design Points 33 11 Normalized Variance Surface Based on 67 ( v, a) t/2 Design Points 34 12 Normalized Variance Surface Based on 53 ( v, a) t/2 Design Points 35 13 Normalized Variance Surface Based on 37 (v, a) t/2 Design Points 36 14 Arbitrary Driving Sequences .... 54 15 Mean Steady State C02 Emission Rates Versus Speed ... 59 16 Distribution of C02 Bag Error from the Surveillance Driving Sequence 61 \ 17 Distribution of C02 Bag Error (First 505 Seconds FTP) . 62 18 Distribution of Error for Miles per Gallon Based on Bag Values from the Surveillance Driving Sequence .... 68 19 Distribution of Error for Miles per Gallon Based on Observed Bag Value Calculations (First 505 Seconds FTP). 69 •vi ------- LIST OF TABLES Table No. Page 1 Principal Components (7) of the Correlation Matrix for HC for 37 Modes 39 2 Ten Rotated Factors of the Correlation Matrix for HC for 37 Modes 41 3 Highly Loaded Modes by Factor Number 42 4 Bag Value Error Statistics — Surveillance Driving Sequence, 1020 Vehicles 46 5 Total Computed Variance of HC over Various Driving Sequences . 55 6 Means and Variances of Calculated and Observed Bag Values 55 7 Comparative Statistics for CO- 60 8 Distribution of C02 Bag Value Error (Observed Calculated) from the First 505 Seconds of the Federal Test Procedure (FTP) and the Surveillance Driving Sequence (SDS) .... 63 9 Replicate Modal Analysis of CO2 for 61 Vehicles .... 65 10 Statistics for Miles/Gallon Error Based on Bag Values . . 67 11 Distribution of Miles/Gallon Error (Observed- Calculated) from the First 505 Seconds of the Federal Test Procedure (FTP) and the Surveillance Driving Sequence (SDS) ... 70 VI1 ------- 1. INTRODUCTION Under U.S. Environmental Protection Agency Contract No. 68-01-0435, Calspan Corporation formulated a model for the prediction of motor vehicle * exhaust emissions over an arbitrary driving sequence. The work reported herein was performed under FPA Contract No. 68-03-0435 as a refinement and extension of the original model. Subsequent discussion will assume famili- arity with the original model as presented in EPA-460/3-74-005, Automobile Exhaust Modal Analysis Model (January 1974); however, wherever essential to understanding, details of the model will be repeated in the present report for the sake of clarity. The impact of motor vehicle exhaust emissions on air quality in a given location depends on a number of factors: the emission characteristics of individual vehicles, the mix of vehicles of different types operating in the location, the numerical density of vehicles per mile or per unit of area, and the driving pattern in which the vehicles are employed. To assess the contribution of motor vehicles to air pollution, therefore, it is necessary to estimate traffic density, composition and flow characteristics, and to have some means for expressing these quantities in terms of pollution burden to the atmosphere. The required traffic parameters can be'estimated in a straightforward way. Traffic in the vicinity can be monitored and classified according to vehicle make, model, age, and other factors known to influence emissions. Moreover, speeds and accelerations prevailing along the traffic way in ques- tion can be measured and tabulated. Unless emissions can be expressed as functions of the applicable traffic parameters, however, it is not possible to assess vehicular contributions to air pollution. Paul Kunselman and H.T. McAdams, Automobile Exhaust Emission Modal Analysis Model, Calspan Report No. NA-5194-D-3 (July 1973). Paul Kunselman, H.T. McAdams, C.J. Domke, and Marcia Williams, Automobile Exhaust Emission Modal Analysis Model, Environmental Protection Agency Report No. EPA-460/3-74-005 (January 1974). ------- The emission tests used for certification of new light duty motor vehicles are based on a prescribed driving sequence by.means of which vehicles can be compared according to a standard set of operating conditions. Though this concept of a standard driving sequence makes it possible to implement emission standards and to check compliance with these standards, the concept does not facilitate the prediction of vehicle emissions over an arbitrary driving sequence. By breaking the standard sequence into segments (modes) having specified speeds and accelerations, however, and noting the emissions produced in each segment, it was postulated that these segments might be recombined appro- priately to form other driving sequences of interest. Ultimately, it was hoped that this process might lead to a model for defining emissions as continuous functions of vehicle operating conditions and thus make it possible to approxi- mate emissions over any driving sequence of interest. As developed by Calspan under EPA Contract No. 68-01-0435, the original modal analysis prediction model was based on the concept of an instantaneous emission rate for each of the primary pollutants carbon monoxide (CO), hydrocarbons (HC), and oxides of nitrogen (NOX). In this model, it was assumed that the instantaneous emission rate can be adequately defined as a function e = f(v, a) of instantaneous speed, v.and accelera- tion, a,for each vehicle. Since every point in. time over a driving sequence has an associated instantaneous speed and acceleration, the total emission over the driving sequence can be obtained by appropriate integration of the emission rate function. Moreover, by virtue of the mathematical form of the model, it can be advantageously used to predict emissions from either homogeneous or nonhomogeneous groups of vehicles. Initial experience with the modal analysis prediction model suggested that it be refined and extended with the following objectives in mind: 1) Investigate means to increase the computational efficiency of the model. 2) Determine whether modal testing requirements can be reduced without appreciable loss of information. ------- 3) Define the accuracy and precision with which group emission predictions can be made from modal data. 4) Use the modal analysis approach to predict fuel •economy over arbitrary driving sequences. Each of these areas of investigation will be discussed in turn in subsequent sections of this report. ------- 2. MODEL COMPUTATIONAL EFFICIENCY Relative to the original formulation of the modal analysis emission model, a significant increase in computational efficiency can he achieved by a simplification of the method by which the instantaneous emission rate func- tion, e (v, a) , is integrated over a driving sequence. As background for this simplification, however, it will be instructive to review the essential features of the model. 2.1 ESSENTIAL FEATURES OF THE BASIC MODEL Inputs to the model are based on the Surveillance Driving Sequence (SDS), in which emissions are measured over a variety of steady state and transient driving conditions. The acceleration and deceleration modes repre- sented in the SDS consist of all possible combinations of the following five speeds: 0 mph, 15 mph, 30 mph, 45 mph, 60 mph. The average acceleration or deceleration rate observed for each mode in the Los Angeles basin is used during operation of 20 of the transient modes. In addition, 6 of the tran- sient modes are repeated using accel/decel rates higher or lower than the average rate in order to determine the effect of accel/decel rate on emissions; A difficulty presented by the use of the 37 discrete modes as inputs to a continuous driving sequence model is that, during much of the sequence, the vehicle may be operating at velocities and accelerations not included in the set of five steady state and 32 accel/decel modes. For example, a vehicle traveling at 23 mph is neither in the 15 mph nor 30 mph steady state mode. To arrive at a continuous predictive model, one must be able to interpolate or otherwise estimate the appropriate emission rates for all combinations of speed and acceleration encountered in the driving sequence. The primary feature of the model is a scheme whereby emissions from the 37 discrete modes can be expanded into a continuous function of time. For this purpose, use is made of a regression function which can, for purposes of visualization, be represented as a "surface" in speed-acceleration space as shown below. ------- Emission Response Surface For any point (v, a) in the speed-acceleration plane, there corresponds an instantaneous emission rate e (v, a) . The surface can be represented by a mathematical equation of the form: e= f(v, a) in which the function f con- tains a number of adjustable constants. These constants can be selected to represent the emission characteristics of a particular automobile or can be selected to represent the mean emission characteristics of a collection of automobiles. The mass of a particular pollutant emitted by an automobile is a cumulative, non-decreasing function of time, e(t\ The time derivative of this function yields the instantaneous emission rate as a function of time: e(t) dCe(tfl dt (1) In the modal analysis model, it is assumed that the instantaneous emission rate is a function of vehicle speed and acceleration, both of which are functions of time. Thus, e(t) = e[v(t), act)] and (2) (3) e(T) = f e[v(t), a(t)] dt 'o gives the mass of pollutant given off by a vehicle in a driving sequence lasting T seconds. Evaluation of the above integral requires (1) speci- fying the driving sequence in terms of v(t) and a(t) » an^ (2) specifying the emission-rate function in terms of speed and acceleration. ------- In practice, a driving sequence is specified in terms of the speed prevailing at each of n discrete, equally spaced points in time, as shown below. SPEED V4 V3 V2 Vl ^* "1 -^S TIME The integration of equation (3) is then approximated by the summation: n-1 e(T) = $ e(v., a.) At where (4) v. vi and ai = - vi n A t = T The applicable emission-rate function is developed by application of a generalized version of multiple regression analysis. As a starting point for development of the multiple repression equa- tion for emission rate as a function of speed and acceleration, it will be instructive to consider first a steady-state emission rate function defined for constant speed (zero acceleration) only. It is assumed that this function can be expanded in the form eg(v) = (5) ------- where a1, a_,..., a, are constants applicable to a specific automobile or group of automobiles, and f (v), f_(v),..., f, (v) are referred to as basis 1. £, K functions. It is emphasized that these functions can assume any form con- sistent with the data to be represented, the only requirement being that they be linearly independent and not contain any adjustable constants dependent on the data. The latter requirement assures that, for a given choice of basis functions, the function e (v) is completely defined by the model. o coefficients a1, a™,.. ., a, . In a similar vein, an emission rate function e.O, a) can be postu- lated for non-steady-state operation in which a ^ 0. It is assumed that this function can be expanded as eA(v, a) = bjgjfv, a) + b^fv, a) -t- ... •»• b^Cv, a) (6) where b1, b_,..., b are constants applicable to a specific automobile or group of automobiles and the basis functions g.(v, a), g2(v, a),..., g (v, a) are, as before, linearly independent and free of any adjustable constants to be determined from the data. As an extension of equations (5) and (6), it is logical to postulate that, by appropriate definition of basis functions, it should be possible to define an emission rate function e(v, a) applicable over the entire (v, a)- plane regardless of whether a = 0 or a i 0. Such a universally applicable equation might assume the form e(v, a) = c^O, a) •«• c^fv, a) + ... + csus(v, a) (7) where c.., c-,..., c are constants applicable to a specified automobile or X £• o group of automobiles and the basis functions u^v, a), u_(v, a),..., u (v, a) JL & 5 are linearly independent and contain no constants to be determined from the data. In the original development of the modal analysis model, however, it was found advantageous to develop the instantaneous emission rate function e(v, a) as a composite function e(v, a) = h(a) es(v) * [l-h(a)] eA(v, a) (8) where h(a) is a weighting function bounded in the interval O^hfa)16!. ------- As employed in the original form of the model, the function h(a) was defined as follows: / 1 a >0 a+1, << < a h(a) or h(a) Acceleration By specifying the constants <-^ and eC^ the weightings of the two rate functions will vary between 0 and 1 in a continuous manner when the tran- sition is made between accel/decel and steady state periods of driving. Once sets of basis functions have been established for equations (5) and (6), the coefficients which define the instantaneous emission rate func- tion could be determined by a straightforward application of least squares theory provided that instantaneous emission rates were known for a sufficient number of (v,a)-positions in the speed-acceleration plane. In reality, how- ever, the data base for vehicle emissions does not contain any instantaneous emission rate observations for accel/decel modes; instead, the observations reported are the total amounts of pollutant collected over each mode and it is possible to calculate only the average emission rate prevailing during the time in mode. In this connection, however, it can be shown that for a postulated form of the emission rate function, it is possible to deduce the applicable model coefficients from the modal average emission rates. ------- To illustrate this point, consider a situation in which the instan- taneous emission rate can be adequately expressed as a linear combination of three basis functions g,(v, a), g~(v, a), and g,(v, a). Then, e(v, a) = bjg^v, a) + b2g2(v, a) + b^fv, a) (9) Consider a mode of time duration T. The average emission rate over time T can be computed from equation (9) as T T = f f e[v(t), a(t)] dt (10) o and from the observed total emission over the mode as — e(T), where e(T) is the "bag value" for the mode and v(t) and a(t) are the speed vs time and acceleration vs time profiles for the mode in question. Then, e(v, a)>T = £eCT)4/ 6[v(t), a(t)] dt T = i (• jblg]L[v(t), a(t)] + b^^vft), a(t)] dt (11) Termwise integration and removal of the constants b. , b_ and b_ from the integrand yields T = bi AT gJvCt), a(t)] dt) T g2[v(t), a(t)] dt} o T y ( g3[v(t), a(t)] dt] (12) ------- Note, however, that the bracketed expressions are just the time averages of the basis functions over the time duration of the mode. Thus, one can write — - Vl b2g2 * b3g3 where §., g_ and g are, respectively, the time averages for g^, g2 and g3 over the mode in question. Since the total emissions for each mode are known, as well as the corresponding times, in mode, the time averages gj, g~2 and g"3 can be computed for each mode and the coefficients b^, b^ and b, can be obtained through least squares regression analysis applied to the average emission rates as computed from.modal bag values. In the context of model refinement and extension, the modal analysis model as originally developed under EPA Contract No. 68-01-0435 should be viewed as a family or continuum of models. Though initial application of the modal analysis model concept employed a specific set of basis functions, the model in a broad sense is amenable to infinite variety in the choice of basis. Indeed, choice of basis may itself present an avenue for model simplification and for increased computational efficiency. Every attempt should be made to keep the number of basis functions to a minimum and to employ the simplest basis functions compatible with the data. In this connection, it is of interest to review the reasoning by which the basis functions for the original model were derived. In the steady- state (zero acceleration) case, the emission rate is a function of speed only. For each of the three pollutants (CO, HC, NOX), steady-state emission rates were averaged over the 1020 vehicles constituting the data base, and these average emission rates were then plotted as a function of speed. These plots suggested that the steady state emission rate function eg could be expressed as a quadratic function of speed: 2 eg(v) » Sj + s2 v + s3 v (13) where s., s and s, are constants. 10 ------- In the case of non-zero acceleration, it was assumed that the accelera- tion occurring at a given speed is a perturbation to the steady-state emission rate at this speed. This perturbation can be accounted for by expressing the coefficients s., s and s_ as functions of acceleration. If it is assumed i £ J that quadratic functions of acceleration represent good approximations to these coefficients, the coefficients can be expressed as follows: Sl = Sl(a) = s2(a) S3(a) = where the q's are constants. The emission rate function used during times of non-zero acceleration e. can then be written in the form: • » e.(v, a) = b + b v + b a + b.av + bcv2 + b,a2 A 1234 5 .6 (is) + b_v a + b.a v + b«a v / o y where the b's are constants and can be expressed in terms of the q's. It is noted that if a =0 equation, (15) reduces to e (v) = b. + b_v + b_v (16) which has the identical form as the equation for e . Thus, in principle, e. could be used to determine emissions for both steady state and non-zero acceleration periods. As noted earlier in the discussion, however, it was found advantageous to express instantaneous emission rate as a composite function e(v, a) = h(a) es(v) + [l-h(a)] eA(v, a) 9 9 in which es(v) is determined independently of e.(v, a). In this way, the model is provided with greater flexibility, especially in the vicinity of zero acceleration, since it has 12 rather than 9 adjustable coefficients for defining the instantaneous emission rates for each pollutant. 11 ------- 2.2 SIMPLIFICATION OF EMISSION INTEGRATION OVER DRIVING SEQUENCES In the original version of the modal analysis model, computation of total emissions over a driving sequence of time-duration T was achieved by performing the integration ? S(t) = J 4[v(t), a(t)] dt (18) for each vehicle or group of vehicles of interest. As will become apparent below, however, the integral can be reformulated in such a way that, for a particular driving sequence, a single integration suffices for all vehicles subjected to that sequence. The composite emission function, as shown in equation (17), can be written in terms of the basis functions f . (v) and g.(v, a) Noting that k (19) es(v) = I. a A i = 1 and e.(v, a) = 2 b.g.(v, a) (20) "• J J j = l one can substitute (19) and (20) for eg (v) ande.(v, a) in (17) and integrate to obtain T k >(t) = j h[a(t)] £ a.f.[v(t)] dt. o ,T 2 bjg;j[v(t), a(t)] dt 3=1 (21) In view of the fact that a. (i = 1.2,..., k) and b. ( j= 1,2, — ,r) are con- stants, (21) can be rewritten as 12 ------- e(T) = 2 a. I h[a(t)] fJvCt)] dt )]) g.j[v(t), a(t)] dt (22) Note that (22) contains k integrals of the form dt, i = 1, 2,.,. .. k (23) I h[a(t)] fjj o and r integrals of the form S I (l-h[a(t)]} gjjvOO, a(t)] dt, j = 1, 2,..., r (24) o The integrands of (23) and (24) are just weighted forms of the basis functions and do not depend on the magnitudes of the coefficients a. and b. . Conse- quently, once these k+r quantities have been computed for a given driving sequence, it is necessary to know only the applicable model coefficients a. and b. in order to compute emissions for a particular automobile or group of automobiles negotiating that driving sequence. For the choice of basis functions employed in the original model, k = 3 and r = 9. Therefore, for each pollutant there are 12 integrals to be - evaluated. These 12 quantities can be combined with the coefficients a., and b. to compute the mass of pollutant emitted by a particular automobile or group of automobiles in performing the specified driving sequence. Subroutine ESUM of the original model has been revised to integrate the weighted basis functions over a specified driving sequence and return the results to the main program where the total emission is calculated. The revised versions of ESUM and the main program are given in Appendix II. 13 ------- 2.3 HYPSOMETRIC ANALYSIS OF DRIVING SEQUENCES As shown above, computational efficiency of the model can be improved by noting that, for a given driving sequence, the integrated forms of the basis functions are invariant and do not need to be recalculated unless a different driving sequence is postulated. Further efforts to improve effi- ciency were aimed at a hypsometric characterization of driving sequences. Hypsometry is a term used in geodesy to characterize the measurement of surface elevation. In particular, the hypsometric integral is a function used to quantify that fraction of a geographic area which exceeds a given threshold level, where the threshold level can be regarded as a continuous variable. As applied to the modal analysis model, the hypsometric integral would provide a characterization of the relative frequency of occurrence of various speed and acceleration levels. In the original form of the model, the driving sequence is described by specifying the speed for each time increment (generally one,second) in the sequence. This specification, in turn, establishes the acceleration during each increment of time. It should be noted, however, that the computed con- tribution to emissions during a particular time increment depends only on the speed and acceleration prevailing during that time interval and is independent of the speed-time history of the vehicle. In short, a particular combination of speed and acceleration is regarded as making the same contribution to the pollutant output of a vehicle regardless of whether that speed-acceleration combination occurs early or late in the overall driving sequence. In view of this fact, it appeared feasible to describe the speed-time history of a driving sequence in terms of the joint frequency distribution of speed and acceleration. It was further postulated that, for "typical" driving sequences- e.g., urban or rural—it might be possible to express the distribution func- tions in terms of a few adjustable parameters. For example, if speeds and accelerations for a particular sequence essentially were to obey a bivariate normal distribution, then specifying the means, variances and covariance of speed and acceleration would suffice to describe the distribution. A useful application of such parametric description of driving sequences might be in 14 ------- characterizing the various branches of a road network hypsometrically, so that pollution abatement studies aimed at optimizing routes in a network might be more amenable to analysis. As far as computation of the total emissions e(T) over a driving sequence of T seconds is concerned, implications of the hypsometric analysis of speed and acceleration would be felt through the functions v(t) and a(t) in equation (22). In view of this fact, and in view of the readiness with which the weighted basis functions can be computed, no further development of the hypsometric description of driving sequences was pursued. In reality, the values of the k+r integrals in equations (23) and (24) constitute a com- plete description of the driving sequence so far as the model is concerned, and, within the limits of validity of the model, completely characterize the effect of the driving sequence or "route" on emissions. Similarly, the values of the k coefficients a^ (i = 1,2,... k) and the r coefficients b. (j = l,2,...,r ) completely characterize, again within the limits of model validity, the "vehicle effect" on emissions for that particular route. 2.4 VEHICLE AND DRIVING SEQUENCE AS VECTORS AFFECTING EMISSIONS The relation between vehicle and driving sequence (route) effects on total emissions e(T) can be expressed succinctly in vector notation. Let the values of the k+r integrals in equations (23) and (24) be considered as components of a (k+r ) -dimensional driving sequence vector = ^ ' ^' ' ' * * ^* ^ ' ' ' ' ' "* Similarly, let the k coefficients a. (i = 1,2,..., k) and the r coefficients b. ( j= 1,2,... , r) be considered as the components of a (k+r ) -dimensional vehicle vector V = (al , a2 , . . . , ak , bj , b2 , . . . , br) (26) Then, the total emissions e(T) for a particular vehicle operating according to a specified driving sequence can be expressed as the vector inner (dot) product 15 ------- e(T) = S_ • V (27) It is to be observed that when the driving sequence consists of a single mode, either steady-state or accel/decel, the dot product in equation (27) reduces to a computed estimate of the bag value for that mode. Also, it should be noted that the vehicle vector (26) can represent a group of vehicles rather than a single automobile. The vector form of the model, as elucidated above, can be further systematized to consider the effects on emissions of various mixes of vehicles and various driving sequences (routes). Let s,, s,,,..., s denote the sequence —i —i -p vectors for p alternative driving sequences, and let y,, v_,..., v denote the the vehicle vectors for q alternative mixes of vehicles. Then, if the vectors s,, S-,..., s are considered as columns of a matrix S and if the vectors —i —,i -p Y_l» yj»"» v are considered as columns of a matrix V, then one can write E = S' V (28) where S is a matrix of order p x (k+r), V is a matrix of order (k+r) x q, and E is a matrix of order p x q. The matrix E consists of elements e.. (i = 1, 2,..., p; j = 1, 2,..., q) which provide estimates of the total emissions generated by the jth mix of vehicles operating according to the ith driving sequence. 16 ------- 3. MODAL TESTING REQUIREMENTS As originally implemented, the modal analysis model employed three basis functions in the steady-state portion of the emission-rate equation and nine basis functions in the accel/decel portion of the equation. The fact that the resulting 12 regression coefficients are considerably fewer than the 37 modes used as data inputs to the model suggests that there is a certain amount of redundancy in the modal data. On the other hand, there are regions of the speed-acceleration plane not adequately represented by modal data, and this fact could occasion unwarranted imprecision in the performance of the model, particularly in those regions of the speed- acceleration plane where modes are sparse. A revised allocation of speeds and accelerations by modes, as well as a possible reduction in the number of modes, is therefore suggested, provided this reallocation and/or reduction does not adversely affect other aspects of the emission-measurement protocol. Several techniques were employed to examine the implications of re- allocation of modal test points in the speed-acceleration plane. These included visual examination of the modal-distribution pattern, the computa- tion of variance maps indicative of error propagation over the (v, a )-plane, and principal component analysis of modal contributions to the model performance. Figure 1 is a plot of the average speeds and accelerations for the 32 acceleration/deceleration modes and the 5 steady-state modes which consti- tute inputs to the emission model as originally formulated. The sparse or ) empty regions of the speed-acceleration plane are clearly evident, particularly that portion of the plane between accelerations of -1 mph/sec and +1 mph/sec. As will become apparent later, the lack of information in this region of the plane tends to exaggerate the uncertainty of prediction in that region and is, at least in part, the reason that the steady-state and accel/decel portions of the model must be bridged in a rather arbitrary way in the original model. In view of the fact that for each mode a speed is specified for each second of time in mode, it is possible to estimate the corresponding accelerations on a second-by-second basis and to plot acceleration versus speed profiles for each mode. 17 ------- accei/decel and steady 18 ------- Let us consider, however, the second-by-second schedules maintained in the various modes and plot acceleration/speed profiles on this basis. Figure 2 is such a plot for mode 23, an acceleration mode, and Figure 3 is such a plot for mode 26, a deceleration mode. As is to be expected, these plots show that actual speeds and accelerations realized over short time increments in these modes span regions of the acceleration-speed plane not represented if only the modal averages are considered. This fact is made clear by Figure 4, which is a composite plot of second-by-second accelerations and speeds achieved when results of all 32 accel/decel modes are combined. The plot suggests that many of the gaps shown in Figure 1 might be filled in if appropriate speeds and accelerations in Figure 4 can be regrouped and averaged to present a revised set of modes more advantageous as model inputs. Consider the time plot of mode 23, as shown in Figure S. A noticeable degree of asymmetry in this plot is evident. For example, the early part of the mode exhibits greater accelerations than the latter part of the mode, and this fact suggests that the mode might be divided into two parts so as to provide a model input to fill part of the gap presently existing in the low-acceleration region. A scheme for examining this concept is as follows. Compute average speed and acceleration for the first n seconds in mode and for the remaining N-n seconds in mode, where N is the total number of seconds in mode. Plot these two results as functions of n over the region 0 n N, as shown in Figure 6. This plot provides a set of options for redefinition of the mode so as to more adequately span the speed-acceleration plane. By electing various options for redefinition of the modal inputs to the model, one can examine the consequences of this redefinition by means of the variance-function concept explained below. 3,2 VARIANCE-FUNCTION ASSESSMENT OF ERROR PROPAGATION For a particular pollutant and for a particular vehicle or group of vehicles, the emission measured for each of the 5 steady-state and 32 accel/decel modes can be regarded as a random variable. In other words, 19 ------- ACCELERATION (MPH/SEC) to 4 3210 1 2 3 •> 5 « •••»•»».»•••«•***•» + «****••««»+»«•«•«*# + »« •+•++•••+•+•• + •»#••+•«•»»••»««»#•«»»*•»••« 1 • 2 3 • 4 5 • 6 T B 9 10 11 12 * 13 14 15 • 16 17 IB * 19 20 21 22 23 24 25 • ^2* * I 27 Q. 28 • § 2* * ~30 » 2 » m J* S 33 00 34 35 36 37 38 39 0 VI 42 43 <4 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 6C Figure 2 ACCELERATION VERSUS SPEED MODE 23 ------- ACCELERATION (MPH/SEC) 1 0 1 i 2 3 4 5 6 7 a 9 1C 11 12 13 1 15 16 17 18 19 70 21 22 23 2* 25 26 = 27 ± 28 g 29 = 30 0 31 HI 32 £J 33 8> 3* 35 36 37 38 39 40 «1 42 t3 46 47 48 49 50 51 52 S3 54 55 56 57 58 59 60 Figure 3 ACCELERATION VERSUS SPEED MODE 26 ------- ACCELERATION (MPH/SEC) 2 1 0 1 ro K> -^ (^ . £> o ILI 111 Q. V) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 46 49 5C 51 52 53 54 55 56 57 5b 59 60 * •* » • » * » * * * * * • » « ' .' ' . • • " • •» • • * • » * . * * • • • • • * • • • » • • »• • * * * * • • . • . * ' ' ' • • ' . . * ' * * * » • »• . ' • ' • * • » » » ' • * * • • * • * * * • * * • . * * * * * - »••• • » • * ** * • * * • • • ******** * •» • * ' ' * * • * * * * * * •.• » • • *•»» » • •• • • • • • . » • » ** * * * * ** * * * * • • • • • * ** * »• * * • ******* • * * * • * ** • * *» • »• » • • **** • • * * * • » * • * * * * * *** * *** * * * * * » ** * . * * * • • * * •• **** * * ***** ** ** • • * * •• * ** * » ***** • * •' • * * * * *** * ***** • * * »* »• • ***** * ** • , * * * * * * ***** * * • » » • *** ** * ***** •* * * ***** » • •• *» ***** * «• •« * * * * * ***** •» » • ' . • • • »• » * * • • *• • 6 •«•»« Figure 4 ACCELERATION VERSUS SPEED COMPOSITE FOR ALL ACCEL/DECEL MODES ------- Speed and Acceleration versus Time (Mode 23) ttnttttitttitffti i! n iii'-H^i iifi^t'tp' Si 8 9 10 Time (seconds) 13 14 15 ------- 15 14 13 12 11 Last N -'h "points (c.f., v- and a") 10 8 4* I i 111:111111 U O V) m £K= .£3.0 & -^ m ^+* ^ 45 i_t-j_j.4 (-i4_W4-J FIGURE 6 Cumulative Averages for Speed and Acceleration 5 67 8 First n Points (c. f. 10 11 12 13 14 15 and ------- the measurement of the modal bag values is subject to error, a fact well demonstrated by the inability to obtain the same emission mass measurements on repeated or replicate tests. Each measurement, therefore, can be regarded as being subject to a certain variance. This variance can be expected to propagate through the regression model to induce uncertainty in the emission estimates computed at every point in the (v, a ) -plane. The magnitude of this uncertainty varies as a function of position in the (v, a ) -plane and can thus be regarded as a variance function of speed and acceleration. Conceptually, this function can be viewed as a variance "surface" and can be graphically portrayed by means of variance contours. The variance function can be computed if the basis functions of the emission-rate function are specified, if the locations of the modal input points in the (v, a ) -plane are known, and if there is available an estimate of the error variance for each of the input-mode bag values. The functional representation of the emission rate function used in the automobile emission model is given by the weighted composite of the accel/decel and steady-state instantaneous emission rate functions, e. and • e~ respectively: e(v, a) = LJ eg(v) + (1-W) eA(v, a) (29) where u» is a weighting function dependent upon acceleration. The accel/decel and steady state instantaneous emission rate func- tions are expressed as linear combinations of basis functions of speed and acceleration. In general, the linear model which gives the true response, / , of a vehicle or group of vehicles is given by: where ft- , i= 1,2,3 ... are constants, f . , f ~ , f , ... are the basis functions of velocity and acceleration, and e is the random error. Since e is a random variable, the responses observed at each (v, a 1 point also constitute a random variable. As a result, it is only possible to obtain from the observations an equation of the form: 25 ------- y = blfl Vb2f2 + V3 + •'• where y is an estimate of y and b. , i = 1,2,3 ... is an estimate of /?.. The estimated responses predicted by the model are currently based on measure- ments of bag values at 32 accel/decel and 5 steady state average velocity/ acceleration points of the Surveillance Driving Sequence. A detailed explanation of the method of computing the variance func- tion for regression estimates is given in Appendix II and will not be duplicated here. Suffice it to say that the variance function is controlled by three considerations: 1) The type of basis functions employed in the regression model. • 2) The positions in the (v, a )-plane, called design points, at which modal emission measure- ments are taken, and 3) The magnitude of the error variance a at each design point. 2 For purposes of this analysis, a is regarded as constant over all design points. The estimated emission response y as computed by the modal analysis model is a weighted combination of the estimates obtained from the steady- State estimate JL and the accel/decel estimate y : y = o>ys + (1-(J) yA (32) Therefore, on the assumption that the errors involved in the two components, of the estimate are statistically independent, Var(y) = 0>2 Var(ys) + (1-6;)2 Var(/A) 26 ------- In the following discussion,the variance function has been computed using this weighted combination of the steady-state and accel/decel portions. Var (y) varies at different coordinates in (v, a )-space. At some points the response can be estimated with relatively little error; at other positions the error can be quite large. As shown in Appendix I, the variance in the estimated response at a point P in the (v, a )-plane is given by Var(y) = x^ (X'X)"1 x.' O2 (34) where x_ is a vector obtained by evaluating each of the basis functions at the particular point P and X'X is matrix of the least squares normal equa- tions. Therefore, for every point P, (34) is actually a variance function. 2 By dividing both sides by 0 , one can obtain the function in normalized form: Var(y)/<72 = x (X'X)"1 x' (35) This emission-rate variance function can be viewed as a response surface generated by evaluating the function at given increments over any region of interest. The propagation of error over the (v, a)-space can be considered relative to the basis functions used and the design points chosen by examina- tion of (35), the variance function in normalized form. The actual magnitude of the variance at any point can be examined by evaluating (34), which 2 includes a scalar multiplication by the error variance O . The reduction in the number of modes or alteration thereof without loss of information was to be investigated. To this end, the change in the variability of the emission rate function as a result of changing the modal design points was examined. Variance surfaces were generated using the normalized variance function so as to isolate the error introduced by changing the design points without introducing the actual error variance O . As a base for purposes of comparison, the variance surface using the average velocities and accelerations of the 32 accel/decel and 5 steady state modes of the Surveillance Driving Sequence was generated. Figure 7 shows the 27 ------- locations of these initial design points as "dots" which have been labeled with their modal numbers. The resulting variances are contoured at various thresholds in Figures 8 and 9. Although average velocities ranging from 0 mph to 60 mph and average accelerations ranging from approximately -3 mph/sec to +2.5 mph/sec are included, examination of Figure 7 reveals that the actual (v, a ) points are quite randomly located and do not appear to adequately represent the entire region. In particular, the region of -1.2 ^ a ^ 1.0 is not well covered except for the steady-state modes. (It was due to this lack of information and the associated uncertainty involved in predicting emissions that the accel/decel and steady-state functions were weighted in the model.) Also not well represented are the regions in which velocity approaches 0 mph or 60 mph and the absolute value of the acceleration rate is large. By dividing each of the accel/decel modes into subsets, it was possible to "fill in" regions which were poorly represented. The following strategies for decomposing the modes were investigated. 1. 0 - t/2, t/2 - t (t is the mode duration) 2. 0 - t/3, t/3 - t 3. 0 _ 2t/3, 2t/3 - t 4. 0 _- t/3, t/3 - t for decel modes 0 _- 2t/3, 2t/3 - t for accel modes 5. v1 - (v..+v_)/2, (v.-«-v2)/2 - v2 (v.. = mode initial velocity) _ _ (v. = mode final velocity) 1 ~" ' 2 (v = mode average velocity) 7. 0-4 sec., 4 sec. — t 8. 0 - (t-4) sec., (t-4) sec. - t In each case, each segment was arbitrarily constrained to cover at least 4 seconds to allow for adequate data collection. 28 ------- 1 2 3 4 5 6 7 8 9 »0 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 * o o Acceleration (MPH/sec) • * * • * ,1 * • * If * * * * * « * * .20 /7 • IS * • # * * • /} /7 S II • 1 A» • * # * I** . * FIGURE 7 SPEED-ACCELF.RATION TEST DESIGN POINTS 29 ------- FIGURE 8 Normalized Variance Surface Based on 37 (v,a) Design Points THRESHOLDS 0.1000 0.1500 0.2000 0.2500 0.3000 12345 — _ 0.3500 0.4000 0.4500 6 7 8 87777888 88777778 876555556666666666555555567 765444444444444444443333334568 8654433333333333333333222222334568 8765433322222233332222222222222334578 7654333222222222222222222111122233457 ~ • - 876544333322222222222222222111222234567 « - 76554433333333333333222222222222334567 ^. - 8765444333333333333333332222222334567 jx - 876554444444444444443333333333344578 ^ - 8665444333333333333333333222223334567 g - 8765544333333333333333333332222233344567 •S- - 87665444433333333344443333333333333333445678 « - 87665554444444444445555554444444444444444455678 & - 87766655555555666666666666666666665555555555566678 u ,;0^. 88777777777788888 88888777777777788 u - 87766655555566666667777777777766666655555555566778 876655555444555555555555555555544444444444556678 77655444444444444444444444443333333334445678 76654444333333333333333333333333333445678 876554443333333333333333333222233344568 87655444444444444443333333222233345678 876655444444444444444333333333344567 7665544444444444444333333222223344568 876554444333333333333333222222223334568 8765544333333333333333322222222223344567 876544333333333333333322222222223344568 87654333322222333333333322222233345678 7654333222233333333333333333344567 8654333333334444444444444445567 764443444455566666666666678 87655556677788888888888 8777788 ~4 ~o SPEED (MPH) C •;,•'.. 70 30 ------- FIGURE 9 Normalized Variance Surface Based on 37 (v,a) Design Points THRESHOLDS 0.4500 0.5000 0.7500 1.0000 2.5000 A B C 0 E 5.0000 7.5000 10.OOOO F G H »4 — HHGGFFFFFFFFFFFFFFFFFFFFFFGGGHH HHGGFFFFFFFFFFFFFFFFFFFFFFFFFGGHH HHGFFFFFFFFFFFFFFFFFFFFFFFFFFFFGGH HGGFFFFEEEEEEEEEEEEEEEEEEEEEEFFFFGGH HGGFFFFEEEEEEEEEEEEEEEEEEEEEEEEEFFFGGHH HGGFFFEEEEEEEEEEEEEEEEEEEEEEEEEEEEFFFGGHH HHGGFFFEEEEEEDDDEEEEEEEEEEEEEEEEEEEEEFFFFGGH HHGGFFFEEEEEDOCCCCDDDDOODDDODDDCCDDDEEEEEFFFGGH HGGFFFFEEEEDCCCCCCCCCCCCCCCCCCCCCCCCCDOEEEEFFFGGHH - HHGGGFFFEEEEDDCCBAAAAAAAABBBBBBAAAAAAAABCCOEEEEFFFFGGHH -HGGFFFFEEEEEDCCAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDEEEEFFFFGGHH -GFFFFEEEEEDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCOEEEEEFFFFGGGHH -FFFEEEEEDDCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCDEEEEEFFFFFGGGHH -FFEEEEEODCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCDEEEEEEFFFFFGGGHH -FEEEEEEDDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEEFFFFFGGGGHHH- ~ -FEEEEEEEDCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEEFFFFFFGGGH- u -FFFEEEEEEDDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEEFFFFFFGGG- « -FFFFEEEEEEDDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEEFFFFFGGGG- x -FFFFFEEEEEEDOCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCDOEEEEEFFFFFGGGGH- t -FFFFEEEEEEDOCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCODE.EEEEEFFFFFGGG- ^ -FFEEEEEEODCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDOEEEEEEFFFFFFG- o -EEEEEEDDCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCODEEEEEEEFFFF- v -EEOOOCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDDEEEEEEEFF- S -OOCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDOEEEEEEEE- " 0 —ODCCCCBAAAAAAAAAAAAAAAAABBBBBBBBBBBBBAAAAAAAAAAAAAAAAABCCCCODDEEEEEEEEF- o -DCCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDEEEEEEEE- < -EODDCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCODEEEEEEEEF- -EEEEEDDCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCOOEEEEEEEFFF- -EEEEEEEODCGCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDOEEEEEEFFFFFF- -FFFEEEEEEOOCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDOEEEEEEFFFFFGG- -FFFFEEEEEEDDCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCDDEEEEEEFFFFFGGG- -FFFFFEEEEEEDOCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDEEEEEEFFFFFGGGG- -FFFFEEEEEEDDCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCDEEEEEEFFFFFGGG- -FFEEEEEEEDDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCODEEEEEEFFFFFGG- -FEEEEEEOOCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCOOEEEEEEFFFFFFGG- -EEEEEEDDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEEEFFFFFFGGGH- -EEEEEEOOCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCDEEEEEEFFFFFGGGGHHH - -FFEEEEEOCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCODEEEEEFFFFFGGGHHH -FFFFEEEEEDCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDOEEEEEFFFFGGGHH -GGFFFFEEEEDDCBAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEFFFFGGGHH -HHGGGFFFEEEEDCCAAAAAAAAAAAAAAAAAAAAAAABCCCDDEEEEEFFFGGGHH HGGGFFFEEEEDCCAAAAAAABCCCCCCCCCCCCCCCDDEEEEEFFFFGGHH HGGFFFEEEEDDCCCCCCCCCCODODDDDDDDDDDEEEEEFFFFGGHH HHGGFFFEEEEDDCCCDDODOEEEEEEEEEEEEEEEEEFFFFGGH HGGFFFEEEEEDOOEEEEEEEEEEEEEEEEEEEEEFFFGGHH HHGFFFEEEEEEEEEEEEEEEEEEEEEEEEEEFFFFGGHH HGGFFFEEEEEEEEEEEEEEEEEEEEEEFFFFFGGHH HGGFFFEEEEEEEEFFFFFFFFFFFFFFFFFGGH 4 — HGGFFFFFFFFFFFFFFFFFFFFFFFFFFGGH 0! SPEED- (MPH) 70 31 ------- The "stars" in Figure 7 are the design points resulting from using strategy #1, which simply divides each accel/decel mode into two subsets based on t/2; this procedure results in using 69 design points, 64 obtained from accel/decel modes and 5 from the steady-state modes. Figures 10 and 11 are threshold maps of the variance surface resulting from using these design points. It is obvious that the entire level of the variance was lowered as a result. In order to investigate the changes in the variance as a result o'f the reduction of the number of modes, a normalized variance surface was generated after certain modes had first been excluded. It was decided to drop 1/4 of the modes simply by excluding points in regions where there seemed to be ( v, a) redundancy. The modes excluded were 13, 22, 23, 25, 27, 28, 30, and 31. The variance map of the depleted design worsened as expected. However, the t/2 expansion of the 24 modes used (53 design points including 48 accel/decel and 5 steady state) actually showed improvement over the full modal t/2 expansion in some regions. Figure 12 shows the resultant variance surface for thresholds less than 0.45. This surface is a definite improve- ment over using the initial modal ( v, a) design points. In a second strategy, one-half of the modes were dropped. F.xcluded were modes 4, 5, 6, 9, 13, 14, 15, 17, 19, 22, 23, 24, 27, 28, 30, and 31. The choice of design points in this instance was guided by the results of principal component analysis, to be discussed later in this report. Figure 13 shows the resultant variance surfaces using the t/2 expansion of the remaining 16 accel/decel modes and 5 steady-state modes for thresholds less than 0.45. This surface was generated using 37 points as was the surface based on the original modal points. Comparison of Figures 8 and 13 clearly shows that an improvement in the normalized variability can be realized by appropriate choice of design points. 3.3 FACTOR ANALYSIS OF MODAL DATA The test data that comprise the input to the original modal emissions model are measurements of individual vehicle emissions given off in time 32 ------- FIGURE 10 Normalized Variance Surface Based on 69 (v,a) t/2 Design Points THRESHOLDS 0.1000 0.1500 0.2000 0.2500 0,.30GO 1 2345 4' —433333445678 -43333334456778 0.3500 0.4000 0.4500 -4333333344566778 6 7 8 -44333333344556677788 -44333233334445556667777888 rr443332223333444455 556666677788 -543332222333333444444555556667788 -54333222222233333334444444455556678 -54333222222222233333333333344444556678 -54433322222222222222233333333333444556778 -5443332222222222222222222222222333334455678 -5443332222222222222222222222222222223334455678 -5443333222222222222222222222222222222222333445678 -54443332222222222222222222222221111112222222334455678 -55443333222222222222222222222211111111111112222333445678 -55443333222222222222222222222221111111111111111222233445678 -55444333322222222222222222222222211111111111111111222333455678 ^ -554443333222222222222222222222222221111111111111111222233445678 J -6544433333222222222222222222222222222211111111111122222334456778 % -44333322222222222222222222222222211111111111111111122222334455678 ^ -44333222222222222222222222222222222222221111111112222223334455678 \| -654433332222222222222222333333333333222222222222222233334455678 ~ -8766554443333333333344444444444444444444433333333333444556778 g - 887665555555555555666666666666666666655555555555556678 £ 0 — 88777777777788888 88888777777777788 t - 87766555555555566666667777777776666666555555555566778 ^ - 87765544444444444444444455555555555444444444444444445556788 S -76554433333322223333333333333333333333333333333333333344556678 o -5544333222222222222222222222222222222222222222222222233344556778 -54433322222211111111111112222222221111111111111112222223344556778 -5544332222221111111111111111111111111111111111111112222334455678 -765544333222222111111111111111111111111111111111112222334456678 -76654433322222111111111111111111111111111111111112222334455678 -766544332222211111111111111111111111111111111112222233445678 -7765443322221111111111111111111111111111111122222334455678 -8765443322211111111111111111111111111222222223334456678 -876544332221111111111111111111222222222233334455678 -876544332221111111111111122222222223333444556778 -87654332222111111111112222222233333444556678 -87654332222111111122222222333334445566778 -87654332222221222222223333344455566788 - 7654333222222222233333444555667788 - 765443322222223333444455666778 - 765443332223333444555667788 - 86544333333344455666778 - 86544333334455567788 - 86544444445566788 - 86554444556678 -4 -£ 875544555678 - SPEED (MPH) 33 ------- FIGURE 11 Normalized Variance Surface Based on 67 (v,a) t/2 Design Points THRESHOLDS . „ M -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDEEEE- :-. -AAAAAAA A AAAA AAAAAAAAAAAAAAAAAA AAA AAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDDEE- & -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDEE- — -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCD- g -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCD- •H -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDDD- £ -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDDDEEE- v -CCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDDEEEEEEE- u -CCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDEEEEEEE- -BAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCODOEEEE- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDD- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCC • -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCO- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDD- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDEE- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDOEEE- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDDDEEEEEE- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCODDEEEEEEEE- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDDEEEEEEEEEEF- -AAAAAAA AAAAA AAAAAAAAAAAA AAAAAAAAAAAAA AAAAAAA AA'AAABCCCCDOOEEEEEEEEEEFFFF- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBCCCCDDDEEEEEEEEEEFFFFFFF- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCCDDDEEEEEEEEEEFFFFFFFFFG- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCCDDDDEEEEEEEEEEFFFFFFFFFGGGG- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCCCDDOOEEEEEEEEEEEFFFFFFFFGGGGGGH- -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBCCCCCDDDDEEEEEEEEEEEEFFFFFFFFGGGGGGHHHH- -AAAAAAAAAAAAAAAAAAAAAAAAAABBCCCCCCDDODEEEEEEEEEEEEFFFFFFFFFGGGGGHHHH -AAAAAAAAAAAAAAAAAAAAAAABBCCCCCCDDDDEEEEEEEEEEEEFFFFFFFFFGGGGGHHHH -AAAAAAAAAAAAAAAAAAAAABCCCCCCDDDDEEEEEEEEEEEEFFFFFFFFFGGGGGGHHH -AAAAAAAAAAAAAAAAAABBCCCCCDDDDEEEEEEEEEEEEFFFFFFFFFFGGGGGHHHH -4 —AAAAAAAAAAAAAAAABCCCCCDDDDEEEEEEEEEEEEEFFFFFFFFFGGGGGHHHH 0 70 SPEED (MPH) 34 ------- FIGURE 12 Normalized Variance Surface Based on 53 (v,a) t/2 Design Points THRESHOLDS 0.1000 0.1500 0.2000 0.2500 0.3000 12345 +4 —2111122233455678 -2111112223344566788 0.3500 0.4000 0.4500 -211111122233445566788 6 7 8 -211111122223334455667788 -211111111222233344455667788 -211111111122222333444455666778 -2211111111112222233334444555667788 -2211111111111222222233333444455566778 -2211111111111111222222223333334445556778 -2211111111111111112222222222233333444556678 -2211111111111111111111222222222222333344455678 -2211111111111111111111111112222222222233334455678 -2221111111111111111111111111111111122222223334455678 -2221111111111111111111111111111111111111222223334456678 -222111111111111111111111111111111111111111122222334455678 ^ -22211111111111111111111111111111111111111111122222334456678 % -222111111111111111111222222222222111111111111122222333455678 « -2222111111111111111222222222222222222211111111222222334455678 ? -22221111111111111122222222222222222222222222222222223334456678 I, -22211111111111111111111222222222222222211111111122222223344556678 _ -32222111111111111111222222222222222222222222222222222223334455678 o -5443332222222222222222223333333333333322222222222222333344556778 '•£ -8765544433333333333344444444444444444444433333333333444556678 £ - 8876655555555555556666666666666666666555555555555566788 •jj 0 — 88777777777788888 88888777777777788 ------- FIGURE 13 Normalized Variance Surface Based on 37 (v,a) t/2 Design Points THRESHOLDS 0.1000 0.1500 0.2000 0.2500 0.3000 12345 1 0.3500 0.4000 0.4500 6 ? 8 888888 87666667788 766555556667788 876554444455566677888 87654443334444555566677788 76544333333334444555556667778 8765433333333333344444455555667788 865543333222333333344444444455556678 87654333322222333333334^444444445556678 876544333222223333333333444444444444555678 866543333222223333333334444444444444444556778 87654433332223333333344444444444433333444455677 T - 8765443333333333333444444444444444333333344455678 « - 7655443333333333444444444444444444443333334445667 ^ - 87654433333333344444455555555555444444444444455678 \| - 7655443333333444445555555555555555544444445556678 £ - 8765433322222223333334444444444444444444334444455677 c - 87654433322222222233333333334444443333333333333444556678 .2 - 8766544333332222333333333334444444443333333333333444556678 « - 87655 ------- periods called modes during which the vehicle follows a given speed-time profile. In order to determine whether or not there is any degeneracy in the information being supplied by the various modes, this test data was examined using methods of factor analysis. Factor analysis is useful in analyzing the intercorrelations within a set of variables in order to identify fundamental and meaningful dimensions in the multivariate domain. This "task of factor analysis is most frequently accomplished by first conducting a principal-components analysis and by then using the resulting principal factors as a set of reference axes for deter- mining the simplest structure, or most easily interpretable set of factors for the domain in question." Principal-components analysis is generally useful in determining the minimum number of independent dimensions needed to account for most of the variance in the original set of variables. In the present instance, this statement can be interpreted to mean that the variance among the 1020 vehicles in the data base,so far as emissions is concerned, can be explained by the car-to-car variability observed in the values of a certain number of linear combinations of the modal contributions. The number of these combinations required to account for some specified fraction of the total variance -- say, 90% -- is often referred to as the dimensionality of the space. The essen- tial thrust of the analysis is to take cognizance of the fact that if two variables, such as two modal contributions to emissions, tend to vary in some related way as one goes from vehicle to vehicle, then there is essen- tially only one variable at work rather than the apparent two. To achieve such insight, it is heuristically logical to examine the correlations among all pairs of modes for the 1020 vehicles in the data base. The result is a correlation matrix for each of the pollutants under consideration. The correlation matrices based on these 37 modes were determined for each of the three pollutants, HC, CO, and NOX (as well as for C0_, in connection with fuel-use studies to be discussed later in this report.) These correlation matrices were then subjected to a principal-components* analysis in order to * Cooley, W. and Lohnes, P., Multivariate Data Analysis, Wiley, New York, 1971, p. 131. 37 ------- determine the eigenvalues (X) and associated normalized eigenvectors (v). The factor coefficients or loadings were then derived by: . Y_j> j = 1, 2,..,, 37 where a. and v. are of dimension 37. The numbers of dimensions or modes -3 -J needed to account for 90% and 95% of the variance for each pollutant are indicated below: 90% of Variance 95% of Variance HC 7 13 CO 9 15 C02 9 18 NOX 14 21 For purposes of illustration, Table 1 gives the factor loadings for the first seven principal components derived from the correlation matrix for HC, together with their associated eigenvalues and the percent variance accounted for by these factors. Besides using the principal-components solution to identify the dimensions of the domain, an attempt could be made to interpret the results. In general, the principal-components solution produces one general factor and p-1 bipolar factors (p is the number of common factors). The general factor is usually all positive (or negative) when the solution is based on a matrix of positive correlations. It could be argued that the first factor in Table 1 is perhaps a "speed" factor. The second factor is a bipolar factor and (except for the five steady-state modes) the modes of acceleration have negative loadings and those of deceleration have positive loadings; this factor could be considered to be an "accel/decel" factor. In order to improve on the solution offered by the principal-components technique, factors were rotated to positions in which the factor pattern comes closer to criteria of simple structure. The purpose of analytic rotation schemes is to transform the principal components so as to obtain new variables 38 ------- Table 1 Principal Components (7) of -chc Correlation Matrix for HC for 37 Modes Eigenvalue % of Variance Variable ! mce 1 2 3 4 5 6 7 8 9 10 11 •12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 27.04 73.08 -0.8762E+00 -0.534CE+00 -0.7764E+00 -O.Q313E+00 -0.9188E+00 -0.8445E+00 -0.8&87E+00 -0.8606E+00 -O.S793E+00 -0.6226E+00 -0.9240E+00 -0.8980E+00 -0.9282E+00 . -0.8564E+00 -C.8939E+00 -0.8081E+00 -0.9365E+00 -0.&762E+00 -0.9287E+00 -0.8892E+00 -0.8965E+00 -0.8086E+00 -C.C590F+00 -0.9214E+00 -0.7&96E+CO -0.8185E+00 -0.9048E+00 -0.£95 0.5049E+GD 0.3135F.+00 0. 161 1E-01 -O.lfcfOE+CG -0.16&6E+00 1.438 3. 89 0.2557E+00 0.1431E+00 0.3774E+00 0.1968E+00 0.2738E-01 -0.1831E+00 -0.5325E-01 -0.2348E-t-00 -0.7&5GE-01 -0.1920E+00 0.1317E-01 -0.2272E+00 . 0.5636E-01 -0.2268E+00 -0.7528E-C2 - 0.2GOOE+00 0.13C5E+OO -0.1435E+00 0.9097F.-01 -0.1372E+GO 0.5654E-01 -0.3352E+GO 0.2227F.+00 0.1551E-01 -0.4387E+00 0.7664E-01 0.5815F.-G1 -0.1197E+00 0.2264E+OG -0.2214E-01 -G.3183E+00 0.8707E-01 0.3972E+00 G.3077E+00 0.36l£.E-01 -C. ------- which might be more readily interpreted and named. This rotation was per- formed on the matrix consisting of the 15 principal components of 37 variables or modes for each pollutant (15 factors accounted for at least 90% of the variance in the case of all pollutants). The "normal" varimax criterion was used for the orthogonal rotation of factors. This new set of rotated axes might be preferred for purposes of interpreting the basic dimensions of the domain measured by the 37 modes. This is because the new coefficients are more "simple" in the sense that a given variable tends to have a high coefficient for only one new axis and each factor has zero, or near zero, coefficients for at least some of the * variables. Table 2 gives the derived rotated factors for the first 10 (HC) factors. The general factor has been destroyed and group factors have been produced. In the first factor, high negative weights are given to variables 5, 7, 9, 11, 13, 17, 19, 21, 24, 27, and 30. These modes, which are all highly correlated, are characterized by accelerations between 1 mph/sec and 2.5 mph/sec-and by velocities ranging from about 28 mph to 53 mph. When variances based on the rate of emission (grams/sec) were calculated, these modes all showed relatively high variances. These observations suggest that using any one of these modes could provide as much information as using all of them. Table 3 gives the factor number for any mode which is weighted heavily in that factor. If more than one mode has high loadings within a factor, the factor number is listed for each mode with the mode which is weighted most heavily being "starred." Examination of this table reveals that the eleven variables which had high loadings in the first factor for HC also have high loadings for the other two pollutants and for CCL. Again, this fact would suggest that these modes provide redundant information. It should also be noted that for CO, C02,and NOX, modes 6, 8, 10, 12, 14, 22, 25, and 28 all have high factor loadings in the second factor. These modes are characterized by average accelerations ranging from -1 mph/sec to -3 mph/ sec and by average velocities from 24 mph to 47 mph. They also all have rela- tively low emission-rate variances. Most factors have high coefficients for Cooley and Lohnes, op. cit. 40 ------- TABLE 2 TEN* ROTATED FACTORS OF THE CORRELATION MATRIX FOR HC FOR 37 MODES FACTOR 1 2 3 45 6 7 8 9 10 1 2 3 4 5 6 .7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 -.68 -.22 -.43 -.72 -.82 -.47 -.86 -.65 -.83 -.37 -.87 -.47 -.86 -.45 -.42 -.31 -.83 -.40 -.81 -.39 -.85 -.36 -.63 -.85 -.38 -.35 -.82 -.40 -.65 -.83 -.48 -.31 -.16 -.37 -.57 -.58 -.64 .37 .24 .49 .39 .20 .24 .17 .12 .12 .30 .23 .31 .23 .25 .39 .62 .28 .34 .27 .39 .24 .23 .27 .18 .12 .39 .16 .34 .26 .13 .14 .47 .88 .80 .49 .19 .23 -.24 -.22 -.24 -.30 -.30 -.60 -.32 -.49 -.32 -.67 -.31 -.69 -.29 -.74 -.58 -.46 -.29 -.73 -.33 -.71 -.29 -.47 -.21 -.27 -.41 -.43 -.23 -.56 -.22 -.23 -.58 -.52 -.28 -.30 -.31 -.27 -.26 .20 .10 .24 .19 .13 .14 .05 .06. .08 .13 .10 .10 .13 .11 .27 .24 .18 .19 .16 .16 .13 .18 .59 .29 .14 .58 .40 .39 .59 .24 .24 .46 .15 .11 .09 .11 .07 .12 .91 ' .12 .14 .12 .11 .09 .11 .09 .13 .10 .14 .10 .12 .15 .13 .11 .13 .11 .15 .08 .10 .10 .09 .08 .13 .08 .12 .11 .07 .08 .11 .14 .13 .15 .08 .10 .08 .06 .06 .14 .20 .18 .10 .17 .16 .09 .08 .09 .11 .10 .13 .05 .09 .11 .12 .13 .06 .08 .08 .11 .09 .08 .07 .12 .07 .07 .14 .07 .05 .14 .36 .70 .41 .18 .10 .11 .14 .16 .18 .16 .21 .17 .25 .18 .33 .16 .10 .18 .16 .16 .20 .16 .23 .18 .72 .18 .20 .77 .21 .21 .38 .16 .17 .39 .19 .11 .14 .16 .14 .36 -.39 -.06 -.61 -.21 -.13 -.07 -.03 -.10 -.06 -.10 -.05 -.08 -.12 .23 -.14 -.15 -.19 -.07 -.13 -.05 -.07 -.09 -.19 -.01 -.04 -.00 -.06 -.10 -.20 .05 -.05 -.12 -.12 -.04 -.01 -.05 -.07 .00 -.01 -.01 .02 -.02 -.02 -.12 -.06 -.15 -.01 -.06 -.02 -.07 -.10 -.15 -.00 .02 .03 . .03 .06 .02 .00 .12 .03 -.02 -.28 .04 .02 .06 .15 .00 -.04 -.03 .02 -.07 .03 -.23 -.10 -.02 -.02 -.10 -.13 -.49 -.01 -.12 .-.03 -.01 -.00 .06 -.03 -.07 -.15 .04 -.06 -.15 -.04 -.05 .00 .00 -.03 -.06 -.08 -.06 -.03 .05 -.04 -.00 -.07 -.05 -.07 -.01 -.07 -.04 -.10 Rotation done on 15 factors. 41 ------- Tnble 3 Highly Loaded Modes by Factor Number HC CO • C00 MOPF 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 I* 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 5 8 1 10 1 11 1 12. 3 1 3 1 3* 13 1 3 1 3 1 4 1 7 9. 4 1 4 1 15 2 2 14 6 9 FACTOR 12 11 6 12. 1 2 9. 1 14. 2 1 2 1* 13, 2 1 2 5 1 10 1 2, 15 1 2 1 15. 2 7 1 2 1 2 8 3* 3 4* 4 NIJMRI-:R 14 15 10 1 1 2 1 12, 2 1 2 1 2 1 2 7, 1 11 1 4 1 2 1 13. 2 4 1* 2 1 9. 2 8 S 6 3 3 15 11 3 12. 1 2 1 13. 2 1 2 1 2 5 1 1 8. 1 2 1 2 1 7, 2 1* 2 7* 4 9 6 14 42 ------- only one mode. Some variables such as 4, 15, and 29 have not been weighted heavily in any factor. The results of the principal-components analysis indicate that for all pollutants 14 modes are sufficient to account for 90% of the total variance. These 14 modes are not the same for all of the pollutants. However, eleven of the modes seem to provide the same information for all pollutants and eight modes provide the same information for three of the pollutants. In conclusion, it appears that test procedures could be modified so as to avoid running a vehicle through all 37 of the defined modes and still obtain .the same amount of information about its emission response. 43 ------- 4. GROUP EMISSION PREDICTIONS Individual vehicles represent a wide variation in model year, make, model, engine and drive train equipment, accumulated mileage, state of maintenance, attached pollution abatement devices, and geographic location. Inasmuch as it is a mix of these diverse vehicles which determines the vehicular contribution to air pollution in a given vicinity, however, it is appropriate to aggregate vehicles into groups and to view the group as a composite emission source for various purposes of analysis. Accordingly, considerable interest centers on the accuracy and precision with which the modal analysis emission model can predict group emissions in a given driving sequence . The characterization of a group of vehicles can be achieved by defining the emission rate function for the average vehicle within the group. Let, b. ., = k'th coefficient in the emission rate function 1JK for the j'th vehicle within the group and i'th kind of pollutant. N = number vehicles in the group. & b.. = k'th coefficient in the emission rate function ik describing the average vehicle's i'th kind of pollutant response. Then , N 1 g b.,= ~ I b.. Thus, the group emission rate functions are determined by averaging the coefficients which make up the emission rate functions of each vehicle in the group. In this way, the group is viewed as consisting of N "average" & vehicles, each having identical emission characteristics. The emission response of the group over any driving sequence can accordingly be determined by multiplying the response of this average vehicle by the number of vehicles 44 ------- in the group. Note that, once the emission response of the average vehicle has been characterized in terms of average regression coefficients, its total emission over any specified driving sequence can be obtained by appropriate integration of the emission rate function in exactly the same manner as for any other vehicle. As was shown in section 3.2 of this report, error propagation in the modal analysis emission model causes the emissions estimated for some! regions of the ( v, a)-plane to have lower variance than for certain other regions of the plane. A consequence of this fact is that the estimation capability of the model over an arbitrary driving sequence will depend on the relative amounts of time which that sequence devotes to regions of high or low variance. This fact is true for both individual vehicles and for groups of vehicles. Our approach to an evaluation of the model for group emission predic- tion was as follows. First, a study was made of the extent of agreement between observed and computed emissions for the Surveillance Driving Sequence (SDS). Then, with this comparison in view as a "base case," a procedure was developed for relating the base case to arbitrary driving sequences which, as a result of differences in their distribution of velocities and acceleration, exhibit different degrees of variance in the emissions computed by the model. 4.1 MODEL PERFORMANCE FOR THE SDS BASE CASE Two general questions are of interest in connection with the predic- tion of group emissions: accuracy and precision. Lack of accuracy is reflected as a bias or systematic error in the predicted results. Lack of precision is the consequence of random errors in the prediction and is manifested in terms of variance in the predicted group emissions under repeated sampling and testing of the group. These two aspects of the group prediction question will be addressed below in connection with the perform- ance of the model for the Surveillance Driving Sequence. 45 ------- 4.1.1 Accuracy of Group Emission Prediction Because of the fact that the "true" or population value for the mass of a pollutant emitted during a particular driving cycle can never be known, the question of accuracy can be resolved only in a relative sense. One possible approach to evaluating the accuracy of model prediction is to com- pare, for a particular driving sequence, bag values as computed by the model and bag values as actually observed in test. The approach indicated above was employed in the initial implementa- tion of the original model by comparing computed and observed bag values for the Surveillance Driving Sequence. These results were originally presented in Calspan Report No. NA-5194-D-3 and in EPA Report No. EPA-460/3-74-005. Relevant portions of these results are repeated herein as Table 4 for purposes of reference, because it is here proposed to view these results in a new light. Table 4 BAG VALUE ERROR STATISTICS SURVEILLANCE DRIVING SEQUENCE 1020 VEHICLES OBSERVED \' POLLUTANT \ HC CO NOX \G VALUE (gms) 0 53. 625. 48. 5 0 2 MEAN ERROR VARIANCE (gms) (gms)2 R ------- experimenter can observe the difference in degree of corrosion for each pair and, as far as overall generalization is concerned, .can circumvent the variability introduced by inhomogeneity of exposure conditions. If he wants to group the paired samples into classes according to soil type -- clay, loam, cinders — he can restrict his inferences to these strata, again with the advantage of balanced comparisons within the strata. It is proposed to examine the performance of the modal analysis emission model in this vein. In this analysis, individual vehicles will play the role of exposure condi- tions, and homogeneous classes of vehicles will play the role of soil strata. First, let us examine the hypothesis that there is no significant difference between the mean bag value as observed and as computed -- that is, let us examine the hypothesis: H : "R = 0 o Because of the large sample size, we can use the u-test to test the hypothesis, The standard error of R is „ "R 5 " and u is defined as u = As shown in Table 4, the hypothesis is rejected at the 0.01 level for all three pollutants. In this connection, however, a word of warning is in order. By pooling a sufficient quantity of data, it is possible to label as statistically significant an effect which may be of negligible engineering magnitude. More germane is the consideration that if the difference between two means is no greater than -- say, 10% --of their pooled mean, it may be of small consequence that this difference is declared to be statistically significant. The importance of the effect depends on its probable magnitude, and the mere act of declaring it to be statistically significant in no way augments its practical magnitude. 47 ------- 4.1.2 Precision of Group Emission Prediction For each of the pollutants HC, CO and NOX, the relative importance of statistical and practical views of model performance can be considered in terms of confidence intervals. Let /„ denote the expected or population mean value of the difference between calculated and observed bag values for a pollutant. The width of a confidence interval for //R depends on the dispersion of estimates for individual vehicles comprising the group and on the "size" of the confidence interval. In statistical terminology, the term "size" denotes the probability with which it can be asserted that the popula- tion mean falls between two prescribed values. In the following discussion, we shall assume a confidence interval of size 0.95 (95% confidence). For 95% confidence, the half-width of the confidence interval is nately 2 a- (more exactly 1.958<7_ ) and the coi R R for the three pollutants are approximately as follows: approximately 2 a- (more exactly 1.958<7_ ) and the confidence intervals R R HC CO NOX 6.4 34 -3.5 !% ^ R '"R 8.0 ^ 52 ^ -1.9 As a percent of 0, the mean observed value for the pollutant in question, one obtains as extremes: ^\| x 100% = 15% for HC •5 J • J x 100% = 10% for CO and -r x 100% = 7.3% for NOX 48.^ In short, for a group of 1020 highly heterogeneous vehicles, the bias for HC would not be expected to be greater than 15% of the mean values as actually observed by direct measurement of these 1020 vehicles. Similar figures of 10% and 7.3% apply for CO and NOX. 48 ------- 4.1.3 Sampling Considerations It is evident that the dispersion of emissions for individual vehicles within a group depends on the degree of homogeneity of the group as far as such determinants as make, model, mileage, state of maintenance, and other factors are concerned. Because of this fact, the standard errors applicable to the mean emissions computed for the three pollutants for the group also depend on the homogeneity of the group. Consequently, the performance of the model in the estimation of group emissions depends strongly on sampling protocol. Consider, for example, a population of vehicles having a certain mix of vehicle "types," as specified by make, model, mileage, and other factors which can be rationally employed to differentiate one vehicle from another. A random sample of N vehicles would produce a certain mix of vehicle types within the sample, not necessarily the mix existing in the population. A second sample would most likely produce a different mix of vehicle typ.es and certainly a set of different vehicles than the first sample. One sees, therefore, the influence of two sources of variability as far as the predic- tions of the model are concerned: vehicle-to-vehicle variability within types and variability in proportionate weighting of types. The result is that a confidence interval based on a random sample from a nonhomogeneous population of vehicles can be expected to be considerably wider than for a case in which some of the sources of variability are controlled. In this connection, consider the case in which stratified sampling is used to select N vehicles from the population. This procedure is a quite logical one in emission assessment, because it assures that the sample will contain the same relative proportions of different types of vehicles as does the population. Random sampling is then performed within each strata to obtain the desired number of vehicles. In this type of sampling, vehicle-to- vehicle variation will be present but the variation in proportions of the various strata will have been eliminated. In conclusion, it is not possible to make overall generalizations about the ability of the model to estimate group emissions, unless the nature of the group and the method by which it is sampled is taken into account. 49 ------- 4.2 MODEL PERFORMANCE FOR ARBITRARY DRIVING SEQUENCES As noted in Section 4.1 of this report, the performance of the modal analysis emission model can be evaluated for the Surveillance Driving Sequence by direct comparison with observed results. No other driving sequence except the FTP permits such a comparison, because bag values are not available for these sequences. To obtain such a comparison for an arbitrary driving sequence, it would be necessary to perform emission tests over that driving sequence as a "validation" of the model. It is possible, however, to compute, for an arbitrary driving sequence, the mean emissions for a group of vehicles and the variance of the emissions exhibited by individual vehicles comprising the group. Thus it is possible to evaluate the precision of model perform- ance for the group for an arbitrary driving sequence, but its accuracy must be judged according to results of the SDS base case. 4.2.1 Theoretical Background The essence of the approach to precision analysis for the performance of the model in an arbitrary driving sequence resides in the simplification of the emission integration as detailed in Section 2.2 of this report. As background for this approach, however, it will be informative to review the underlying statistical theory. Consider a set of random variables X. , X_,..., X and a linear combination of these variables Y=C1X1C2X2 - +CPXp where c1, c_,..., c are constants. The variance of the random variable Y can be computed as P 2 P P Var Y = 2 c. Var X. + 2 I £ c.c. Cov (X., X.) 50 ------- For example, for three variables, Var Y = C Var Var Var Cov (X., X ) + 2 c.c_ .Cov A ' fc Jl O Cov (X2, X3) In matrix notation, this result can be written Var Y = [C1C2C3] Var Cov Cov Cov Cov (X^ X2) Var (X2) Cov (X2 X3) Cov (X,, X3) Var or, in general, Var S c where S is the variance-covariance matrix of the random variables X.., X_,..., X , c is a column vector of the weighting coefficients p' and £' is a row vector, the transpose of c. It will be seen that equation (22), pertaining to the integrated basis functions, fits the definition above. 4.2.2 Variance Computations for Arbitrary Driving Sequences It has been shown that, for an arbitrary driving sequence, the total emission of a pollutant can be written as 51 ------- e(T) = 2 a / h[a(t)J f [v(t)] dt • . i •!• / -^ ~ gj [v(t), a(t)] dt (36) for the modal analysis emission model as originally developed with 12 basis functions. Moreover, it was shown that, for a given driving sequence, the 12 integrals need be computed only once for any group of vehicles because the values of these integrals are constant for all vehicles in the group and depend only on the nature of the driving sequence. On the other hand, each vehicle in the group gives rise to a different set of a. and b. ; conse- quently, these values can be considered as outcomes of random variables A. , Thus the A. and B. play the i i = 1, 2, 3 and B., j = 1, 2,... , 9. role of the X. in Section 4.2.1 of this report. Similarly, the values of the 12 integrals in equation (36) play the role of the constants c. in Section 4.2.1. Denoting these integrals c., c2, c_, d., d2,..., dg> one can then write Var e(T) = l'C2'C3'dl" >d9] . . .Cov(A2,Bg) Var Aj Cov(A1,A2)Cov(A1,A3)Cov(A1,B1)...Cov(A1,B9) Cov(A1,A2)Var &2 Cov(A2 . Cov(A1,A3)Cov(A2,A3)Var Cov(A1,B1)Cov(A2,B1)Cov(A3,B1)Var BX . . .CovfBj ,Bg) • • • • • • • * • • • • • • • Cov(A1,B9)Cov(A2,B9)Cov(Bg)Cov(B1,B9)...Var Bg 52 ------- or, more succinctly, as Var e(T) = c_' S c_ where S is the variance-covariance matrix of the coefficients of the model for the group of vehicles under consideration and £ is a column vector of the integrated basis functions as integrated over the driving sequence under consideration. In application, the variance-covariance matrix would be estimated from the N vehicles comprising the group of vehicles under consideration. To illustrate this principle, four driving sequences were constructed with the intention of accounting for highway and city driving. Driving Sequence ID Description DS1 Highway driving with frequent changes in speed. DS2 City driving with frequent changes in speed. DS3 City driving with long periods of constant speed. DS4 Constant-speed highway driving. These driving sequences are depicted in Figure 14. Calculations of the total variance over a driving sequence were based on 1050 seconds. Therefore, the sequence shown for DS1 was repeated once and that shown for DS2 was repeated three times. A fifth driving sequence was taken as the first 505 seconds of the Federal Test Procedure. Results for these driving sequences are presented in Table 5 for HC only. The results are based on all 1020 vehicles considered as a group. 53 ------- FIGURE 14 ARBITRARY DRIVING SEQUENCES 30 0 / V / \ DS1 - Highway Driving 1 i , > i 1 it . i 1 1 1 1 i i i 1 j t — I 1 0 30 60 90 120 150 180 210 240 270 300 330 360 390 420 450 480 510 525 Time (sec) 60 a, e. 60 co 60 . 30 DS2 - City Driving 100 150 200 Time (~ec) 250 DS3 - Constant Speed City Driving 100 200 300 400 500 Time (sec) DS4 - Constant Speed Highway Driving Time (sec) 11 1050 10 ------- Table 5 TOTAL COMPILED VARIANCE OF HC OVER VARIOUS DRIVING SEQUENCES Driving Time Duration Variance Sequence (sec) (gm^) SDS 1054 2172.6 FTP 505 336.9 DS1 1050 3992.7 DS2 1050 906.4 DS3 1050 817.5 DS4 1050 5241.9 The table illustrates the fact that the variance of individual vehicle emissions, as computed by the model, depends on the nature of the driving sequence. In the case of the FTP, the low variance reflects, at least in part, the fact that the time duration of the sequence, 505 seconds, is considerably less than the time duration of the other modes. / For a check on the validity of the variances as estimated from the variance-covariance matrix of the model coefficients, compare Table 5 with Table 6 below. Table 6 MEANS AND VARIANCES OF CALCULATED AND OBSERVED BAG VALUES (CMS) FEDERAL TEST PROCEDURE MODEL OBSERVED SURVEILLANCE DRIVING SEQUENCE MODEL OBSERVED HC CO NOX MEAN 18.23 214;51 16.72 VAR 336. 23010. 77. MEAN 81 1 13 21.05 223.69 17.22 VAR 380. 22760. 81. MEAN 69 7 62 46.34 582.0 50.9 VAR 2173. 180900. 699. MEAN 4 0 1 53.55 625.10 48.17 VAR 2680.4 210610.2 647.3 55 ------- In Table 6, the columns labeled "model" were obtained by computing, for each of the 1020 vehicles, the bag value as determined by application of the model emission model. The quantities were then averaged to obtain the mean bag value for the model and their variance was computed by the usual formula = N-l where x^ denotes the model-computed bag values for the i vehicle. In conclusion, it is noted that the variances of the bag values, as computed by the model, are comparable with the variances of the bag values as actually determined by test. Also, in view of the agreement between Table 5 and Table 6, a method is at hand for estimating the vehicle-to- vehicle variance within a group for any driving sequence. This capability, in turn, makes it practical to estimate the standard error of the group mean. 56 ------- 5. PREDICTION OF FUEL ECONOMY In view of the fact that the modal analysis emission model provides a means to estimate pollutant emission over any arbitrary driving sequence, it appeared feasible to employ the model to estimate fuel consumption by means of the carbon balance equation. In this connection, reference is made to work by M.E. Williams ejt al. with regard to the FY 72 exhaust emission * surveillance program. The carbon balance equation relates the amount of fuel consumed per mile to the amount of carbon-containing emissions produced per mile. Using the output of the modal emissions model as input into this equation allows one to estimate the fuel consumption over any driving sequence. The carbon- containing emissions that must be inputted are carbon monoxide (CO), carbon dioxide (C0?), and hydrocarbons (HC). The carbon balance method of calculating fuel economy in miles per gallon (mpg) is given as: grams of carbon/galIon of fuel mP8 ~ grams of carbon in exhaust/mile The actual equation incorporated into the model to estimate miles per gallon is 2423.0 mpg 0.866 (HC) + 0.429 (CO) + 0.273 (CO ) * where HC, CO, and C0_ emissions are estimated in terms of grams/mile. Implementation of the formula required, first of all, appropriate formula- tion of the modal analysis emission model to predict CO emissions in addition to CO and HC. Then it was a straightforward matter to substitute these predicted quantities into the carbon-balance equation to obtain predictions of miles per gallon. * M.E. Williams, J.T. White, L.A. Platte, and C.J. Domke, Automobi1e Fxhaust Emission Surveillance - Analysis of the FY 72 Program, Report No. F.PA- 460/2-74-001, U.S. Environmental Protection Agency, Ann Arbor, Michigan (February 1974) 57 ------- 5.1 PREDICTION OF C02 Since the ability of the modal analysis emission model to predict (XL was not investigated under Contract Number 68-01-0435, it was necessary to examine the model's effectiveness in predicting CO- emissions prior to using these estimates in the carbon balance equation. In order to determine the form of the emission rate function that should be used to represent C0? emissions, the average emission rate of the 1020 vehicles in the data base for each of the steady state modes was plotted versus speed. This curve is shown in Figure 15. On the basis of this figure, the assumption was made that the steady state and accel/decel emission rate functions for CO- could be represented by the same weighted quadratic func- tions of speed and acceleration as those used for HC, CO, and NOX in the original formulation of the model. By means of the composite emission rate function, the amount of CO emitted for each of the 1020 vehicles was estimated for the Surveillance Driving Sequence (SDS) and for the first 505 seconds of the Federal Test Procedure (FTP) driving sequence. These estimates are reported in Table 7, where they are compared with results as observed in the actual emissions tests. The notation in the table is as follows: 0 = observed mean bag value for C0_, in gms/mile, for 1020 vehicles R = difference between mean emissions predicted by the model and the observed mean bag value for C02 (gms/mile) o = standard deviation of errors for individual vehicles A visual appreciation of the distribution of the errors for individual vehicles is afforded by Figure 16 for the SDS and Figure 17 for the FTP. The occurrence frequencies on which these histograms are based are tabulated in Table 8. 58 ------- 400 360 340 300 I 260 c 0 •H in 220 180 140 100 60 15 30 45 Speed (mph) 60 Figure 15 MEAN STEADY STATE C02 EMISSION RATES VS. SPEED 59 ------- Table 7 COMPARATIVE STATISTICS FOR CO, Statistic Surveillance Driving Sequence FTP (First 505 Sec) Driving Sequence 0 R 4347.4 270.6 1662.7 141.5 356373 597.0 108756 329.8 ^ x 100% 0 a — x 100% x 100% 0 655.4 6.22 13.73 15.08 358.9 8.51 19.8 21.6 60 ------- o •IH ^T > •z. 200 150 i nn "50 n :4_.....LZ. ..; . . - - . -'• -• ; '" • , • -- ^_^T-T- -- j ' _- :; i ••'• MEAN = 270.6 CMS 1 ' i ' ' ' ' .... ^ . . • . : ; ; STD. DEV. = 597.0 CMS r - .'"; • '• .. .......-.- - ._ •-'.•- ~ .:-::' '•'•..• -'• ' '• • -.. '- '.•'.'•' :'-j .'••;;.': • ; ; \|- . . ' "]'-'. •L • . i _-•• • " : '" '•-.-' •• • • •;• / i- • . ;•"--!.• — t - - "- •" -™ " " "" ~' ' ~'" ~" - - ._.;... . . - .-^ — —^ . . "Th~. n --. -r-s r-i -1200 -800 -400 0 400 800 1200 1600 2000 Bar Error (prams) 2400 Figure 16 DISTRIBUTION OF CO., BAG F.RROR FROM THF. SURVF.ILLANCF. DRIVTNf, SF.OUENCF ------- 400 •300 1200 1100 tn • . : I— 1 . - 1- 0 ' -S •-.—-> =- - •-: - - . - ;£! : : «w . . : !' 2 . . . . ....... •? . ,___] 1 . • . \. 1 -800 -600 -400 -200 0 200 Bag Error (grains) 400 600 800 1000 Figure 17 DISTRIBUTION OF C02 BAG ERROR (FIRST 505 SECONDS FTP) ------- Table 8 DISTRIBUTION OF C00 BAG VAI.UF FRROR (OBSERVED - FROM THP FIRST 505 SEC OF TI1F FEDERAL TF.ST PROCEDURE (FTP) AND THF SURVEILLANCE PRTVINC SFOUFNCFS (SPS) NUMBER OF VEHICLES FRROR (CMS) FTP ___ SPS_ -2400 to -2300 0 1 -1800 to -1700 0 -1700 to -1600 0 -1600 to -1500 0 -1300 to -1200 1 n -1200 to -1100 0 4 -1100 to -1000 0 2 -1000 to -900 1 -900 to -800 0 7 -800 to -700 • 3 8 -700 to -600 2 11 -600 to -500 2 11 -500 to -400 3 n -400 to -300 7 16 -300 to -200 20 32 -200 to -100 38 61 -100 to 0 143 98 0 to 100 328 169 100 to 200 226 153 200 to 300 86 102 300 to 400 40 68 400 to 500 33 36 500 to 600 29 28 600 to 700 26 21 700 to 800 13 20 800 to 900 2 17 900 to 1000 8 17 1000 to 1100 2 14 1100 to 1200 . 1 19 1200 to 1300 1 1° 1300 to 1400 ' 0 12 1400 to 1500 0 8 1500 to 1600 1 - ° 1600 to 1-700 0 • 16 1700 to 1800 0 p 1800 to 1900 0 3 1900 to 2000 ° 5 2000 to 2100 ° 3 2100 to 22200 1 3 2300 to 2400 0 4 2700 to 2800 ° 2 2900 to 3000 ° 1 3400 to 3500 1 1 4200 to 4300 1 n 4700 to 4800 1 ' n 63 ------- Several observations can be made from Table 7 and from Figures 16 and 17 which are useful in judging the adequacy of the model. First, the mean or expected difference, R , between the calculated and the observed values should be zero if the model is to be considered unbiased. The error distributions show that the bag error clusters around the average error R , and this average for both driving sequences deviates from zero by only a few percentage points of the average measured bag values. Also, the root mean square error which represents the combined systematic and random errors as represented respectively by R and a , are largely dominated by the random error component. In order to further judge what these measures indicate as to the predictive performance of the model, the results for C02 were compared with the results obtained by using replicate data. Of the 1020 vehicles in the Surveillance Driving Sequence, 61 had been tested twice each. Thus there were available 61 replicate measurements from which could be obtained a measure of the repeatability of the test measurements themselves. Estimates of the mean, X , standard deviation, a , and relative or percent standard deviation, a/X , are given in Table 9 for the SDS, for the FTP, and for individual modes. The percent standard deviation characterizes the repeatability of measurements. These values are 8.34% for the SDS and 9.65% for the FTP driving sequence. For the individual modes, the percent standard deviation ranged from 6% to about 40%. This large variability in the test measurements is reflected as errors in the determination of the regression coefficients which in turn determine the error in estimating the instantaneous emission rate at any point in (v, a)-space. In view of the relatively large errors in the modal input data, the errors obtained for model performance do not appear unreasonable. 5.2 PREDICTION OF MILES PER GALLON Prediction of fuel consumption in terms of miles per gallon is achieved by substituting the computed emissions of C0_, CO and HC into the carbon balance equation. Though direct measurements of miles per gallon were not available, it was possible to compute "observed" values by \ 64 ------- Table 9 REPLTCATF MODAL ANALYSIS OF CO FOR 61 VEHICLES MODE TIME (sec) X (gm/min) a2 (gm/min)2 a (gm/min) a/I -100% 1 12 1004.87 39566.29 198.91 19.79 2 16 329.28 7501.51 86.61 26.30 3 8 1129.19 192329.25 438.55 38.84 4 11 693.08 14943.85 122.24 17.64 5 13 580.97 8654.38 93.03 16.01 6 12 194.23 1909.12 43.69 22.50 7 17 673.03 15976.41 126.40 18.78 8 12 217.70 2632.25 51.31 23.57 9 14 ,569.71 9127.54 95.54 16.77 10 30 202.64 2134.50 46.20 22.80 11 26 671.98 12228.70 110.58 16.46 12 21 229.06 4525.77 67.27 29.37 13 32 726.05 11218^22 105.92 14.59 14 23 190.92 1952.54 44.19 23.14 15 9 236.69 2042.98 45.20 19.10 16 8 569.40 18156.47 134.75 23.66 17 22 731.97 12361.45 111.18 15.19 18 16 201.06 3033.29 55.08 27.39 19 18 671.54 17670.24 132.93 19.79 20 19 245.55 6892.80 83.03 33.81 21 25 748.31 11888.24 109.03 14.57 22 28 223.31 2553.20 50.53 22.63 23 15 882.50 19518.84 139.71 15.83 24 25 582.99 7184.77 84.76 14.54 25 18 191.30 1578.71 39.73 20.77 26 10 333.03 7648.20 87.45 26.26 27 38 644.71 12949.74 113.80 17.65 28 35 216.89 2629.09 51.27 23.64 29 18 777.47 19903.79 141.09 18.15 30 21 601.46 16428.74 128.17 21.31 31 14 191.20 2318.55 48.15 25.18 32 13 316.58 4451.17 66.72 21.07 33 60 73:43 170.64 13.06 17.79 34 60 387.26 3180.27 56.39 14.56 35 60 330.37 560.51 23.68 7.17 36 60 357.10 439.12 20.96 5.87 37 60 402.84 1397.43 37.38 9.28 FTP (gm) 483.86 2178.55 46.67 9.65 SDS (gm) 457.40 1453.77 38.13 8.34 ------- incorporating the observed values of CO , CO and HC into the carbon balance equation. Miles per gallon figures, as based on the model outputs and as based on the observed bag values for the three emission products, could then be compared. The results of this comparison are given in Table 10. The applicable notation is as follows: 0 = mean value of miles per gallon for 1020 vehicles, as computed from bag values for C0_, CO and HC emissions R = mean difference between "observed" miles per gallon and miles per gallon as determined from model outputs for C0?, CO and HC emissions 0ft = standard deviation of errors for individual vehicles As was the case for C0_, the quantity R denotes a systematic error, whereas	------- Table 10 Statistics for Miles/Gallon Error Based on Bag Values Statistic Surveillance Driving Sequence FTP (First 505 Sec) Driving Sequence 0 R o2 R aR y?rr.R2 - x 100% 0 CTR — x 100% 0 i/s* * ,R2 v inn0- 0 17.07 -1.00 12.49 3.53 3.67 -5.88 20.70 21.52 16.44 -1.30 6.63 2.58 2.88 -7.89 15.67 17.54 67 ------- 400 300 IT. 0) u •£ 200 <4- c 00 \| 100 i - , till HI thr iTtt: Jilii. i 'n ,", [in- F- i'^ ;-...„.- H iU: lifl ttli miii lit: il- $. lli Tft Mil rtp I Bti 1 It Hi ;-itt -1 MEAN STD. DEV. I -1.0 (MPG) 3.53 (MPG) yl 1M\| ' • t M < > tn~Tt~'"r • -t~n j;.:! ^ M r- htt" ^TTrpjjj 'i! IIP- ± -12 -8 -404 Error (Miles per Gallon) 12 Figure 18 DISTRIBUTION OF ERROR FOR MILFS PER GALLON BASFD ON BAG VALUFS FROM TUF SURVFILLANCF DRIVING SFQUFNCF ------- 400 300 200 100 ft : ; - "71 "T:""" ' ' : •; : ----- Vehicles «4- C ^ • ,. i > ^ i J '-" , ^ . ;;- .'A' "-/I 1 • :: 1 1 : I • " „._' • :T":"-^;: "'.\'r ". '. '":''. • ', ' L '. :; . : • r i ' • • • . . ' I ! . . : 1 • • i • • • ' t "-"- T~ ; ; I -- r ; • - ; ; . ].. . . ; ; , ; j I ' ' :.-[;: I . i , —:[:. r : . j ' : ; ,-' ;,1 i1 -.:.: .. ^ ^ _^ _ ^r- • :: 1 :;i: r. ; '. ; -. • • • 1 ' • • I !;;: — 1__ . • i •; i i i • : ffrr . ' ' I i : i • • i : . • . . r ; . ' '. . i i ' ; , . j : . .... '• ••: " "•; — r i • ;: . . !.;: ;•' ! ! . ; ; I i . . . .'. I ' . __.: •^.VM • •','• — _ ';;'i '.',,'. ---. :,,. , ; :: — : • • i • ' ' . . . ) ... ;;•: • : '('•':': j : "1 • • ' ' : ' . • i ' : • • : .. :_ • • , '. i n ; i • : j :: ! l •••"•• ' ' ' . j : .• 1 • • • T . . : i . - ! - - ....";..:. ' i ~ ! • \| . . . ! . . : i ' \| . MEAN STD. DEV. » ... . , . i ._ ... I..J-1 !'.--.: : ':!! '-. ... i ... • i - - . ! ; '• ( 1 . ', . -1.3 GM ; ! 2.58 GM i i . . ] - j • • ; • ! ! . :: • - - 1 • i :'!-..-; •'.: ! ::: . . ' - :\|.:^i : : ;••!,!• ! ;': i . i . . : : : • . i '! "-' : i : i ' ! ' : : : ! j - i i t ' : -; ' : i .. -:}---..-, , — ; - r 1 ' : : • ; i i -14 -12 -10 -8-6-4-2 02 4 6 Error (Miles per Gallon) 8 10 12 14 16 18 Figure 19 DISTRIBUTION OF ERROR FOR MILES PER GALLON BASED ON OBSERVED BAG VALUE CALCULATIONS (FIRST 505 SECONDS FTP) ------- Table'11 DISTRIBUTION OF MILES/GALLON ERROR, (OBSERVED-CALCULATED) FROM THF. FIRST 505 SEC OF THE FEDERAL TEST PROCEDURE (FTP) AND THE SURVEILLANCE DRIVING SEQUENCES (SDS) NUMBER OF VEHICLES FTP SDS ERROR -14 to -13 -12 -11 rlO -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12' 13 14 15 16 17 22 77 (MPC) -13 -12 -11 -10 -9 -8 -7 -6 -5 .-4 -3 -2 -1 0 ] 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 23 78 0 0 0 4 3 4 5 13 27 25 32 50 84 263 296 139 ' 37 18 5 4 5 0 1 0 1 0 0 1 1 1 1 0 1 0 2 1 1 5 1 6 14 17 26 27 30 71 200 397 140 45 1? 8 6 o ^. 0 1 1 0 o 0 ] 1 1 0 1 1 0 1 ------- 6. SUMMARY AND CONCLUSIONS Refinements and extensions of the automobile exhaust modal analysis model, as originally reported in F.PA-460/3-74-005, have been completed in four important areas: 1) Increased computational efficiency 2) Reduction of modal testing requirements 3) Accuracy and precision of group emission predictions 4) Prediction of fuel economy i These improvements broaden the capability of the model and provide increased opportunities for applying the results of standard emissions tests to new contexts. Improved computational efficiency derives primarily from a simplifica- tion of the method by which the instantaneous emission function e(v, a) is integrated over a driving sequence. For a particular driving sequence, it was shown that integration of the basis functions over the sequence need be performed only once for all vehicles subjected to that sequence. Moreover, it was shown that vehicle factors and driving sequence factors affecting emissions can be essentially separated as specific vector quantities, the inner product of which yields the emissions for the particular vehicle and driving sequence combination. Redefinition of modal testing requirements was examined by means of variance-function analysis and principal-component analysis. It was shown that, although some redundancy exists in the modes as formulated, there are also regions of the speed-acceleration plane not well represented by existing modes. It is indicated that the number of modes could be reduced without serious loss of information but that modes should also be introduced to cover the region of the ( v, a)-plane in the vicinity of accelerations between -1 mph/sec and +1 mph/sec. The accuracy and precision of the model in predicting group emissions was assessed by (1) comparing model predictions with observed test values for the SDS and FTP driving sequences, and (2) evolving a scheme for defining 71 ------- model output variance for an arbitrary driving sequence. Variances of the model predictions for the SDS and FTP compare favorably with the corresponding variances of observed bag values as actually determined by test. The scheme for computing model output variance for an arbitrary driving sequence stems directly from the basis function integration simplifications evolved in improving model computational efficiency. In particular, it is shown that if the variance-covariance matrix of the model coefficients for the vehicles comprising a group is known, this information can be adapted to defining the variance of total emissions over any driving sequence. It is necessary only to know, in addition to this variance-covariance matrix, the integrated forms of the basis functions for the driving sequence under consideration. Prediction of fuel economy by means of the model can be achieved by developing an equation for the emission-rate surface for C0_ in addition to CO and HC. Then, by means of a carbon-balance equation, the total output of carbon-containing emission products can be transformed into an estimate of the amount of fuel which produced these emission products. Accuracy and precision of the miles-per-gallon predictions are considered to be limited primarily by the errors in'measuring the modal outputs of C0_, CO and HC. 72 ------- APPENDIX I VARIANCE FUNCTIONS FOR REGRESSION ESTIMATES Regression analysis is one of the basic tools employed in the formulation of the modal analysis emission model. As used in this context, it is to be understood in a generalized way permitting a relatively wide choice of form for the regression equation. Of particular interest is precision of the regression model over the treatment space; this precision can be evaluated by means of the concept of variance functions as discussed in this appendix. In the following discussion, the mathematical basis of generalized regression analysis is presented, together with a discussion of variance functions and their visualization as a variance surface by means of variance maps. 1. FOUNDATIONS OF GENERALIZED REGRESSION ANALYSIS Assume that a variable £ , which shall be referred to as the response variable, depends on k experimental variables x. , x ,..., x, and that a functional relation > = f (x x x "> H-1) Q - i. l_A. , 2 , « • • , AjJ exists. Under certain conditions, the response equation (1-1) can be expanded as £ » /3jf j (xx ,x2,... ,xk) +02f 2 (KI ,x2,...,xk)^f 3 (xx ,x2 f...,xfc) +... (i_ 2) where the j3. , i = 1, 2, 3, .>. are constants to be determined. The functions £ t f t f ... are of arbitrary form provided only that they are linearly Jl <— O independent and do not involve the constants /3.. Equation (1-2) is thus a linear function of the /3., although the f. may be nonlinear in x. ,x0,...,x, . 1 1 1 £. K Eouation (1-2) is said to be a linear model, and the functions fj may be regarded as basis vectors spanning a vector snace comprising a certain class of functions. 1-1 ------- Suppose, now, that the function (1-1) is to he estimated from experi- mental ohservations. The variables x, , x , .... x constitute a 12 k k -dimensional space called the x-space, and one may estimate (1-1) from observations taken at n points in this space. These n points constitute what shall subsequently be referred to as the experimental design. Tn general , >• cannot be observed at these points because of error. Rather, it is possible only to observe a variable y related to /• by /=£+< (1-3) where ( is a random error. Then (1-2) assumes the form y = (1-4) Since f is a random variable, the responses observed at each design point also constitute a random variable. As a result, it is possible only to obtain from the observations an equation of the form y = b1f1(x1,x2,...,xk)+b2£2(x1,x2,...,xk)+...bpfp(x1,x2,...,xk) (I-S) where y is an estimate of v and v, i = i ? « is an estimate ' / °^»11>^i...,p of B. . • . ^ i Clearly, two types of errors can affect the approximation of the function (1-1). First, if (1-1) is to be approximated by the linear model (T-2), then (1-1) must belonp to the class of functions spanned by the basis functions fj, £,..., f . Second, some means must be found for minimizing or controlling the effect of the random errors ( , since these will affect the estimation of the (3 . . Generally, the form of (1-1) is unknown at the outset, and the experimenter has the option of assuming a set of basis functions according to experience or prior knowledge concerning the system under study. For control of random errors, the theory of least squares is employed. 1-2 ------- In view of equation (T-5), each observation y. obtained in the process of data collecting can be represented as 2xirx2r-'->V + --- + fj d-6) where the b± are estimates of ^ and the c. are random errors as computed from At the outset, y is not known, and it is the object of the least-squares algorithm to estimate y in such a way that n £ = a minimum. (1-8) We proceed to display the theory of this algorithm. In matrix notation, equation (1-6) can be written as where y, and e_ are n-rowed column vectors (°r n x 1 matrices), b_ is a p -rowed column vector (or p x 1 matrix), and X is a matrix of dimension n x p . The set of points at which observations are made will be referred to as the design region, and the set of functions fj, f , ..., f will be referred to as the basis functions or simply the basis. 1-3 ------- helow. For a two-variahle case, the X- matrix is penerated as shown X - MATRIX 11 • * 22 2\ ]2, 2\ 22 fl f1 f, fl fl fl f, fl fl (rii-2i) (,,, 22) /v" z A (]]• 2H> (!2'.2l) (12- 22l m (ylK'X2l) (X1K' 22) * (1K' X2M) f2 f2 f2 f2 f2 f2 f2 'f2 f2 (1P 2l) ' ' ' (n.«22) ••• (n- Z2M^ • • • (12- 2l)' (1 2' ^22^ ' ' ' (X12' 2d ' ' ' (X\K< 2l) ' ' ' (,K, ^22) . . . (IK' 2M) .% (11'. 21^ fp (!!' •:22V fp (^11' Z2M) fp (I2' 2l) fp ^12' ^22^ fp (12' X2\d fp ^IK' ^21 ) fp (Z1K( Z22) ', (".«'• ») The columns of the matrix are identified with the basis functions, the rows with the desipn points. Tn the example shown, x assumes K distinct values and x assumes M distinct values, so that there are n = KM points in the desicn. Each hasis function is evaluated at every point in the design, and the resulting n x p array constitutes the X-matrix. If the array is rearranged, so that the rows become columns and the columns become TOWS, the resulting matrix is the transpose of X. The matrix X and its transpose X1 will he used extensively in the sequel. Consider (1-9) and write the error vector e_ as e_ = y_ - X b (T-10) 1-4 ------- The sum of squares of the components e , e_,..., e vector e can be written as of the error Q = e'e_= (y-Xb) • (y_-Xb) (T-ll) where e.' is the transpose of e_ and (y - XbJ' is tn^ transpose of (y - xbj. In extenso, (1-11) becomes (since n = KM ) n . K M 2 e. = 2 2 i=l X k=l m=l K. M I I k=l m=l (T-12) By differentiatinp (I-T2) with respect to the b. , one obtains a set of p equations of the form (T-13) which can be solved for the b. to minimize Q . The result can be summarized succinctly in the form X'X b = X1 y_ (1-14) where X'X js a square matrix of order p . the so-called normal equations of least squares. Then b_= (X'X)"1 X' y_ is the formal solution minimizing the error sum of squares. Fquation fl-14) provides (1-15) 1-5 ------- It is of interest to investigate the statistical properties of the least squares solution under certain assumptions. For the error vector we assume that E(f) =0 (T-16) E(ff) = la2 where E denotes expectation and I is the identity matrix of order n . Equations (1-16) are equivalent to the assumption that the errors are uncorrelated, with mean zero and constant variance a^ . From equation (1-15), it is clear that .b = C X' v_ (1-17) where C = (X'X)"1 . Note that, for the desipn points, (1-4) can be written as £=X£+< (1-18) Substituting (1-18) into (1-17) one obtains b = C X' (X/Ue) = C X'X£+ C X'f =£+ C X'l (T-19) Since (_ is a random vector, b is also a random vector. Takinp expectations in (T-19), one sees that E(b) = E(j8f C X'e) = E(/3) ••• E(C X'f) = J0+ C X'E(<) =& (1"20) 1-6 ------- Thus, the estimates provided hy (1-15) are unbiased, provided the postulated form of the function f is correct. In the event of an incorrect choice of model, the estimates of the coefficients will he hiased to an extent dependinp on the depree of discrepancy hetween the postulated and true models. Unbiasedness derives from the anility to substitute for y_ its equivalent X/8+ £_ . Suppose the model is inadequate and requires additional basis functions so that the true model is v_ where ,Q. denotes the additional terms in the expansion. Then b = CX'y = CX' (XjSOC^+f = CX'X/3+ CX'X./S. •- i1— i and E(b) = \|3 + CX'X^ = & AA The matrix AX = CX'X1 = (X'X)"1 X'Xj is called the alias matrix. Thouph it is useful in indicatinp the extent of confoundinp amonp various coefficients in the correct model, it is defined only in relation to an alternative hypothesis. Consider, now, the covariance matrix of b . Denoted V(b) , the covariance matrix is of order p x p and is given by V(b) = F. [b-E(b)] [b-E(b)] ' (1-21) But, from (1-20) »-B(b).b-g (, and applying the results of (1-19) gives b - E(bJ = §+ CX'6 -£ = CX'i ( 1-7 ------- Therefore , V(b) = E[CX'C] [CX'f] ' ee'XC'J (1-24) or, since c is symmetric, (1-25) But, by the assumption of (1-16), = la2 Therefore, V(b) = CX'XCa2 = Co2 (1-26) The results of (1-26) can be expressed as follows: (a) The diagonal elements of the matrix C = (X'X) , when 2 multiplied by o , the variance of the individual observations, provide the variances of the estimated coefficients in the model. (b) The off-diagonal elements of C similarly provide the covariance between two estimated coefficients, b. andb. 2. VARIANCE FUNCTIONS The variable y is often referred to as the response. We wish to study the variance to which the estimated response y is subject as we consider different points in the x-space. The necessary information can be obtained by an extension of the above reasoning. Consider an arbitrary point (Xj, x2) in treatment space and define a corresponding vector x_ as x = 1-8 ------- Then the estimated response at that point is y - xb -bjfjCx^) * b2f2(xlfx2) + ... + bpf (xlfx2) where b_ is a column vector of coefficients. Then, E(y) = E(xb) - xE(b) = x/3 n_ and Var (y) = E^-E(y)J [^-E(y)] • = E[xb-xj8j [xb-xj3]' Erxfb-«n [x^-^)]' xE [(b-£)-(bj3)'] x- (I- Rut EC^Cb-jQ)' = V(b) • Therefore, Var (y) = x(X'X)"1 x_'a2 (T-3U Equation (1-31) pives the variance of the estimated response at an arbitrary point (x , x ) in tne sampling plane. Note that this variance depends strongly on the form of the x matrix, which is determined both by the location of the design points and the form of the basis functions. Equation (1-31) theoretically provides an estimate of the variance of the estimated response at every point in tht x-space. In the event that the x-space is two-dimensional, it is possible to display contour maps of this variance. The variance is computed at every point in an array of points in the x-space plane, and these values are then thresholded arrd displayed as a variance map. Figure 1-1 provides a graphic presentation of such a variance 1-9 ------- surface for a 3 factorial design using the basis functions listed. Similar techniques were applied to the modal analysis data in generating the variance maps exhibiting the effect of reallocation of modes on the precision of estimation of the model. In those applications, the two variables x. and x were speed v and acceleration a . FIGURE I - 1 VARIANCE SURFACE FOR 32 FACTORIAL DESIGN 1-10 ------- APPENDIX II COMPUTER PROGRAM REVISIONS FOR INCREASED COMPUTATIONAL EFFICIENCY *MAIN PROGRAM I* C C MAIN PROGRAM I DETERMINES THE AMOUNT OF EMISSIONS GIVEN OFF BY C INDIVIDUAL VEHICLES OVER A DRIVING SEQUENCE SPECIFIED BY ARR»VVT».< C C VTM(I )=>VELOCITY VS. TIMEdN ONE SECOND INTERVALS) OF THE SURVEIL-' C -LANCE DRIVING SEQUENCE.VTM(I)=VELOCITY(MPH) AT TIME (I-l)SEC C (REAL*) C C VVT(I)=>VELOCITY VS. TIMEdN ONE SECOND INTERVALS) OF ANY DRIVING C SEQUENCE OVER WHICH EMISSIONS ARE TO BE CALCULATED.VVT(I)VELOC- C -ITY AT TIME (1-1 ) SEC. (REAL4) C i C AMTC(I,J)=> AMOUNT OF I «TH EMMITTANT GIVEN OFF IN J»TH MODE. C C DS«I)=DISTANCE(MILES)TRAVELED IN I»TH MODE.NOTEtSTEADY STATE MODES C ARE 60 SEC IN DURATION. C C FUNC(I)=> INTEGRATED BASIS FUNCTIONS CHARACTERIZING A C DRIVING SEQUENCE (REAL8) C C DIMENSION ITAB(20,2),IDAT(4,19),RDAT(16i,19),DS(37) DIMENSION VTM( 1055), VVT ( 2000 ),AMTC(4, 37) REAL8 C(4),FUNC(12) REAL8 AA(9,32),AS(3,5),BAD(4,12),XMPG,HCGPM,CQGPM,C02GPM DATA DS /.0602f.0741,.0201,.0705,.1360,.1268,.2163,.1716 C,.2043..3367,.3136,.1973t.3313t.2994,.0579,.0173,.1759,.1392,.1528 C,.1304,.2654,.2634,.0737,.3134,.2362,.0444,.4009,.3293,.0886,.2599 C,.1813,.0592,.0000,.2500,.5000,.7500,1.0007 DEFINE FILE 99(75,3256,U,N1) C C READ IN SURVEILLANCE DRIVING SEQUENCE C PRINT 1003 1003 FORMAT(1HO,«SURV. DRIVING SEQ.«//) DO 3000 1=1,100 NX1=((I-l)16)+l NX2=NX1+15 READ(5,100)(VTM(K),K=NX1,NX2) PRINT 1002, (VTM(K),K=NX1,NX2) II-l ------- 1002 FORMATUHO,16F8.0) 100 FORMATU6F5.0) IF(VTM(NX1).GT.99.0IGOT03111 3000 CONTINUE 3111 CONTINUE C C READ IN DRIVING SEQUENCE OVER WHICH EMMISSIONS ARE TO BE CALCULATE C C IN THIS EXAMPLE VVT=> FIRST 505 SEC. OF FTP C PRINT 1004 1004 FORMATUHOt »FTP DRIVING SEQ.V/) NPTS=506 DO 1500 1=1,100 . NX1=<(I-l)16)+l NX2=NX1+15 READ(5,100)(VVT(K),K=NX1,NX2) PRINT 1002, ------- DO 1000 IR=1,37 DD=1.0 C C FOR A/D MODES CHANGE DATA FROM GRMS/MILE TO GRMS C IF(IR.LE.32)DD=DSUR) DO 1001 IC=1,4 IW = UIR-1)4) + 13 + 1C AMTC(IC,IR)=RDAT ------- SUBROUTINE ESUMtVVTtNT,FUNC,DIST) C *************************************************************** C C SUBROUTINE ESUM INTEGRATES THE BASIS FUNCTIONS OVER THE C INPUTTED DRIVING SEQUENCE AND DETERMINES THE DISTANCED TRAVELED C C C VVT(I)=>VELOCITY VS. TIME HISTORYCDRIVING CYCLE) IN ONE SECOND C INTERVALS. WT (I )=VELOCITY(MPH t AT THE I «TH SECOND .REAL C C NT=>MAXIMUM NUMBER SECONDS IN DRIVING CYCLE! SECOND C " C FUNC(I)=> INTEGRATED BASIS FUNCTIONS CHARACTERIZING THE DRIVING C SEQUENCE (REAL8) C C DIST=D,ISTANCE(MILESHN SPECIFIED DRIVING CYCLE,REAL C C **************************************************************** DIMENSION VVT(NT) REAL8 X(12)tFUNC<12),DIS,AMIN,AMAX,A1,A2,HOA AMAX=1.0DO AMIN=-1.20DO A1=-1.0DO/AMIN A2=-1.0DO/AMAX C C CLEAR FUNC ARRAY C DO 1000 1=1,12 1000 FUNC(I)=O.ODO C C INTEGRATE AUTO»S EMISSION RATE FUNCTION OVER DRIVING CYCLE C DIS=O.ODO NTT=NT-1 DO 3000 IT=1,NTT KT=IT+1 X(1)=1.0DO X(2) = DBLE((WTaT)+VVT(KT))/2.0) X(3)=DBLE(VVT ------- SUBROUTINE EDOTJAMTC,AA,AS,BAD) C C SUBROUTINE EDOT COMPUTES THE COEFFICIENTS THAT SPECIFY AN AUTO'S C INSTANTANEOUS EMISSION RATE FUNCTIONS FOR HC,CO,NOX(ARRAY 'BAD1), C GIVEN THE AMOUNT OF EACH EMITTANT GIVEN OFF BY THE AUTO IN 32 A/D C MODES AND 5 STEADY STATE MODESURRAY »AMTC»),AND THE BASIS C FUNCTION FACTOR ARRAYS(AAtAS). C C**THIS VERSION CALCULATES COEFFICIENTS FOR C02 ALSO C THE DO 1000 LOOP CHANGED TO 1=1,4 C C AMTC(I,J)=AMOUNT(GMS) OF THE I«TH EMITTANT GIVEN OFF BY THIS AUTO C IN THE J»TH MODE. I=1=>HC,I=2=>CO,I=3=>C02,1=4=>NOX, C J=l,37 (32 A/D MODES, 5 STEADY STATE MODES). (REAL) C C BAD(I,J)=J'TH COEFFICIENT OF THIS AUTO'S INSTANTANEOUS EMISSION C RATE FUNCTION FOR THE I\|TH KIND OF EMITTANT.I=1=>HC,I=2=>CO, C I=3=>C02,I==>NOX. (REAL8) C C AA=>BASIS FI C AA=BASIS FUNCTION FACTOR ARRAY FOR ACEL/DECEH CALCULATED BY SUBROU C -TINE SETUP). C C AS=BASIS FUNCTION FACTOR ARRAY FOR STEADY STATE(CALCULATED BY C SUBROUTINE SETUP). C C TM(I)=TIME(SEC) IN I»TH MODE.(REAL) C £ ***jMc***********:Hc*****^^ DIMENSION TM(37),AMTC(4,37) REAL8 AA(9,32)fAS(3,5),BAD(4,12),SUM,YA(32),YS(5),B(3),XO,X1,X2 C,A1,A2 DATA TM/12.,16.,8.,ll.»13.,12.,17.,12.,14.t3G.f26.,21.>32.t23.t9., C8.,22.,16.,18.,19.,25.,28.,15.,25.,18.,10.,38.,35.,18.,21.f14.,13. Ct60.t60.»60.,60.,60./ NOBSA=32 4MOBSS=5 NBFA=9 NBFS=3 C DO 1000 IC=1,4 C C IC=1=>HC,IC=2=>CO,IC=3=>C02,IC=A=>NOX C C CALCULATE OBSERVED AVERAGE EMISSION RATES OVER 32 A/D MODES C DO 1100 1=1,32 A1=AMTC(IC,I) A2=TM(I) YA(I)=A1/A2 1100 CONTINUE II-5 ------- c c c c c c c c c c c c CALCULATE COEFFICIENTS THAT SPECIFY A/D EMISSION RATE FUNCTIONS DO 1200 I=1,NBFA SUM=O.ODO DO 1250 J=1,NOBSA SUM=SUM+(AA(I,J)YA(J)) 1250 CONTINUE BAD(IC,I)=SUM 1200 CONTINUE CALCULATE OBSERVED AVERAGE EMISSION RATES OVER 5 SS MODES DO 2000 1=33,37 IP=I-32 A1=AMTC(IC,IJ A2=TM(I) YS(IP)=A1/A2 2000 CONTINUE CALCULATE COEFFICIENTS THAT SPECIFY SS EMISSION RATE FUNCTIONS DO 2001 I=1,NBFS SUM-O.ODO DO 2100 J=1,NOBSS SUM=SUM + (AS( I,J)YSU)) 2100 CONTINUE 8(I)=SUM 2001 CONTINUE CHECK ON EXISTANCE OF NEGATIVE EMISSION RATES LOOP=0 IF(B(3).EQ.O.ODO)GOT02151 XO=(B(2)*2)-<4.0DOB(3)B(1)) IF(XO.LT.O.ODO)GOT02153 XO=DSQRT( (B(2)2)-(4.0DOB<3)B(im X1=(-B(2)+XO)/(2.0DOB(3)) X2 = (-B(2)-XO)/(2.0DO*B(3M IF((X1.GT.O.ODO.AND.X1.LT.60.0DO).OR.(X2.GT.O.ODO.AND.X2.LT.60.0DO ;))LOOP=1 GOT02153 XO=-B(1)/B{2) IF(XO.GT.O.ODO.AND.XO.LT.60.0DO)LOOP=2 IF(LOOP.EQ.O)GOT02154 2151 2153 C C C C C c c c IF LOOP=0=>NO NEGATIVE EMISSIONS FOR VELOCITYS BETWEEN 0,60 IF LOOP=1 OR 2=> NEGATIVE EMISSION RATES BETWEEN 0.60MPH. CALL SUBROUTINE PAD TO FIND COEFFICIENTS WHICH DO NOT PRODUCE NEGATIVE EMISSION RATES. CALL PAD(YS,B) 2154 BAD(IC,10)=B(1) BAD(IC,11)=B(2) BAD(IC,12)=B(3) 1000 CONTINUE RETURN END II-6 -------


EPA-460/3-74-024
OCTOBER 1974
              AUTOMOBILE  EXHAUST
         EMISSION MODAL ANALYSIS
                    MODEL EXTENSION
                     AND  REFINEMENT
         U.S. ENVIRONMENTAL PROTECTION AGENCY
             Office of Air and Wante Management
         Office of Mobile Source Air Pollution Control
           Certification and Surveillance Division
                Ann Arbor, Michigan 48105

-------
                                   EPA-460/3-74-024
   AUTOMOBILE EXHAUST

EMISSION  MODAL ANALYSIS
      MODEL EXTENSION
       AND REFINEMENT
                   by

              H. T. McAdams

             Calspan Corporation
               P.O. Box 235
             4455 Genesee Street
            Buffalo, New York  14221


            Contract No. 68-03-0435


        EPA Project Officer: C.J.Domke


               Prepared for

    U.S. ENVIRONMENTAL PROTECTION AGENCY
        Office of Air and Waste Management
     Office of Mobile Source Air Pollution Control
       Certification and Surveillance Division
          Ann Arbor, Michigan 48105

               October 1974

-------
This report is issued by the Environmental Protection Agency to report
technical data of interest to a limited number of readers.  Copies are
available free of charge to Federal employees,  current contractors and
grantees, and nonprofit organizations - as supplies permit - from the
Air Pollution Technical Information Center, Environmental Protection
Agency, Research Triangle Park, North Carolina 27711; or, for a fee,
from the National Technical information Service,  5285 Port Royal Road,
Springfield, Virginia 22161.
This report was furnished to the Environmental Protection Agency by
Calspan Corporation, in fulfillment of Contract No. 68-03-0435.
The contents of this report are reproduced herein as received from
Calspan Corporation. The opinions, findings, and conclusions  expressed
are those of the author and not necessarily those of the Environmental
Protection Agency.  Mention of company or product names is not to be
considered as ari endorsement by the Environmental Protection Agency.
                     Publication No. EPA-460/3-74-024
                                 11

-------
                                   ABSTRACT

          This report on modal analysis of automobile emissions was prepared
for the United States Environmental Protection Agency, Division of Certification
and Surveillance, Ann Arbor, Michigan, under EPA Contract No. 68-03-0435.  The
work reported herein constitutes a refinement and extension of a modal analysis
exhaust emission model originally developed under EPA Contract No. 68-01-0435.
This earlier effort was released as EPA-460/3-74-005, "Automobile Exhaust
Emission Modal Analysis Model".

          The modal analysis exhaust emission model makes it possible to calcu-
late the amounts of emission products emitted by individual vehicles or groups
of vehicles over an arbitrary driving sequence.  Refinements to the model permit
an improvement in computational efficiency and a reduction in input data require-
ments.  Extensions of the model include a scheme for computation of fuel usage
in terms of C0_, CO and HC output by means of a carbon-balance approach and a
procedure for more definitive assessment of the precision of the model in pre-
dicting group emissions.
                                      111

-------
                               ACKNOWLEDGEMENTS
        Support and encouragement by the Environmental Protection Agency is
gratefully acknowledged.  Particular thanks go to C.J. Domke, M.E. Williams,
and L.A. Platte for their guidance and suggestions.

        A number of individuals at Calspan Corporation contributed signifi-
cantly to the work.  These included P.E. Yates, who performed much of the
computer analysis, and A.C. Keller, who provided valuable suggestions and
criticism.

        Finally, special recognition goes to Paul Kunselman, formerly of
Calspan, who was a prime mover in the development of the original model and
whose insight was instrumental in the follow-on refinements and improvements.
                                     IV

-------
                              TABLF. OF CONTFNTS


Section                                                               Page

   1     INTRODUCTION 	      1

   2     MODEL COMPUTATIONAL EFFICIENCY   	  	      4

         2.1   ESSENTIAL FEATURES OF THE BASIC MODEL  	      4

         2.2   SIMPLIFICATION OF EMISSION INTEGRATION OVER
               DRIVING SEQUENCES    ... 	     12

         2.3   HYPSOMETRIC ANALYSIS OF DRIVING SEQUENCES  .....     14

         2.4   VEHICLE AND DRIVING SEQUENCE AS VECTORS
               AFFECTING EMISSIONS    . 	     15

   3     MODAL TESTING REQUIREMENTS   	     J7

   4     GROUP EMISSION PREDICTIONS   	     44

         4.1   MODEL PERFORMANCE FOR THE SDS BASE CASE	     45

               4.1.1   Accuracy of Group Emission Prediction  ...     46
               4.1.2   Precision of Group Emission Prediction ...     ^g

               4.1.3   Sampling Considerations  	     49

         4.2   MODEL PERFORMANCE FOR ARBITRARY DRIVING SEQUENCES  .     50

               4.2.1   Theoretical Background   	     50

               4.2.2   Variance Computations for Arbitrary
                       Driving Sequences  	     51

   5     PREDICTION OF FUEL ECONOMY   	     57

         5.1   PREDICTION OF C02    	     F8

         5.2   PREDICTION OF MILES PER GALLON   	     64

   6     SUMMARY AND CONCLUSIONS    	     71


APPENDIX  I -  VARIANCE FUNCTIONS FOR REGRESSION ESTIMATES  ....    T-l

APPENDIX II -  COMPUTER PROGRAM REVISIONS FOR INCREASED
               COMPUTATIONAL EFFICIENCY	IT.-1

-------
                              LIST OF FIGURES
Figure No.                           Title

   1       Average Speeds and Accelerations for Aceel/Decel
           and Steady State Modes	     18

   2       Acceleration Versus Speed,  Mode 23   	  .....     20

   3       Acceleration Versus Speed,  Mode 26   	     21

   4       Acceleration Versus Speed,  Composite for All
           Accel/Decel Modes      .  .	  .     22

   5       Speed and Acceleration Versus Time,  Mode 23    	     23

   6       Cumulative Averages for Speed and Acceleration,  Mode 23.     24

   7       Speed/Acceleration Test Design Points  	     29

   8       Normalized Variance Surface Based on 37 ( v, a)
           Design Points    	     30

   9       Normalized Variance Surface Based on 37 ( v, a)
           Design Points	     31

  10       Normalized Variance Surface Based on 67 ( v, a)  t/2
           Design Points    	     33

  11       Normalized Variance Surface Based on 67 ( v, a)  t/2
           Design Points	     34

  12       Normalized Variance Surface Based on 53 ( v, a)  t/2
           Design Points	     35

  13       Normalized Variance Surface Based on 37 (v, a)  t/2
           Design Points    	     36

  14       Arbitrary Driving Sequences  ....  	     54

  15       Mean Steady State C02  Emission Rates Versus Speed   ...     59

  16       Distribution of C02 Bag Error from the  Surveillance
           Driving Sequence	     61
                                                                 \
  17       Distribution of C02 Bag Error (First  505 Seconds FTP)   .     62

  18       Distribution of Error  for Miles per  Gallon  Based on  Bag
           Values from the Surveillance Driving Sequence    ....     68

  19       Distribution of Error  for Miles per  Gallon  Based on
           Observed Bag Value Calculations (First  505  Seconds FTP).     69

                                    •vi

-------
                               LIST OF TABLES


Table No.                                                             Page
    1      Principal Components (7) of the Correlation Matrix
           for HC for 37 Modes	    39

    2      Ten Rotated Factors of the Correlation Matrix for
           HC for 37 Modes     	    41

    3      Highly Loaded Modes by Factor Number  	    42

    4      Bag Value Error Statistics — Surveillance Driving
           Sequence, 1020 Vehicles	    46

    5      Total Computed Variance of HC over Various
           Driving Sequences	  .    55

    6      Means and Variances of Calculated and Observed
           Bag Values	    55

    7      Comparative Statistics for CO-  	    60

    8      Distribution of C02 Bag Value Error (Observed  Calculated)
           from the First 505 Seconds of the Federal Test Procedure
           (FTP) and the Surveillance Driving Sequence (SDS) ....    63

    9      Replicate Modal Analysis of CO2 for 61 Vehicles   ....    65

   10      Statistics for Miles/Gallon Error Based on Bag Values .  .    67

   11      Distribution of Miles/Gallon Error (Observed- Calculated)
           from the First 505 Seconds of the Federal Test Procedure
           (FTP) and the Surveillance Driving Sequence (SDS)   ...    70
                                      VI1

-------
1.      INTRODUCTION
        Under U.S. Environmental Protection Agency Contract No. 68-01-0435,
Calspan Corporation formulated a model for the prediction of motor vehicle
                                                     *
exhaust emissions over an arbitrary driving sequence.   The work reported
herein was performed under FPA Contract No. 68-03-0435 as a refinement and
extension of the original model.  Subsequent discussion will assume famili-
arity with the original model as presented in EPA-460/3-74-005, Automobile
                                            **
Exhaust Modal Analysis Model (January 1974);   however, wherever essential
to understanding, details of the model will be repeated in the present report
for the sake of clarity.
        The impact of motor vehicle exhaust emissions on air quality in a
given location depends on a number of factors:  the emission characteristics
of individual vehicles, the mix of vehicles of different types operating in
the location, the numerical density of vehicles per mile or per unit of area,
and the driving pattern in which the vehicles are employed.  To assess the
contribution of motor vehicles to air pollution, therefore, it is necessary
to estimate traffic density, composition and flow characteristics, and to
have some means for expressing these quantities in terms of pollution burden
to the atmosphere.
        The required traffic parameters can be'estimated in a straightforward
way.  Traffic in the vicinity can be monitored and classified according to
vehicle make, model, age, and other factors known to influence emissions.
Moreover, speeds and accelerations prevailing along the traffic way in ques-
tion can be measured and tabulated.  Unless emissions can be expressed as
functions of the applicable traffic parameters, however, it is not possible
to assess vehicular contributions to air pollution.
  Paul Kunselman and H.T. McAdams,  Automobile Exhaust Emission Modal Analysis
  Model,  Calspan Report No. NA-5194-D-3  (July 1973).
**
  Paul Kunselman, H.T. McAdams, C.J. Domke,  and Marcia Williams,  Automobile
  Exhaust Emission Modal Analysis Model,  Environmental Protection Agency
  Report No. EPA-460/3-74-005  (January 1974).

-------
        The emission tests used for certification of new light duty motor
vehicles are based on a prescribed driving sequence by.means of which vehicles
can be compared according to a standard set of operating conditions.  Though
this concept of a standard driving sequence makes it possible to implement
emission standards and to check compliance with these standards, the concept
does not facilitate the prediction of vehicle emissions over an arbitrary
driving sequence.  By breaking the standard sequence into segments (modes) having
specified speeds and accelerations, however, and noting the emissions produced
in each segment, it was postulated that these segments might be recombined appro-
priately to form other driving sequences of interest.  Ultimately, it was hoped
that this process might lead to a model for defining emissions as continuous
functions of vehicle operating conditions and thus make it possible to approxi-
mate emissions over any driving sequence of interest.
        As developed by Calspan under EPA Contract No. 68-01-0435, the
original modal analysis prediction model was based on the concept of an
instantaneous emission rate for each of the primary pollutants carbon
monoxide (CO), hydrocarbons (HC), and oxides of nitrogen  (NOX).  In this
model, it was assumed that the instantaneous emission rate can be adequately
defined as a function e = f(v, a) of instantaneous speed, v.and accelera-
tion, a,for each vehicle.  Since every point in. time over a driving sequence
has an associated instantaneous speed and acceleration, the total emission
over the driving sequence can be obtained by appropriate integration of the
emission rate function.  Moreover, by virtue of the mathematical form of
the model, it can be advantageously used to predict emissions  from either
homogeneous or nonhomogeneous groups of vehicles.
        Initial experience with the modal analysis prediction  model suggested
that it be refined and extended with the following objectives  in mind:

        1)    Investigate means to increase the computational
              efficiency of the model.
        2)    Determine whether modal testing requirements can
              be reduced without appreciable loss of information.

-------
        3)    Define the accuracy and precision with which group
              emission predictions can be made from modal data.
        4)    Use the modal analysis approach to predict fuel
              •economy over arbitrary driving sequences.
Each of these areas of investigation will be discussed in turn in subsequent
sections of this report.

-------
2.      MODEL COMPUTATIONAL EFFICIENCY
        Relative to the original formulation of the modal analysis emission
model, a significant increase in computational efficiency can he achieved by
a simplification of the method by which the instantaneous emission rate func-
tion, e (v, a)  , is integrated over a driving sequence.  As background for
this simplification, however, it will be instructive to review the essential
features of the model.

2.1     ESSENTIAL FEATURES OF THE BASIC MODEL
        Inputs to the model are based on the Surveillance Driving Sequence
(SDS), in which emissions are measured over a variety of steady state and
transient driving conditions.  The acceleration and deceleration modes repre-
sented in the SDS consist of all possible combinations of the following five
speeds:  0 mph, 15 mph, 30 mph, 45 mph, 60 mph.  The average acceleration or
deceleration rate observed for each mode in the Los Angeles basin is used
during operation of 20 of the transient modes.  In addition, 6 of the tran-
sient modes are repeated using accel/decel rates higher or lower than the
average rate in order to determine the effect of accel/decel rate on emissions;
        A difficulty presented by the use of the 37 discrete modes as inputs
to a continuous driving sequence model is that, during much of the sequence,
the vehicle may be operating at velocities and accelerations not included
in the set of five steady state and 32 accel/decel modes.  For example, a
vehicle traveling at 23 mph is neither in the 15 mph nor 30 mph steady state
mode.  To arrive at a continuous predictive model, one must be able to
interpolate or otherwise estimate the appropriate emission rates for all
combinations of speed and acceleration encountered in the driving sequence.
        The primary feature of the model is a scheme whereby emissions from
the 37 discrete modes can be expanded into a continuous function of time.
For this purpose, use is made of a regression function which can, for purposes
of visualization, be represented as a "surface" in speed-acceleration space
as shown below.

-------
                           Emission Response Surface
For any point (v,  a) in the speed-acceleration plane, there corresponds an
instantaneous emission rate e  (v,  a) .   The surface can be represented by a
mathematical equation of the form: e=  f(v, a)  in which the function f   con-
tains a number of adjustable constants.  These constants can be selected to
represent the emission characteristics of a particular automobile or can be
selected to represent the mean emission characteristics of a collection of
automobiles.
        The mass of a particular pollutant emitted by an automobile is a
cumulative, non-decreasing function of time,   e(t\  The time derivative of
this function yields the instantaneous emission rate as a function of time:
                          e(t)
                                 dCe(tfl
                                   dt
(1)
In the modal analysis model, it is assumed that the instantaneous emission
rate is a function of vehicle speed and acceleration, both of which are
functions of time.  Thus,
                          e(t)  = e[v(t),  act)]
and
                                                                       (2)
                                                                       (3)
                          e(T)  =  f   e[v(t),  a(t)] dt
                                'o
gives the mass of pollutant given off by a vehicle  in  a driving sequence
lasting T seconds.  Evaluation of the above integral requires  (1)  speci-
fying the driving sequence in terms  of v(t)  and  a(t)  » an^ (2) specifying
the emission-rate function in terms  of speed and acceleration.

-------
        In practice, a driving sequence is specified in terms of the speed
prevailing at each of n discrete, equally spaced points in time, as shown
below.
SPEED
V4
V3
V2
Vl

^* 	 "1

-^S




                                    TIME
The integration of equation (3) is then approximated by the summation:
                         n-1
                  e(T)  = $   e(v.,  a.)  At
where
                                                                       (4)
                  v.
                              vi
and
                  ai =
                            - vi
                  n A t = T
The applicable emission-rate function is developed by application of a
generalized version of multiple regression analysis.
        As a starting point for development of the multiple repression equa-
tion for emission rate as a function of speed and acceleration, it will be
instructive to consider first a steady-state emission rate function
defined for constant speed (zero acceleration) only.  It is assumed that
this function can be expanded in the form
          eg(v)  =
                                                                       (5)

-------
where a1, a_,..., a,  are constants applicable to a specific automobile or
group of automobiles, and f  (v), f_(v),..., f, (v) are referred to as basis
                           1.      £,          K
functions.  It is emphasized that these functions can assume any form con-
sistent with the data to be represented, the only requirement being that they
be linearly independent and not contain any adjustable constants dependent
on the data.  The latter requirement assures that, for a given choice of
basis functions, the function e (v) is completely defined by the model.
                               o
coefficients a1, a™,.. ., a, .
        In a similar vein, an emission rate function e.O, a) can be postu-
lated for non-steady-state operation in which a ^ 0.  It is assumed that this
function can be expanded as

             eA(v, a) = bjgjfv, a) + b^fv, a) -t- ... •»• b^Cv, a)     (6)

where b1, b_,..., b  are constants applicable to a specific automobile or
group of automobiles and the basis functions g.(v, a), g2(v, a),..., g (v, a)
are, as before, linearly independent and free of any adjustable constants to
be determined from the data.
        As an extension of equations (5) and  (6), it is logical to postulate
that, by appropriate definition of basis functions, it should be possible to
define an emission rate function e(v, a) applicable over the entire (v, a)-
plane regardless of whether a = 0 or a i 0.  Such a universally applicable
equation might assume the form
             e(v, a) = c^O, a) •«• c^fv, a) + ... + csus(v, a)      (7)

where c.., c-,..., c  are constants applicable to a specified automobile or
       X   £•       o
group of automobiles and the basis functions u^v, a), u_(v, a),..., u (v, a)
                                              JL         &             5
are linearly independent and contain no constants to be determined from the
data.  In the original development of the modal analysis model, however, it
was found advantageous to develop the instantaneous emission rate function
e(v, a) as a composite function
             e(v, a) = h(a) es(v) * [l-h(a)] eA(v, a)                  (8)
where h(a) is a weighting function bounded in the interval O^hfa)16!.

-------
As employed in the original form of the model, the function h(a)  was defined
as follows:
                          /  1
                                           a >0
                                 a+1,  <<  < a
                  h(a)
or
                                       h(a)
                            Acceleration
        By specifying the constants  <*-^ and  eC^ the weightings of the two
rate functions will vary between 0 and 1 in a continuous manner when the tran-
sition is made between accel/decel and steady state periods of driving.
        Once sets of basis functions have been established for equations (5)
and (6), the coefficients which define the instantaneous emission rate func-
tion could be determined by a straightforward application of least squares
theory provided that instantaneous emission rates were known for a sufficient
number of (v,a)-positions in the speed-acceleration plane.  In reality, how-
ever, the data base for vehicle emissions does not contain any instantaneous
emission rate observations for accel/decel modes; instead, the observations
reported are the total amounts of pollutant collected over each mode and it
is possible to calculate only the average emission rate prevailing during the
time in mode.  In this connection, however, it can be shown that for a
postulated form of the emission rate function, it is possible to deduce the
applicable model coefficients from the modal average emission rates.

-------
        To illustrate this point, consider a situation in which the instan-
taneous emission rate can be adequately expressed as a linear combination
of three basis functions g,(v, a), g~(v, a), and g,(v, a).  Then,
             e(v, a) = bjg^v, a) + b2g2(v, a)  + b^fv, a)             (9)
Consider a mode of time duration T.  The average emission rate over time T
can be computed from equation (9) as
                              T
             T = f  f  e[v(t), a(t)] dt                       (10)
                             o
and from the observed total emission over the mode as — e(T), where e(T)
is the "bag value" for the mode and v(t) and a(t) are the speed vs  time
and acceleration vs time profiles for the mode in question.   Then,
              e(v, a)>T = £eCT)4/  6[v(t), a(t)]  dt
             T = i (• jblg]L[v(t), a(t)]  + b^^vft),  a(t)]

                                                         dt             (11)
Termwise integration and removal of the constants b. ,  b_ and b_  from the
integrand yields
             T = bi  AT gJvCt),  a(t)]  dt)
                                   T
                                     g2[v(t), a(t)] dt}
                                  o
                                   T
                              y  (  g3[v(t), a(t)]  dt]                (12)

-------
Note, however, that the bracketed expressions are just the time averages of
the basis functions over the time duration of the mode.  Thus, one can write
                  —  - Vl * b2g2 * b3g3
where §., g_ and g  are, respectively, the time averages for g^, g2 and g3
over the mode in question.  Since the total emissions for each mode are known,
as well as the corresponding times, in mode, the time averages gj, g~2 and g"3
can be computed for each mode and the coefficients b^, b^ and b, can be obtained
through least squares regression analysis applied to the average emission rates
as computed from.modal bag values.
        In the context of model refinement and extension, the modal analysis
model as originally developed under EPA Contract No. 68-01-0435 should be
viewed as a family or continuum of models.  Though initial application of the
modal analysis model concept employed a specific set of basis functions, the
model in a broad sense is amenable to infinite variety in the choice of basis.
Indeed, choice of basis may itself present an avenue for model simplification
and for increased computational efficiency.  Every attempt should be made
to keep the number of basis functions to a minimum and to employ the simplest
basis functions compatible with the data.
        In this connection, it is of interest to review the reasoning by
which the basis functions for the original model were derived.  In the steady-
state (zero acceleration) case, the emission rate is a function of speed only.
For each of the three pollutants (CO, HC, NOX), steady-state emission rates
were averaged over the 1020 vehicles constituting the data base, and these
average emission rates were then plotted as a function of speed.  These plots
suggested that the steady state emission rate function eg could be expressed
as a quadratic function of speed:
                                               2
                       eg(v) » Sj + s2 v + s3 v                        (13)
where s., s  and s, are constants.
                                     10

-------
        In the case of non-zero acceleration, it was assumed that the accelera-
tion occurring at a given speed is a perturbation to the steady-state emission
rate at this speed.  This perturbation can be accounted for by expressing the
coefficients s., s    and s_  as functions of acceleration.  If it is assumed
              i   £        J
that quadratic functions of acceleration represent good approximations to
these coefficients, the coefficients can be expressed as follows:
                  Sl = Sl(a)
                     = s2(a)
                       S3(a) =
where the  q's are constants.  The emission rate function used during times
of non-zero acceleration e.  can then be written in the form:
                          • »
               e.(v,  a)  =  b  + b v + b  a + b.av  + bcv2  + b,a2
                A          1234       5      .6            (is)
                                   + b_v a + b.a v  + b«a v
                                      /       o       y
where the  b's are constants and can be expressed in terms of the  q's.  It is
noted that if a =0 equation, (15)  reduces to

                    e (v) = b.  + b_v + b_v                             (16)
which has the identical form as the equation for  e .  Thus, in principle,
e.  could be used to determine emissions for both steady state and non-zero
acceleration periods.  As noted earlier in the discussion, however, it was
found advantageous to express instantaneous emission rate as a composite
function

                 e(v, a) = h(a) es(v) + [l-h(a)] eA(v, a)

         9                                      9
in which es(v)  is  determined independently of e.(v, a).  In this way, the
model is provided with greater flexibility, especially in the vicinity of
zero acceleration, since it has 12 rather than 9 adjustable coefficients for
defining the instantaneous emission rates for each pollutant.

                                     11

-------
2.2     SIMPLIFICATION OF EMISSION INTEGRATION OVER DRIVING SEQUENCES
        In the original version of the modal analysis model, computation of
total emissions over a driving sequence of time-duration T was achieved by
performing the integration
                           ?
                   S(t) =  J   4[v(t), a(t)]  dt
(18)
for each vehicle or group of vehicles of interest.  As will become apparent
below, however, the integral can be reformulated in such a way that,  for a
particular driving sequence, a single integration suffices for all vehicles
subjected to that sequence.
        The composite emission function, as shown in equation (17), can be
written in terms of the basis functions f . (v)  and g.(v, a)  Noting that

                          k                                             (19)
                 es(v)  =  I. a
  A                      i = 1
and
                 e.(v,  a)  = 2  b.g.(v, a)                               (20)
                  "•              J  J
                            j = l
one can substitute (19) and (20) for eg (v)   ande.(v, a) in  (17) and integrate
to obtain
                         T          k
                >(t)  =  j   h[a(t)]  £   a.f.[v(t)]  dt.
                        o
                        ,T
                                       2  bjg;j[v(t), a(t)]  dt
                                       3=1                               (21)
In view of the fact that a.  (i = 1.2,..., k) and b.  ( j= 1,2, — ,r) are con-
stants, (21) can be rewritten as
                                     12

-------
                 e(T)  =  2  a.  I   h[a(t)]  fJvCt)]  dt


                                          )])  g.j[v(t),  a(t)]  dt
                                                                       (22)


Note that (22) contains  k integrals of the form

                                        dt,     i = 1, 2,.,. .. k        (23)
                  I     h[a(t)] fjj
                  o
and  r integrals of the form
                  S
                 I  (l-h[a(t)]} gjjvOO, a(t)]  dt,
                                                       j = 1, 2,..., r  (24)
                  o
The integrands of (23) and  (24) are just weighted  forms of the basis functions
and do not depend on the magnitudes of the coefficients a.  and b.  .   Conse-
quently, once these k+r  quantities have been computed for a given driving
sequence, it is necessary to know only the applicable model coefficients a.
and b.  in order to compute emissions for a particular automobile or group
of automobiles negotiating that driving sequence.
        For the choice of basis functions employed in the original model,
k = 3 and r = 9.  Therefore, for each pollutant there are 12 integrals  to be -
evaluated.  These 12 quantities can be combined with the coefficients a.,  and
b.   to compute the mass of pollutant emitted by a particular automobile or
group of automobiles in performing the specified driving sequence.  Subroutine
ESUM of the original model has been revised to integrate the weighted basis
functions over a specified driving sequence and return the results to the
main program where the total emission is calculated.  The revised versions
of ESUM and the main program are given in Appendix II.
                                     13

-------
2.3     HYPSOMETRIC ANALYSIS OF DRIVING SEQUENCES
        As shown above, computational efficiency of the model can be improved
by noting that, for a given driving sequence, the integrated forms of the
basis functions are invariant and do not need to be recalculated unless a
different driving sequence is postulated.  Further efforts to improve effi-
ciency were aimed at a hypsometric characterization of driving sequences.
        Hypsometry is a term used in geodesy to characterize the measurement
of surface elevation.  In particular, the hypsometric integral is a function
used to quantify that fraction of a geographic area which exceeds a given
threshold level, where the threshold level can be regarded as a continuous
variable.  As applied to the modal analysis model, the hypsometric integral
would provide a characterization of the relative frequency of occurrence of
various speed and acceleration levels.
        In the original form of the model, the driving sequence is described
by specifying the speed for each time increment (generally one,second) in the
sequence.  This specification, in turn, establishes the acceleration during
each increment of time.  It should be noted, however, that the computed con-
tribution to emissions during a particular time increment depends only on the
speed and acceleration prevailing during that time interval and is independent
of the speed-time history of the vehicle.  In short, a particular combination
of speed and acceleration is regarded as making the same contribution to the
pollutant output of a vehicle regardless of whether that speed-acceleration
combination occurs early or late in the overall driving sequence.  In view
of this fact, it appeared feasible to describe the speed-time history of a
driving sequence in terms of the joint frequency distribution of speed and
acceleration.  It was further postulated that, for "typical" driving sequences-
e.g., urban or rural—it might be possible to express the distribution func-
tions in terms of a few adjustable parameters.  For example, if speeds and
accelerations for a particular sequence essentially were to obey a bivariate
normal distribution, then specifying the means, variances and covariance of
speed and acceleration would suffice to describe the distribution.  A useful
application of such parametric description of driving sequences might be in
                                     14

-------
characterizing the various branches of a road network hypsometrically, so
that pollution abatement studies aimed at optimizing routes in a network
might be more amenable to analysis.
        As far as computation of the total emissions e(T) over  a driving
sequence of T seconds is concerned, implications of the hypsometric analysis
of speed and acceleration would be felt through the functions v(t) and a(t)
in equation (22).  In view of this fact, and in view of the readiness with
which the weighted basis functions can be computed, no further development
of the hypsometric description of driving sequences was pursued.  In reality,
the values of the k+r  integrals in equations (23) and (24) constitute a com-
plete description of the driving sequence so far as the model is concerned, and,
within the limits of validity of the model, completely characterize the effect
of the driving sequence or "route" on emissions.  Similarly, the values of the
k  coefficients a^  (i = 1,2,... k) and the  r coefficients b. (j = l,2,...,r )
completely characterize, again within the limits of model validity, the
"vehicle effect" on emissions for that particular route.

2.4     VEHICLE AND DRIVING SEQUENCE AS VECTORS AFFECTING EMISSIONS
        The relation between vehicle and driving sequence (route) effects on
total emissions e(T)  can be expressed succinctly in vector notation.
        Let the values of the k+r  integrals in equations (23) and (24) be
considered as components of a (k+r ) -dimensional driving sequence vector

               = ^ ' ^' ' ' * * ^* ^   ' ' ' ' ' "*
Similarly, let the k  coefficients a. (i = 1,2,..., k) and the  r coefficients
b. ( j= 1,2,... , r) be considered as the components of a (k+r ) -dimensional
vehicle vector
             V = (al , a2 , . . . , ak , bj , b2 , . . . , br)                      (26)

Then, the total emissions  e(T) for a particular vehicle operating according
to a specified driving sequence can be expressed as the vector inner (dot)
product
                                     15

-------
                           e(T) = S_ • V                               (27)
It is to be observed that when the driving sequence consists of a single mode,
either steady-state or accel/decel, the dot product in equation (27) reduces
to a computed estimate of the bag value for that mode.  Also, it should be
noted that the vehicle vector (26) can represent a group of vehicles rather
than a single automobile.
        The vector form of the model, as elucidated above, can be further
systematized to consider the effects on emissions of various mixes of vehicles
and various driving sequences (routes).  Let s,, s,,,..., s  denote the sequence
                                             —i  —i      -p
vectors for p alternative driving sequences, and let y,, v_,..., v  denote the
the vehicle vectors for q alternative mixes of vehicles.  Then, if the vectors
s,, S-,..., s  are considered as columns of a matrix S and if the vectors
—i  —,i      -p
Y_l» yj»*"» v  are considered as columns of a matrix V, then one can write

                           E = S' V                                   (28)
where S is a matrix of order p x (k+r), V is a matrix of order (k+r) x q, and
E is a matrix of order p x q.  The matrix E consists of elements
e.. (i = 1, 2,..., p; j = 1, 2,..., q) which provide estimates of the total
emissions generated by the jth mix of vehicles operating according to the ith
driving sequence.
                                     16

-------
3.      MODAL TESTING REQUIREMENTS
        As originally implemented, the modal analysis model employed three
basis functions in the steady-state portion of the emission-rate equation
and nine basis functions in the accel/decel portion of the equation.  The
fact that the resulting 12 regression coefficients are considerably fewer
than the 37 modes used as data inputs to the model suggests that there is
a certain amount of redundancy in the modal data.  On the other hand, there
are regions of the speed-acceleration plane not adequately represented by
modal data, and this fact could occasion unwarranted imprecision in the
performance of the model, particularly in those regions of the speed-
acceleration plane where modes are sparse.  A revised allocation of speeds
and accelerations by modes, as well as a possible reduction in the number
of modes, is therefore suggested, provided this reallocation and/or reduction
does not adversely affect other aspects of the emission-measurement protocol.
        Several techniques were employed to examine the implications of re-
allocation of modal test points in the speed-acceleration plane.  These
included visual examination of the modal-distribution pattern, the computa-
tion of variance maps indicative of error propagation over the (v, a )-plane,
and principal component analysis of modal contributions to the model
performance.
        Figure 1 is a plot of the average speeds and accelerations for the
32 acceleration/deceleration modes and the 5 steady-state modes which consti-
tute inputs to the emission model as originally formulated.  The sparse or
                                        )
empty regions of the speed-acceleration plane are clearly evident, particularly
that portion of the plane between accelerations of -1 mph/sec and +1 mph/sec.
As will become apparent later, the lack of information in this region of the
plane tends to exaggerate the uncertainty of prediction in that region and is,
at least in part, the reason that the steady-state and accel/decel portions
of the model must be bridged in a rather arbitrary way in the original model.
        In view of the fact that for each mode a speed is specified for each
second of time in mode, it is possible to estimate the corresponding
accelerations on a second-by-second basis and to plot acceleration versus
speed profiles for each mode.
                                     17

-------
             accei/decel  and steady
18

-------
        Let us consider, however, the second-by-second schedules maintained
in the various modes and plot acceleration/speed profiles on this basis.
Figure 2 is such a plot for mode 23, an acceleration mode, and Figure 3
is such a plot for mode 26, a deceleration mode.  As is to be expected,
these plots show that actual speeds and accelerations realized over short
time increments in these modes span regions of the acceleration-speed plane
not represented if only the modal averages are considered.  This fact is
made clear by Figure 4, which is a composite plot of second-by-second
accelerations and speeds achieved when results of all 32 accel/decel modes
are combined.  The plot suggests that many of the gaps shown in Figure 1
might be filled in if appropriate speeds and accelerations in Figure 4 can
be regrouped and averaged to present a revised set of modes more advantageous
as model inputs.
        Consider the time plot of mode 23, as shown in Figure S.  A noticeable
degree of asymmetry in this plot is evident.  For example, the early part of
the mode exhibits greater accelerations than the latter part of the mode,
and this fact suggests that the mode might be divided into two parts so as
to provide a model input to fill part of the gap presently existing in the
low-acceleration region.
        A scheme for examining this concept is as follows.  Compute average
speed and acceleration for the first n seconds in mode and for the remaining
N-n seconds in mode, where N is the total number of seconds in mode.  Plot
these two results as functions of n over the region 0  n  N, as shown in
Figure 6.  This plot provides a set of options for redefinition of the mode
so as to more adequately span the speed-acceleration plane.  By electing
various options for redefinition of the modal inputs to the model, one can
examine the consequences of this redefinition by means of the variance-function
concept explained below.

3,2     VARIANCE-FUNCTION ASSESSMENT OF ERROR PROPAGATION
        For a particular pollutant and for a particular vehicle or group
of vehicles, the emission measured for each of the 5 steady-state and 32
accel/decel modes can be regarded as a random variable.  In other words,

                                     19

-------
                                             ACCELERATION (MPH/SEC)
to
                     4         3210        1        2        3        •>        5        «
    *** ••**•*»•»».»*•••*«•******•» + **«*****•*•«*«»**+»*«•«•«**# + »« •+•++•••+•+•• + *•»#•*•**+•*«*•»»•***•»**«*«»#**•*«»»***•»***••«
   1                                           •
   2
   3                                 •
   4
   5                             •
   6
   T
   B                         *
   9
  10
  11
  12                          *
  13
  14
  15                           •
  16
  17
  IB                            *
  19
  20
  21
  22
  23
  24
  25                                       •
^2*                                        *
I 27
Q. 28                                           •
§ 2*                                              *
~30                                             »

2 »
m J*
S 33
00 34
  35
  36
  37
  38
  39
  *0
  VI
  42
  43
  <4
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  6C
                            Figure 2   ACCELERATION VERSUS SPEED MODE 23

-------
                                       ACCELERATION (MPH/SEC)
                                          1       0       1
   i
   2
   3
   4
   5
   6
   7
   a
   9
  1C
  11
  12
  13
  1*
  15
  16
  17
  18
  19
  70
  21
  22
  23
  2*
  25
  26
= 27
± 28
g 29
= 30
0 31
HI 32
£J 33
8> 3*
  35
  36
  37
  38
  39
  40
  «1
  42
  t3
  46
  47
  48
  49
  50
  51
  52
  S3
  54
  55
  56
  57
  58
  59
  60
                        Figure 3  ACCELERATION VERSUS SPEED MODE 26

-------
              ACCELERATION (MPH/SEC)
          2       1      0       1
























ro
K> -^
(^

. £>
o
ILI
111
Q.
V)


























1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
46
49
5C
51
52
53
54
55
56
57
5b
59
60
* •* » • » * *» ** *
* * * • » « ' .' ' .
• • " ** **
• •» • • * •
» * . * * • •
• • •
* • • • » • • »•
• * *
* * • • . • .
* ' ' ' • •
' . . * ' * * * »
• *»**• . ' • ' •
* * •
» » » ' • *
* • • * • * *
* *• * * *
• . * * * * * -
*»••• • » •
* * ** * *• * * * •
• •
******** * *•»
• * * ' ' * * •
* * * * * *
*•*.*• » * • •
**•»*» » •
*•• • **•* •
• • . » • » ** * * *
* ** * * * *** • •
• • •** * ** * *»*• * *
*• ******** • *
* * *• * * ** •
* **» • »• »
• •* **** •
• * * * •
*» * * • * * *
*** *****
***** * *** *
* * * *
*»* ** * . * * * • •
* * ** •****•
****** * *
******* ** ****
****** **• •
* * •• * ** * *» *
***** • * •'
• * * * * *** *
***** • *
* »* »• • *****
* ** • , * *
* * * * *****
* * • » » •
***** ****
* ***** •* * *
***** *» • ••
***» * *****
* «• •« * * * * *
***** •» » • ' .
*• *• • *»•* »
* * • • **•* •
                                                              6
                                                            •«•»«
Figure 4  ACCELERATION VERSUS SPEED COMPOSITE
        FOR ALL ACCEL/DECEL MODES

-------
                        Speed and Acceleration versus  Time
                                    (Mode  23)
                               ttnttttitttitffti i! n iii'-H^i iifi^t'tp' Si
 8     9      10
Time  (seconds)
13
14
15

-------
     15
     14
13
12
11
                                        Last N -'h "points  (c.f., v- and a")
10
8
                                                        4* I i 111:111111
U
O
V)
               m
£K=
.£3.0

&
                 -^
          m
                            ^+*
        ^
        45
                                            i_t-j_j.4 (-i4_W4-J
                               FIGURE 6

           Cumulative Averages for Speed and Acceleration
                                     5      67     8

                                     First n Points  (c. f.
                                                               10
                                                           11    12
                                                                13
                                                                14
                                                               15
                                                         and

-------
the measurement of the modal bag values is subject to error, a fact well
demonstrated by the inability to obtain the same emission mass measurements
on repeated or replicate tests.  Each measurement, therefore, can be regarded
as being subject to a certain variance.  This variance can be expected to
propagate through the regression model to induce uncertainty in the emission
estimates computed at every point in the (v, a ) -plane.  The magnitude of
this uncertainty varies as a function of position in the (v, a ) -plane and
can thus be regarded as a variance function of speed and acceleration.
Conceptually, this function can be viewed as a variance "surface" and can be
graphically portrayed by means of variance contours.  The variance function
can be computed if the basis functions of the emission-rate function are
specified, if the locations of the modal input points in the (v, a ) -plane
are known, and if there is available an estimate of the error variance for
each of the input-mode bag values.
        The functional representation of the emission rate function used in
the automobile emission model is given by the weighted composite of the
accel/decel and steady-state instantaneous emission rate functions, e.  and
•
e~  respectively:
                            e(v, a) = LJ eg(v) + (1-W) eA(v, a)       (29)

where  u» is a weighting function dependent upon acceleration.
        The accel/decel and steady state instantaneous emission rate func-
tions  are expressed as linear combinations of basis functions of speed and
acceleration.  In general, the  linear model which gives the true response,
/  , of a vehicle or group of vehicles is given by:
where   ft-  ,  i= 1,2,3  ...  are constants,  f . , f ~  , f , ...  are the basis
functions of velocity and acceleration, and  e is the random error.
        Since  e is a random variable, the responses observed at each (v, a 1
point also  constitute a random variable.  As a result, it is only possible to
obtain  from the observations an equation of the form:
                                     25

-------
                            y = blfl Vb2f2 + V3 +  •'•
where  y is an estimate of y  and  b.  , i = 1,2,3 ...  is an estimate of /?..
The estimated responses predicted by the model are currently based on measure-
ments of bag values at 32 accel/decel and 5 steady state average velocity/
acceleration points of the Surveillance Driving Sequence.
        A detailed explanation of the method of computing the variance func-
tion for regression estimates is given in Appendix II and will not be
duplicated here.  Suffice it to say that the variance function is controlled
by three considerations:

        1)    The type of basis functions employed in the
              regression model. •
        2)    The positions in the (v, a )-plane, called
              design points, at which modal emission measure-
              ments are taken, and
        3)    The magnitude of the error variance a  at each
              design point.
                                 2
For purposes of this analysis, a   is regarded as constant over all design
points.
        The estimated emission response  y as computed by the modal analysis
model is a weighted combination of the estimates obtained from the steady-
State estimate JL  and the accel/decel estimate y   :

                            y = o>ys + (1-(J)  yA                       (32)

Therefore, on the assumption that the errors involved in the two components, of
the estimate are statistically independent,

                  Var(y)  = 0>2 Var(ys)  + (1-6;)2 Var(/A)
                                     26

-------
In the following discussion,the variance function has been computed using
this weighted combination of the steady-state and accel/decel portions.
        Var (y)  varies at different coordinates in (v, a )-space.  At some
points the response can be estimated with relatively little error; at other
positions the error can be quite large.  As shown in Appendix I,  the
variance in the estimated response at a point P  in the (v, a )-plane is
given by

                            Var(y)  = x^ (X'X)"1 x.' O2                   (34)

where  x_ is a vector obtained by evaluating each of the basis functions at
the particular point P  and  X'X is matrix of the least squares normal equa-
tions.  Therefore, for every point  P, (34) is actually a variance function.
                            2
By dividing both sides by 0 , one can obtain the function in normalized form:

                            Var(y)/<72 =  x (X'X)"1  x'                   (35)

This emission-rate variance function can be viewed as a response surface
generated by evaluating the function at given increments over any region of
interest.  The propagation of error over the (v,  a)-space can be considered
relative to the basis functions used and the design points chosen by examina-
tion of (35), the variance function in normalized form.  The actual magnitude
of the variance at any point can be examined by evaluating (34), which
                                                         2
includes a scalar multiplication by the error variance  O .
        The reduction in the number of modes or alteration thereof without
loss of information was to be investigated.  To this end, the change in the
variability of the emission rate function as a result of changing the modal
design points was examined.  Variance surfaces were generated using the
normalized variance function so as to isolate the error introduced by changing
the design points without introducing the actual error variance O .
        As a base for purposes of comparison, the variance surface using the
average velocities and accelerations of the 32 accel/decel and 5 steady state
modes of the Surveillance Driving Sequence was generated.  Figure 7 shows the
                                     27

-------
locations of these initial design points as "dots" which have been labeled
with their modal numbers.  The resulting variances are contoured at various
thresholds in Figures 8 and 9.
        Although average velocities ranging from 0 mph to 60 mph and average
accelerations ranging from approximately -3 mph/sec to +2.5 mph/sec are
included, examination of Figure 7 reveals that the actual (v, a ) points are
quite randomly located and do not appear to adequately represent the entire
region.  In particular, the region of -1.2  ^ a  ^ 1.0 is not well covered
except for the steady-state modes.  (It was due to this lack of information
and the associated uncertainty involved in predicting emissions that the
accel/decel and steady-state functions were weighted in the model.)  Also
not well represented are the regions in which velocity approaches 0 mph or
60 mph and the absolute value of the acceleration rate is large.  By dividing
each of the accel/decel modes into subsets, it was possible to "fill in"
regions which were poorly represented.
        The following strategies for decomposing the modes were investigated.

        1.    0 - t/2,   t/2  - t               (t  is the mode duration)
        2.    0  - t/3,   t/3  - t
        3.    0  _ 2t/3,  2t/3 - t
        4.    0  _- t/3,   t/3  - t  for decel modes
              0 _- 2t/3,  2t/3 - t  for accel modes
        5.    v1 - (v..+v_)/2, (v.-«-v2)/2 - v2     (v.. = mode initial velocity)
                   _  _                          (v. = mode final velocity)
               1 ~"  '      2                     (v  = mode average velocity)
        7.    0-4 sec., 4 sec.  —  t
        8.    0  - (t-4) sec., (t-4)  sec.  - t

In each case, each segment was arbitrarily constrained to cover at least 4
seconds to allow for adequate data collection.
                                     28

-------
  1
  2
  3
  4
  5
  6
  7
  8
  9
 »0
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
60
  *
o
o
                           Acceleration (MPH/sec)
                                                            •    *    *
                                                                 •  *
           ,1
       *    •
                                                             *


                                                         If   *
                         *
                      *     *
                                     *          «
                                       * *
                                                                  .20
                 /7
                 •
                                                              IS
                                                       *   •    #
 *    * *•

/}  /7  S
     *
                     II
                    •
                    1
                                                     A» •  *
                                                       #
                        *  I**
                                                    *.*     *
                     FIGURE 7   SPEED-ACCELF.RATION TEST DESIGN POINTS
                                                                     29

-------
           FIGURE 8    Normalized Variance Surface Based on 37 (v,a) Design Points
     THRESHOLDS
            0.1000         0.1500         0.2000        0.2500        0.3000
            12345
     —
     _                                   0.3500        0.4000        0.4500
                                         6              7             8
                         87777888       88777778
                      876555556666666666555555567
                    765444444444444444443333334568
                   8654433333333333333333222222334568
                 8765433322222233332222222222222334578
                   7654333222222222222222222111122233457
~ •  -             876544333322222222222222222111222234567
«    -              76554433333333333333222222222222334567
^.   -                8765444333333333333333332222222334567
jx    -                 876554444444444444443333333333344578
^   -                8665444333333333333333333222223334567
g    -             8765544333333333333333333332222233344567
•S-   -           87665444433333333344443333333333333333445678
«    -         87665554444444444445555554444444444444444455678
&    -       87766655555555666666666666666666665555555555566678
u  ,;0^.      88777777777788888              88888777777777788
u    -       87766655555566666667777777777766666655555555566778
              876655555444555555555555555555544444444444556678
                 77655444444444444444444444443333333334445678
                   76654444333333333333333333333333333445678
                    876554443333333333333333333222233344568
                      87655444444444444443333333222233345678
                       876655444444444444444333333333344567
                       7665544444444444444333333222223344568
                    876554444333333333333333222222223334568
                   8765544333333333333333322222222223344567
                 876544333333333333333322222222223344568
                87654333322222333333333322222233345678
                 7654333222233333333333333333344567
                   8654333333334444444444444445567
                    764443444455566666666666678
                      87655556677788888888888
                        8777788
  ~4 ~o                           SPEED (MPH)
                                 C •;,•'..                               70

                                    30

-------
            FIGURE 9  Normalized Variance Surface Based on 37 (v,a) Design Points


        THRESHOLDS
               0.4500        0.5000        0.7500        1.0000        2.5000
               A             B             C             0             E

                                           5.0000        7.5000        10.OOOO
                                           F             G             H

  »4 —              HHGGFFFFFFFFFFFFFFFFFFFFFFGGGHH
                   HHGGFFFFFFFFFFFFFFFFFFFFFFFFFGGHH
                  HHGFFFFFFFFFFFFFFFFFFFFFFFFFFFFGGH
                 HGGFFFFEEEEEEEEEEEEEEEEEEEEEEFFFFGGH
                HGGFFFFEEEEEEEEEEEEEEEEEEEEEEEEEFFFGGHH
               HGGFFFEEEEEEEEEEEEEEEEEEEEEEEEEEEEFFFGGHH
             HHGGFFFEEEEEEDDDEEEEEEEEEEEEEEEEEEEEEFFFFGGH
           HHGGFFFEEEEEDOCCCCDDDDOODDDODDDCCDDDEEEEEFFFGGH
          HGGFFFFEEEEDCCCCCCCCCCCCCCCCCCCCCCCCCDOEEEEFFFGGHH
     - HHGGGFFFEEEEDDCCBAAAAAAAABBBBBBAAAAAAAABCCOEEEEFFFFGGHH
     -HGGFFFFEEEEEDCCAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDEEEEFFFFGGHH
     -GFFFFEEEEEDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCOEEEEEFFFFGGGHH
     -FFFEEEEEDDCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCDEEEEEFFFFFGGGHH
     -FFEEEEEODCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCDEEEEEEFFFFFGGGHH
     -FEEEEEEDDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEEFFFFFGGGGHHH-
~   -FEEEEEEEDCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEEFFFFFFGGGH-
u    -FFFEEEEEEDDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEEFFFFFFGGG-
«    -FFFFEEEEEEDDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEEFFFFFGGGG-
x    -FFFFFEEEEEEDOCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCDOEEEEEFFFFFGGGGH-
t    -FFFFEEEEEEDOCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCODE.EEEEEFFFFFGGG-
^   -FFEEEEEEODCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDOEEEEEEFFFFFFG-
o    -EEEEEEDDCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCODEEEEEEEFFFF-
v    -EEOOOCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDDEEEEEEEFF-
S    -OOCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDOEEEEEEEE-
"  0  —ODCCCCBAAAAAAAAAAAAAAAAABBBBBBBBBBBBBAAAAAAAAAAAAAAAAABCCCCODDEEEEEEEEF-
o    -DCCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDEEEEEEEE-
<    -EODDCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCODEEEEEEEEF-
     -EEEEEDDCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCOOEEEEEEEFFF-
     -EEEEEEEODCGCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDOEEEEEEFFFFFF-
     -FFFEEEEEEOOCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDOEEEEEEFFFFFGG-
     -FFFFEEEEEEDDCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCDDEEEEEEFFFFFGGG-
     -FFFFFEEEEEEDOCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDEEEEEEFFFFFGGGG-
     -FFFFEEEEEEDDCCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCDEEEEEEFFFFFGGG-
     -FFEEEEEEEDDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCODEEEEEEFFFFFGG-
     -FEEEEEEOOCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCOOEEEEEEFFFFFFGG-
     -EEEEEEDDCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEEEFFFFFFGGGH-
     -EEEEEEOOCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCDEEEEEEFFFFFGGGGHHH -
     -FFEEEEEOCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCODEEEEEFFFFFGGGHHH
     -FFFFEEEEEDCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDOEEEEEFFFFGGGHH
     -GGFFFFEEEEDDCBAAAAAAAAAAAAAAAAAAAAAAAAAAABCCDDEEEEFFFFGGGHH
     -HHGGGFFFEEEEDCCAAAAAAAAAAAAAAAAAAAAAAABCCCDDEEEEEFFFGGGHH
         HGGGFFFEEEEDCCAAAAAAABCCCCCCCCCCCCCCCDDEEEEEFFFFGGHH
           HGGFFFEEEEDDCCCCCCCCCCODODDDDDDDDDDEEEEEFFFFGGHH
            HHGGFFFEEEEDDCCCDDODOEEEEEEEEEEEEEEEEEFFFFGGH
              HGGFFFEEEEEDOOEEEEEEEEEEEEEEEEEEEEEFFFGGHH
               HHGFFFEEEEEEEEEEEEEEEEEEEEEEEEEEFFFFGGHH
                 HGGFFFEEEEEEEEEEEEEEEEEEEEEEFFFFFGGHH
                  HGGFFFEEEEEEEEFFFFFFFFFFFFFFFFFGGH
  4  —             HGGFFFFFFFFFFFFFFFFFFFFFFFFFFGGH

     0!                           SPEED- (MPH)                                  70
                                      31

-------
        The "stars" in Figure 7 are the design points resulting from using
strategy #1, which simply divides each accel/decel mode into two subsets
based on t/2; this procedure results in using 69 design points, 64 obtained
from accel/decel modes and 5 from the steady-state modes.   Figures 10 and 11
are threshold maps of the variance surface resulting from using these design
points.  It is obvious that the entire level of the variance was lowered as
a result.
        In order to investigate the changes in the variance as a result o'f
the reduction of the number of modes, a normalized variance surface was
generated after certain modes had first been excluded.  It was decided to
drop 1/4 of the modes simply by excluding points in regions where there
seemed to be ( v,  a) redundancy.  The modes excluded were 13, 22, 23, 25, 27,
28, 30, and 31. The variance map of the depleted design worsened as expected.
However, the t/2 expansion of the 24 modes used (53 design points including 48
accel/decel and 5 steady state) actually showed improvement over the full
modal t/2 expansion in some regions.  Figure 12 shows the resultant variance
surface for thresholds less than 0.45.  This surface is a definite improve-
ment over using the initial modal ( v, a) design points.
        In a second strategy, one-half of the modes were dropped.  F.xcluded
were modes 4, 5, 6, 9, 13, 14, 15, 17, 19, 22, 23, 24, 27, 28, 30, and 31.
The choice of design points in this instance was guided by the results of
principal component analysis, to be discussed later in this report.
Figure 13 shows the resultant variance surfaces using the t/2 expansion of
the remaining 16 accel/decel modes and 5 steady-state modes for thresholds
less than 0.45.  This surface was generated using 37 points as was the
surface based on the original modal points.  Comparison of Figures 8 and 13
clearly shows that an improvement in the normalized variability can be
realized by appropriate choice of design points.

3.3     FACTOR ANALYSIS OF MODAL DATA
        The test data that comprise the input to the original modal emissions
model are measurements of individual vehicle emissions given off in time
                                     32

-------
    FIGURE  10   Normalized Variance Surface Based on 69 (v,a) t/2 Design Points
     THRESHOLDS
            0.1000        0.1500        0.2000        0.2500        0,.30GO
            1             2345
 *4' —433333445678
     -43333334456778                    0.3500        0.4000        0.4500
     -4333333344566778                  6              7             8
     -44333333344556677788
     -44333233334445556667777888
     rr443332223333444455 556666677788
     -543332222333333444444555556667788
     -54333222222233333334444444455556678
     -54333222222222233333333333344444556678
     -54433322222222222222233333333333444556778
     -5443332222222222222222222222222333334455678
     -5443332222222222222222222222222222223334455678
     -5443333222222222222222222222222222222222333445678
     -54443332222222222222222222222221111112222222334455678
     -55443333222222222222222222222211111111111112222333445678
     -55443333222222222222222222222221111111111111111222233445678
     -55444333322222222222222222222222211111111111111111222333455678
^    -554443333222222222222222222222222221111111111111111222233445678
J    -6544433333222222222222222222222222222211111111111122222334456778
%    -44333322222222222222222222222222211111111111111111122222334455678
^    -44333222222222222222222222222222222222221111111112222223334455678
|    -654433332222222222222222333333333333222222222222222233334455678
~    -8766554443333333333344444444444444444444433333333333444556778
g    -   887665555555555555666666666666666666655555555555556678
£ 0 —       88777777777788888             88888777777777788
t    -    87766555555555566666667777777776666666555555555566778
^    - 87765544444444444444444455555555555444444444444444445556788
S    -76554433333322223333333333333333333333333333333333333344556678
o    -5544333222222222222222222222222222222222222222222222233344556778
     -54433322222211111111111112222222221111111111111112222223344556778
     -5544332222221111111111111111111111111111111111111112222334455678
     -765544333222222111111111111111111111111111111111112222334456678
     -76654433322222111111111111111111111111111111111112222334455678
     -766544332222211111111111111111111111111111111112222233445678
     -7765443322221111111111111111111111111111111122222334455678
     -8765443322211111111111111111111111111222222223334456678
     -876544332221111111111111111111222222222233334455678
     -876544332221111111111111122222222223333444556778
     -87654332222111111111112222222233333444556678
     -87654332222111111122222222333334445566778
     -87654332222221222222223333344455566788
     - 7654333222222222233333444555667788
     - 765443322222223333444455666778
     - 765443332223333444555667788
     - 86544333333344455666778
     - 86544333334455567788
     - 86544444445566788
     - 86554444556678
  -4 -£ 875544555678                                                          -

                                  SPEED (MPH)

                                     33

-------
            FIGURE  11   Normalized Variance Surface Based on 67  (v,a)  t/2 Design Points


       THRESHOLDS                               .                        „  M     -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDEEEE-
:-.     -AAAAAAA A AAAA AAAAAAAAAAAAAAAAAA AAA AAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDDEE-
&     -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDEE-
—     -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCD-
g     -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCD-
•H     -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDDD-
£     -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDDDEEE-
v     -CCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDDEEEEEEE-

u     -CCCBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDEEEEEEE-
      -BAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCODOEEEE-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDD-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCC •
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCO-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDD-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDEE-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDOEEE-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCDDDEEEEEE-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCODDEEEEEEEE-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCDDDEEEEEEEEEEF-
      -AAAAAAA AAAAA AAAAAAAAAAAA AAAAAAAAAAAAA AAAAAAA AA'AAABCCCCDOOEEEEEEEEEEFFFF-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBCCCCDDDEEEEEEEEEEFFFFFFF-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCCDDDEEEEEEEEEEFFFFFFFFFG-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCCDDDDEEEEEEEEEEFFFFFFFFFGGGG-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABCCCCCCDDOOEEEEEEEEEEEFFFFFFFFGGGGGGH-
      -AAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBCCCCCDDDDEEEEEEEEEEEEFFFFFFFFGGGGGGHHHH-
      -AAAAAAAAAAAAAAAAAAAAAAAAAABBCCCCCCDDODEEEEEEEEEEEEFFFFFFFFFGGGGGHHHH
      -AAAAAAAAAAAAAAAAAAAAAAABBCCCCCCDDDDEEEEEEEEEEEEFFFFFFFFFGGGGGHHHH
      -AAAAAAAAAAAAAAAAAAAAABCCCCCCDDDDEEEEEEEEEEEEFFFFFFFFFGGGGGGHHH
      -AAAAAAAAAAAAAAAAAABBCCCCCDDDDEEEEEEEEEEEEFFFFFFFFFFGGGGGHHHH
   -4 —AAAAAAAAAAAAAAAABCCCCCDDDDEEEEEEEEEEEEEFFFFFFFFFGGGGGHHHH
      0                                                                       70
                                  SPEED (MPH)
                                      34

-------
      FIGURE 12  Normalized Variance Surface Based on 53 (v,a) t/2 Design Points
     THRESHOLDS
            0.1000        0.1500        0.2000         0.2500         0.3000
            12345
 +4  —2111122233455678
     -2111112223344566788               0.3500         0.4000         0.4500
     -211111122233445566788             6              7              8
     -211111122223334455667788
     -211111111222233344455667788
     -211111111122222333444455666778
     -2211111111112222233334444555667788
     -2211111111111222222233333444455566778
     -2211111111111111222222223333334445556778
     -2211111111111111112222222222233333444556678
     -2211111111111111111111222222222222333344455678
     -2211111111111111111111111112222222222233334455678
     -2221111111111111111111111111111111122222223334455678
     -2221111111111111111111111111111111111111222223334456678
     -222111111111111111111111111111111111111111122222334455678
^    -22211111111111111111111111111111111111111111122222334456678
%    -222111111111111111111222222222222111111111111122222333455678
«    -2222111111111111111222222222222222222211111111222222334455678
?    -22221111111111111122222222222222222222222222222222223334456678
I,    -22211111111111111111111222222222222222211111111122222223344556678
_    -32222111111111111111222222222222222222222222222222222223334455678
o    -5443332222222222222222223333333333333322222222222222333344556778
'•£    -8765544433333333333344444444444444444444433333333333444556678
£    -   8876655555555555556666666666666666666555555555555566788
•jj 0  —       88777777777788888             88888777777777788
-------
      FIGURE 13  Normalized Variance Surface Based on 37 (v,a) t/2 Design Points
      THRESHOLDS
             0.1000        0.1500        0.2000        0.2500        0.3000
             12345

     1                                  0.3500        0.4000        0.4500
                                         6              ?             8
                   888888
                 87666667788
                766555556667788
              876554444455566677888
             87654443334444555566677788
             76544333333334444555556667778
            8765433333333333344444455555667788
            865543333222333333344444444455556678
            87654333322222333333334^444444445556678
            876544333222223333333333444444444444555678
             866543333222223333333334444444444444444556778
             87654433332223333333344444444444433333444455677
T    -       8765443333333333333444444444444444333333344455678
«    -        7655443333333333444444444444444444443333334445667
^    -        87654433333333344444455555555555444444444444455678
|    -         7655443333333444445555555555555555544444445556678
£    -       8765433322222223333334444444444444444444334444455677
c    -     87654433322222222233333333334444443333333333333444556678
.2    -   8766544333332222333333333334444444443333333333333444556678
«    -   87655
-------
periods called modes during which the vehicle follows a given speed-time
profile.  In order to determine whether or not there is any degeneracy in
the information being supplied by the various modes, this test data was
examined using methods of factor analysis.
        Factor analysis is useful in analyzing the intercorrelations within
a set of variables in order to identify fundamental and meaningful dimensions
in the multivariate domain.  This "task of factor analysis is most frequently
accomplished by first conducting a principal-components analysis and by then
using the resulting principal factors as a set of reference axes for deter-
mining the simplest structure, or most easily interpretable set of factors
                            *
for the domain in question."
        Principal-components analysis is generally useful in determining the
minimum number of independent dimensions needed to account for most of the
variance in the original set of variables.  In the present instance, this
statement can be interpreted to mean that the variance among the 1020 vehicles
in the data base,so far as emissions is concerned, can be explained by the
car-to-car variability observed in the values of a certain number of linear
combinations of the modal contributions.  The number of these combinations
required to account for some specified fraction of the total variance -- say,
90% -- is often referred to as the dimensionality of the space.  The essen-
tial thrust of the analysis is to take cognizance of the fact that if two
variables, such as two modal contributions to emissions, tend to vary in
some related way as one goes from vehicle to vehicle, then there is essen-
tially only one variable at work rather than the apparent two.
        To achieve such insight, it is heuristically logical to examine the
correlations among all pairs of modes for the 1020 vehicles in the data base.
The result is a correlation matrix for each of the pollutants under consideration.
The correlation matrices based on these 37 modes were determined for each
of the three pollutants, HC, CO, and NOX (as well as for C0_, in connection
with fuel-use studies to be discussed later in this report.)  These correlation
matrices were then subjected to a principal-components* analysis in order to
*
 Cooley, W. and Lohnes, P.,  Multivariate Data Analysis,  Wiley, New York,
 1971, p. 131.

                                     37
-------
determine the eigenvalues (X) and associated normalized eigenvectors (v).
The factor coefficients or loadings were then derived by:

                             .         Y_j> j = 1, 2,..,, 37
where a.  and v.  are of dimension 37. The numbers of dimensions or modes
      -3      -J
needed to account for 90% and 95% of the variance for each pollutant are
indicated below:

                              90% of Variance     95% of Variance
              HC                    7                  13
              CO                    9                  15
              C02                   9                  18
              NOX                  14                  21

For purposes of illustration, Table 1 gives the factor loadings for the first
seven principal components derived from the correlation matrix for HC, together
with their associated eigenvalues and the percent variance accounted for by
these factors.
        Besides using the principal-components solution to identify the
dimensions of the domain, an attempt could be made to interpret the results.
In general, the principal-components solution produces one general factor
and p-1 bipolar factors (p is the number of common factors).  The general
factor is usually all positive (or negative) when the solution is based on
a matrix of positive correlations.  It could be argued that the first factor
in Table 1 is perhaps a "speed" factor.  The second factor is a bipolar factor
and (except for the five steady-state modes) the modes of acceleration have
negative loadings and those of deceleration have positive loadings; this
factor could be considered to be an "accel/decel" factor.
        In order to improve on the solution offered by the principal-components
technique, factors were rotated to positions in which the factor pattern comes
closer to criteria of simple structure.  The purpose of analytic rotation
schemes is to transform the principal components so as to obtain new variables
                                     38
-------
                      Table 1   Principal Components (7) of -chc Correlation Matrix for HC for 37 Modes
Eigenvalue

% of Variance


  Variable
!
mce
1
2
3
4
5
6
7
8
9
10
11
•12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
27.04
73.08
-0.8762E+00
-0.534CE+00
-0.7764E+00
-O.Q313E+00
-0.9188E+00
-0.8445E+00
-0.8&87E+00
-0.8606E+00
-O.S793E+00
-0.6226E+00
-0.9240E+00
-0.8980E+00
-0.9282E+00
. -0.8564E+00
-C.8939E+00
-0.8081E+00
-0.9365E+00
-0.&762E+00
-0.9287E+00
-0.8892E+00
-0.8965E+00
-0.8086E+00
-C.C590F+00
-0.9214E+00
-0.7&96E+CO
-0.8185E+00
-0.9048E+00
-0.£95
0.5049E+GD
0.3135F.+00
0. 161 1E-01
-O.lfcfOE+CG
-0.16&6E+00
1.438

3. 89
                                            0.2557E+00
                                            0.1431E+00
                                            0.3774E+00
                                            0.1968E+00
                                            0.2738E-01
                                           -0.1831E+00
                                           -0.5325E-01
                                           -0.2348E-t-00
                                           -0.7&5GE-01
                                           -0.1920E+00
                                            0.1317E-01
                                           -0.2272E+00 .
                                            0.5636E-01
                                           -0.2268E+00
                                           -0.7528E-C2 -
                                            0.2GOOE+00
                                            0.13C5E+OO
                                           -0.1435E+00
                                            0.9097F.-01
                                           -0.1372E+GO
                                            0.5654E-01
                                           -0.3352E+GO
                                            0.2227F.+00
                                            0.1551E-01
                                           -0.4387E+00
                                            0.7664E-01
                                            0.5815F.-G1
                                           -0.1197E+00
                                            0.2264E+OG
                                           -0.2214E-01
                                           -G.3183E+00
                                            0.8707E-01
                                            0.3972E+00
                                            G.3077E+00
                                            0.36l£.E-01
                                           -C.
-------
which might be more readily interpreted and named.   This rotation was per-
formed on the matrix consisting of the 15 principal components of 37 variables
or modes for each pollutant (15 factors accounted for at least 90% of the
variance in the case of all pollutants).  The "normal" varimax criterion was
used for the orthogonal rotation of factors.
        This new set of rotated axes might be preferred for purposes of
interpreting the basic dimensions of the domain measured by the 37 modes.
This is because the new coefficients are more "simple" in the sense that a
given variable tends to have a high coefficient for only one new axis and
each factor has zero, or near zero, coefficients for at least some of the
          *
variables.    Table 2 gives the derived rotated factors for the first 10 (HC)
factors.  The general factor has been destroyed and group factors have been
produced.  In the first factor, high negative weights are given to variables
5, 7, 9, 11, 13, 17, 19, 21, 24, 27, and 30.  These modes, which are all
highly correlated, are characterized by accelerations between 1 mph/sec and
2.5 mph/sec-and by velocities ranging from about 28 mph to 53 mph.  When
variances based on the rate of emission (grams/sec) were calculated, these
modes all showed relatively high variances.  These observations suggest that
using any one of these modes could provide as much information as using all
of them.
        Table 3 gives the factor number for any mode which is weighted
heavily in that factor.  If more than one mode has high loadings within a
factor, the factor number is listed for each mode with the mode which is
weighted most heavily being "starred."  Examination of this table reveals
that the eleven variables which had high loadings in the first factor for
HC also have high loadings for the other two pollutants and for CCL.  Again,
this fact would suggest that these modes provide redundant information.
It should also be noted that for CO, C02,and NOX, modes 6, 8, 10, 12, 14, 22,
25, and 28 all have high factor loadings in the second factor.  These modes
are characterized by average accelerations ranging from -1 mph/sec to -3 mph/
sec and by average velocities from 24 mph to 47 mph.  They also all have rela-
tively low emission-rate variances.  Most factors have high coefficients for
 Cooley and Lohnes, op. cit.

                                     40
-------
TABLE 2   TEN* ROTATED FACTORS OF THE CORRELATION MATRIX FOR HC FOR 37 MODES
                                             FACTOR
                  1      2      3      45      6      7      8      9
10
1
2
3
4
5
6
.7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-.68
-.22
-.43
-.72
-.82
-.47
-.86
-.65
-.83
-.37
-.87
-.47
-.86
-.45
-.42
-.31
-.83
-.40
-.81
-.39
-.85
-.36
-.63
-.85
-.38
-.35
-.82
-.40
-.65
-.83
-.48
-.31
-.16
-.37
-.57
-.58
-.64
.37
.24
.49
.39
.20
.24
.17
.12
.12
.30
.23
.31
.23
.25
.39
.62
.28
.34
.27
.39
.24
.23
.27
.18
.12
.39
.16
.34
.26
.13
.14
.47
.88
.80
.49
.19
.23
-.24
-.22
-.24
-.30
-.30
-.60
-.32
-.49
-.32
-.67
-.31
-.69
-.29
-.74
-.58
-.46
-.29
-.73
-.33
-.71
-.29
-.47
-.21
-.27
-.41
-.43
-.23
-.56
-.22
-.23
-.58
-.52
-.28
-.30
-.31
-.27
-.26
.20
.10
.24
.19
.13
.14
.05
.06.
.08
.13
.10
.10
.13
.11
.27
.24
.18
.19
.16
.16
.13
.18
.59
.29
.14
.58
.40
.39
.59
.24
.24
.46
.15
.11
.09
.11
.07
.12
.91 '
.12
.14
.12
.11
.09
.11
.09
.13
.10
.14
.10
.12
.15
.13
.11
.13
.11
.15
.08
.10
.10
.09
.08
.13
.08
.12
.11
.07
.08
.11
.14
.13
.15
.08
.10
.08
.06
.06
.14
.20
.18
.10
.17
.16
.09
.08
.09
.11
.10
.13
.05
.09
.11
.12
.13
.06
.08
.08
.11
.09
.08
.07
.12
.07
.07
.14
.07
.05
.14
.36
.70
.41
.18
.10
.11
.14
.16
.18
.16
.21
.17
.25
.18
.33
.16
.10
.18
.16
.16
.20
.16
.23
.18
.72
.18
.20
.77
.21
.21
.38
.16
.17
.39
.19
.11
.14
.16
.14
.36
-.39
-.06
-.61
-.21
-.13
-.07
-.03
-.10
-.06
-.10
-.05
-.08
-.12
.23
-.14
-.15
-.19
-.07
-.13
-.05
-.07
-.09
-.19
-.01
-.04
-.00
-.06
-.10
-.20
.05
-.05
-.12
-.12
-.04
-.01
-.05
-.07
.00
-.01
-.01
.02
-.02
-.02
-.12
-.06
-.15
-.01
-.06
-.02
-.07
-.10
-.15
-.00
.02
.03
. .03
.06
.02
.00
.12
.03
-.02
-.28
.04
.02
.06
.15
.00
-.04
-.03
.02
-.07
.03
-.23
-.10
-.02
-.02
-.10
-.13
-.49
-.01
-.12
.-.03
-.01
-.00
.06
-.03
-.07
-.15
.04
-.06
-.15
-.04
-.05
.00
.00
-.03
-.06
-.08
-.06
-.03
.05
-.04
-.00
-.07
-.05
-.07
-.01
-.07
-.04
-.10
         Rotation done on 15 factors.
                                      41
-------
Tnble 3   Highly Loaded Modes by Factor Number
          HC          CO       •   C00
MOPF
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
I*
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37


5
8

1
10
1
11
1
12*. 3
1*
3
1
3*

13
1
3
1
3
1

4
1
7
9*. 4
1

4
1
15

2*
2
14
6
9
FACTOR
12
11
6

12*. 1
2
9*. 1
14*. 2
1
2*
1*
13*, 2
1
2
5

1
10
1
2, 15
1
2

1
15*. 2
7
1
2

1
2
8
3*
3

4*
4
NIJMRI-:R
14
15
10
1
1
2
1
12*, 2
1
2*
1
2
1
2


7*, 1
11
1
4
1
2

1
13*. 2
4
1*
2

1
9*. 2
8
S
6

3
3*

15
11
3



12*. 1
2
1
13*. 2
1
2
1
2

5
1

1

8*. 1
2

1
2*

1
7, 2

1*
2
7*
4
9
6

14
                     42
-------
only one mode.  Some variables such as 4, 15, and 29 have not been weighted
heavily in any factor.
        The results of the principal-components analysis indicate that for
all pollutants 14 modes are sufficient to account for 90% of the total
variance.  These 14 modes are not the same for all of the pollutants.
However, eleven of the modes seem to provide the same information for all
pollutants and eight modes provide the same information for three of the
pollutants.
        In conclusion, it appears that test procedures could be modified so
as to avoid running a vehicle through all 37 of the defined modes and still
obtain .the same amount of information about its emission response.
                                     43
-------
4.      GROUP EMISSION PREDICTIONS
        Individual vehicles represent a wide variation in model year, make,
model, engine and drive train equipment, accumulated mileage, state of
maintenance, attached pollution abatement devices, and geographic location.
Inasmuch as it is a mix of these diverse vehicles which determines the
vehicular contribution to air pollution in a given vicinity, however, it is
appropriate to aggregate vehicles into groups and to view the group as a
composite emission source for various purposes of analysis.  Accordingly,
considerable interest centers on the accuracy and precision with which the
modal analysis emission model can predict group emissions in a given driving
sequence .
        The characterization of a group of vehicles can be achieved by
defining the emission rate function for the average vehicle within the group.
Let,
        b. .,   =  k'th coefficient in the emission rate function
         1JK
                 for the j'th vehicle within the group and i'th
                 kind of pollutant.
        N     =  number vehicles in the group.
         &
        b..    =  k'th coefficient in the emission rate function
         ik
                 describing the average vehicle's i'th kind of
                 pollutant response.
Then ,
                            N
                         1   g
                  b.,= ~ I  b..
Thus, the group emission rate functions are determined by averaging the
coefficients which make up the emission rate functions of each vehicle in
the group.  In this way, the group is viewed as consisting of N   "average"
                                                               &
vehicles, each having identical emission characteristics.  The emission
response of the group over any driving sequence can accordingly be determined
by multiplying the response of this average vehicle by the number of vehicles
                                     44
-------
in the group.  Note that, once the emission response of the average vehicle
has been characterized in terms of average regression coefficients, its
total emission over any specified driving sequence can be obtained by
appropriate integration of the emission rate function in exactly the same
manner as for any other vehicle.
        As was shown in section 3.2 of this report, error propagation in the
modal analysis emission model causes the emissions estimated for some! regions
of the ( v, a)-plane to have lower variance than for certain other regions
of the plane.  A consequence of this fact is that the estimation capability
of the model over an arbitrary driving sequence will depend on the relative
amounts of time which that sequence devotes to regions of high or low
variance.  This fact is true for both individual vehicles and for groups of
vehicles.
        Our approach to an evaluation of the model for group emission predic-
tion was as follows.  First, a study was made of the extent of agreement
between observed and computed emissions for the Surveillance Driving Sequence
(SDS).   Then, with this comparison in view as a "base case," a procedure was
developed for relating the base case to arbitrary driving sequences which, as
a result of differences in their distribution of velocities and acceleration,
exhibit different degrees of variance in the emissions computed by the model.
4.1     MODEL PERFORMANCE FOR THE SDS BASE CASE
        Two general questions are of interest in connection with the predic-
tion of group emissions:  accuracy and precision.  Lack of accuracy is
reflected as a bias or systematic error in the predicted results.  Lack of
precision is the consequence of random errors in the prediction and is
manifested in terms of variance in the predicted group emissions under
repeated sampling and testing of the group.  These two aspects of the group
prediction question will be addressed below in connection with the perform-
ance of the model for the Surveillance Driving Sequence.
                                     45
-------
4.1.1   Accuracy of Group Emission Prediction
        Because of the fact that the "true" or population value for the mass
of a pollutant emitted during a particular driving cycle can never be known,
the question of accuracy can be resolved only in a relative sense.   One
possible approach to evaluating the accuracy of model prediction is to com-
pare, for a particular driving sequence, bag values as computed by the model
and bag values as actually observed in test.
        The approach indicated above was employed in the initial implementa-
tion of the original model by comparing computed and observed bag values for
the Surveillance Driving Sequence.  These results were originally presented
in Calspan Report No. NA-5194-D-3 and in EPA Report No. EPA-460/3-74-005.
Relevant portions of these results are repeated herein as Table 4 for purposes
of reference, because it is here proposed to view these results in a new light.

                                  Table 4
                        BAG VALUE ERROR STATISTICS
                       SURVEILLANCE DRIVING SEQUENCE
                                1020 VEHICLES
OBSERVED
\'
POLLUTANT \
HC
CO
NOX
\G VALUE
(gms)
0
53.
625.
48.
5
0
2
MEAN ERROR VARIANCE
(gms) (gms)2
R 
-------
experimenter can observe the difference in degree of corrosion for each pair
and, as far as overall generalization is concerned, .can circumvent the
variability introduced by inhomogeneity of exposure conditions.   If he wants
to group the paired samples into classes according to soil type  -- clay, loam,
cinders — he can restrict his inferences to these strata, again with the
advantage of balanced comparisons within the strata.   It is proposed to
examine the performance of the modal analysis emission model in  this vein.
In this analysis, individual vehicles will play the role of exposure condi-
tions, and homogeneous classes of vehicles will play the role of soil strata.
        First, let us examine the hypothesis that there is no significant
difference between the mean bag value as observed and as computed -- that is,
let us examine the hypothesis:

                                H  : "R = 0
                                 o
Because of the large sample size, we can use the  u-test to test the hypothesis,
The standard error of  R is

                               „     "R
                                5 "
and  u is defined as

                                u =
        As shown in Table 4,  the hypothesis is rejected at the 0.01 level for
all three pollutants.  In this connection, however, a word of warning is in
order.  By pooling a sufficient quantity of data, it is possible to label as
statistically significant an effect which may be of negligible engineering
magnitude.  More germane is the consideration that if the difference between
two means is no greater than -- say, 10% --of their pooled mean, it may be
of small consequence that this difference is declared to be statistically
significant.  The importance of the effect depends on its probable magnitude,
and the mere act of declaring it to be statistically significant in no way
augments its practical magnitude.

                                     47
-------
 4.1.2    Precision of Group Emission Prediction
         For each of the pollutants HC, CO and NOX, the relative importance
 of  statistical  and practical views of model performance can be considered
 in  terms of confidence intervals.  Let /*„  denote the expected or population
 mean value of the difference between calculated and observed bag values  for
 a pollutant.  The width of a confidence interval for  //R  depends on the
 dispersion of estimates for individual vehicles comprising the group and on
 the "size" of the confidence interval.  In statistical terminology, the  term
 "size" denotes  the probability with which it can be asserted that the popula-
tion mean falls between two prescribed values.  In the following discussion,
 we  shall assume a confidence interval of size 0.95 (95% confidence).
        For 95% confidence, the half-width of the confidence interval is
        nately 2 a- (more exactly 1.958<7_ ) and the coi
                  R                     R
for the three pollutants are approximately as follows:
approximately 2 a- (more exactly 1.958<7_ ) and the confidence intervals
                  R                     R
HC
CO
NOX
6.4
34
-3.5
!%
^ R
'"R
* 8.0
^ 52
^ -1.9
As a percent of  0, the mean observed value for the pollutant in question,
one obtains as extremes:

                   ^|  x  100%  =  15% for HC
                   •5 J • J

                         x  100%  =  10% for CO

and                -r     x  100%  =  7.3% for NOX
                   48.^

In short, for a group of 1020 highly heterogeneous vehicles, the bias for
HC would not be expected to be greater than 15% of the mean values as
actually observed  by direct measurement of these 1020 vehicles.  Similar
figures of 10% and 7.3% apply for CO and NOX.
                                     48
-------
4.1.3   Sampling Considerations
        It is evident that the dispersion of emissions for individual vehicles
within a group depends on the degree of homogeneity of the group as far as
such determinants as make, model, mileage, state of maintenance, and other
factors are concerned.  Because of this fact, the standard errors applicable
to the mean emissions computed for the three pollutants for the group also
depend on the homogeneity of the group.  Consequently, the performance of
the model in the estimation of group emissions depends strongly on sampling
protocol.
        Consider, for example, a population of vehicles having a certain mix
of vehicle "types," as specified by make, model, mileage, and other factors
which can be rationally employed to differentiate one vehicle from another.
A random sample of N vehicles would produce a certain mix of vehicle types
within the sample, not necessarily the mix existing in the population.  A
second sample would most likely produce a different mix of vehicle typ.es
and certainly a set of different vehicles than the first sample.  One sees,
therefore, the influence of two sources of variability as far as the predic-
tions of the model are concerned:  vehicle-to-vehicle variability within types
and variability in proportionate weighting of types.  The result is that a
confidence interval based on a random sample from a nonhomogeneous population
of vehicles can be expected to be considerably wider than for a case in
which some of the sources of variability are controlled.
        In this connection, consider the case in which stratified sampling is
used to select N vehicles from the population.  This procedure is a quite
logical one in emission assessment, because it assures that the sample will
contain the same relative proportions of different types of vehicles as does
the population.  Random sampling is then performed within each strata to
obtain the desired number of vehicles.  In this type of sampling, vehicle-to-
vehicle variation will be present but the variation in proportions of the
various strata will have been eliminated.
        In conclusion, it is not possible to make overall generalizations
about the ability of the model to estimate group emissions, unless the nature
of the group and the method by which it is sampled is taken into account.

                                     49
-------
4.2     MODEL PERFORMANCE FOR ARBITRARY DRIVING SEQUENCES
        As noted in Section 4.1 of this report, the performance of the modal
analysis emission model can be evaluated for the Surveillance Driving Sequence
by direct comparison with observed results.  No other driving sequence except
the FTP permits such a comparison, because bag values are not available for
these sequences.  To obtain such a comparison for an arbitrary driving
sequence, it would be necessary to perform emission tests over that driving
sequence as a "validation" of the model.  It is possible, however, to compute,
for an arbitrary driving sequence, the mean emissions for a group of vehicles
and the variance of the emissions exhibited by individual vehicles comprising
the group.  Thus it is possible to evaluate the precision of model perform-
ance for the group for an arbitrary driving sequence, but its accuracy must
be judged according to results of the SDS base case.

4.2.1   Theoretical Background
        The essence of the approach to precision analysis for the performance
of the model in an arbitrary driving sequence resides in the simplification
of the emission integration as detailed in Section 2.2 of this report.  As
background for this approach, however, it will be informative to review the
underlying statistical theory.
        Consider a set of random variables X. , X_,..., X     and a linear
combination of these variables
                     Y=C1X1*C2X2* -  +CPXp
where  c1,  c_,..., c        are constants.   The variance of the random
variable  Y can be computed as
                    P    2             P   P
           Var Y =  2  c.   Var X.  + 2  I   £  c.c.  Cov (X.,  X.)
                                     50
-------
For example, for three variables,
Var Y = C   Var
                 Var
                                                  Var
                           Cov (X.,  X )  + 2 c.c_ .Cov
                                 A '   fc       Jl O

                           Cov (X2,  X3)
In matrix notation, this result can be written
 Var Y =
          [C1C2C3]
Var
Cov
                  Cov
                       Cov
                     Cov (X^  X2)  Var (X2)  Cov (X2
                              X3)  Cov (X,,  X3)  Var
or, in general,
 Var
                                S c
where  S is the variance-covariance matrix of the random variables
 X..,  X_,..., X ,    c    is a column vector of the weighting coefficients
              p'
and   £'  is a row vector, the transpose of  c.
        It will be seen that equation (22), pertaining to the integrated
basis functions, fits the definition above.

4.2.2   Variance Computations for Arbitrary Driving Sequences
        It has been shown that, for an arbitrary driving sequence, the total
emission of a pollutant can be written as
                                     51
-------
           e(T) =  2  a   /   h[a(t)J f  [v(t)] dt
                  • . i  •!•  /            -^
                   ~
                                         gj [v(t), a(t)] dt
                                                                         (36)
for the modal analysis emission model as originally developed with 12 basis
functions.  Moreover, it was shown that, for a given driving sequence, the
12 integrals need be computed only once for any group of vehicles because
the values of these integrals are constant for all vehicles in the group and
depend only on the nature of the driving sequence.  On the other hand, each
vehicle in the group gives rise to a different set of  a.   and  b.   ; conse-
quently, these values can be considered as outcomes of random variables  A. ,
                                           Thus the  A.  and  B.  play the
                                                      i
i = 1,  2,  3  and B., j = 1, 2,... ,  9.
role of the  X.   in Section 4.2.1 of this report.  Similarly, the values of
the 12 integrals in equation (36) play the role of the constants  c.   in
Section 4.2.1.  Denoting these integrals c.,  c2, c_,  d., d2,..., dg>
one can then write
Var e(T) =
      l'C2'C3'dl"
                    >d9]
                                                             . . .Cov(A2,Bg)
Var Aj Cov(A1,A2)Cov(A1,A3)Cov(A1,B1)...Cov(A1,B9)

Cov(A1,A2)Var &2 Cov(A2 .

Cov(A1,A3)Cov(A2,A3)Var

Cov(A1,B1)Cov(A2,B1)Cov(A3,B1)Var BX . . .CovfBj ,Bg)
    •         •          •        •          •
    •         •          *        •          •
    •         •          •        •          •
Cov(A1,B9)Cov(A2,B9)Cov(Bg)Cov(B1,B9)...Var Bg
                                     52
-------
or, more succinctly, as

                        Var e(T) = c_' S c_
where  S is the variance-covariance matrix of the coefficients of the model
for the group of vehicles under consideration and   £ is a column vector of
the integrated basis functions as integrated over the driving sequence under
consideration.  In application, the variance-covariance matrix would be
estimated from the N vehicles comprising the group of vehicles under
consideration.
        To illustrate this principle, four driving sequences were constructed
with the intention of accounting for highway and city driving.

             Driving Sequence ID                  Description
                   DS1                  Highway driving with frequent
                                        changes in speed.
                   DS2                  City driving with frequent
                                        changes in speed.
                   DS3                  City driving with long periods
                                        of constant speed.
                   DS4                  Constant-speed highway driving.


These driving sequences are depicted in Figure 14.  Calculations of the total
variance over a driving sequence were based on 1050 seconds.  Therefore, the
sequence shown for DS1 was repeated once and that shown for DS2 was repeated
three times.  A fifth driving sequence was taken as the first 505 seconds of
the Federal Test Procedure.  Results for these driving sequences are
presented in Table 5 for HC only.  The results are based on all 1020 vehicles
considered as a group.
                                     53
-------
                                 FIGURE  14    ARBITRARY DRIVING SEQUENCES
30
0
/ V / \
DS1 - Highway Driving
	 1 i , > i 	 1 it . i 	 1 	 1 	 1 	 1 	 i i i 	 1 j t 	 — I 	 1
       0    30    60    90    120    150    180    210   240   270   300   330   360   390   420   450   480   510 525
                                                        Time  (sec)
     60
a, e.
     60
co
     60
  .   30
                         DS2  - City Driving
                                     100
                     150            200
                          Time (~ec)
                                       250
                          DS3 - Constant  Speed City Driving
                   100
200
300
400         500
   Time (sec)
                                               DS4 - Constant Speed Highway Driving
                                                        Time (sec)
                                                                                                    11
1050
                                                                                                            10
-------
                                   Table 5
                       TOTAL COMPILED VARIANCE OF HC
                       OVER VARIOUS DRIVING SEQUENCES
             Driving            Time Duration         Variance
             Sequence              (sec)	          (gm^)
              SDS                  1054                2172.6
              FTP                   505                 336.9
              DS1                  1050                3992.7
              DS2                  1050                 906.4
              DS3                  1050                 817.5
              DS4                  1050                5241.9

 The  table  illustrates the  fact that the variance of individual vehicle
 emissions,  as computed by  the model, depends on the nature of the driving
 sequence.   In the  case of  the FTP, the  low variance reflects, at least in
 part,  the  fact  that  the time duration of the sequence,  505 seconds, is
 considerably less  than the time duration of the other modes.
             /
        For a check on the validity of the variances as estimated from the
variance-covariance matrix of the model coefficients,  compare Table 5  with
Table 6 below.

                                   Table 6
                           MEANS AND VARIANCES OF
                   CALCULATED AND OBSERVED BAG VALUES (CMS)
          FEDERAL TEST PROCEDURE
          MODEL
OBSERVED
SURVEILLANCE DRIVING SEQUENCE
    MODEL             OBSERVED
HC
CO
NOX
MEAN
18.23
214;51
16.72
VAR
336.
23010.
77.
MEAN
81
1
13
21.05
223.69
17.22
VAR
380.
22760.
81.
MEAN
69
7
62
46.34
582.0
50.9
VAR
2173.
180900.
699.
MEAN
4
0
1
53.55
625.10
48.17
VAR
2680.4
210610.2
647.3
                                     55
-------
In Table 6, the columns labeled "model" were obtained by computing, for each
of the 1020 vehicles, the bag value as determined by application of the model
emission model.  The quantities were then averaged to obtain the mean bag
value for the model and their variance was computed by the usual formula
                      =
                                    N-l
where  x^  denotes the model-computed bag values for the i   vehicle.
        In conclusion, it is noted that the variances of the bag values, as
computed by the model, are comparable with the variances of the bag values
as actually determined by test.  Also, in view of the agreement between
Table 5 and Table 6, a method is at hand for estimating the vehicle-to-
vehicle variance within a group for any driving sequence.  This capability,
in turn, makes it practical to estimate the standard error of the group mean.
                                     56
-------
5.      PREDICTION OF FUEL ECONOMY
        In view of the fact that the modal analysis emission model provides
a means to estimate pollutant emission over any arbitrary driving sequence,
it appeared feasible to employ the model to estimate fuel consumption by
means of the carbon balance equation.  In this connection, reference is made
to work by M.E. Williams ejt al.  with regard to the FY 72 exhaust emission
                     *
surveillance program.
        The carbon balance equation relates the amount of fuel consumed per
mile to the amount of carbon-containing emissions produced per mile.  Using
the output of the modal emissions model as input into this equation allows
one to estimate the fuel consumption over any driving sequence.  The carbon-
containing emissions that must be inputted are carbon monoxide (CO), carbon
dioxide (C0?), and hydrocarbons (HC).
        The carbon balance method of calculating fuel economy in miles per
gallon (mpg) is given as:

                      grams of carbon/galIon of fuel
              mP8  ~  grams of carbon in exhaust/mile

The actual equation incorporated into the model to estimate miles per gallon
is
                                    2423.0
              mpg
                      0.866 (HC) + 0.429 (CO) + 0.273 (CO )
                                                                     *
where HC, CO, and C0_ emissions are estimated in terms of grams/mile.
Implementation of the formula required, first of all, appropriate formula-
tion of the modal analysis emission model to predict CO  emissions in
addition to CO and HC.  Then it was a straightforward matter to substitute
these predicted quantities into the carbon-balance equation to obtain
predictions of miles per gallon.
 *
  M.E. Williams, J.T. White, L.A. Platte, and C.J. Domke,  Automobi1e Fxhaust
  Emission Surveillance - Analysis of the FY 72 Program,  Report No. F.PA-
  460/2-74-001,  U.S. Environmental Protection Agency, Ann Arbor, Michigan
  (February 1974)
                                     57
-------
5.1     PREDICTION OF C02
        Since the ability of the modal analysis emission model to predict (XL
was not investigated under Contract Number 68-01-0435, it was necessary to
examine the model's effectiveness in predicting CO- emissions prior to using
these estimates in the carbon balance equation.
        In order to determine the form of the emission rate function that
should be used to represent C0? emissions, the average emission rate of the
1020 vehicles in the data base for each of the steady state modes was plotted
versus speed. This curve is shown in Figure 15.  On the basis of this figure,
the assumption was made that the steady state and accel/decel emission rate
functions for CO- could be represented by the same weighted quadratic func-
tions of speed and acceleration as those used for HC, CO, and NOX in the
original formulation of the model.
        By means of the composite emission rate function, the amount of CO
emitted for each of the 1020 vehicles was estimated for the Surveillance
Driving Sequence (SDS) and for the first 505 seconds of the Federal Test
Procedure (FTP) driving sequence.  These estimates are reported in Table 7,
where they are compared with results as observed in the actual emissions
tests.  The notation in the table is as follows:

               0  =  observed mean bag value for C0_, in gms/mile,
                     for 1020 vehicles
               R  =  difference between mean emissions predicted
                     by the model and the observed mean bag value
                     for C02 (gms/mile)
              o   =  standard deviation of errors for individual
                     vehicles

A visual appreciation of the distribution of the errors for individual
vehicles is afforded by Figure 16 for the SDS and Figure 17 for the FTP.
The occurrence frequencies on which these histograms are based are tabulated
in Table 8.
                                    58
-------
     400
     360
     340
     300
I    260
 c
 0
•H
 in
     220
     180
     140
     100
      60
                    15
30          45

  Speed (mph)
60
           Figure 15 MEAN STEADY STATE  C02  EMISSION  RATES  VS.  SPEED
                                   59
-------
                                    Table 7

                        COMPARATIVE STATISTICS FOR CO,
Statistic
 Surveillance

Driving Sequence
FTP (First 505 Sec)

 Driving Sequence
  0
  R
4347.4


 270.6
      1662.7
       141.5
                               356373
                                597.0
                               108756



                                329.8
  ^ x 100%
  0

 a
 — x 100%
           x 100%
     0
 655.4


   6.22



  13.73



  15.08
       358.9



         8.51




        19.8




        21.6
                                      60
-------
o
•IH
^T

>
•z.
200


150


i nn


"50



n
:4_.....LZ. ..; . .



- - . -'• -• ; 	


'" • ,


•


-- ^_^T-T-























































































































--


































j ' _- :; i 	 ••'• MEAN = 270.6 CMS 1 ' i ' ' ' '
.... ^ . . • .
: ; ; STD. DEV. = 597.0 CMS
r - .'"; • '• .. 	 	 .......-.- 	 - ._ 	
•-'*.•- ~ .:-::' '•'•..• -'• ' '• • -..
'- '.•'.'•' :'-j .'••;;.': • ; ; |- . . ' "]'-'.

•L • . 	


i 	 _-•• • " : '" '•-.-' •• • • •;• / i- • . ;•"--!.• — t - -

"- •" -™ " " "" ~' ' ~'" ~" - - ._.;... . . - 	 .-^ — —^
. . "Th~. n --. -r-s r-i
       -1200      -800
-400
0          400          800        1200         1600         2000


    Bar Error  (prams)
                                                                                                                     2400
                Figure  16   DISTRIBUTION OF CO., BAG F.RROR  FROM  THF. SURVF.ILLANCF. DRIVTNf, SF.OUENCF
-------
400
•300
1200
1100
tn • . :
I— 1 . - 1-
0 '
-S •-.—-> 	 =- - •-: - - . - 	
;£! :
: «w . . : !'
2 . . . . .......

•? .
,___] 	















1 . • .



\. 	 1
         -800
-600
-400
-200           0          200
     Bag Error  (grains)
400
600
800
                                                                                                                     1000
                             Figure 17 DISTRIBUTION OF C02 BAG ERROR (FIRST 505 SECONDS FTP)
-------

                        Table  8
 DISTRIBUTION OF C00 BAG VAI.UF  FRROR (OBSERVED -
FROM THP FIRST 505 SEC OF  TI1F FEDERAL TF.ST  PROCEDURE  (FTP)
       AND THF SURVEILLANCE PRTVINC SFOUFNCFS  (SPS)

                                    NUMBER  OF  VEHICLES
  FRROR (CMS)                        FTP ___ SPS_

 -2400 to -2300                        0            1
 -1800 to -1700                        0
 -1700 to -1600                        0
 -1600 to -1500                        0
 -1300 to -1200                        1            n
 -1200 to -1100                        0            4
 -1100 to -1000                        0            2
 -1000 to  -900                        1            *
  -900 to  -800                        0            7
  -800 to  -700                    •    3            8
  -700 to  -600                        2           11
  -600 to  -500                        2           11
  -500 to  -400                        3           n
  -400 to  -300                        7           16
  -300 to  -200                       20           32
  -200 to  -100                       38           61
  -100 to      0                       143           98
      0 to    100                       328          169
    100 to    200                       226          153
    200  to    300                       86          102
    300 to    400                       40           68
    400 to    500                       33           36
    500  to    600                       29           28
    600  to    700                       26           21
    700  to    800                       13           20
    800  to    900                         2           17
    900  to   1000                         8           17
   1000  to   1100                         2           14
   1100  to   1200               .          1           19
   1200  to   1300                         1           1°
   1300  to   1400        '                 0           12
   1400  to  1500                        0             8
   1500  to  1600                        1      -       °
   1600 to  1-700                        0     •      16
   1700 to  1800                        0             p
   1800 to  1900                        0             3
   1900 to  2000                        °             5
   2000 to  2100                        °             3
   2100 to 22200                        1             3
   2300 to  2400                        0             4
   2700 to  2800                        °             2
   2900 to  3000                        °             1
   3400 to  3500                        1             1
   4200 to  4300                        1             n
   4700 to  4800                        1         '    n
                            63
-------
        Several observations can be made from Table 7 and from Figures 16
and 17 which are useful in judging the adequacy of the model.  First, the
mean or expected difference,  R , between the calculated and the observed
values should be zero if the model is to be considered unbiased.  The error
distributions show that the bag error clusters around the average error  R  ,
and this average for both driving sequences deviates from zero by only a few
percentage points of the average measured bag values.  Also, the root mean
square error which represents the combined systematic and random errors
as represented respectively by   R and  a  , are largely dominated by the
random error component.
        In order to further judge what these measures indicate as to the
predictive performance of the model, the results for C02 were compared with
the results obtained by using replicate data.  Of the 1020 vehicles in the
Surveillance Driving Sequence, 61 had been tested twice each.  Thus there
were available 61 replicate measurements from which could be obtained a
measure of the repeatability of the test measurements themselves.  Estimates
of the mean,  X  , standard deviation,  a , and relative or percent standard
deviation,  a/X  , are given in Table 9 for the SDS, for the FTP, and for
individual modes.  The percent standard deviation characterizes the
repeatability of measurements.  These values are 8.34% for the SDS and 9.65%
for the FTP driving sequence.  For the individual modes, the percent
standard deviation ranged from 6% to about 40%.  This large variability in
the test measurements is reflected as errors in the determination of the
regression coefficients which in turn determine the error in estimating the
instantaneous emission rate at any point in (v,  a)-space.  In view of the
relatively large errors in the modal input data, the errors obtained for
model performance do not appear unreasonable.

5.2     PREDICTION OF MILES PER GALLON
        Prediction of fuel consumption in terms of miles per gallon is
achieved by substituting the computed emissions of C0_, CO and HC into the
carbon balance equation.  Though direct measurements of miles per gallon
were not available, it was possible to compute "observed" values by
                                                                          \
                                    64
-------
                     Table 9   REPLTCATF  MODAL ANALYSIS  OF CO  FOR 61 VEHICLES


MODE      TIME (sec)         X (gm/min)        a2  (gm/min)2        a (gm/min)          a/I -100%

  1          12              1004.87           39566.29          198.91              19.79
  2          16               329.28             7501.51           86.61              26.30
  3           8              1129.19          192329.25          438.55              38.84
  4          11               693.08           14943.85          122.24              17.64
  5          13               580.97             8654.38           93.03              16.01
  6          12               194.23             1909.12           43.69              22.50
  7          17               673.03           15976.41          126.40              18.78
  8          12               217.70             2632.25           51.31              23.57
  9          14               ,569.71             9127.54           95.54              16.77
 10          30               202.64             2134.50           46.20              22.80
 11          26               671.98           12228.70          110.58              16.46
 12          21               229.06             4525.77           67.27              29.37
 13          32               726.05           11218^22          105.92              14.59
 14          23               190.92             1952.54           44.19              23.14
 15           9               236.69             2042.98           45.20              19.10
 16           8               569.40           18156.47          134.75              23.66
 17          22               731.97           12361.45          111.18              15.19
 18          16               201.06             3033.29           55.08              27.39
 19          18               671.54           17670.24          132.93              19.79
 20          19               245.55             6892.80           83.03              33.81
 21          25               748.31           11888.24          109.03              14.57
 22          28               223.31             2553.20           50.53              22.63
 23          15               882.50           19518.84          139.71              15.83
 24          25               582.99             7184.77           84.76              14.54
 25          18               191.30             1578.71           39.73              20.77
 26          10               333.03             7648.20           87.45              26.26
 27          38               644.71           12949.74          113.80              17.65
 28          35               216.89             2629.09           51.27              23.64
 29          18               777.47           19903.79          141.09              18.15
 30          21               601.46           16428.74          128.17              21.31
 31          14               191.20             2318.55           48.15              25.18
 32          13               316.58             4451.17           66.72              21.07
 33          60                73:43              170.64           13.06              17.79
 34          60               387.26             3180.27           56.39              14.56
 35          60               330.37              560.51           23.68               7.17
 36          60               357.10              439.12           20.96               5.87
 37          60               402.84             1397.43           37.38               9.28
FTP (gm)                      483.86             2178.55           46.67               9.65
SDS (gm)                      457.40             1453.77           38.13               8.34
-------
incorporating the observed values of CO , CO and HC into the carbon balance
equation.  Miles per gallon figures, as based on the model outputs and as
based on the observed bag values for the three emission products, could then
be compared.
        The results of this comparison are given in Table 10.  The applicable
notation is as follows:

                0 =   mean value of miles per gallon for 1020 vehicles,
                      as computed from bag values for C0_, CO and HC
                      emissions
                R =   mean difference between "observed" miles per
                      gallon and miles per gallon as determined from
                      model outputs for C0?, CO and HC emissions
              0ft =   standard deviation of errors for individual
                      vehicles

As was the case for C0_, the quantity  R  denotes a systematic error, whereas

-------
                                  Table 10
              Statistics for Miles/Gallon Error Based on Bag Values
Statistic
  Surveillance
Driving Sequence
FTP (First 505 Sec)
 Driving Sequence
0
R
o2
R
aR

y?rr.R2
- x 100%
0
CTR
— x 100%
0
i/s* * ,R2
v inn0-
0
17.07
-1.00
12.49
3.53
3.67
-5.88
20.70
21.52
16.44
-1.30
6.63
2.58
2.88
-7.89
15.67
17.54
                                       67

-------
           400
            300
        IT.

        0)
        u
       •£   200
       <4-
        c
00
       |
            100

                                  i
                                     -
                                         ,
                                        till
      HI
      thr
iTtt:
Jilii.
          i  'n
          ,",  [in-
          F- i'^
                                 ;-...„.-
               H
       iU:

       lifl
                     ttli
                                                   miii
                                                        lit:
                                                           il-
             $.

                                lli
                                                                    Tft
                                                                    Mil

rtp


I
Bti
                                       1
                                                  It
                                                     Hi
                                                         ;-itt
                                                         -1

MEAN


STD. DEV.
                                I
                                             -1.0  (MPG)
3.53 (MPG) yl
                                         1M|

           ' • t M < >
           tn~Tt~'"r
            •  -t~n
                 j;.:!

                 ^
               M
                                                                                        r- htt"
                                                   ^TTrpjjj
                                                                                                 'i!
                                                    IIP-

                                                                       ±
                      -12
-8
  -404


    Error (Miles  per Gallon)
                           12
                           Figure 18  DISTRIBUTION OF ERROR FOR MILFS PER GALLON  BASFD ON

                                       BAG VALUFS  FROM TUF SURVFILLANCF DRIVING SFQUFNCF

-------
400
300
200
100
ft
:
;

- "71
"T:"""
' ' : •; :



-----
Vehicles
«4-
C
^
•
,.

i
> ^
i
J '-"


,
^ .

;;-
.'A'




"-/I
1
• ::
1 1
: I •

"




„._'
•
:T":"-^;:
"'.\'r ". '. '":''.
• ', ' L '. :;
. : • r i ' • • •
. .
' I !
. . : 1
• • i •
• • '





t


"-"-



T~
; ; I
--



r ; • - ; ;

. ].. .
. ; ; , ; j
I ' '

:.-[;: I
. i ,
—:[:. r
: . j ' : ;
,-' ;,1 i1
-.:.: ..
^ ^ 	 _^ _
^r- •
::
	 1

:;i:
r.

; '.
; -.
• • • 1 '
• • I
!;;:

— 1__ .
• i
•; i
i i • :




ffrr
. ' '


I i :
i • •
i : .


•
. . r ; . '
'. . i i ' ;
, . j : . ....
'•
*••: " "•; — r 	
i





• ;: . . !.;:
;•'
! ! .


; ;
I i . . . .'.
I ' .
__.:


•^.VM


• •','•
— _

';;'i




'.',,'.


---.

:,,.



,
; ::


— :

• • i
• ' '
. . . ) ...












;;•:
• :




'('•':':
j
: "1
• • ' '

: '
.



• i '




: • •




















: ..

:_

•
• ,
'. i
n
; i
• : j
:: ! l

•••"••
' ' ' . j

: .• 1 •


• •
T . . : i .
- ! - -

....";..:. ' i ~ !
• | . . . ! . . : i ' | .
MEAN
STD. DEV. »
... . , . i ._ ... I..J-1 !'.--.:







: ':!! '-.
... i ...

• i -
- . !

; '•
(
1 . ', .
-1.3 GM ; !
2.58 GM i
i
. . ] - j • • ;
• !
! . ::
• - - 1
• i
:'!-..-;
•'.: ! :::

. . ' -
:|.:^i :
: ;••!,!• ! ;':
i

. i .
. :
: : * • . i
'! "-' : i :
i ' ! '
: : : ! j
- i i t
' : -; ' : i
.. -:}---..-, ,
— ; 	 	 - r 	 1
' : : • ; i
i
-14  -12   -10   -8-6-4-2     02     4     6
                                Error  (Miles per Gallon)
8    10    12    14    16    18
             Figure 19 DISTRIBUTION OF ERROR FOR MILES PER GALLON BASED ON OBSERVED
                       BAG VALUE CALCULATIONS (FIRST  505 SECONDS FTP)

-------
                       Table'11

 DISTRIBUTION OF MILES/GALLON ERROR, (OBSERVED-CALCULATED)
FROM THF. FIRST 505 SEC OF THE FEDERAL TEST PROCEDURE (FTP)
       AND THE SURVEILLANCE DRIVING SEQUENCES (SDS)
                                   NUMBER OF VEHICLES
                                    FTP          SDS
ERROR
-14 to
-13
-12
-11
rlO
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
8
9
10
11
12'
13
14
15
16
17
22
77
(MPC)
-13
-12
-11
-10
-9
-8
-7
-6
-5
.-4
-3
-2
-1
0
]
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
23
78
0
0
0
4
3
4
5
13
27
25
32
50
84
263
296
139
' 37
18
5
4
5
0
1
0
1
0
0
1
1
1
1
0
1
0
2
1
1
5
1
6
14
17
26
27
30
71
200
397
140
45
1?
8
6
o
^.
0
1
1
0
o
0
]
1
1
0
1
1
0
1

-------
6.      SUMMARY AND CONCLUSIONS
        Refinements and extensions of the automobile exhaust modal analysis
model, as originally reported in F.PA-460/3-74-005, have been completed in four
important areas:

        1)   Increased computational efficiency
        2)   Reduction of modal testing requirements
        3)   Accuracy and precision of group emission predictions
        4)   Prediction of fuel economy
                                                                 i
These improvements broaden the capability of the model and provide increased
opportunities for applying the results of standard emissions tests to new
contexts.
        Improved computational efficiency derives primarily from a simplifica-
tion of the method by which the instantaneous emission function  e(v, a) is
integrated over a driving sequence.  For a particular driving sequence, it
was shown that integration of the basis functions over the sequence need be
performed only once for all vehicles subjected to that sequence.  Moreover,
it was shown that vehicle factors and driving sequence factors affecting
emissions can be essentially separated as specific vector quantities, the
inner product of which yields the emissions for the particular vehicle
and driving sequence combination.
        Redefinition of modal testing requirements was examined by means of
variance-function analysis and principal-component analysis.  It was shown
that, although some redundancy exists in the modes as formulated, there are
also regions of the speed-acceleration plane not well represented by existing
modes.  It is indicated that the number of modes could be reduced without
serious loss of information but that modes should also be introduced to
cover the region of the ( v, a)-plane in the vicinity of accelerations
between -1 mph/sec and +1 mph/sec.
        The accuracy and precision of the model in predicting group emissions
was assessed by  (1) comparing model predictions with observed test values
for the SDS and FTP driving sequences, and  (2) evolving a scheme for defining
                                     71

-------
model output variance for an arbitrary driving sequence.   Variances of the
model predictions for the SDS and FTP compare favorably with the corresponding
variances of observed bag values as actually determined by test.  The scheme
for computing model output variance for an arbitrary driving sequence stems
directly from the basis function integration simplifications evolved in
improving model computational efficiency.  In particular, it is shown that
if the variance-covariance matrix of the model coefficients for the vehicles
comprising a group is known, this information can be adapted to defining the
variance of total emissions over any driving sequence.  It is necessary only
to know, in addition to this variance-covariance matrix, the integrated forms
of the basis functions for the driving sequence under consideration.
        Prediction of fuel economy by means of the model can be achieved by
developing an equation for the emission-rate surface for C0_ in addition to
CO and HC.  Then, by means of a carbon-balance equation, the total output of
carbon-containing emission products can be transformed into an estimate of
the amount of fuel which produced these emission products.  Accuracy and
precision of the miles-per-gallon predictions are considered to be limited
primarily by the errors in'measuring the modal outputs of C0_, CO and HC.
                                     72

-------
                                  APPENDIX I
                  VARIANCE FUNCTIONS FOR REGRESSION ESTIMATES
          Regression analysis is one of the basic tools employed in the
 formulation of the modal analysis emission model.   As used in this context,
 it is to be understood in a generalized way permitting a relatively wide
 choice of form for the regression equation.   Of particular interest is
 precision of the regression model over the treatment space;  this precision
 can be evaluated by means of the concept of variance functions as discussed
 in this appendix.

          In the following discussion,  the mathematical basis of generalized
 regression analysis is presented, together with a  discussion of variance
 functions and their visualization as a variance surface by means of variance
 maps.

 1.        FOUNDATIONS OF GENERALIZED REGRESSION ANALYSIS

          Assume that a variable   £   , which shall be referred to as the
response variable, depends on    k  experimental variables x. , x ,...,  x,
and that a functional relation

                           >  =  f (x   x       x ">                          H-1)
                           Q  -  i. l_A. ,  *2 , « • • ,  AjJ

exists.  Under certain conditions, the  response equation  (1-1) can be expanded
as

       £  » /3jf j (xx ,x2,... ,xk) +02f 2 (KI ,x2,...,xk)^f 3 (xx ,x2 f...,xfc) +...    (i_ 2)

where the  j3.  ,   i = 1,  2,  3, .>. are  constants to be  determined.  The  functions
 £  t  f  t  f        ... are of arbitrary  form provided  only  that  they  are  linearly
 Jl   <—    O
 independent  and  do not  involve  the  constants  /3..  Equation  (1-2)  is  thus  a
 linear  function  of the   /3.,  although  the  f.   may be nonlinear  in x. ,x0,...,x, .
                           1                  1                        1   £.      K
 Eouation  (1-2)  is said  to  be  a  linear  model, and the functions  fj  may be
 regarded  as  basis vectors   spanning  a  vector snace comprising  a  certain class
 of  functions.
                                      1-1

-------
           Suppose, now,  that  the  function  (1-1)  is to he estimated  from experi-
mental ohservations.  The variables     x,   ,   x   , ....  x     constitute  a
                                         12        k
k -dimensional space called the x-space, and one may estimate (1-1) from
observations taken at n points in this space.  These n  points constitute
what shall subsequently be referred  to as the  experimental design.  Tn general ,
   >•     cannot be observed at these  points because of error.  Rather, it  is
possible only to observe a variable   y   related to   /•   by

                            /=£+<                                       (1-3)

where        (       is a random error.  Then (1-2)  assumes  the form
     y =
                                                                            (1-4)
Since    f   is a random variable, the responses observed at each design point
also constitute a random variable.  As a result, it is possible only to obtain
from the observations an equation of the form

    y = b1f1(x1,x2,...,xk)+b2£2(x1,x2,...,xk)+...*bpfp(x1,x2,...,xk)       (I-S)

where    y    is an estimate of  v   and  v,   i = i  ?      «    is an estimate
         '                       /        °^»11>^i...,p
of  B.  .                                                        •     .
    ^ i

          Clearly, two types of errors can affect the approximation of the
function (1-1).  First, if (1-1) is to be approximated by the linear model (T-2),
then (1-1)  must belonp to the class of functions spanned by the basis functions
fj, £,..., f .           Second, some means must be found for minimizing or
controlling the effect of the random errors   (   , since these will affect
the estimation of the  (3 .  .  Generally, the form of (1-1) is unknown at the
outset, and the experimenter has the option of assuming a set of basis functions
according to experience or prior knowledge concerning the system under study.
For control of random errors, the theory of least squares is employed.
                                         1-2

-------
          In view of equation  (T-5), each observation   y.   obtained in the
process of data collecting can be represented as
                                    2xirx2r-'->V + --- + fj         d-6)

where the   b±  are estimates of    ^ and the   c.  are random errors as
computed from
At the outset,   y  is not known, and it is the object of the least-squares
algorithm to estimate   y  in such a way that
                              n
                              £     = a minimum.                    (1-8)
We proceed to display the theory of this algorithm.

          In matrix notation, equation  (1-6) can be written as
where    y,    and   e_    are n-rowed column vectors (°r  n x 1 matrices),  b_ is
a p -rowed column vector  (or  p x 1 matrix), and  X  is a matrix of dimension
n x p         .  The set of points at which observations are made will be referred
to as the design region, and the set of functions  fj, f , ..., f        will be
referred to as the basis functions or simply the basis.
                                       1-3

-------
helow.
          For a two-variahle case, the   X- matrix is penerated as shown
                              X  -  MATRIX
      11 • * 22
          2\
     ]2,
          2\
          22
fl
f1
f,
fl
fl
fl
f,
fl
fl
(rii-*2i)
(*,,, *22)
/v" z A
(*]]• *2H>
(*!2'.*2l)
(*12- *22l
m

*
(ylK'X2l)
(X1K' *22)
*
*
(*1K' X2M)
f2
f2
f2
f2
f2
f2
f2
'f2
f2
(*1P *2l) ' ' '
(*n.«22) •••
(*n- Z2M^ • • •
(*12- *2l)'
(*1 2' ^22^ ' ' '
(X12' *2d ' ' '
(X\K< *2l) ' ' '
(*,K, ^22) . . .
(*IK' *2M)
.% (*11'. *21^
fp (*!!' •*:22V
fp (^11' Z2M)
fp (*I2' *2l)
fp ^12' ^22^
fp (*12' X2\d
fp ^IK' ^21 )
fp (Z1K( Z22)
', (".«'• *»)
          The columns of the matrix are identified with the basis functions,
the rows with the desipn points.  Tn the example shown, x   assumes K
distinct values and  x   assumes M distinct values, so that there are  n = KM
points in the desicn.  Each hasis function is evaluated at every point in the
design, and the resulting  n x  p  array constitutes the X-matrix.  If the array
is rearranged, so that the rows become columns and the columns become TOWS,
the resulting matrix is the transpose of X.  The matrix X and its transpose X1
will he used extensively in the sequel.

          Consider (1-9) and write the error vector    e_  as
                         e_ = y_ - X b
(T-10)
                                       1-4

-------
The sum of squares of the components    e , e_,..., e
vector    e   can be written as
                                                              of the error
                          Q = e'e_=  (y-Xb) • (y_-Xb)
                                                                     (T-ll)
 where   e.'  is the transpose of  e_  and   (y - XbJ'     is tn^ transpose of
 (y - xbj.


           In extenso, (1-11) becomes (since  n = KM  )
          n    .    K   M
          2  e.  =  2   2
         i=l  X    k=l m=l
          K.   M
          I   I
         k=l m=l
                                                                     (T-12)
           By differentiatinp (I-T2) with respect to the    b.    , one obtains

 a set of p equations of the form
                                                                      (T-13)
which can be solved for the  b.   to minimize  Q   .  The result can be summarized
succinctly in the form
                          X'X b = X1  y_
                                                                    (1-14)
where    X'X       js a square matrix of order  p .
the so-called normal equations of least squares.
Then

                          b_=  (X'X)"1  X' y_
is the formal solution minimizing the error sum of squares.
                                                     Fquation fl-14)  provides
                                                                    (1-15)
                                        1-5

-------
          It is of interest to investigate the statistical properties  of the
least squares solution under certain assumptions. For the error vector
we assume that

                  E(f)  =0                                           (T-16)
                  E(ff)  = la2
where    E   denotes expectation and   I   is the identity matrix of order  n  .
Equations (1-16) are equivalent to the assumption that the errors are  uncorrelated,
with mean zero and constant variance   a^ .

          From equation  (1-15), it is clear that

                 .b = C X' v_                                         (1-17)

where   C =  (X'X)"1       . Note that, for the desipn points,  (1-4)  can be
written as

                 £=X£+<                                          (1-18)


Substituting (1-18) into  (1-17) one obtains

            b = C X' (X/Ue) = C X'X£+ C X'f =£+ C X'l             (T-19)

Since    (_    is a random vector,    b  is also a random vector.
Takinp expectations in (T-19), one sees that

            E(b) = E(j8f C X'e)
                 = E(/3)  ••• E(C  X'f)
                 = J0+  C X'E(<)  =&                                    (1"20)
                                       1-6

-------
Thus, the estimates provided hy  (1-15) are unbiased, provided the postulated
form of the function  f  is correct.   In  the event of an  incorrect choice  of
model, the estimates of  the coefficients will he hiased  to an extent dependinp
on the depree of discrepancy hetween  the postulated and  true models.

          Unbiasedness derives from the  anility to substitute for    y_    its
equivalent   X/8+ £_    .  Suppose the  model is inadequate and requires additional
basis functions so that  the true model is
                 v_

where   *,Q.      denotes the additional terms in the expansion.  Then
            b = CX'y = CX' (XjSOC^+f
              = CX'X/3+ CX'X./S.
                    •-       i1— i
and
         E(b)  = |3 + CX'X^
              = & AA
The matrix  AX = CX'X1 = (X'X)"1 X'Xj      is called the alias matrix.  Thouph it
is useful in indicatinp the extent of confoundinp amonp various coefficients
in the correct model, it is defined only in relation to an alternative hypothesis.

          Consider, now, the covariance matrix of  b    .  Denoted V(b)   , the
covariance matrix is of order   p x p    and is given by

            V(b) = F. [b-E(b)] [b-E(b)]  '                              (1-21)

But, from (1-20)
           »-B(b).b-g                                         (,


and applying the  results of (1-19)  gives

           b  - E(bJ =  §+ CX'6  -£ = CX'i                            (
                                      1-7

-------
Therefore ,
                 V(b)  = E[CX'C]  [CX'f] '
                             ee'XC'J                                 (1-24)

or, since   c   is symmetric,

                                                                    (1-25)
But, by the assumption of (1-16),
                       = la2
Therefore,
                V(b) = CX'XCa2 = Co2
                                                                    (1-26)
          The results of (1-26) can be expressed as follows:

     (a)  The diagonal elements of the matrix  C = (X'X)   , when
                            2
          multiplied by    o ,  the variance of the individual
          observations, provide the variances of the estimated
          coefficients in the model.

      (b) The off-diagonal elements of   C  similarly provide the
          covariance between two estimated coefficients,   b.    andb.
 2.       VARIANCE FUNCTIONS
           The variable   y  is often referred to as  the response.  We wish  to
 study the variance  to which the estimated response   y   is  subject  as we
 consider different  points in the x-space.  The necessary information can be
 obtained by an extension of the above reasoning.

           Consider an arbitrary point  (Xj, x2)     in treatment space  and
 define a corresponding vector   x_  as
                x =
                                      1-8

-------
 Then the estimated  response  at  that  point  is

      y  - xb  -bjfjCx^)  * b2f2(xlfx2)  +  ...  + bpf  (xlfx2)


 where   b_  is a column vector of coefficients.
 Then,

               E(y) = E(xb) - xE(b) = x/3                             n_
    and
               Var (y) = E^-E(y)J  [^-E(y)] •
                       = E[xb-xj8j [xb-xj3]'
                         Erxfb-«n  [x^-^)]'
                         xE [(b-£)-(bj3)']  x-
                                                                     (I-
 Rut   EC^Cb-jQ)' = V(b)         •   Therefore,
                   Var  (y) = x(X'X)"1 x_'a2
                                                                     (T-3U
          Equation  (1-31) pives  the variance of the estimated response  at  an
arbitrary point   (x  , x )     in tne  sampling plane.  Note that this variance
depends  strongly  on the form  of  the    x   matrix, which  is determined  both
by the location of  the design points  and  the form of the  basis functions.

          Equation  (1-31) theoretically provides an estimate of the variance
of the estimated  response at  every point  in tht  x-space.  In the  event that the
 x-space is two-dimensional,  it  is possible to display  contour maps of  this
variance.   The variance is  computed at every point in an  array of  points in the
 x-space plane, and these values are  then thresholded arrd displayed as  a
variance map.  Figure  1-1  provides a graphic presentation  of such a variance
                                         1-9

-------
surface for a 3  factorial design using the basis functions listed.   Similar
techniques were applied to the modal analysis data in generating the variance
maps exhibiting the effect of reallocation of modes on the precision of
estimation of the model.  In those applications, the two variables  x.  and
x   were speed  v and acceleration  a .
             FIGURE  I - 1    VARIANCE SURFACE FOR 32 FACTORIAL DESIGN
                                        1-10

-------
                           APPENDIX II

    COMPUTER PROGRAM REVISIONS FOR INCREASED COMPUTATIONAL EFFICIENCY

                       ***MAIN PROGRAM I***

C
C     MAIN PROGRAM  I DETERMINES  THE AMOUNT OF EMISSIONS GIVEN OFF BY
C     INDIVIDUAL VEHICLES OVER A DRIVING SEQUENCE SPECIFIED BY ARR»VVT».<
C
C     VTM(I )=>VELOCITY VS.  TIMEdN ONE  SECOND INTERVALS) OF THE SURVEIL-'
C     -LANCE DRIVING SEQUENCE.VTM(I)=VELOCITY(MPH) AT TIME (I-l)SEC
C     (REAL**)
C
C     VVT(I)=>VELOCITY VS.  TIMEdN ONE  SECOND INTERVALS) OF ANY DRIVING
C     SEQUENCE OVER WHICH   EMISSIONS ARE TO BE CALCULATED.VVT(I)*VELOC-
C     -ITY AT TIME  (1-1 ) SEC.  (REAL*4)
C                                                                        i
C     AMTC(I,J)=> AMOUNT OF  I «TH EMMITTANT GIVEN OFF IN J»TH MODE.
C
C     DS«I)=DISTANCE(MILES)TRAVELED IN  I»TH MODE.NOTEtSTEADY STATE MODES
C     ARE 60 SEC IN DURATION.
C
C     FUNC(I)=> INTEGRATED  BASIS FUNCTIONS CHARACTERIZING A
C               DRIVING SEQUENCE (REAL*8)
C
C

      DIMENSION ITAB(20,2),IDAT(4,19),RDAT(16i,19),DS(37)
      DIMENSION VTM( 1055), VVT ( 2000 ),AMTC(4, 37)
      REAL*8 C(4),FUNC(12)
      REAL*8 AA(9,32),AS(3,5),BAD(4,12),XMPG,HCGPM,CQGPM,C02GPM
      DATA DS           /.0602f.0741,.0201,.0705,.1360,.1268,.2163,.1716
     C,.2043..3367,.3136,.1973t.3313t.2994,.0579,.0173,.1759,.1392,.1528
     C,.1304,.2654,.2634,.0737,.3134,.2362,.0444,.4009,.3293,.0886,.2599
     C,.1813,.0592,.0000,.2500,.5000,.7500,1.0007
      DEFINE FILE 99(75,3256,U,N1)
C
C      READ IN SURVEILLANCE  DRIVING SEQUENCE
C
      PRINT 1003
 1003 FORMAT(1HO,«SURV. DRIVING  SEQ.«//)
      DO 3000 1=1,100
      NX1=((I-l)*16)+l
      NX2=NX1+15
      READ(5,100)(VTM(K),K=NX1,NX2)
      PRINT 1002, (VTM(K),K=NX1,NX2)
                               II-l

-------
 1002 FORMATUHO,16F8.0)
  100 FORMATU6F5.0)
      IF(VTM(NX1).GT.99.0IGOT03111
 3000 CONTINUE
 3111 CONTINUE
C
C     READ IN DRIVING SEQUENCE OVER WHICH EMMISSIONS ARE TO BE CALCULATE
C
C     IN THIS EXAMPLE VVT=> FIRST 505 SEC. OF FTP
C
      PRINT 1004
 1004 FORMATUHOt »FTP DRIVING SEQ.V/)
      NPTS=506
      DO 1500 1=1,100  .
      NX1=<(I-l)*16)+l
      NX2=NX1+15
      READ(5,100)(VVT(K),K=NX1,NX2)
      PRINT 1002, 
-------
      DO 1000 IR=1,37
      DD=1.0
C
C  FOR A/D MODES CHANGE DATA FROM GRMS/MILE TO GRMS
C
      IF(IR.LE.32)DD=DSUR)
      DO 1001 IC=1,4
      IW = UIR-1)*4) + 13 + 1C
      AMTC(IC,IR)=RDAT
-------
      SUBROUTINE ESUMtVVTtNT,FUNC,DIST)
C     ******************************************************************
C
C     SUBROUTINE ESUM INTEGRATES THE BASIS FUNCTIONS OVER THE
C      INPUTTED DRIVING SEQUENCE AND DETERMINES THE DISTANCED TRAVELED
C
C
C     VVT(I)=>VELOCITY VS. TIME HISTORYCDRIVING CYCLE) IN ONE SECOND
C             INTERVALS. WT (I )=VELOCITY(MPH t AT THE I «TH SECOND .REAL**
C
C     NT=>MAXIMUM NUMBER SECONDS IN DRIVING CYCLE*! SECOND
C                 "
C     FUNC(I)=> INTEGRATED BASIS FUNCTIONS CHARACTERIZING THE DRIVING
C            SEQUENCE (REAL*8)
C
C     DIST=D,ISTANCE(MILESHN SPECIFIED DRIVING CYCLE,REAL**
C
C     ******************************************************************
      DIMENSION VVT(NT)
      REAL*8 X(12)tFUNC<12),DIS,AMIN,AMAX,A1,A2,HOA
      AMAX=1.0DO
      AMIN=-1.20DO
      A1=-1.0DO/AMIN
      A2=-1.0DO/AMAX
C
C  CLEAR FUNC ARRAY
C
      DO 1000 1=1,12
 1000 FUNC(I)=O.ODO
C
C     INTEGRATE AUTO»S EMISSION RATE FUNCTION OVER DRIVING CYCLE
C
      DIS=O.ODO
      NTT=NT-1
      DO 3000 IT=1,NTT
      KT=IT+1
      X(1)=1.0DO
      X(2) = DBLE((WTaT)+VVT(KT))/2.0)
      X(3)=DBLE(VVT
-------
      SUBROUTINE EDOTJAMTC,AA,AS,BAD)
C
C      SUBROUTINE EDOT COMPUTES THE COEFFICIENTS THAT SPECIFY  AN AUTO'S
C     INSTANTANEOUS EMISSION RATE FUNCTIONS FOR HC,CO,NOX(ARRAY  'BAD1),
C     GIVEN THE AMOUNT OF EACH EMITTANT GIVEN OFF BY THE AUTO  IN 32  A/D
C     MODES AND 5 STEADY STATE MODESURRAY »AMTC»),AND THE  BASIS
C     FUNCTION FACTOR ARRAYS(AAtAS).
C
C*****THIS VERSION CALCULATES COEFFICIENTS FOR C02 ALSO
C     THE DO 1000 LOOP CHANGED TO 1=1,4
C
C      AMTC(I,J)=AMOUNT(GMS) OF THE I«TH EMITTANT GIVEN OFF BY THIS  AUTO
C     IN THE J»TH MODE.  I=1=>HC,I=2=>CO,I=3=>C02,1=4=>NOX,
C     J=l,37 (32 A/D MODES, 5 STEADY  STATE MODES). (REAL**)
C
C     BAD(I,J)=J'TH COEFFICIENT OF THIS AUTO'S INSTANTANEOUS EMISSION
C     RATE FUNCTION FOR THE I|TH KIND OF EMITTANT.I=1=>HC,I=2=>CO,
C     I=3=>C02,I=*=>NOX. (REAL*8)
C
C     AA=>BASIS FI
C     AA=BASIS FUNCTION FACTOR ARRAY  FOR ACEL/DECEH CALCULATED BY SUBROU
C    -TINE SETUP).
C
C     AS=BASIS FUNCTION FACTOR ARRAY  FOR STEADY STATE(CALCULATED BY
C     SUBROUTINE SETUP).
C
C     TM(I)=TIME(SEC) IN I»TH MODE.(REAL**)
C
£     *******jMc***************:Hc********^^
      DIMENSION TM(37),AMTC(4,37)
      REAL*8 AA(9,32)fAS(3,5),BAD(4,12),SUM,YA(32),YS(5),B(3),XO,X1,X2
     C,A1,A2
      DATA TM/12.,16.,8.,ll.»13.,12.,17.,12.,14.t3G.f26.,21.>32.t23.t9.,
     C8.,22.,16.,18.,19.,25.,28.,15.,25.,18.,10.,38.,35.,18.,21.f14.,13.
     Ct60.t60.»60.,60.,60./
      NOBSA=32
      4MOBSS=5
      NBFA=9
      NBFS=3
C
      DO 1000 IC=1,4
C
C      IC=1=>HC,IC=2=>CO,IC=3=>C02,IC=A=>NOX
C
C     CALCULATE OBSERVED AVERAGE EMISSION RATES OVER 32 A/D MODES
C

      DO 1100 1=1,32
      A1=AMTC(IC,I)
      A2=TM(I)
      YA(I)=A1/A2
 1100 CONTINUE
                               II-5
-------
c
c
c
c
c
c
c
c
c
c
c
c
      CALCULATE COEFFICIENTS THAT SPECIFY A/D EMISSION RATE FUNCTIONS

     DO 1200 I=1,NBFA
     SUM=O.ODO
     DO 1250 J=1,NOBSA
     SUM=SUM+(AA(I,J)*YA(J))
1250 CONTINUE
     BAD(IC,I)=SUM
1200 CONTINUE

     CALCULATE OBSERVED AVERAGE EMISSION RATES OVER 5 SS MODES

     DO 2000 1=33,37
     IP=I-32
     A1=AMTC(IC,IJ
     A2=TM(I)
     YS(IP)=A1/A2
2000 CONTINUE

     CALCULATE COEFFICIENTS THAT SPECIFY SS EMISSION RATE FUNCTIONS

     DO 2001 I=1,NBFS
     SUM-O.ODO
     DO 2100 J=1,NOBSS
     SUM=SUM + (AS( I,J)*YSU))
2100 CONTINUE
     8(I)=SUM
2001 CONTINUE

     CHECK ON EXISTANCE OF NEGATIVE EMISSION RATES

     LOOP=0
     IF(B(3).EQ.O.ODO)GOT02151
     XO=(B(2)**2)-<4.0DO*B(3)*B(1))
     IF(XO.LT.O.ODO)GOT02153
     XO=DSQRT( (B(2)**2)-(4.0DO*B<3)*B(im
     X1=(-B(2)+XO)/(2.0DO*B(3))
     X2 = (-B(2)-XO)/(2.0DO*B(3M
     IF((X1.GT.O.ODO.AND.X1.LT.60.0DO).OR.(X2.GT.O.ODO.AND.X2.LT.60.0DO
    ;))LOOP=1
     GOT02153
     XO=-B(1)/B{2)
     IF(XO.GT.O.ODO.AND.XO.LT.60.0DO)LOOP=2
     IF(LOOP.EQ.O)GOT02154
 2151
 2153
C
C
C
C
C
c
c
c
      IF LOOP=0=>NO NEGATIVE EMISSIONS FOR VELOCITYS BETWEEN 0,60

     IF LOOP=1 OR 2=> NEGATIVE EMISSION RATES BETWEEN 0.60MPH.

     CALL SUBROUTINE PAD TO FIND COEFFICIENTS WHICH DO NOT PRODUCE
     NEGATIVE EMISSION RATES.

     CALL PAD(YS,B)
 2154 BAD(IC,10)=B(1)
      BAD(IC,11)=B(2)
      BAD(IC,12)=B(3)
 1000 CONTINUE
      RETURN
      END
                               II-6
-------