EPA/600/R-01-068 December 2001 Computational Methods for Sensitivity and Uncertainty Analysis for Environmental and Biological Models By Sastry S. Isukapalli and Panos G. Georgopoulos Computational Chemodynamics Laboratory Environmental and Occupational Health Sciences Institute Rutgers University and UMDNJ RWJ Medical School 170 Frelinghuysen Road, Piscataway, NJ 08854 Cooperative Agreement CR823717 Project Officer Daewon W. Byun Atmospheric Modeling Division National Exposure Research Laboratory Research Triangle Park, NC 27711 National Exposure Research Laboratory Office of Research and Development U.S. Environmental Protection Agency Research Triangle Park, NC 27711 ------- ------- DISCLAIMER ~> This document has been reproduced from the best copy furnished by the sponsoring agency. It is being released in the interest of making available as much information as possible. ------- NOTICE THE U.S. Environmental Protection Agency through its Office of Research and De- velopment partially funded and collaborated in the research described here under Cooperative Agreement CR823717 to the Environmental and Occupational Health Sciences Institute, Rutgers University. It has been subjected to the Agency's peer and administrative review and has been approved for publication as an EPA docu- ment. ------- ABSTRACT Characterization of uncertainty associated with transport-transformation models is often of critical importance, as for example in cases where environmental and bio- logical models are employed in risk assessment. However, uncertainty analysis using conventional methods such as standard Monte Carlo or Latin Hypercube Sampling may not be efficient, or even feasible, for complex, computationally demanding mod- els. This work introduces a computationally efficient alternative method for uncer- tainty propagation, the Stochastic Response Surface Method (SRSM). The SRSM approximates uncertainties in model outputs through a series expansion in normal random variables (polynomial chaos expansion). The unknown coefficients in series expansions are calculated using a limited number of model simulations. This method is analogous to approximation of a deterministic system by an algebraic response surface. Further improvements in the computational efficiency of the SRSM are accom- plished by coupling the SRSM with AD IFOR, which facilitates automatic calculation of partial derivatives in numerical models coded in Fortran. The coupled method, SRSM-ADIFOR, uses the model outputs and their derivatives to calculate the un- known coefficients. The SRSM and the SRSM-ADIFOR are general methods, and are applicable to any model with random inputs. The SRSM has also been implemented as a black-box, web-based tool for facilitating its easy use. The SRSM and the SRSM-ADIFOR have been applied to a set of environmental and biological models. In all the case studies, the SRSM required an order of magni- tude fewer simulations compared to conventional methods, and the SRSM-ADIFOR required even fewer simulations. In addition to their computational efficiency, these methods directly provide sensitivity information and individual contributions of in- put uncertainties to output uncertainties; conventional methods require substantially larger numbers of simulations to provide such information. Thus, the SRSM and the SRSM-ADIFOR provide computationally efficient means for uncertainty and sensi- tivity analysis. Finally, this research addresses uncertainties associated with model structure and resolution with application to photochemical air quality modeling. A three dimen- sional version of the regulatory Reactive Plume Model (RPM), RPM-3D, has been developed and applied to understand model uncertainty. ------- ACKNOWLEDGMENTS • U.S. EPA National Exposure Research Laboratory (NERL) (Co-operative Agreement CR823717) • NJDEP funded Ozone Research Center at EOHSI • U.S. DOE Consortium for Risk Evaluation with Stakeholder Participation (CRESP) (Cooperative Agreement DE-FC01-95EW55084) ------- Contents 1 INTRODUCTION 1 1.1 Uncertainty Analysis 1 1.2 Transport-Transformation Models 2 1.3 Limitations in Performing Uncertainty Analysis 3 1.4 Objective of this Research 4 1.5 Outline of the Document 5 2 BACKGROUND 8 2.1 Types and Origins of Uncertainty in Transport-Transformation Models . . 8 2.1.1 Natural Uncertainty and Variability 8 2.1.2 Model Uncertainty 9 2.1.3 Parametric/Data Uncertainty 10 2.2 Reducible and Irreducible Uncertainty 11 2.3 Approaches for Representation of Uncertainty 12 2.3.1 Interval Mathematics 13 2.3.2 Fuzzy Theory 14 2.3.3 Probabilistic Analysis 16 2.4 Sensitivity and Sensitivity/Uncertainty Analysis 19 2.5 Conventional Sensitivity/Uncertainty Analysis Methods 19 2.5.1 Sensitivity Testing Methods 21 2.5.2 Analytical Methods 21 2.5.3 Sampling Based Methods 23 2.5.4 Computer Algebra Based Methods 27 2.6 Need for Alternative Sensitivity/Uncertainty Analysis Methods 29 3 THE STOCHASTIC RESPONSE SURFACE METHOD: DEVELOP- MENT AND IMPLEMENTATION 30 3.1 Overview of the Method 30 3.2 Step I: Representation of Stochastic Inputs 32 3.2.1 Direct Transformations 33 3.2.2 Transformation via Series Approximation 34 3.2.3 Transformation of Empirical Distributions 34 3.2.4 Transformation of Correlated Inputs 35 3.3 Step II: Functional Approximation of Outputs 37 3.4 Step III: Estimation of Parameters in the Functional Approximation ... 38 iv ------- 3.4.1 Deterministic Equivalent Modeling Method/Probabilistic Colloca- tion Method (DEMM/PCM) 39 3.4.2 Efficient Collocation Method (ECM) 41 3.4.3 Regression Based SRSM 42 3.5 Step IV: Estimation of the Statistics of the Output Metrics 42 3.6 Step V: Evaluation of Convergence, Accuracy and Efficiency of Approxi- mation 44 3.7 An Illustration of the Application of the SRSM 44 3.8 Computational Implementation 47 3.8.1 Implementation of the Conventional Uncertainty Propagation Meth- ods 47 3.8.2 Implementation of the Stochastic Response Surface Method ... 49 3.9 Implementation of a Web Interface to the SRSM 50 4 COUPLING OF THE SRSM WITH SENSITIVITY ANALYSIS METH- ODS: DEVELOPMENT AND IMPLEMENTATION OF SRSM-ADIFOR 54 4.1 Methods for Estimation of Partial Derivatives 56 4.2 Automatic Differentiation using ADIFOR 59 4.3 Coupling of SRSM and ADIFOR 59 4.3.1 Selection of Sample Points 62 4.4 An Illustration of the Application of the SRSM-ADIFOR 62 4.5 Implementation of the SRSM-ADIFOR 64 5 CASE STUDIES FOR THE EVALUATION OF THE SRSM AND THE SRSM-ADIFOR 66 5.1 Case Study I: A Zero Dimensional Physiological System 67 Uncertainty Analysis of a Physiologically Based PharmacoKinetic Model for Perchloroethylene 67 5.1.1 Description of the Case Study 67 5.1.2 Specification of Parameter Uncertainty 70 5.1.3 Implementation of the PERC PBPK Model 71 5.1.4 Results for the PBPK case study 71 5.2 Case Study II: A Two-Dimensional Photochemical Air Quality Model . . 82 Uncertainty Analysis of the Reactive Plume Model (RPM-IV) 82 5.2.1 Description of RPM-IV 82 5.2.2 Uncertainty Analysis of RPM-IV 84 5.2.3 Results and Discussion 84 5.3 Case Study III: A Three-Dimensional Urban/Regional Scale Photochemical Air Quality Model 89 Uncertainty Analysis of the Urban Airshed Model (UAM-IV) for the New Jersey/Philadelphia Domain 89 v ------- 5.3.1 Uncertainties Associated with Biogenic Emission Estimates .... 90 5.3.2 Uncertainty Analysis 91 5.3.3 Estimation of Uncertainties 91 5.3.4 Results of the Case Study 94 5.4 Case Study IV: A Ground Water Model with Discontinuous Probability Distributions 99 Uncertainty Analysis of Tritium Contamination in a Landfill using the EPACMTP Model 99 5.4.1 EPACMTP Model 99 5.4.2 Data Sources for the Application of EPACMTP 101 5.4.3 Implementation of the Monte Carlo Method in EPACMTP .... 101 5.4.4 Uncertainty Analysis 102 5.4.5 Results and Discussion 103 6 CHARACTERIZATION AND REDUCTION OF MODEL UNCERTAINTY: AN ATMOSPHERIC PLUME MODEL STUDY 105 6.1 Introduction and Background 105 6.2 Photochemical Air Pollution Modeling 106 6.2.1 Mathematical and Numerical Formulation of PAQSMs 107 6.3 Formulation of The Reactive Plume Model (RPM) 108 6.3.1 Initial Physical Dimensions of the Plume 110 6.3.2 RPM-IV Model Equations Ill 6.3.3 Limitations of the RPM-IV 112 6.4 Formulation of the RPM-3D 113 6.5 Case Studies 115 6.6 Discussion 120 7 CONCLUSIONS AND DISCUSSION 122 7.1 Development and Application of the SRSM 122 7.2 Development and Application of the SRSM-ADIFOR 123 7.3 Consideration of Uncertainties Beyond Parametric/Data Uncertainty . . . 124 7.4 Directions for Future Work 124 7.4.1 Improvements to the Stochastic Response Surface Method .... 124 7.4.2 Further evaluation of the SRSM and the SRSM-ADIFOR 125 7.4.3 Addressing uncertainty propagation under constraints 125 7.4.4 Random processes and random fields 126 7.4.5 Uncertainties Associated with Evaluation Data 127 vi ------- List of Figures 1.1 Outline of the Document 7 2.1 Types of uncertainty in transport-transformation modeling 9 2.2 A schematic depiction of the propagation of uncertainties in transport- transformation models 17 3.1 Schematic depiction of the Stochastic Response Surface Method .... 32 3.2 Schematic depiction of the steps involved in the application of conventional Monte Carlo method 47 3.3 Schematic depiction of the steps involved in the application of the Stochas- tic Response Surface Method 49 3.4 Application of the web based SRSM tool: specification of input uncertainties 51 3.5 Application of the web based SRSM tool: recommended sample points and the corresponding model outputs 52 3.6 Application of the web based SRSM tool: estimates of the pdfs of the outputs 53 4.1 Rationale for coupling of the SRSM with a sensitivity analysis method . . 55 4.2 Schematic illustration of the Automatic Differentiation method 58 4.3 FORTRAN derivative code generation using ADIFOR - An Example ... 60 4.4 Schematic depiction of the steps involved in the application of the SRSM- ADIFOR 64 5.1 Schematic representation of a PBPK model for PERC 68 5.2 Evaluation of DEMM/PCM: uncertainty in CML (6 uncertain inputs) . . 72 5.3 Evaluation of DEMM/PCM: uncertainty in AUCA and AUCL (6 uncertain inputs) 73 5.4 Evaluation of ECM: uncertainty in AUCA and AUCL (6 uncertain inputs) 74 5.5 Evaluation of ECM: uncertainty in CML (6 uncertain inputs) 75 5.6 Evaluation of ECM: uncertainty in CML (11 uncertain inputs) 76 5.7 Evaluation of ECM: uncertainty in AUCA and AUCL (11 uncertain inputs) 77 5.8 Evaluation of SRSM and LHS: uncertainty in CML (11 uncertain inputs) 78 5.9 Evaluation of SRSM and LHS: uncertainty in AUCA and AUCL (11 uncer- tain inputs) 79 5.10 Evaluation of SRSM-ADIFOR: uncertainty in CML 80 5.11 Evaluation of SRSM-ADIFOR: uncertainty in AUCA and AUCL 81 5.12 Evaluation of ECM: uncertainty in downwind ozone concentrations ... 86 vii ------- 5.13 Evaluation of SRSM: uncertainty in downwind ozone concentrations ... 87 5.14 Evaluation of SRSM-ADIFOR: uncertainty in downwind ozone concentrations 88 5.15 Origins and types of uncertainty present in Photochemical Air Quality Simulation Models 90 5.16 The Philadelphia/New Jersey modeling domain used for uncertainty analysis 93 5.17 Probability distribution of the daily maximum ozone concentration . ... 94 5.18 Probability distribution of the daily average ozone concentration 95 5.19 Probability distribution of the daily maximum 8 hr running average ozone concentration 95 5.20 Probability distribution of the pervasiveness of the ozone episode .... 96 5.21 Uncertainty in estimated maximum tritium concentration in a receptor well 103 5.22 Uncertainty in estimated time of occurrence of maximum tritium concen- tration in a receptor well 104 6.1 Schematic depiction of the evolution of an atmospheric plume 109 6.2 Schematic depiction of the entrainment and detrainment steps in the RPM 110 6.3 Discretization of the plume cross section by RPM-IV and RPM-3D . . . 113 6.4 Dependence of plume average 03 concentration on horizontal resolution of the RPM-IV in a VOC dominant regime 115 6.5 Dependence of plume average 03 concentration on vertical resolution of the RPM-3D in a VOC dominant regime (4 horizontal cells) 116 6.6 Horizontal profile of the O3 concentration in the plume in a VOC dominant regime (6 horizontal cells, RPM-IV) 116 6.7 Vertical profile of the O3 concentration at the plume centerline (w.r.t. horizontal) in a VOC dominant regime (4 horizontal cells and 6 vertical cells, RPM-3D) 117 6.8 Dependence of plume average O3 concentration on horizontal resolution of the RPM-IV in a NOx dominant regime 118 6.9 Dependence of plume average 03 concentration on vertical resolution of the RPM-3D in a NOx dominant regime 119 6.10 Horizontal profile of the 03 concentration in the plume in a NOx dominant regime (6 horizontal cells, RPM-IV) 119 6.11 Vertical profile of the O3 concentration at the plume centerline (w.r.t. horizontal) in a VOC limited (NOx dominant) regime (4 horizontal cells and 6 vertical cells, RPM-3D) 120 viii ------- List of Tables 1.1 Transport-transformation components in biological and environmental models 3 2.1 Sources of uncertainty in transport-transformation models 11 2.2 Selected probability density functions useful in representing uncertainties in environmental and biological model inputs 18 2.3 A summary of sensitivity measures employed in sensitivity analysis .... 20 2.4 A representative list of packages available for automatic differentiation 27 3.1 Representation of common univariate distributions as functionals of normal random variables 33 5.1 Deterministic and uncertain parameters used in the uncertainty analysis of the PERC PBPK model 69 5.2 Contribution of individual parameters to the overall uncertainty in the UAM-IV 97 5.3 Uncertain parameters in the EPACMTP model 100 5.4 Deterministic parameters in the EPACMTP model 102 ix ------- Chapter 1 INTRODUCTION 1.1 Uncertainty Analysis Mechanistic modeling of physical systems is often complicated by the presence of un- certainties. Environmental and biological modeling, for example, entails uncertainties in the estimates of toxicant emissions, transformation and transport parameters, etc., that impact the estimates of related health risks. The implications of these uncer- tainties are particularly important in the assessment of several potential regulatory options, for example, with respect to the selection of a strategy for the control of pollutant levels. Even though significant effort may be needed to incorporate uncer- tainties into the modeling process, this could potentially result in providing useful information that can aid in decision making. A systematic uncertainty analysis provides insight into the level of confidence in model estimates, and can aid in assessing how various possible model estimates should be weighed. Further, it can lead to the identification of the key sources of uncertainty (such as data gaps) which merit further research, as well as the sources of uncertainty that are not important with respect to a given response. The purpose of quantitative uncertainty analysis is to use currently available in- formation to quantify the degree of confidence in the existing data and models. The purpose is not to somehow "reduce" uncertainty - reduction in uncertainty can only come from gathering additional information and filling "data gaps". Even though the applicability of a model is limited by the model assumptions and the uncertainties in the evaluation data, understanding the judgments associated with the modeling process is more valuable than side-stepping the uncertainty analysis. In fact, it is pre- cisely for problems where data are limited and where simplifying assumptions have been used that a quantitative uncertainty analysis can provide an illuminating role, to help identify how robust the conclusions about model results are, and to help target data gathering efforts [70]. The following stages are involved in the uncertainty analysis of a model: (a) es- timation of uncertainties in model inputs and parameter (characterization of input uncertainties), (b) estimation of the uncertainty in model outputs resulting from the uncertainty in model inputs and model parameters (uncertainty propagation), (c) characterization of uncertainties associated with different model structures and model formulations (characterization of model uncertainty), and (d) characterization 1 ------- Chapter: I Section: 2 2 of the uncertainties in model predictions resulting from uncertainties in the evaluation data. 1.2 Transport-Transformation Models A wide range of environmental and biological models involve the fate and trans- port of a chemical species that originates from a source, travels through a medium ("transport"), undergoes changes due to chemical or radioactive processes ("trans- formation"), and eventually comes in contact with a receptor (e.g., a human, or specifically an organ, or a tissue). In this report, the term "transport-transformation models" refers to a wide range of biological, environmental, and engineering problems. These models are similar in structure, often described by ordinary or partial differential equations. These models are based on the continuity equation, and on a mass-balance of chemical or radioactive species in a control volume. Even though the complexity of these models varies significantly, they are categorized together because the underlying physical and chemical processes, and the mathematical equations describing these systems, are similar. The term "transport" refers to the movement of material through the surroundings as a result of associated flows and diffusive processes. The inhalation of chemicals, the movement of toxic substances along with the blood flow, or the absorption of pollutants by the layers of skin, are examples of transport. The flow of gaseous pollutants along with the wind, and the flow of contaminants through and along with surface and ground water are also examples of transport processes. Additionally, the diffusion of chemical species across a concentration gradient is also a transport process. The term "transformation" refers to the change of a species (physical, chemical, or biological). In biological systems,the enzymatic reactions account for transformations. In the atmosphere, transformation can result from a series of nonlinear photochem- ical reactions. In groundwater systems, transformation can sometimes result from radioactive processes, and sometimes due to chemical reactions. Many transport-transformation models can be represented in a simplified form by the following equation: ^ = F + R + S at where — is the rate of change of a species concentration in a control volume at under consideration, F is the net inflow of the species into the control volume, R is the net rate of production of the species due to chemical or radioactive processes, and S is the net rate of injection of the species into the control volume (also called as ------- Chapter: I Section: 3 3 Table 1.1: Examples of transport-transformation components in biological and envi- ronmental models Model Component Flow Reaction Source/Sink Biological Air Quality Water Qual- ity Blood Wind S u rface water/groundwater Enzymatic Photochemica Radioactive Absorption into tis- sues Emissions/Deposition Leachate source/sink term). Table 1.1 presents examples of the components of some transport- transformation models of environmental and biological systems. Uncertainty occurs in transport-transformation models due to a number of fac- tors. The randomness inherent in natural systems, errors in the estimates of transport properties of a species in a medium, and inaccurate estimates of transformation prop- erties, such as the reaction rates, contribute to the overall uncertainty in these models. The main sources of uncertainty in transport-transformation systems are presented in more detail in Chapter 2. 1.3 Limitations in Performing Uncertainty Analysis The main limitation in performing comprehensive uncertainty analyses of transport- transformation models is the associated cost and effort. The computer resources required for uncertainty propagation using conventional methods (to be discussed in the following paragraph) can sometimes be prohibitively expensive. Further, the incorporation of uncertainty associated with structural and formulation aspects of the models requires significant effort. Finally, the data needs for characterizing input uncertainties are often substantial, and not necessarily available. Conventional methods for uncertainty propagation typically require several model runs that sample various combinations of input values. The number of model runs can sometimes be very large, i.e., of the order of many thousands, resulting in substantial computational demands. On the other hand, in order to estimate the uncertainties associated with model formulation, several different models, each corresponding to a different formulation of the mathematical problem corresponding to the original phys- ical system, have to be developed. The model results corresponding to all the possible combinations give an estimate of the range of the associated model uncertainty. De- velopment and application of several alternative computational models can require substantial time and effort. Thus, the costs associated with uncertainty analysis may sometimes be prohibitively high, necessitating a large number of model simulations ------- Chapter: I Section: 4 4 and/or the development of several alternative models. 1.4 Objective of this Research The primary objective of this research is the development of computationally effi- cient methods for uncertainty propagation. This is addressed from the perspective of (a) computational requirements of the methods, (b) applicability of the methods to a wide range of models, and (c) ease of use of the methods. Conventional methods for uncertainty propagation, such as the standard Monte Carlo and Latin Hypercube Sampling methods, may be computationally prohibitively expensive, as described in more detail in Chapter 2. The development of alternative methods that are applicable to a wide range of transport-transformation models could substantially reduce the computational costs associated with uncertainty propagation (in terms of both the time and the resources required). In fact, such methods could facilitate the uncertainty analysis of complex, computationally demanding models where the application of traditional methods may not be feasible due to computational and time limitations. Additionally, there is a need for an extensive evaluation of the alternative methods with realistic case studies. Such an evaluation addresses the issues associated with the wider applicability of the methods. Further, the ease of use of any method is an important factor in its applicability. The extra effort required in understanding and applying a new method can sometimes offset other advantages that the method has to offer. Hence, attention must be paid to the development of auxiliary tools that facilitate the easy use of the new methods. Thus, the primary objective of this research can be stated as "the development of computationally efficient alternative methods for un- certainty propagation that are applicable to a wide range of transport- transformation models, and the development of auxiliary tools that facil- itate easy use of these methods." Another objective of this research is to study the uncertainties associated with model structure and formulation, which fall beyond the scope of input uncertainty propagation. Such a study provides help in the selection of the appropriate level of model detail. Further, in conjunction with uncertainty propagation, it can be used to identify the relative magnitudes of uncertainties associated with model inputs and model formulation. This provides information as to where available resources should be focused: filling data gaps versus refining the model used for the problem. These objectives are accomplished via the development of the Stochastic Response Surface Method (SRSM), and its evaluation for a range of transport-transformation models. Additionally, an easy to use, web-based interface is developed to facilitate black-box use of this method. Furthermore, the SRSM is coupled with an existing sen- sitivity analysis method, ADIFOR (Automatic Differentiation of FORtran), in order ------- Chapter: I Section: 5 5 to further improve the computational efficiency of the SRSM. Finally, uncertainties associated with model structure and formulation are addressed in the framework of photochemical air quality modeling. 1.5 Outline of the Document Chapter 1 has presented a brief introduction to the concepts of uncertainty, and the importance of uncertainty analysis in the context of transport-transformation models. It has also discussed the steps involved in a systematic uncertainty analysis, and the associated limitations. Additionally, Chapter 1 summarizes the objectives and presents a brief overview of this document (Figure 1.1). Chapter 2 presents the relevant background information in the following order: • classification of uncertainty into natural, model, and parametric/data uncertain- ties, and into reducible and irreducible uncertainties, • approaches for representing parametric/data uncertainty, such as, Interval Mathematics, Fuzzy Theory, and Probabilistic Analysis, and • conventional methods for sensitivity and sensitivity/uncertainty analysis, clas- sified here as sensitivity testing methods, analytical methods, sampling based methods, and computer algebra based methods. Additionally, Chapter 2 presents several applications of the conventional meth- ods to the uncertainty analysis of transport-transformation models. The merits and limitations of each method are discussed, and an argument is made for alternative methods that can be used for combined sensitivity and uncertainty analysis. This argument is made both from the perspective of the computational cost reduction, and from the perspective of obtaining sensitivity information. Chapter 3 presents the Stochastic Response Surface Method (SRSM), which is de- veloped here as a computationally efficient alternative method for uncertainty analysis of numerical models. The formulation, implementation, and application of the SRSM are discussed in detail. Individual sections elaborate on the steps involved in the ap- plication of the SRSM: (a) representation of random inputs in a "standardized man- ner", (b) expression of model outputs in a parametric form (e.g., series expansions), (c) estimation of the parameters of output approximation (e.g., coefficients in a series expansion), (d) calculation of the statistical properties of outputs, and (e) evaluation of the approximation. Chapter 3 also presents an example problem that illustrates the application of the SRSM, explaining all the steps in detail; case studies for the evaluation of the SRSM are presented in Chapter 5. Additionally, Chapter 3 presents a web-interface for the application of the SRSM through a browser, without requiring additional effort in studying the mathematical details of the method. ------- Chapter: I Section: 5 6 Chapter 4 presents the coupling of the SRSM with the Automated Differentiation method, AD IFOR (Automatic Differentiation of FORtran). AD IFOR is a computer algebra based method that reads the Fortran source code of a model and constructs the code that calculates first order partial derivatives of model outputs with respect to model inputs or parameters. In Chapter 4, first the rationale for coupling the SRSM with ADIFOR is presented, and is followed by a description of ADIFOR. Then, the coupling of the SRSM with ADIFOR is presented, and the application of the coupled method, SRSM-ADIFOR, is illustrated with an example problem. Chapter 5 presents a comprehensive evaluation of both the SRSM and the SRSM- ADIFOR methods. This evaluation consists of four case studies, covering a wide range of transport-transformation models, as follows: I: A zero-dimensional physiologically-based pharmacokinetic model. II: A two-dimensional reactive atmospheric plume model. Ill: A three-dimensional urban/regional scale photochemical air quality model. IV: A one-dimensional ground water model with discontinuous probability density functions for inputs and parameters. The SRSM is evaluated for all the four models, whereas the SRSM-ADIFOR is eval- uated for the first two models. Each case study description contains the information about the model and its application in exposure/risk assessment. Then, the un- certainties associated with the model are described, and the application of different uncertainty propagation methods is presented. The estimates from the uncertainty propagation methods are then presented graphically, in the form of probability density functions (pdf s), along with a brief discussion at the end. Chapter 6 addresses the issues involved with model uncertainty, which arises due to uncertainties in model formulation and model structure. This chapter describes how such uncertainty can be addressed in the framework of photochemical air quality modeling. A study is presented involving a regulatory atmospheric photochemical plume model, the Reactive Plume Model (RPM). The development and implemen- tation of the RPM-3D, a three-dimensional version of the RPM, is presented in this Chapter. Subsequently, case studies are presented to compare the estimates of RPM- 3D and RPM, in order to characterize uncertainties associated with model resolution. Chapter 7 presents the conclusions of this research, and recommendations for future work. ------- Chapter: I Section: 5 7 Implementation of SRSM (cli. 3) Implementation of MC and LHS (ch. 3) Implementation of SRSM-ADIFOR (ch. 4) Air Quality Model (local scale) Photochemical Plume (ch. 5.2) „ Biological Model (Human PBPK) (ch. 5.1) Development and Application of RPM-3D. a 3D Version of RPM-IV (ch. 6) Air Quality Model (urban scale) Airshed Model _ (ch. 5.3) _ Groundwater M (Leachate) (ch. 5.4) Introduction (ch. 1) Automatic Differentiation (ADIFOR) (ch. 2, ch. 4) Model Uncertainty (ch.2) Input Uncertainty and its Representation (ch.2) Novel Methods SRSM (ch. 3) SRSM-ADIFOR (ch. 4) Non-Probabilistic Methods: Fuzzy Theory Interval Methods (ch. 2) Stochastic Response Surface Method (SRSM) (cli.3) Coupling of SRSM and ADIFOR (SRSM-ADIFOR) (cli.4) Background: Uncertainties and Classification (ch.2) Conventional Probabilistic Methods: Monte Carlo (MC) and Latin Hypercube Sampling (LHS) (ch. 2) Addressing Uncertainties Associated with Model Formulation and Resolution (ch. 6) Figure 1.1: Schematic depiction of the outline of this document. The shaded areas indicate the contributions of this research ------- Chapter 2 BACKGROUND 2.1 Types and Origins of Uncertainty in Transport-Transformation Models Uncertainties in transport-transformation models can be classified as Natural, Model, and Data uncertainties, depending on their origins and on how they can be addressed. They are briefly described in the following. 2.1.1 Natural Uncertainty and Variability Environmental and biological systems are inherently stochastic due to unavoidable unpredictability (randomness). Some quantities are random even in principle, while some quantities that are precisely measurable are modeled as "random" quantities as a practical matter (due to cost and effort involved with continuous and precise measurement). For example, in air pollution systems, the turbulent atmosphere and unpredictable emission-related activities contribute to "natural uncertainty". In such cases, a precise estimation of system properties is not possible, and the uncertainty can be characterized through ensemble averages. On the other hand, the modeling of emissions from a source are sometimes modeled via a mean value and a "random error", since it is impractical to continuously monitor the emissions. Further, some quantities vary over time, over space, or across individuals in a population; this is termed "variability". Variability is the heterogeneity between in- dividual members of a population of some type, and is typically characterized through a frequency distribution. It is possible to interpret variability as uncertainty under certain conditions, since both can be addressed in terms of "frequency" distributions. However, the implications of the differences in uncertainty and variability are relevant in decision making. For example, the knowledge of the frequency distribution for variability can guide the identification of significant subpopulations which merit more focused study. In contrast, the knowledge of uncertainty can aid in determining areas where additional research or alternative measurement techniques are needed to reduce uncertainty. 8 ------- Chapter: 11 Section: 1 9 Evaluation Data Uncertainty JNatural Uncertainty Model Uncertainty Input Data Uncertainty Figure 2.1: Types of uncertainty present in transport-transformation modeling appli- cations and their interrelationships (adapted from [75]) 2.1.2 Model Uncertainty Mathematical models are necessarily simplified representations of the phenomena be- ing studied and a key aspect of the modeling process is the judicious choice of model assumptions. The optimal mechanistic model will provide the greatest simplifica- tions while providing an adequately accurate representation of the processes affecting the phenomena of interest. Hence, the structure of mathematical models employed to represent transport-transformation systems is often a key source of uncertainty. In addition to the significant approximations often inherent in modeling, sometimes competing models may be available. Furthermore, the limited spatial or temporal resolution (e.g., numerical grid cell size) of many models is also a type of approxima- tion that introduces uncertainty into model results. In this context, different sources of model uncertainties are summarized in the following: Model Structure: Uncertainty arises when there are alternative sets of scientific or technical assumptions for developing a model. In such cases, if the results from competing models result in similar conclusions, then one can be confident that the decision is robust in the face of uncertainty. If, however, alternative model formula- tions lead to different conclusions, further model evaluation might be required. Model Detail: Often, models are simplified for purposes of tractability. These in- clude assumptions to convert a complex nonlinear model to a simpler linear model in a parameter space of interest. Uncertainty in the predictions of simplified models ------- Chapter: 11 Section: 1 10 can sometimes be characterized by comparison of their predictions to those of more detailed, inclusive models. Extrapolation: Models that are validated for one portion of input space may be com- pletely inappropriate for making predictions in other regions of the parameter space. For example, a dose-response model based on high-dose, short duration animal tests may incur significant errors when applied to study low-dose, long duration human exposures. Similarly, in atmospheric modeling, models that are evaluated only for base case emission levels, may introduce uncertainties when they are employed in the study of future scenarios that involve significantly different emissions levels (also known as AE caveat). Model Resolution: In the application of numerical models, selection of a spatial and/or temporal grid size often involves uncertainty. On one hand, there is a trade- off between the computation time (hence cost) and prediction accuracy. On the other hand, there is a trade-off between resolution and the validity of the governing equa- tions of the model at such scales. Very often, a coarse grid resolution introduces approximations and uncertainties into model results. Sometimes, a finer grid res- olution need not necessarily result in more accurate predictions. For example, in photochemical grid modeling, the atmospheric diffusion equation (ADE) (see Equa- tion 6.1) is valid for grid cell sizes between 2 km and 20 km [134], In this case, a coarse-grid model ignores significant "sub-grid" detail, and a fine-grid model would require more computer resources, and may even produce incorrect results, as the governing equation may not be appropriate. This type of uncertainty is sometimes dealt with through an appropriate selection of model domain parameter values, or by comparing results based on different grid sizes. Model Boundaries: Any model may have limited boundaries in terms of time, space, number of chemical species, types of pathways, and so on. The selection of a model boundary may be a type of simplification. Within the boundary, the model may be an accurate representation, but other overlooked phenomenon not included in the model may play a role in the scenario being modeled. 2.1.3 Parametric/Data Uncertainty Uncertainties in model parameter estimates stem from a variety of sources. Even though many parameters could be measurable up to any desired precision, at least in principle, there are often significant uncertainties associated with their estimates. Some uncertainties arise from measurement errors; these in turn can involve (a) ran- dom errors in analytic devices (e.g., the imprecision of continuous monitors that mea- sure stack emissions), (b) systematic biases that occur due to imprecise calibration, or (c) inaccuracies in the assumptions used to infer the actual quantity of interest from the observed readings of a "surrogate" or "proxy" variable. Other potential sources of uncertainties in estimates of parameters include misclassification, estima- ------- Chapter: 11 Section: 2 11 Table 2.1: Examples of the sources of uncertainty in the formulation and application of transport-transformation models Uncertainty in Model Formulation (Structural Uncertainty) Uncertainty in Model Application (Data/Parametric Uncertainty) Simplifications in Conceptual Formulation Simplifications in Mathematical Formulation Ergodic-type Hypotheses Idealizations in Constitutive Relations Independence Hypotheses Spatial Averaging Temporal Averaging Process Decoupling Lumping of Parameters Discretization Numerical Algorithm/Operator Splitting Approximations in Computer Coding Constitutive Parameter Selection Design/Structural Parameter Selection Input Data Development/Selection Source Information Meteorology (air quality models) Hydrology (water quality models) Initial and Boundary Conditions Operational Model Evaluation Uncertainty in model estimates Uncertainty in observations Nonexistence of observations Response Interpretation AE caveat and other implicit hypotheses tion of parameters through a small sample, and estimation of parameters through non-representative samples. Further, uncertainty in model application arises from uncertainties associated with measurement data used for the model evaluation. Table 2.1 lists some of the sources of model and parametric uncertainty associated with the formulation and the application of transport-transformation models. 2.2 Reducible and Irreducible Uncertainty Uncertainty associated with model formulation and application can also be classified as "reducible" and "irreducible". Natural uncertainty is "inherent" or irreducible, whereas data and model uncertainty contain both reducible and irreducible compo- nents. The irreducible uncertainty in data and models is generally a result of the presence of natural uncertainty. Reducible uncertainty can be lowered, e.g., by better inventorying methods, improved instrumentation, improvements in model formula- tion, etc. Nevertheless, the distinction between reducible and irreducible model and data uncertainties is to a great extent a matter of convention since it may not be feasible to eliminate the presence of an error (reducible uncertainty) in measurement or modeling beyond a certain level. Furthermore, what is perceived as irreducible natural uncertainty may be quantified in a statistical sense, and via mechanistic mod- eling, better than "artificial" reducible uncertainty. Irreducible modeling uncertainty ------- Chapter: 11 Section: 3 12 reflects the "current" model formulation and may actually change when improved theories describing the phenomena under consideration become available. Also, the averaging processes involved in model formulation unavoidably "lump" together nat- ural and modeling uncertainty and only a quantification of this lumped uncertainty may be possible or desirable. Figure 2.1 depicts schematically the types of uncertain- ties present in transport-transformation models, and their interrelationships. 2.3 Approaches for Representation of Uncertainty Various approaches for representing uncertainty in the context of different domains of applicability are presented by Klir [122], and are briefly summarized in the following: • Classical set theory: Uncertainty is expressed by sets of mutually exclusive alter- natives in situations where one alternative is desired. This includes diagnostic, predictive and retrodictive uncertainties. Here, the uncertainty arises from the nonspecificity inherent in each set. Large sets result in less specific predictions, retrodictions, etc., than smaller sets. Full specificity is obtained only when one alternative is possible. • Probability theory: Uncertainty is expressed in terms of a measure on subsets of a universal set of alternatives (events). The uncertainty measure is a function that, according to the situation, assigns a number between 0 and 1 to each subset of the universal set. This number, called probability of the subset, expresses the likelihood that the desired unique alternative is in this subset. Here, the uncertainty arises from the conflict among likelihood claims associated with the subsets of the universal set, each consisting of exactly one alternative. Since these alternatives are mutually exclusive, nonzero probabilities assigned to two or more events conflict with one another since only one of them can be the desired one. • Fuzzy set theory: Fuzzy sets, similar to classical sets, are capable of express- ing nonspecificity. In addition, they are also capable of expressing vagueness. Vagueness is different from nonspecificity in the sense that vagueness emerges from imprecision of definitions, in particular definitions of linguistic terms. In fuzzy sets, the membership is not a matter of affirmation or denial, but rather a matter of degree. • Fuzzy measure theory: This theory considers a number of special classes of mea- sures, each of which is characterized by a special property. Some of the measures used in this theory are plausibility and belief measures, and the classical proba- bility measures. Fuzzy measure theory and fuzzy set theory differ significantly: in the fuzzy set theory, the conditions for the membership of an element into a ------- Chapter: 11 Section: 3 13 set are vague, whereas in the fuzzy measure theory, the conditions are precise, but the information about an element is insufficient to determine whether it satisfies those conditions. • Rough set theory: A rough set is an imprecise representation of a crisp set in terms of two subsets, a lower approximation and upper approximation. Further, the approximations could themselves be imprecise or fuzzy. Some of the widely used uncertainty representation approaches used in transport- transformation modeling include interval mathematics, fuzzy theory, and probabilistic analysis. These approaches are presented in the following sections. 2.3.1 Interval Mathematics Interval mathematics is used to address data uncertainty that arises (a) due to im- precise measurements, and (b) due to the existence of several alternative methods, techniques, or theories to estimate model parameters. In many cases, it may not be possible to obtain the probabilities of different values of imprecision in data; in some cases only error bounds can be obtained. This is especially true in case of conflict- ing theories for the estimation of model parameters, in the sense that "probabilities" cannot be assigned to the validity of one theory over another. In such cases, interval mathematics can be used for uncertainty estimation, as this method does not require information about the type of uncertainty in the parameters [5,24], The objective of interval analysis is to estimate the bounds on various model out- puts based on the bounds of the model inputs and parameters. In the interval math- ematics approach, uncertain parameters are assumed to be "unknown but bounded", and each of them has upper and lower limits without a probability structure [178]; every uncertain parameter is described by an interval. If a parameter Xi of a model is known to be between xi — q and xi + the interval representation of Xi is given by \xi — ti, xi + ti\. Correspondingly, the model estimates would also belong to another interval. Special arithmetic procedures for calculation of functions of intervals used in this method are described in the literature [5,146,147,150]. For example, if two variables, a and b are given by [cii,au] and [bi,bu], where a/ < au and 6/ < bui simple arithmetic operations are given by the following: a + b [ai + bhau + bu] a — b a ¦ b IP'i by, CLU 6/] [min(aik, aibu, aubi,aubu),max(aibi,aibu, aubh aubu)\ (2.1) a/b [ahau]- —;0 i[bhbu\ Furthermore, a range of computational tools exist for performing interval arith- metic on arbitrary functions of variables that are specified by intervals [119,129]. ------- Chapter: 11 Section: 3 14 Symbolic computation packages such as Mathematica [10] and Maple [91] support interval arithmetic. In addition, extensions to FORTRAN language with libraries for interval arithmetic are also available [117,118]. Therefore, in principle, the ranges of uncertainty in any analytical or numerical model (FORTRAN model) can be analyzed by existing tools. Various applications of interval analysis in the literature include the treatment of uncertainty in the optimal design of chemical plants [67], in the cost benefit analysis of power distribution [178], and in decision evaluation [24], Interval analysis has also been applied to estimate the uncertainty in displacements in structures due to the uncertainties in external loads [126]. Dong et al. presented a methodology for the propagation of uncertainties using intervals [50]. Hurme et al. presented a review on semi qualitative reasoning based on interval mathematics in relation to chemical and safety engineering [101]. Applications in transport-transformation modeling include uncertainty analysis of groundwater flow models [51,138]. The primary advantage of interval mathematics is that it can address problems of uncertainty analysis that cannot be studied through probabilistic analysis. It may be useful for cases in which the probability distributions of the inputs are not known. However, this method does not provide adequate information about the nature of output uncertainty, as all the uncertainties are forced into one arithmetic interval [131]. Especially when the probability structure of inputs is known, the application of interval analysis would in fact ignore the available information, and hence is not recommended. 2.3.2 Fuzzy Theory Fuzzy theory is a method that facilitates uncertainty analysis of systems where uncer- tainty arises due to vagueness or "fuzziness" rather than due to randomness alone [63]. This is based on a superset of conventional (Boolean) logic that has been extended to handle the concept of partial truth - truth values between "completely true" and "completely false". It was introduced by Zadeh as a means to model the uncer- tainty of natural language [212], Fuzzy theory uses the process of "fuzzification" as a methodology to generalize any specific theory from a crisp (discrete) to a continuous (fuzzy) form. Classical set theory has a "crisp" definition as to whether an element is a member of a set or not. However, certain attributes of systems cannot be ascribed to one set or another. For example, an attribute of a system can be specified as either "low" or "high". In such a case, uncertainty arises out of vagueness involved in the definition of that attribute. Classical set theory allows for either one or the other value. On the other hand, fuzzy theory provides allows for a gradual degree of membership. This can be illustrated as follows: In the classical set theory, the truth value of a statement can be given by the ------- Chapter: 11 Section: 3 15 membership function /j,a(x), as "-M = { J is Ha <2-2' On the other hand, fuzzy theory allows for a continuous value of ha between 0 and 1, as ( 1 iff x E A Ha(x) = < 0 iff x ^ A (2.3) [p; 0 < p < 1 if x partially belongs to A In fuzzy theory, statements are described in terms of membership functions, that are continuous and have a range [0,1]. For example, given the measured value of a parameter, the membership function gives the "degree of truth" that the parameter is "high" or "low". Further, fuzzy logic is defined by the following the set relationships: fJLA'(x) = 1.0 -ha (x) Hahb(x) = min(fj,A(x),fJ,B(x)) (2.4) Haub(x) = max(/j,A(x),iiB(x)) Using fuzzy arithmetic, based on the grade of membership of a parameter of interest in a set, the grade of membership of a model output in another set can be calculated. Fuzzy theory can be considered to be a generalization of the classical set theory. It must be noted that if the membership grades are restricted to only 0 and 1, the fuzzy theory simplifies to classical set theory. A vast amount of literature is available on fuzzy theory, including extensive in- troductions to the concepts involved by Bezdek and by Smithson [15, 185]. Fuzzy arithmetic and its applications are described by Klir et al., by Kauffmann et al., and by Puri et al. [116,123,161]. Kraslawski et al. applied fuzzy set theory to study un- certainty associated with incomplete and subjective information in process engineer- ing [127]. Ayyub et al. studied structural reliability assessment by using fuzzy theory to study the uncertainty associated with ambiguity [9]. Juang et al. demonstrated the applicability of fuzzy theory in the modeling and analysis of non-random uncertain- ties in geotechnical engineering [112]. Further literature on industrial applications of fuzzy theory is available for the interested reader [169,211], Specifically, it has been applied to characterize uncertainty in engineering design calculations [128,209], in quality control [162], in sludge application land selection [44], and in solute transport modeling [51,52,138]. However, drawbacks in its applicability to uncertainty analy- sis has been noted by some researchers [187,203]. Fuzzy theory appears to be more suitable for qualitative reasoning, and classification of elements into a fuzzy set, than for quantitative estimation of uncertainty. ------- Chapter: 11 Section: 3 16 2.3.3 Probabilistic Analysis In the probabilistic approach, uncertainties are characterized by the probabilities as- sociated with events. The probability of an event can be interpreted in terms of the frequency of occurrence of that event. When a large number of samples or experiments are considered, the probability of an event is defined as the ratio of the number of times the event occurs to the total number of samples or experiments. For example, the statement that the probability that a pollutant concentration c lies between c\ and C2 equals p means that from a large number of independent measurements of the concentration c, under identical conditions, the number of times the value of c lies between c\ and c2 is roughly equal to the fraction p of the total number of samples. Probabilistic analysis is the most widely used method for characterizing uncer- tainty in physical systems, especially when estimates of the probability distributions of uncertain parameters are available. This approach can describe uncertainty arising from stochastic disturbances, variability conditions, and risk considerations. In this approach, the uncertainties associated with model inputs are described by probability distributions, and the the objective is to estimate the output probability distributions. This process comprises of two stages: • Probability encoding of inputs: This process involves the determination of the probabilistic distribution of the input parameters, and incorporation of random variations due to both natural variability (from, e.g., emissions, meteorology) and "errors." This is accomplished by using either statistical estimation tech- niques or expert judgments. Statistical estimation techniques involve estimating probability distributions from available data or by collection of a large number or representative samples. Techniques for estimating probability distributions from data are presented in the literature [36,100]. In cases where limited data are available, an expert judgment provides the information about the input probability distribution. For example, a uniform distribution is selected if only a range of possible values for an input is available, but no information about which values are more likely to occur is available. Similarly, normal distribution is typically used to describe unbiased measurement errors. Table 2.2 presents a list of some commonly used probability distributions in the uncertainty analysis of transport-transformation models. • Propagation of uncertainty through models: Figure 2.2 depicts schematically the concept of uncertainty propagation: each point of the response surface (i.e., each calculated output value) of the model to changes in inputs "1" and "2" will be characterized by a probability density function (pdf) that will depend on the pdf s of the inputs. The techniques for uncertainty propagation are discussed in the following. A number of excellent texts on probabilistic analysis are available in the literature ------- Chapter: 11 Section: 3 17 response 1-2 probability density of response 1-2 input 2 probability density of input 2 input 1 probability density of input 1 Figure 2.2: A schematic depiction of the propagation of uncertainties in transport- transformation models for the interested reader [73,155,194], ------- Chapter: 11 Section: 3 18 Table 2.2: Selected probability density functions useful in representing uncertainties in environmental and biological model inputs Distribution Parameters & Conditions Probability density function Moments Uniform a, b 1 b — a a + b Mean = 12 (x - fl)2 ' e 2a2 (TV 2tt Normal H, a , a > 0 Mean = /j Var = a2 Mode = fi Lognormal fi, a , a > 0 (\og(x) - il)2 ' e 2a2 XOsJ 2tt Mean = + a /^) Var = (e°"2 — l)e^^ + a ) Mode = ~ Gamma a, b, a > 0, b > 0 X ,, xa~le b x > 0 Y(a)ba Mean = ab Var = ab2 Mode = (a — 1)6 Exponential A, A > 0 Xe~Xx , x > 0 1 Mean = — 1A Var = — Mode = 0 Weibull a axa ~ ^e~x (x > 0) Mean = F(1 + -) 2 a 1 Var = P(1 H—) — P(1 H—) a a 1 Mode = (1 )a,a > 1 a Extreme Value g—x — e~x Mean = 0 Var = 1 Mode = 0 ------- Chapter: 11 Section: 4 19 2.4 Sensitivity and Sensitivity/Uncertainty Analysis The aim of sensitivity analysis is to estimate the rate of change in the output of a model with respect to changes in model inputs. Such knowledge is important for (a) evaluating the applicability of the model, (b) determining parameters for which it is important to have more accurate values, and (c) understanding the behavior of the system being modeled. The choice of a sensitivity analysis method depends to a great extent on (a) the sensitivity measure employed, (b) the desired accuracy in the estimates of the sensitivity measure, and (c) the computational cost involved. In general, the meaning of the term "sensitivity analysis" depends greatly on the sensitivity measure that is used. Table 2.3 presents some of the sensitivity measures that are often employed in the sensitivity analysis of a mathematical model of the form Jr(u,k) = 0 (2.5) where k is a set of m parameters, and u is a vector of n output variables. Based on the choice of a sensitivity metric and the variation in the model pa- rameters, sensitivity analysis methods can be broadly classified into the following categories: • Variation of parameters or model formulation: In this approach, the model is run at a set of sample points (different combinations of parameters of concern) or with straightforward changes in model structure (e.g., in model resolution). Sensitivity measures that are appropriate for this type of analysis include the response from arbitrary parameter variation, normalized response and extrema. Of these measures, the extreme values are often of critical importance in envi- ronmental applications. • Domain-wide sensitivity analysis: Here, the sensitivity involves the study of the system behavior over the entire range of parameter variation, often taking the uncertainty in the parameter estimates into account. • Local sensitivity analysis: Here, the focus is on estimates of model sensitivity to input and parameter variation in the vicinity of a sample point. This sensitivity is often characterized through gradients or partial derivatives at the sample point. 2.5 Conventional Sensitivity/Uncertainty Analysis Methods Conventional methods for sensitivity analysis and uncertainty propagation can be broadly classified into four categories: (a) "sensitivity testing", (b) analytical meth- ods, (c) sampling based methods, and (d) computer algebra based methods. ------- Chapter: 11 Section: 5 20 Table 2.3: A summary of sensitivity measures employed in sensitivity analysis, adapted from McRae et al., 1982 [143] Sensitivity Measure Definition Response from arbitrary parameter variation u = u(k + 5k) — u{k) Normalized Response D,= iUi Ui{k) Average Response /... / Ui(k)dk Uiik) - hldk Expected Value (ut(k)) = j-J ut(k)P(k)dk Variance 5f(k) = Mk)2) - (ut(k))2 Extrema max [ui(k)], min [Ui(k)] Local Gradient Approximation dv ¦ Su [S]5k ; = ^ Normalized Gradient kj dui 13 Ui(k) Ok, Sensitivity testing involves studying model response for a set of changes in model formulation, and for a selected model parameter combinations. Analytical methods involve either the differentiation of model equations and subsequent solution of a set of auxiliary sensitivity equations, or the reformulation of original model using stochastic algebraic/differential equations. On the other hand, the sampling based methods involve running the original model for a set of input/parameter combinations (sample points) and estimating the sensitivity/uncertainty using the model outputs at ------- Chapter: 11 Section: 5 21 those points. Yet another sensitivity analysis method is based on direct manipulation of the computer code of the model, and is termed automatic differentiation. These methods are elaborated in the following. 2.5.1 Sensitivity Testing Methods In this approach, the model is run for a set of sample points for the parameters of concern or with straightforward changes in model structure (e.g., in model res- olution). This approach is often used to evaluate the robustness of the model, by testing whether the model response changes significantly in relation to changes in model parameters and structural formulation of the model. The application of this approach is straightforward, and it has been widely employed in conjunction with transport-transformation modeling. Sistla et al. [183] and Al-Wali et al. [4] used this approach for the sensitivity analysis of urban air quality models. Roselle [168] used this approach to study the effect of biogenic emission uncertainties by performing model simulations at three emission biogenic estimate levels. Vieux et al. [199] and Vanderperk [198] have used a similar approach in water quality modeling. The primary advantage of this approach is that it accommodates both qualitative and quantitative information regarding variation in the model. However, the main disadvantage of this approach is that detailed information about the uncertainties is difficult to obtain using this approach. Further, the sensitivity information obtained depends to a great extent on the choice of the sample points, especially when only a small number of simulations can be performed. 2.5.2 Analytical Methods Some of the widely used analytical methods for sensitivity/uncertainty are: (a) dif- ferential analysis methods, (b) Green's function method, (c) spectral based stochastic finite element method, and (d) coupled and decoupled direct methods. Differential Analysis Methods Differential analysis methods include the Neumann expansion [3,190], and the pertur- bation method [3,191]. The Neumann expansion method involves finding the inverse of the model operator through the expansion of the model equations, and hence has limitations on the type of model equations it can address. The perturbation method involves expansion of model outputs as a series in terms of small random perturbations in model parameters, and the subsequent solution of the series coefficients. The Neu- mann expansion and perturbation based methods have been applied in uncertainty analysis of ground water models [1,2,214], and in the design of structures [26,115,135]. The main limitation of these methods is the requirement that the perturbation terms ------- Chapter: 11 Section: 5 22 be small. Further, these methods are in general difficult to apply in conjunction with the modeling of complex, nonlinear systems, as the model equations are often mathematically intractable. Green's Function Method In the Green's function method, the sensitivity equations of a model are obtained by differentiating the model equations. The sensitivity equations are then solved by constructing an auxiliary set of Green's functions. This method minimizes the number of differential equations that are solved for sensitivity, and replaces them with integrals that can be easily evaluated [53,54], This approach has been applied to the sensitivity analysis of atmospheric chemistry models [103,201]. Spectral Based Stochastic Finite Element Method This method relies on the use of representing stochastic processes in terms of a series expansion, specifically the Karhunen-Loeve expansion [80,155]. For finite element method problems, this approach results in a set of linear matrix equations with de- terministic matrices multiplied by random vectors. The matrix equations are solved either using operator expansions or by using the Galerkin's method [200]. One of the main features of this method is the representation of random parameters in terms of orthogonal functions of a set of standardized random variables; the expansion is also known as "polynomial chaos expansion", and forms the basis for the development of the "Stochastic Response Surface Method" (SRSM). The polynomial chaos expansion and the SRSM are discussed in detail in Chapter 3. Coupled/Decoupled Direct Method The direct method involves the differentiation of model equations and the subse- quent solution of the sensitivity equations. The sensitivity equations are then solved along with the original model equations (Coupled Direct Method) [193], or separately (Decoupled Direct Method) [58,160]. The decoupled method is advantageous both in terms of computational efficiency and stability of the solution. This method has evolved into a standard module in conjunction with commercial sensitivity analy- sis codes for chemical kinetics, such as CHEMKIN [40], and GCKP86 [163]. The decoupled method is also reported to be more efficient than the Green's function method [58]. The analytical methods require access to the governing model equations, and may involve writing additional computer code for the solution of the auxiliary equations, which may be impractical and sometimes impossible. For example, reformulating an existing computational model developed by others could require prohibitive amounts of resources. ------- Chapter: 11 Section: 5 23 2.5.3 Sampling Based Methods Sampling based methods do not require access to model equations or even the model code. These methods involve running a set of model at a set of sample points, and establishing a relationship between inputs and outputs using the model results at the sample points. Some of the widely used sampling based sensitivity /uncertainty analysis methods are: (a) Monte Carlo and Latin Hypercube Sampling methods, (b) Fourier Amplitude Sensitivity Test (FAST) (c) reliability based methods, and (d) response surface methods. Monte Carlo and Latin Hypercube Sampling Methods Monte Carlo (MC) methods are the most widely used means for uncertainty analysis, with applications ranging from aerospace engineering [12] to zoology [28]. These meth- ods involve random sampling from the distribution of inputs and successive model runs until a statistically significant distribution of outputs is obtained. They can be used to solve problems with physical probabilistic structures, such as uncertainty propagation in models or solution of stochastic equations, or can be used to solve non-probabilistic problems, such as finding the area under a curve [47,73]. Monte Carlo methods are also used in the solution of problems that can be modeled by the sequence of a set of random steps that eventually converge to a desired solution. Problems such as optimization and the simulation of movement of fluid molecules are often addressed through Monte Carlo simulations [73]. For the interested reader, a wide range of literature describing the methodology, tools, and the applicability of the Monte Carlo methods is available in the literature [49,69,113,172,186]. Since these methods require a large number of samples (or model runs), their ap- plicability is sometimes limited to simple models. In case of computationally intensive models, the time and resources required by these methods could be prohibitively ex- pensive.*. A degree of computational efficiency is accomplished by the use of Modified Monte Carlo (MMC) methods that sample from the input distribution in an efficient manner, so that the number of necessary solutions compared to the simple Monte Carlo method is significantly reduced. The Latin Hypercube Sampling [104,141,188] is one such widely used variant of the standard Monte Carlo method. In this method, the range of probable values for each uncertain input parameter is divided into ordered segments of equal probability. Thus, the the whole parameter space, consisting of all the uncertain parameters, is partitioned into cells having equal probability, and they are sampled in an "efficient" *For example, in order to perform the uncertainty analysis of a grid-based photochemical model, such as the Urban Airshed Model (UAM-IV), which requires about six hours of CPU time for the simulation of air quality near the vicinity of New Jersey (see the example in Section 5.3), a thousand samples would require more than eight months of computer time ------- Chapter: 11 Section: 5 24 manner such that each parameter is sampled once from each of its possible segments. The advantage of this approach is that the random samples are generated from all the ranges of possible values, thus giving insight into the extremes of the probability distributions of the outputs. A detailed description of the implementation of Latin Hypercube Sampling is presented in Section 3.8.1. Monte Carlo methods and Latin Hypercube Sampling have been applied in proba- bilistic assessment of risk in physiologically based pharmacokinetic models by consid- ering uncertainty in the estimates of the model parameters [43,65,181]. These meth- ods have also been applied in groundwater contamination models [14,86,138,152,167], in air pollution modeling [33,46,72,81,173,184], in process engineering [164,205], and in the reliability assessment of structures [8]. Guidelines for the application of Monte Carlo and Latin Hypercube Sampling methods in risk assessment are provided by the U.S.EPA [196,197]. Several computer packages containing routines for Monte Carlo and Latin Hypercube Sampling methods are reported in the literature [105,109]. The EPA approved FORTRAN package for Latin Hypercube Sampling by Iman et al. [105] is available through the EPA's Exposure Models Library [195]. Commercial plugins based on spreadsheet software include Crystal ball [45], @RISK [208], RISKMAN [157], and MonteCarlo [159]. This list is only representative of the available software tools. In the present work, a Latin Hypercube Sampling utility in FORTRAN has been developed as a part of the implementation of the Stochastic Response Surface Method, and is freely available to the research community. Fourier Amplitude Sensitivity Test (FAST) Fourier Amplitude Sensitivity Test (FAST) is a method based on Fourier transfor- mation of uncertain model parameters into a frequency domain, thus reducing the a multi-dimensional model into a single dimensional one. For a model with m model parameters, ki,k2, ¦ ¦ ¦ ,km, and n outputs, U\, u2, ¦ ¦ ¦ , um, such that the FAST method involves the transformation of the parameters into a frequency domain spanned by a scalar s, as follows: fi(tj ^1 > ^2) • • • j i 1) 2, • • • , Tlj h = Gi(smwis), l = l,2,...,m The outputs are then approximated as: ------- Chapter: 11 Section: 5 25 These integrals are evaluated by repeatedly sampling the parameter space of s, which corresponds to the sampling in the multidimensional model parameter space. The details of the transformation of model parameters into the frequency domain, and the subsequent sampling are explained by Koda et al. [124], and a computational implementation of FAST is presented by McRae et al. [144], FAST has been applied in the sensitivity /uncertainty analysis of atmospheric pho- tochemical mechanisms [64], in conjunction with land surface processes in global cli- mate models models [39,140], and in the disposal of radioactive waste [121], A review article by Helton [93] reports that Monte Carlo methods are more widely applicable than FAST. Reliability Based Methods (FORM and SORM) First- and second-order reliability methods (FORM and SORM, respectively) are ap- proximation methods that estimate the probability of an event under consideration (typically termed "failure"). For example, these methods can provide the probabil- ity that a contaminant concentration exceeds a target level at a location (or, the probability of failure). In addition, these methods provide the contribution to the probability of failure from each input random variable, at no additional computa- tional effort [89,114,142], These methods are useful in uncertainty analysis of models with a single failure criterion. For a model with random parameters the objective of the reliability based approach is to estimate the probability of failure. In case of pollutant contamination exceedance, the failure condition can be defined as where Cr is a pre-specified maximum permissible concentration at a location of in- terest. If the joint probability density function for the set X is given by /x, then the probability of failure is given by the n-fold integral: where the integration is carried out over the failure domain. The evaluation of this integral becomes computationally demanding as the number of random variables (the dimension of the integration) increases; in fact if m is the number of function calls X = (Xi,X2,..., Xn) and a failure condition g(x1}x2,... , xn) < o g(X) = CR-C(X) < 0, PF = P{g(X) < 0} = P{CR < C(X)} ------- Chapter: 11 Section: 5 26 of the integrand per dimension, and n is the dimension, the computation time grows as mn [94], In addition, since the value of the integrand is small, the numerical inaccuracies can be considerably magnified when integrated over a multi-dimensional space [23]. FORM and SORM use analytical schemes to approximate the probability integral, through a series of the following simple steps, as illustrated by Bjerager [20]: • mapping the basic random variables X and the failure function g(X), into a vector of standardized and uncorrelated normal variates U, as X(U) and G(U) respectively, • approximating the function G(U) by a tangent (FORM) or a paraboloid (SORM) at a failure point u* closest to the origin, and • calculating the probability of failure as a simple function of u*. These methods are reported to be computationally very efficient compared to Monte Carlo methods, especially for scenarios corresponding to low probabilities of failure [89]. Further, SORM is more accurate than FORM, but computationally more intensive, since it involves a higher order approximation. These methods have been applied to problems of ground-water flow and contaminant transport [87-89,108] and to problems in safety assessment [38,142], The main drawbacks of FORM and SORM are that the mapping of the failure function on to a standardized set, and the subsequent minimization of the function, involve significant computational effort for nonlinear black box numerical models [79]. In addition, simultaneous evaluation of probabilities corresponding to multiple failure criteria would involve significant additional effort. Furthermore, these methods im- pose some conditions on the joint distributions of the random parameters [89], thus limiting their applicability. Response Surface Methods The response surface methods consist of (i) screening to determine a subset of im- portant model input parameters, (ii) making multiple runs of the computer model using specific values and pairings of these input parameters, and (iii) fitting a general polynomial model to the model data (using the method of least squares). This fitted response-surface is then used as a replacement or proxy for the computer model, and all inferences related to sensitivity/uncertainty analysis for the original model are derived from this fitted model. This approach is sometimes termed as a "secondary model technique" [66]. Box et al. [21,22] describe the adaptive nature of these meth- ods, and the methodology for identifying the sampling points and for subsequent analysis is presented in detail in the literature [57,68,120]. ------- Chapter: 11 Section: 5 27 Helton [93] studied the applicability of response surface methods in performance assessment of radioactive waste disposal. It has also been used in conjunction with soil erosion modeling [32], with vegetative plant growth modeling [136], and with structural reliability problems [27]. 2.5.4 Computer Algebra Based Methods Table 2.4: A representative list of packages available for automatic differentiation Package Authors Year Language Applications/Features ADIC ADIFOR ADGEN ADOL-C ADOL-F ATOM FT GRESS Odyssee PADRE2 PCOMP SMS Bischof et al. [19] Bischof et al. [17] Worley et al. [210] Griewank [84] Shiriaev et al. [180] Change et al. [31] Horwedel [97] Rostaing et al. [170] Kubota [130] Dobmann et al. [48 Korelc [125] TEXPANDER Ozaki et al. [153] Unnamed Rich et al. [166] 1997 1996 1989 1996 1996 1994 1993 1991 1995 1997 1993 1992 C Fortran Fortran C/C++ Fortran 90 Fortran Fortran Fortran Fortran Fortran Mathematics Fortran Matlab 3-D CFD grid generator [19] A wide range of applications [29,98,102,107] Radionuclide modeling [95] Groundwater flow [41] Solution of ODEs [31] Radionuclide transport [96,151,156] Process engineering [145] Estimates rounding errors Automatic derivation of finite element formulae [125] Optimization of structures [153] Computer algebra based methods involve the direct manipulation of the com- puter code, typically available in the form of a high level language code (such as C or FORTRAN), and estimation of the sensitivity and uncertainty of model outputs with respect to model inputs. These methods do not require information about the model structure or the model equations, and use mechanical, pattern-matching algo- rithms, to generate a "derivative code" based on the model code. One of the main computer algebra based methods is the automatic differentiation, which is sometimes also termed automated differentiation. Automatic differentiation involves direct manipulation of a model code to gener- ate a corresponding "derivative calculating code". Given the source code, and the information about what the dependent and independent variables of interest are, the automatic differentiation tools can generate derivative information without further user intervention. This approach is essentially a "black-box" approach, as it does not ------- Chapter: 11 Section: 5 28 require access to the model formulation or model equations. The main advantages of this approach are that partial derivatives can be calculated with the accuracy of the machine precision [17], and this approach results in substantial computer time savings [102,110]. An overview of the methodology and applications of automatic differentiation is presented by Griewank [82,83]. In addition, various reviews of automatic differenti- ation in the fields of engineering design, process engineering and numerical analysis can be found in the literature [13,34,106]. A large number of packages for automatic differentiation of model codes written in various programming languages are currently available. Table 2.4 lists a majority of the automatic differentiation tools currently available, along with some of their applications reported in the literature. Two of the notable automatic differentiation tools that have been applied in con- junction with transport-transformation models are GRESS and ADIFOR. GRESS (GRadient Enhanced Software System) [97] is a Fortran preprocessor system that searches the model equations based on the appearance of the "=" symbol, and an- alyzes the functional dependence of independent and dependent variables. The dif- ferentiation is carried out for each statement and the results are stored numerically at each stage of computation. This method has been applied to models describing transport of radionuclides through groundwater [151,156]. It has also been applied a model describing the atmospheric dispersion of radionuclides, AIRDOS [95,147]. The main limitation of GRESS is that the particulars of every operation performed in the code are stored in the memory at each step. This interpretation overhead associated with the storage, and the potentially great memory demands, can sometimes limit its applicability [16]. ADIFOR (Automatic Differentiation of FORtran) [17] applies the rules of auto- matic differentiation, and rewrites the original code, inserting statements for com- puting the first-order derivatives. The generated code can then be compiled just like a normal program, and can be executed to calculate derivatives in the model, along with the model outputs. This approach avoids the limitation of GRESS, with respect to the interpretation overhead. In the area of photochemical air quality modeling, Carmichael et al. [29] applied the ADIFOR method to calculate the sensitivity of ozone levels to the initial con- centrations (84 species) and all reaction rate constants (178 chemical reactions) for six different chemical regimes. Further, Hwang et al. [102] studied the applicability of automatic differentiation techniques in numerical advection schemes in air quality models. They calculated the sensitivity of concentration to a global perturbation of wind velocity in advection models and comparing that with other methods. Their results indicate that ADIFOR-generated code can produce exact sensitivity informa- tion up to the machine epsilon, and that the CPU requirements were lower for the ADIFOR method. ADIFOR has also been applied in several other research areas, such as groundwa- ------- Chapter: 11 Section: 6 29 ter modeling [204], in weather modeling [18], in biostatistics [107], and in aeronau- tics [98]. The details of the ADIFOR approach and its application are presented in Chap- ter 4 where the coupling of the Stochastic Response Surface Method and ADIFOR is described. 2.6 Need for Alternative Sensitivity/Uncertainty Analysis Meth- ods Traditional sampling methods for sensitivity and uncertainty analysis, such as the Monte Carlo and Latin Hypercube Sampling, require a substantial number of model runs to obtain a good approximation of the output pelfs, especially for cases involv- ing several inputs. On the other hand, analytical methods require the information about the mathematical equations of a model, and often are restricted in their ap- plicability to cases where the uncertainties are small. Therefore there is a need for a computationally efficient method for uncertainty propagation that is robust and also applicable to a wide range of complex models. The following chapter describes the development of the Stochastic Response Surface Method (SRSM), which is a computationally efficient uncertainty analysis method developed as part of this work. ------- Chapter 3 THE STOCHASTIC RESPONSE SURFACE METHOD: DEVELOPMENT AND IMPLEMENTATION 3.1 Overview of the Method The Stochastic Response Surface Method (SRSM), as the name suggests, can be viewed as an extension to the classical deterministic Response Surface Method (RSM), on which extensive literature is available [21,22,68,120]. The main difference between the SRSM and the RSM is that in the former the inputs are random variables, where as in the latter, the inputs are deterministic variables. The SRSM was developed as a part of this work with the objective of reducing the number of model simulations required for adequate estimation of uncertainty, as compared to conventional methods. This is accomplished by approximating both inputs and outputs of the uncertain system through series expansions of standard random variables; the series expansions of the outputs contain unknown coefficients which can be calculated from the results of a limited number of model simulations. Conceptually, the propagation of input uncertainty through a model using the SRSM consists of the following steps: (1) input uncertainties are expressed in terms of a set of standard random variables, (2) a functional form is assumed for selected outputs or output metrics, and (3) the parameters of the functional approximation are determined. The SRSM is founded on the principle that random variables with well-behaved (square-integrable) pdf s can be represented as functions of independent random vari- ables [25,47,80,191]. The above integrability requirement is usually satisfied by uncertain quantities of interest, so this method is applicable to a wide variety of of transport-transformation models. Random variables with normal distributions, N(0,1), are often selected to represent input uncertainties due to the mathematical tractability of functions of these random variables [47,155]. In the present work these random variables are referred to as "Standard Random Variables" (srvs). Once the inputs are expressed as functions of the selected srvs, the output metrics can also be represented as functions of the same set of srvs. The minimum number of srvs needed to represent the inputs is defined as the "number of degrees of freedom" in input uncertainty. Since model outputs are deterministic functions of model inputs, 30 ------- Chapter: 111 Section: 1 31 they have at most the same number of degrees of freedom in uncertainty. Consider a model y = T_ (x), where the set of random inputs is represented by the vector x, and a set of selected random outputs or output metrics is represented by the vector y. First, the vector of input random variables is expressed as a function of the form x = /i(£), where £ denotes the vector of the selected srvs. Then, a functional representation for outputs, of the form y = /(£,a), where a denotes a parameter vector, is assumed. The specific method employed in the estimation of parameters of approximation depends on the complexity of the model, which can be broadly categorized into the following: • Models consisting of simple linear algebraic operators: the outputs can be di- rectly obtained as analytical functions of srvs, as y = £_1 (x) = ------- Chapter: 111 Section: 2 32 inputs (X) computational model outputs (Y) express inputs as functions of srvs X = F<& X=F(& (for complex cases) J (exact, if possible) explicit systems A Y=G(X)) direct solution A ) =<1<1 (I) t J mathematically tractable equations le^ estimate a by standard error minimization methods (e.g., Galerkin's method) implicit systems A (1 (X, Y) =0 J assume a form for Y > (e.g., series expansion) Y= h £,a) > black-box models or complex numerical codes D estimate a using: > collocation methods regression methods (e.g., ECM or (if collocation methods \orthogonal collocation) don't converge) j evaluate ) = lain) Figure 3.1: Schematic depiction of the Stochastic Response Surface Method 3.2 Step I: Representation of Stochastic Inputs The first step in the application of the SRSM is the representation of all the model inputs in terms of a set of "standardized random variables". This is analogous to nor- malization process used in transforming deterministic variables. In this work, normal random variables are selected as srvs as they have been extensively studied and their functions are typically well behaved [25,47,80,155]. Here, the srvs are selected from a set of independent, identically distributed (iid) normal random variables, Ld where n is the number of independent inputs, and each ^ has zero mean and unit variance. When the input random variables are independent, the uncertainty in the zth model input is expressed directly as a function of the ith srv, i.e., a ) ------- Chapter: 111 Section: 2 33 transformation of Xj to is employed. Such transformations are useful in the stan- dardized representation of the random inputs, each of which could have very different distribution properties. One of two approaches may be taken to represent the uncertain inputs in terms of the selected srvs: (a) direct transformation of inputs in terms of the srvs, and (b) series approximations in srvs. Devroye [47] presents transformation techniques and approximations for a wide variety of random variables. Input random variables whose transformations are not found in literature can be approximated either by alge- braic manipulation or by series expansion techniques. These techniques are described in detail in the following sections. Table 3.1: Representation of common univariate distributions as functionals of normal random variables Distribution Type Transformation® Uniform (a, b) a + (b - a) Q + ^erf^/v^ Normal (n,cr) Lognormal (n,a) exp(/x + O / I \ ^ Gamma (a,b) ab[iyh+l~ i) Exponential (A) ~ilog(}+rfK/vi)) Weibull (a) y~a Extreme Value - log(y) 3 £ is normal (0,1) and y is exponential (1) distributed 3.2.1 Direct Transformations Direct transformations for expressing a given random variable as a function of another random variable are typically derived by employing the following relation [155]: If Y = h(X), where X is a random variable with the pdf fx(x), then the pdf of Y, fr(y), is given by fr(v) = +'''+ ji^\+"' (:11) \h'{x i)| \h'(xn) | where X\,... , xn,... are the real roots of the equation y = h(x). ------- Chapter: 111 Section: 2 34 Using the above equation, given fy(y), and fx(x), the transformation Y = h(X) is identified by means of trial and error. Transformations are available in the literature for some common distributions. For example, if X is uniform [0,1], then — — logX is A exponential (A) distributed, and if X is normal, the X2 is gamma (1,1) distributed. Although a large number of transformations for distributions exist in the lit- erature, they are not directly applicable for the case of normal random variables because of the following reasons: (a) these transformations are optimized for compu- tational speed as they are used as random number generators, and (b) a majority of these transformations are in terms of the uniform random numbers. However, some transformations in terms of the standard normal random variables are presented by Devroye [47]. In addition, some transformations were derived during the course of the development of the SRSM. Table 3.1 presents a list of transformations for some probability distributions commonly employed in transport-transformation modeling. 3.2.2 Transformation via Series Approximation When a model input follows a non-standard distribution, direct transformation in terms of standard random variables is often difficult to obtain. This may be true for some distributions for which direct transformations are not available in the literature, and are difficult to derive. In such cases, the approximation of the random variable can be obtained as follows: • the moments of the random variable are computed based on the given informa- tion (e.g., measurements or the given non-standard or empirical pdf), • a series expansion in terms of an srv is assumed to represent the random variable, • the moments of the corresponding series expansion are derived as functions of unknown coefficients, • the error between the derived moments and the calculated moments is mini- mized (e.g., through least squares), thus resulting in a set of equations in the unknown coefficients, and • the process is repeated using a higher order approximation, until the desired accuracy is obtained. 3.2.3 Transformation of Empirical Distributions In the case of empirical distributions, the probability distribution function of an input is specified in terms of a set of measurements (i.e., a sample of random data points, or a frequency histogram). In some cases the uncertainty in a model input x is specified ------- Chapter: 111 Section: 2 35 by an empirical cumulative distribution function (cdf), in the form Fx(x) = g(x), where g(x) can be one of the following: (a) an algebraic function, or (b) a look-up table, or (c) a numerical subroutine. In such cases, x can be approximated in terms of an srv £, as follows: • Generate a sample of srv £ • Generate a uniform random variate z from £ as (z corresponds to Uniform(0,l)) • For the calculated value of z, obtain the value of the input x, for which Fx(x) equals the sample value of z • Repeat the above steps for additional samples. In this process, x is sampled uniformly from the all its percentiles, and hence the samples represent the true distribution of x. These steps can be represented by the following transformation equation: x = ,-'(«), where , = (I + IerfK/^)) , and ,(*) i„ the cdf of * (3.2) 3.2.4 Transformation of Correlated Inputs For cases in which the random inputs are correlated, the interdependence of a set of n correlated random variables is often described by a covariance matrix. The covariance matrix, S, is a symmetric matrix with the following elements: where rjXi and rjx. are the means of Xi and Xj, respectively. The transformation process for n correlated random variables with zero mean, and similar distribution functions, is presented by Devroye [47], and is described briefly here: To generate a random vector A, with n components, with mean 0 and covariance C, we start with a vector B comprising of n iid unit random variables. Then, assuming a linear transformation A = HB, the following conditions are satisfied: Vxi){xj Vx3)}, E {BBt} = 1 E{A} = HE{B} = 0 E {AAt} = HE{BBt}Ht = HHt = C ------- Chapter: 111 Section: 2 36 The problem reduces to finding a nonsingular matrix H such that HHT = C. Since covariance matrices are positive semidefinite, one can always find a nonsingular H. The above method is generalized here for the case of transformation of an arbitrary random vector X with a covariance matrix X. If the zth random variable, Xi, has a probability density function fi(x), mean fii, and standard deviation 0 f(x;a) = l r(a/2)2°/2 I 0, elsewhere The mean of the x2 distributions is a, and is related to the mean of the zth component, Xi>m, and the Dirichlet distribution parameter 0, as follows: ------- Chapter: 111 Section: 3 37 The zth component can now be directly obtained as: X Xi = where n is the number of components in the Dirichlet distribution. X1 + x2 +... + xn 3.3 Step II: Functional Approximation of Outputs An output of a model may be influenced by any number of model inputs. Hence, any general functional representation of uncertainty in model outputs, should take into account uncertainties in all inputs. For a deterministic model with random inputs, if the inputs are represented in terms of the set the output metrics can also be represented in terms of the same set, as the uncertainty in the outputs is solely due to the uncertainty of the inputs [191]. This work addresses one specific form of representation, the series expansion of normal random variables, in terms of Hermite polynomials; the expansion is called "polynomial chaos expansion" [80]. When normal random variables are used as srvs, an output can be approximated by a polynomial chaos expansion on the set {£;»}?= 1; given by: n n i\ y Clo + ^ ] ^rifei) ~l~ ^ ^ 6^2) ii = l ii = l ^2 = 1 n i\ %2 + ^2 ^2 ^2 ^1*2^3(61,62; 63) + • • • ) (3-3) i 1 = 1 12 = 1 13 = 1 where, y is any output metric (or random output) of the model, the a^.Js are de- terministic constants to be estimated, and the rp(^il,... , £ip) are multi-dimensional Hermite polynomials of degree p, given by rr(€u,...,^) = (-1)" etft * e-ift, (3.4) where ^ is the vector of p iid normal random variables {iik}pk=1, that are used to represent input uncertainty. Hermite polynomials on {^}^1are random variables, since they are functions of the random variables Furthermore, the Hermite polynomials defined on l=laxe orthogonal with respect to an inner product defined as the expectation of the product of two random variables [80]. Thus, E[TpTq] = 0 iff Tp^Tq , It is known that the set of multi-dimensional Hermite polynomials forms an orthog- onal basis for the space of square-integrable pelfs, and that the polynomial chaos expansion is convergent in the mean-square sense [80]. In general, the accuracy of the approximation increases as the order of the polynomial chaos expansion increases. ------- Chapter: 111 Section: 4 38 The order of the expansion can be selected to reflect accuracy needs and computa- tional constraints. For example, an uncertain model output U, can be expressed as second and third order Hermite polynomial approximations, U2 and U3 as follows: n n n—1 n U2 = «0,2 + ^ <^,26 + ^ au,2(£i — 1) + ^ ^ aij,2^j (3.5) i= 1 i= 1 i= 1 j>i n n U3 — ao,3 + ^ ^ aii,3(Ci ~ 1) + ^ 0-iii,3(Ci ~ 3£i) 1=1 1=1 ^=1 n—1 n n n n—2 n—1 n E E^.3«4-a + E E E Clijk^CiCjCk (3-6) i= 1 i=l j = l 1 where n is the number of srvs used to represent the uncertainty in the model inputs, and the coefficients a^m2, a^m3, aij>m2, ay>m3, and 3 are the coefficients to be estimated. From the above equation, it can be seen that the number of unknowns to be determined for second and third order polynomial chaos expansions of dimension n, denoted by N2 and N3, respectively, are: N2 = l + 2„,+ ^> (3.7) JV3 = 1 + 3n + + »(" - !) (" - 2) (3 8) 2 6 3.4 Step III: Estimation of Parameters in the Functional Approx- imation The unknown coefficients in the polynomial chaos expansion (a^...'s) can be estimated by one of the following methods, depending on the complexity of the model: • For cases in which the model is invertible, output metrics can be directly calcu- lated as explicit functions of the srvs (£js); the expressions for input variables in terms of s are substituted into the inverted model to obtain explicit functions for the output metrics in terms of s. In such simple cases, there is no need to assume a functional form for the output metrics. • For nonlinear operators with mathematically manipulable equations, the un- known coefficients in the polynomial chaos expansion can be determined by minimizing an appropriate norm of the residual, after substituting the trans- formed inputs into the model equations. Gelarkin's method [200] is commonly used with weight function corresponding to the expectation of the random vari- ables [191]. ------- Chapter: 111 Section: 4 39 • For cases in which the model equations are not easy to manipulate, or when the model is of "black-box" type, the unknown coefficients can be obtained by a collocation method [191,200]. This method imposes the requirement that the estimates of model outputs are exact at a set of selected collocation points, thus making the residual at those points equal to zero. The unknown coefficients are estimated by equating model outputs and the corresponding polynomial chaos expansion, at a set of collocation points in the parameter space; the number of collocation points should be equal to the number of unknown coefficients to be found. Thus, for each output metric, a set of linear equations results with the coefficients as the unknowns; these equations can be readily solved using standard linear solvers. 3.4.1 Deterministic Equivalent Modeling Method/Probabilistic Collocation Method (DEMM/PCM) The Deterministic Equivalent Modeling Method (DEMM) [191] incorporates the poly- nomial chaos expansion technique, along with symbolic computation methods and compiler technology, to provide a prototype language for automated uncertainty anal- ysis. This method uses the Galerkin's method to relate the input and output approx- imations for models with tractable equations. Depending on the complexity of the model, DEMM uses either the Galerkin's method or the Probabilistic Collocation Method (PCM), described here. As a part of DEMM, the Probabilistic Collocation Method (PCM) [154, 192], is used for applying DEMM in the case of black-box models. The coefficients of the polynomial chaos expansion are obtained using the model outputs at selected collocation points. Then, the next order polynomial chaos expansion is employed, and the solution process is repeated. If the estimates of the pdf s of the output metrics of concern, are approximately equal, the expansion is assumed to have converged; the higher order approximation is used to estimate the pdfs of output metrics. In DEMM/PCM the collocation points are selected following the orthogonal col- location method suggested by Villadsen and Michelsen [200]. The collocation points correspond to the roots of the polynomial of one degree higher than the order of the polynomial chaos expansion. For one-dimensional problems, this method gives the same results as Galerkin's method [200], and hence is regarded as an "optimal method". This approach is adapted for multi-dimensional cases in DEMM [191]. For example, in order to solve for a two dimensional second order polynomial chaos expan- sion, the roots of the third order Hermite polynomial, i/3,— V3 and 0 are used, hence the possible collocation points are (0,0), (—1/3,0), (0,i/3), (0,—i/3), (~V3,—V3), (1/3,1/3), (1/3,0), (—1/3,1/3) and (i/3, — i/3). There are nine possible collocation points, but from Equations 3.5 and 3.6 with n=2 and taking terms only up to T2, there are only six unknowns. Similarly, for higher dimension systems and higher or- ------- Chapter: 111 Section: 4 40 der approximations, the number of available collocation points is always greater than the number of collocation points needed, which introduces a problem of selecting the appropriate collocation points. In the absence of selection criteria at present, the collocation points are typically selected at random from the set of available points. Limitations of DEMM/PCM As noted above, the selection of the required collocation points in an efficient manner from the large number of possible points is not addressed in the present implemen- tation of DEMM. In fact, as the number of degrees of freedom increases, the number of available collocation points increases exponentially. For example, for the case of n = 6, using Equation 3.3, the number of collocation points required for second and third order expansions are 28 and 84 respectively. Since the collocation points are selected from the combinations of the roots of one order higher Hermite polynomial, the number of collocation points available for second and third order expansions are 36 = 729 and 46 = 4096, respectively - there are three roots for a third order Hermite polynomial (used for obtaining collocation points for a second order expansion), and there are six variables, hence the number of possible collocation points is 36, and similarly 46 for a third order expansion. As the number of inputs and the order of expansion increase, the number of available collocation points increases exponentially. In principle, any choice of collocation points from the available ones, should give adequate estimates of the polynomial chaos expansion coefficients. However, different combinations of collocation points may result in substantially different estimates of the pdf s of output metrics; this poses the problem of the optimal selection of colloca- tion points. Furthermore, in a computational setting, some of the collocation points could be outside the range of the algorithmic or numerical applicability of the model, and model results at these points cannot be obtained. These issues are addressed in the following section, where an algorithm for collocation point selection method is described. Another shortcoming of the standard collocation technique suggested in DEMM is that sometimes the roots of the Hermite polynomials do not correspond to high prob- ability regions and these regions are therefore not adequately represented in the poly- nomial chaos expansion. For example, for a third order approximation, the roots of the fourth order Hermite polynomial, namely \J3 + \/6, — s/3 + \/6, \]3 — \/6, and — \J3 — -\/6 are used as the collocation points. However, the origin corresponds to the region of highest probability for a normal random variable. Hence, the exclusion of the origin as a collocation point could potentially lead to a poor approximation. The limitations of DEMM/PCM are shown in the context of a case study involving a human Physiologically Based Pharmacokinetic (PBPK) Model, and the results are presented in Section 5.1.4. ------- Chapter: 111 Section: 4 41 3.4.2 Efficient Collocation Method (ECM) This method addresses some limitations of DEMM/PCM. Here, the collocation points for the estimation of the coefficients of the polynomial chaos expansion are selected based on a modification of the standard orthogonal collocation method of [191,200]. The points are selected so that each standard normal random variable takes the values of either zero or of one of the roots of the higher order Hermite-polynomial. A simple heuristic technique is used to select the required number of points from the large number of potential candidates: for each term of the series expansion, a "corresponding" collocation point is selected. For example, the collocation point corresponding to the constant is the origin; i.e., all the standard normal variables (£'s) are set to value zero. For terms involving only one variable, the collocation points are selected by setting all other £'s to zero value, and by letting the corresponding variable take values as the roots of the higher order Hermite polynomial. For terms involving two or more random variables, the values of the corresponding variables are set to the values of the roots of the higher order polynomial, and so on. If more points "corresponding" to a set of terms are available than needed, the points which are closer to the origin are preferred, as they fall in regions of higher probability. Further, when there is still an unresolved choice, the collocation points are selected such that the overall distribution of the collocation points is more symmetric with respect to the origin. If still more points are available, the collocation point is selected randomly. The advantage of this method is that the behavior of the model is captured rea- sonably well at points corresponding to regions of high probability. Furthermore, singularities can be avoided in the resultant linear equations for the unknown co- efficients, as the collocation points are selected to correspond to the terms of the polynomial chaos expansion. Thus, the pdfs of the output metrics are likely to be approximated better than by a random choice of collocation points, while following a technique similar to the "orthogonal collocation" approach. Applicability and Limitations of ECM The ECM method was applied to a case study involving a human Physiologically Based Pharmacokinetic (PBPK) model, which is presented in Section 5.1.4. This method resulted in accurate estimates of the output pdf s while requiring much fewer model simulations as compared to standard Monte Carlo methods, as shown in Fig- ures 5.4-5.7 in Section 5.1.4. The ECM method was applied to a more complex model, the two-dimensional at- mospheric photochemical plume model, the Reactive Plume Model, version 4, (RPM- IV), as described in Section 5.2.3. For that case study, the ECM method failed to converge for a third order polynomial chaos approximation. This is attributed to the inherent instability of the collocation based approaches, as described by Atkinson [6]. ------- Chapter: 111 Section: 5 42 3.4.3 Regression Based SRSM Collocation methods are inherently unstable, especially with polynomial approxima- tions of high orders, since the requirement that the polynomial (curve or a surface) has to pass through all the collocation points. Thus any one collocation point in the model space could significantly alter the behavior of the polynomial [6]. In this context, regression-based methods provide slightly more computationally expensive, but robust means of estimation of coefficients of the functional approximation, be- cause model results at more points are used in this approach, and the influence of each sample point is moderated by all other sample points. This extension to the collocation method uses regression in conjunction with an improved input sampling scheme to estimate the unknown coefficients in the polynomial expansion. In the regression based method, a set of sample points is selected in the same manner as the ECM method, as described earlier. The number of sample points selected must be higher than the number of unknown coefficients to be estimated; selecting a number of points equaling twice the number of coefficients is recommended for obtaining ro- bust estimates, and is the approach used in the case studies presented in this work. The model outputs at the selected sample points are equated with the estimates from the series approximation, resulting in a set of linear equations with more equations than unknowns. This system of equations is then solved using the singular value decomposition method [158]. The regression method has the advantage that it is more robust than the colloca- tion method, but the method requires a higher number of model simulations to obtain estimates of output uncertainty. The advantages of this method are demonstrated by comparing the results of this method with those of ECM for a the case study involving RPM-IV, an atmospheric photochemical plume model, described in Section 5.2.3. Based on preliminary case studies (presented in Chapter 5), the regression based method is found to be robust and the method resulted in good approximations of model outputs. Hence this method is recommended for use in general. The higher computational burden, compared to the ECM, is offset by the robustness this method provides for many cases. Throughout the rest of this document, the ECM method is denoted as "ECM" and the regression based method is denoted simply as "SRSM". Unless otherwise specified, the term SRSM in the case studies refers to a regression based SRSM. 3.5 Step IV: Estimation of the Statistics of the Output Metrics Once the coefficients used in the series expansion of model outputs are estimated, the statistical properties of the outputs, such as the density functions, moments; joint densities; joint moments; correlation between two outputs, or between an output and an input; etc., can be readily calculated. One way of accomplishing this is through ------- Chapter: 111 Section: 5 43 the generation of a large number of realizations of the srvs, and the calculation of the values of inputs and outputs from the transformation equations. This results in a large number of samples of inputs and outputs. These samples can then be statistically analyzed using standard methods. It must be noted that the calculation of model inputs and outputs involves eval- uation of simple algebraic expressions, and does not involve model runs, and hence substantial savings in the computer time are accomplished. As an example, if the inputs Xi s are represented as Xi = and if the outputs Uj's are estimated as Xi = Gi(£ 1,^2 • • •£«), then the following steps are involved in the estimation of the statistics of the inputs and outputs. • generation of a large number of samples of (£1 £2 • • • £n,i), • calculation of the values of input and output random variables from the samples, • calculation of the moments using Equation 3.9, and • calculation of density functions and joint density functions using the sample values. al (3.9) From a set of N samples, the moments of the distribution of an output t/j can be calculated as follows: 1 N Vyi = Mean (yj = E{yJ = 3 = 1 1 N Var(yi) = Ej^ - %i)2} = - r]yi)2 3 = 1 1 1 N 71.W = Skew (yi) = — E{(yi - r/yi)3} = - r]yif aVi °Vi j=1 1 1 N l2,Vi = Kurt(yi) = — EKyi-^.)4} = ^(yij - r]Vif ayi ayi j=1 Further, the correlation coefficient of an input Xk and an output yi can be calcu- lated using the following: Txm = E{(*>-».)(», -h,)} = 1_ j^{xk j _ (3 10) axkayi ^axkayi j=l Similarly, higher moments of outputs, or the correlation between two outputs, or between an input and an output can be directly calculated. ------- Chapter: 111 Section: 6 44 3.6 Step V: Evaluation of Convergence, Accuracy and Efficiency of Approximation Once the coefficients of the polynomial chaos expansion are obtained using one of the methods described above, the convergence of the approximation is determined through comparison with the results from a higher order approximation. The next order polynomial chaos expansion is used, and the process for the estimation of un- known coefficients is repeated. If the estimates of pdf s of output metrics agree closely, the expansion is assumed to have converged, and the higher order approximation is used to calculate the pdf s of output metrics. If the estimates differ significantly, yet another series approximation, of the next order, is used, and the entire process is repeated until convergence is reached. In the present work, the accuracy of the approximation is evaluated by comparing the results from SRSM with the results obtained from Monte Carlo analyses. This verification is performed to ensure that the approximation converges to the true dis- tribution. The efficiency of the SRSM is further evaluated by comparing the number of simulations required for the SRSM method, with the number required for a Latin Hypercube sampling method. 3.7 An Illustration of the Application of the SRSM As an illustration, consider a computational model with three independent random inputs Xi, X2, and X3, and two outputs Y\ and Y2, where the input random variables X\, X2 and X3 are given by The input random variables can be represented by three srvs, £i, £2, £3, using Table 3.1 as follows: Xi = Uniform(pi, q\) X2 = Lognormal(p2, Q2) X3 = Gamma (p3,(?3) (3.11) X2 = exp(p2 + 926) (3.12) Where £1, £2, £3 are Hd N(0,1) random variables. A second order polynomial chaos ------- Chapter: 111 Section: 7 45 approximation for Yx and Y2 in terms of £1, £2> £3 is given by Yi = a0 + &i£i + + a3^3 + ^4(^1 — 1) + a${^2 — 1) + as(&i ~ 1) +<37^1^2 + a8^,2^,3 + ^9^,1^,3 bo + 6l£l + &2^2 + ^3^3 + — 1) + M^2 — 1) + ^(£3 ~~ 1) +&7C1C2 + ^^2^3 + ^^1^3 (3.13) In order to estimate the 10 unknown coefficients (for each output) from the above equation, the selection of a set of N sample points (equaling about twice the number of sample points, i.e., 20 in this case) is recommended for regression based SRSM, in the form (6,i 6,1 £3,1); (6,2 £2,2 £3,2), • • • , (6,w &,n 6,w) These sample points can be readily generated using a simple algorithm developed as a part of the SRSM implementation. The selection of the sample points is discussed in detail in Section 3.4.2, in the context of the ECM method. These sample points correspond to the original model input samples, (xi;i £2,1 £3,1)... (x^n %2,n %3,n as follows: ( t \ U ,i £2 ,i V J ^ Xi,i ^ ^3 yi j ' Pi + (Qi -Pi) Q + ^erf(£lii/v/2)^ ^ exp(p2 + 926,0 V UWg^ + l-g^ (3.14) for i = 1... N. After obtaining the original model input sample points, the model simulation is performed at the points given by (x^i x2>\ x3;i)... (xitN x2>^ %3,n)- Then the outputs at these sample points, 2/1,1 •• • 2/1, n and 2/2,1 •• • J/2,n, are used to calculate the coefficients a\... a 10 and b\... b\o by solving the following linear equations through singular value decomposition. ZTx eta bo CL\ b\ a2 b2 as bg CIq bg V1,1 2/2,1 2/1,2 2/2,2 Ul,N 2/2, N , where (3.15) ------- Chapter: 111 Section: 7 46 1 1 1 ... 1 6,2 6,3 • • • 6 ,N 6,2 6,3 • • • 6 ,N 6,2 6,3 • • • 6 ,N 6,1 6,1 6,i 62,i -1 62,i -1 62,i -1 6,16,1 (3.16) 6,16,1 6,16,1 In the above equations, Z can be calculated from the values of £'s at each sample point, whereas, yi^ and j/2,i are the corresponding model outputs. Thus, the only unknowns in Equation 3.15, the coefficients a^s and bi s, can be readily estimated. Once the coefficients are estimated, the distributions of Yi and Y2 are fully described by the polynomial chaos expansions as shown in Equation 3.14. The next step involves determination of the statistical properties of the outputs, such as the individual moments, joint moments, correlations between the outputs and correlations between inputs and outputs. This process is done by numerically generating a large number of random samples of (£1^ £2,i 6,»)> calculating the values of model inputs and outputs for those samples, and computing the statistical properties from the values of inputs and outputs as described in Section 3.5. ------- Chapter: 111 Section: 8 47 3.8 Computational Implementation Input Distributic Model ¦*- it Distributi ( Generate a set of random inputs ) Figure 3.2: Schematic depiction of the steps involved in the application of conventional Monte Carlo method 3.8.1 Implementation of the Conventional Uncertainty Propaga- tion Methods Modules for the application of conventional Monte Carlo and the Latin Hypercube Sampling methods were developed as part of this work. Though these modules are mainly used in this work for the evaluation of alternative methods, they can be used in isolation. As shown in Figure 3.2, the conventional Monte Carlo method involves the fol- lowing steps: (a) obtaining random samples from the probability distributions of the inputs, (b) performing model simulations for the combination of the sampled inputs, and (c) statistically analyzing the model outputs. The random numbers for sampling the input distributions were generated depend- ing on the case study considered. For all the cases, the normal random numbers were generated through the function GASDEV provided with the mathematical routines of the Numerical Recipes [158]. Gamma random variables were generated by first gener- ating a normal random number, and applying the transformation shown in Table 3.1. In the case of Dirichlet distributions, the independent random numbers were gener- ated first, and the corresponding ratios were calculated using Equation 3.2.4. In one specific case, in the study involving a physiologically based pharmacokinetic model, described in Section 5.1, the model was implemented on the SimuSolv modeling plat- form [56], and the Monte Carlo method involved sampling from random numbers generated by the pseudo-gaussian random number generator function in SimuSolv, R-GAUSS. In all the case studies described in this work, the original numerical models were modified accordingly to accept the uncertain model inputs and parameters as inputs from a file, and model simulations were performed at the sampled values of the inputs. ------- Chapter: 111 Section: 8 48 The outputs of the simulations were then analyzed by "binning" them into a set of intervals; a bin number associated with an interval contains the number of outputs realizations that belong to that interval, from all the model simulations performed. The ratio of the bin number to the total number of samples, gives the probability that the output belongs to that interval. The ratio of this probability to the interval size gives the probability density at the mid-point of the interval. The probability density is then plotted against the range of values that the output can belong to, as presented in the case studies. Further, FORTRAN routines to calculate the mean, variance, and higher moments were also implemented as part of the SRSM; these were based on equations presented in Section 3.5. The Latin Hypercube Sampling (LHS) method was developed and implemented from basic principles, mainly because of the lack of availability of a standardized routine readily usable on Unix operating system. The EPA approved routine for the LHS, developed by Iman et al. [105], is developed for the VAX/VMS operating system, and other commercial software such as Crystal ball [45] and @RISK [208], for the PCs have LHS routines embedded in the modeling platform (typically spreadsheet based), and thus cannot be used in conjunction with stand-alone numerical codes. The main aspects of the LHS implementation developed here are presented in the following. In order to generate samples from input distributions, the range of probable values for each uncertain input parameter is divided into M segments of equal probability. Thus, the whole parameter space, consisting of N parameters, is partitioned into MN cells, with each cell corresponding to equal probability regions from the input sampling space. For example, for the case of 3 input parameters and 4 segments, the parameter space is divided into 4x4x4 cells. The next step involves the generation of M samples from MN cells. This is accomplished as follows: first, a random sample is generated, and its cell number is calculated. The cell number indicates the segment number the sample belongs to, with respect to each of the parameters. For example, a cell number (2,1,2) indicates that the sample lies in the segment 2 with respect to first parameter, segment 1 with respect to second parameter, and segment 2 with respect to third parameter. At each successive step, a random sample is generated, and is accepted only if it does not agree with any previous sample on any of the segment numbers. Thus, no two samples have any input corresponding to the same segment. The advantage of this approach is that the random samples are generated from all the ranges of possible values, thus providing samples from the tails of the probability distributions (i.e., regions that have very low probabilities of occurrence, but correspond to extreme values). Further, for every M samples, each parameter is sampled from each of its M subranges. The LHS FORTRAN subroutine developed here requires the following inputs: (a) the total number of samples to be generated, A^0t, (b) the number of random inputs, N, and (c) the input probability distributions, consisting of the distribution type (a number), and the distribution parameters (dependent on the distribution ------- Chapter: 111 Section: 8 49 Input Distributions Output Distributions Model Select a set of srvs and transform inputs in terms of srvs Express outputs as series in srvs with unknown coefficients Generate a set of regression points Estimate the coefficients of output approximation Figure 3.3: Schematic depiction of the steps involved in the application of the Stochas- tic Response Surface Method type). Further, the number of ranges the parameter range of each input should be divided (M) is also required as an input. The routine generates a set of M sample points in each iteration, till samples are generated. A typical value of M used in the case studies is about 50. A much higher value results in too small a range for each sampled input, requiring a large number of trial samples before an acceptable sample is generated, whereas much smaller values do not adequately represent all ranges of possible values, especially the tails of the distributions. 3.8.2 Implementation of the Stochastic Response Surface Method The steps involved in the application of the SRSM are presented in Figure 3.3. The implementation of the SRSM is based on the following routines: • A symbolic calculation routine in Maple [91], to generate polynomial chaos expansions of arbitrary order, consisting of an arbitrary number of parameters. • A FORTRAN subroutine to generate polynomial chaos expansions of up to fifth order for an arbitrary number of parameters. The extension to higher orders is straightforward, but not pursued because in all the case studies considered, the third order expansion was the highest required. This routine is developed using the results from the Maple routine. • A FORTRAN subroutine to generate the sample points based on the input distributions and the order of the polynomial chaos expansion selected. • A set of FORTRAN utilities to sample distributions in the form of the srvs, to ------- Chapter: 111 Section: 9 50 calculate the corresponding transformed random variables, and to calculate the numerical values of the polynomial chaos expansions. • A FORTRAN subroutine to perform collocation or regression (based on the scenario) on a set of sample points and the corresponding model outputs. These are used to estimate the coefficients of a polynomial chaos expansion. • A FORTRAN subroutine to generate the probability density functions from the coefficients of a polynomial chaos expansion in srvs, by obtaining a large number of samples of srvs, and calculating the statistics of the corresponding outputs. • FORTRAN subroutines to calculate the moments, percentiles, and other statis- tical properties of both the inputs and outputs. 3.9 Implementation of a Web Interface to the SRSM All numerical routines associated with the development of the SRSM have been incor- porated into a stand-alone application, that could be used for uncertainty propagation with black-box models. This has also been implemented as a "Web Based" tool, that can be run from remote web browsers. The interface was implemented by using the Common Gateway Interface (CGI) programming using the Perl language. An operational version can be accessed through the location http://www.ccl.rutgers.edu/srsm.html The web server is located in the Computational Chemodynamics Laboratory, at the Environmental and Occupational Sciences Institute, a joint project of University of Medicine and Dentistry, New Jersey, and Rutgers, The State University of New Jersey. Here, the application of the SRSM method for a simple black-box model is pre- sented. In order to use the SRSM, first the number of inputs, their probability distributions, the number of outputs, and the order of polynomial chaos expansion desired, have to be specified, as shown in Figure 3.4. The program responds with the recommended sample points, and the model outputs at these sample points have to be input, as shown in Figure 3.5. Additionally, a range for plotting the output probability density functions is also required as input. The program uses the model outputs at these sample points and calculates, and displays the probability density functions for each of the outputs, as shown in Figure 3.6. ------- Chapter: 111 Section: 9 51 nle Edit View 3d Communicator Help r ' >-¦ !.¦ ¦_ ' ii tp //nenny rutqers edu/s3i/bin/5r3m c : ¦¦!. !¦., iled A V/pii Interface. ro the SKSM Methodology for computationally efficient uncertainly propagation To use the SRSM method for the uncertainty analysis of a model. Please fill in the following table: Number of outputs: fCurrently up to 3 5upportedi : i2 ! Distributionlype | ParainaterA i Paramater B i .;.; '• . .. ' :'•• ijfcy' •¦ "•• ' . Bugs, comments/suggestions; vasin'i-' i sut a;;a'i Last modified: Wed Nov 26 16:00:26 EST 1998 Figure 3.4: Application of the web based SRSM tool: specification of input uncer- tainties ------- Chapter: 111 Section: 9 52 hhbhhhhbhhhhhhhhhhh I'^ml VdriitMi' I Vt>ii duo U.8509E-01 SET. tn#. ¦k h;,rv '-¦5'-" \ ! ">' ,b.3^1i i.69-- . 6 G 7 i IS. .0115 31 .4'i:y ~ i r?i "¦ _. ~i ~ -i j 4 _ 4 ? . : i S 2 * .528*? §1 jiS.61:ii Please enter th«> range for the outputs ¦HRHH^^ Sji&o Oiiiplli ? 60 Submit Output> Wb Help 3 mm< Figure 3.5: Application of the web based SRSM tool: recommended sample points and the corresponding model outputs ------- Chapter: 111 Section: 9 53 These are Hie probability distributions for the outputs ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ Output 2 Figure 3.6: Application of the web based SRSM tool: estimates of the pdfs of outputs the ------- Chapter 4 COUPLING OF THE SRSM WITH SENSITIVITY ANALYSIS METHODS: DEVELOPMENT AND IMPLEMENTATION OF SRSM-ADIFOR The objective of this effort is to couple the Stochastic Response Surface Method (SRSM) with sensitivity analysis methods, to obtain savings in the computer resources re- quired for uncertainty propagation. In the construction of an approximate model, sensitivity analysis methods can provide information that can help reduce the num- ber of model simulations required to estimate the approximation parameters. Conse- quently, the use of sensitivity information in conjunction with uncertainty propagation methods can reduce the model simulations required to estimate uncertainties in model outputs. The rationale for the coupling of the SRSM with a sensitivity analysis method is illustrated in Figure 4.1 for the simple case of approximation of a curve in a 2-D Euclidean space using a set of sample points. Figure 4.1 (A) represents the model re- sponse, which can be approximated accurately using the five sample points, as shown in Figure 4.1 (B). In contrast, using three sample points, only a poor approximation as shown in Figure 4.1 (C) can be obtained. However, using the derivative informa- tion at the three sample points (tangents at these points), a good approximation, as shown in Figure 4.1 (D) can be obtained. As an extension of this reasoning to multi-dimensional systems, the partial derivatives of model outputs with respect to model inputs (i.e., tangents in the multi-dimensional space) can be used to reduce the number of sample points required to construct an adequately approximate model. An overview of various sensitivity metrics is presented in Section 2.4, and the metrics are listed in Table 2.3. Of these sensitivity metrics, the derivative information can be directly utilized in constructing an approximate model. For example, consider a model * = f(x,y) for which the output z is to be approximated as a function of inputs x and y through the expression: z = a0 + a\x + ci2y + a%x2 + a^y2 + a$xy In order to estimate the coefficients a^s, the model outputs for z are needed at six sample points, requiring six model runs. This results in a set of six equations with 54 ------- Chapter: IV 55 B (A) Model Response (B) Approximation with sufficient data (C) Approximation with fewer data (D) Approximation with fewer data and derivative information Figure 4.1: Rationale for coupling of the SRSM with a sensitivity analysis method the left hand side consisting of the model outputs and the right hand side consisting of linear equations in the coefficients. For the same case, if all the first order partial derivatives are available at each sample point, the number of model runs can be reduced to two, and the derivative information at two points, (x\,yi) and (£2,2/2) can be used to construct the following ------- Chapter: IV Section: 1 56 closed system of equations: z\{xi,yi) dz o,q + tt\X\ + ct,2yi + o 3X1^ + Q4J/12 + CL5X1I/1 9X (x1>yi) dz 0\ + 2a,3Xi + CI5J/1 &2 + 2d4j/i + CI5X1 do + CliX2 + O2I/2 + 03X2^ + O4I/22 + O5X2I/2 9X (x2)OT) dz 0\ + 203X2 + CI5J/2 ci2 + 2o^y2 + 05X2 Similarly, for a model with M parameters, the number of model runs required to estimate the unknown coefficients decreases by a factor of M + 1, since the first order partial derivatives with respect to M parameters result in M additional equations at each sample point, whereas calculation of the model output at a given point results in only one equation. 4.1 Methods for Estimation of Partial Derivatives The following approaches are commonly used to estimate derivatives for a model [16, (a) Variational Equations: When the model equations are available, one can, in princi- ple, compute the derivatives from these equations by directly differentiating the equa- tions and performing the necessary algebraic manipulations. This approach results in an additional set of equations that can be solved either in a coupled manner [193], or in a decoupled manner [58]. For lower dimensional, linear models, these equations can be solved directly along with the model equations to estimate the derivatives. However, for nonlinear functions the derivatives are generally more complicated than the function itself. Further, if the original model has n outputs and m model param- eters, in order to estimate the partial derivatives of all the outputs with respect to all the parameters, the number of equations required would be m x n. Hence, this ap- proach requires a considerable amount of effort for generating all the necessary model equations. Further, this approach requires access to the original model formulation equations. (b) "Brute-force" approach: The two commonly used techniques in this approach are: 29]: ------- Chapter: IV Section: 1 57 Central differences: For the model represented by Equation 2.5, the partial derivative of Ui with respect to a parameter kj at a point (fci;o, • • • , kjto,... , km>o) is estimated by solving the model at two points (fci,o, • • • , fcj,o — h,... , km>o) and (fci;o, • • • , kjfl + h,... , kmfl), and the respective outputs are obtained as Uij and Ui>r. The partial derivative is then given by diti Ui r Uij —2h-'h^° One-sided differences: Here, the model is solved at a point (fci;o, • • • , • • • , km>o), resulting in output -u^o and either at the point (/ci,o, • • • , kj)o — h,... , km,o) or at the point (fci,o, • • • , kj)o + h,... , km------- Chapter: IV Section: 1 58 Automatic Derivative Calculator (e.g. ADIFOR) Compile and Link Further Analysis DERIVATIVE CODE Derivative Computing Code COMPUTER MODEL (To Be Analyzed) Figure 4.2: Schematic illustration of the Automatic Differentiation method (adapted from Bischof et al., 1994 [16]) (d) Automated Differentiation: Automated differentiation, also known as automatic differentiation, relies on the fact that every function is executed on a computer as a sequence of elementary operations such as additions, multiplications, and elementary functions. By applying the chain rule t=t o s=g(t0) t=t o. over and over again to the composition of those elementary operations, one can com- pute, in a completely mechanical fashion, derivatives of / that are correct up to the machine precision [16]. This approach essentially consists of using a computational tool that can "understand" the model code and subsequently produce the code for estimating partial derivatives. The main advantage of this method is that the par- tial derivatives can be estimated using a single model run; another advantage is that this method can be used in conjunction with complex numerical models, that can be considered as "black-box" structures. The automatic differentiation method has been widely applied in sensitivity anal- ysis of transport/transformation modeling, as mentioned in Section 2.5.4. ADI- FOR (Automatic Differentiation of FORtran) is used in this work for coupling with the SRSM. The following sections describe the ADIFOR methodology and the cou- pling with the SRSM. ------- Chapter: IV Section: 2 59 4.2 Automatic Differentiation using AD I FOR ADIFOR (Automatic Differentiation of FORtran) [17] transforms the model source code based on the specification of the dependent and independent variables, and produces source code that calculates the derivatives of the dependent variables with respect to the independent variables. Figure 4.2 presents a schematic depiction of the automatic differentiation procedure using ADIFOR. Figure 4.3 presents an example of the transformation of source code using ADI- FOR. Figure 4.3 (a) shows a segment of a sample Fortran program for which deriva- tives are desired. The dependent variable is y and the independent variables are x(l) through x(10). Figure 4.3 (b) shows the derivative code produced through the appli- cation of ADIFOR. Using the resultant code, the partial derivatives of y with respect to independent variable x(i) can be obtained through the corresponding element dy(i) from the array dy. This example illustrates some key advantages of using ADIFOR over other meth- ods: ADIFOR steps through the conditional execution loops and iterative loops (e.g., the if and the do statements in Figure 4.3 (a)), and ADIFOR can calculate deriva- tives for assignment and iteration statements (e.g., the statement y =x(i) * y *y). Further, the iterative loops and conditional loops are common in computer codes, while they do not appear in algebraic or differential equations. It must be noted that the value of the variable y depends to a great extent the "sign" of the various x(i) 's, specifically of x(n); such situations are difficult to express in algebraic forms, whereas they can be readily expressed in computer code. Since ADIFOR methodically follows the model code, all the complex logic of the model code is followed exactly by the derivative code. In principle, this method can be easily extended to compute derivatives for com- puter code of arbitrary length and complexity. Furthermore, in the application of ADIFOR, the user has to only (a) specify the information about the dependent and independent variables, and (b) add statements to the derivative code to output the derivatives. 4.3 Coupling of SRSM and ADIFOR The coupling of SRSM and ADIFOR follows the same steps as the SRSM with re- spect to input and output transformations. The coupled method, SRSM-ADIFOR, SRSM approximates uncertain model outputs in terms of a set of "standard random variables" (srvs), denoted by the set The following steps are involved in the application of the SRSM-ADIFOR: • the model inputs are expressed as functions of selected srvs, as described in Section 3.2, ------- Chapter: IV Section: 3 60 dimension x(10) y = 1.0 do i = 1, n if (x(i) .gt. 0.0) tbjen y = x(i) * y * y endif enddo (a) dimension x(10) dimension dy(10), dx(10,10) do j = 1,10 dx(i,i) = 1.0 enddo do j = 1, 10 dy(j) = 0.0 enddo y = 1.0 do i = 1, n if (x(i) .gt. 0.0) then a = x(i)*y b = y*y c = a + y*x(i) do j = 1, 10 dy(j) = b*dx(j,i) + c*dy(j) enddo y = a*y endif enddo (b) Figure 4.3: FORTRAN derivative code generation using ADIFOR: (a) Sample code fragment and (b) ADIFOR generated derivative code • an approximation for the model outputs is assumed in the same form as pre- ------- Chapter: IV Section: 3 61 sented in Section 3.3: n yr Clr^0+ ^ ^ ^ ^ ^ ^ &ryiii2 ^2 ) n %\ ii = l %i = l ^2 — 1 n i\ %2 + EEE ^(61,62> 6s) + • • • (4-i) %1 = \ »2 = 1 13 = 1 where, yr is the rth output metric (or random output) of the model, the ar;il...'s are deterministic constants to be estimated, and the ... , £ip) are multi- dimensional Hermite polynomials of degree p, given by rp(&i,... ,Cip) = (~l)p 9 e~^, (4.2) O^i 1 ... o^tp • correspondingly, the first order partial derivatives of the rth model output with respect to jth srv^j, given by —are expressed as dyr _ ^ ^ ^2(^,62) dt dt d£ J J —1 J J —1 — 1 J J n ^l a ii = l ^J ii = l ^2 = 1 n ii 12 + E E E+ ¦¦¦ («) il = l »2 = 1 13 = 1 • the model derivative code is generated using the original model code and ADI- FOR and modified so that it can calculate the first order partial derivatives of model outputs with respect to the srvs, • a set of sample points is selected using the sampling algorithm presented in Section 3.4.2; the number of sample points selected is smaller than the number of coefficients to be estimated, as shown in Equation 4.4, • the model outputs and the partial derivatives with respect to the srvs are cal- culated at these sample points, • outputs at these points are equated with the polynomial chaos approximation (Equation 4.1, resulting in a set of linear equations in s, and partial derivatives at these points are equated with the corresponding approximation given by Equation 4.3, • The resultant equations are solved using singular value decomposition, to obtain estimates of a^s, as illustrated in Section 4.4. ------- Chapter: IV Section: 4 62 4.3.1 Selection of Sample Points In the application of SRSM-ADIFOR to the uncertainty analysis of a model with M inputs, P outputs, for a given order of expansion, Equation 4.1 is used to calculate the number of coefficients to be estimated (say K). Thus, K coefficients need to be estimated for each output. The execution of the model derivative code at one sample point gives the model calculations for the outputs and M first order partial derivatives for each output. Thus, equations 4.1 and 4.3 in conjunction with these calculations result in M + 1 linear equations at each sample point. Here, the number of recommended sample points is based on the rationale behind the regression based SRSM: the number of resulting equations should be higher than the number of coefficients estimated in order to obtain robust estimates of the coef- ficients. Here, the recommended number of equations is about twice the number of coefficients to be estimated. Since for each sample point, the number of resultant equations is M + 1, the number of sample points required for SRSM-ADIFOR, N, is approximately given 4.4 An Illustration of the Application of the SRSM-ADIFOR This section presents an illustration of the SRSM-ADIFOR method by revisiting the example presented in Section 3.7 involving the application of the SRSM. The probability distributions of the three model inputs, X1,X2, and X3, are given by Equation 3.12, and the two model outputs, Yi and Y2, are approximated by polynomial chaos expansions by*: 2 K (4.4) N M + 1 Y\ = ao + ai£i + a2^2 + <^3^3 + ~ 1) + (^2 ~~ 1) + ~ 1) +«7Cl^2 + <28^2^3 + Y2 = b0 + &i£i + b2£2 + 63^3 + 64(^1 — 1) + — 1) + ^(£3 — 1) +&7C1C2 + ^^2^3 + (4.5) *the number is approximate because 2K may not always be exactly divisible by M + 1 ------- Chapter: IV Section: 4 63 Clearly, the partial derivatives of Yi and Y2 with respect to the srvs are given by: dYl <96 dYi <96 dYi <96 dY2 <96 <96 9Y2 <96 = ai + 2a^i + 0,7^2 + = a2 + 2a5£2 + <276 + ®s6 = + 2a6^3 + ^6 + <^9^1 = 61 + 2646 + ^6 + ^9^3 = 62 + 2656 + ^6 + ^6 = 63 + 26e6 + &s6 + 696 (4.6) Based on Equation 4.4, the number of sample points, N, required to estimate the 10 unknown coefficients equals 5. Thus, using SRSM-ADIFOR, the uncertainty in the model outputs can be estimated by obtaining the model outputs at N = 5 sample points, compared to 20 sample points required in the application of the SRSM. The model equations corresponding to the estimation of unknown coefficients, are given by the following equations: ZTx ao bo ai bi a2 b2 as bs ag bg yi,i 2/2,1 <9yi dy2 <96,1 <9yi <96,1 dy2 <96,1 dyi #6,1 <9?/2 <96,i <96,1 vi,n y2,N dyi dy2 <96 ,N dyi <96,w dy2 <96,w dyi <96,w dy2 - <96 ,N <96 ,N . , where (4.7) ------- Chapter: IV Section: 5 64 Input Distributions Output Distributions Model Select a set of srvs and transform inputs in terms of srvs ADIFOR application Estimate the coefficients of output approximation Express outputs as series in srvs with unknown coefficients Derivative Code Generate a set of regression points Figure 4.4: Schematic depiction of the steps involved in the application of the SRSM- ADIFOR 1 0 0 0 ... 1 0 0 0 6,i 1 0 0 ... 6 ,N 1 0 0 6,i 0 1 0 ... 6 ,N 0 1 0 6,i 0 0 1 6 ,N 0 0 1 62,i -1 26,i 0 0 ... ••• £Lv-i 26,JV 0 0 62,1 -1 0 26,i 1 • • • £2,N ~~ 1 0 26, N 1 £!,i -1 0 0 26,1 ••• • • • d,N ~ i 0 0 26, N 6,16,1 6,i 0 6,1 • • • • • • 6,w6,w 6 ,N 0 6 ,N £2,16,1 6,1 6,1 0 ... • • • 6,w6,w 6 ,N 6 ,N 0 6,16,1 0 6,1 6,1 • • • • • • 6,w6,w 0 6 ,N 6 ,N 4.5 Implementation of the SRSM-ADIFOR The implementation of the SRSM-ADIFOR involves the modification of the model source code, by first processing the code using ADIFOR, and then making changes in the resultant code so that the derivatives with respect to the srvs can be calculated. Subsequently the derivative code has to be coupled with a main routine that produces ------- Chapter: IV Section: 5 65 the derivatives of concern as program outputs. The calculated derivatives are then coupled with the SRSM to estimate the coefficients of the polynomial chaos approx- imations of the outputs. In order to accomplish this, a set of routines is developed to generate expressions for the derivatives of a polynomial chaos expansion, and to generate a matrix similar to the illustration shown in Equation 4.7. The subsequent procedure is similar to the application of the SRSM. Figure 4.4 presents the steps involved in the application of the SRSM-ADIFOR. ------- Chapter 5 CASE STUDIES FOR THE EVALUATION OF THE SRSM AND THE SRSM-ADIFOR This chapter presents four case studies to evaluate the SRSM and the SRSM- ADIFOR. These models cover a range of applications, both from the perspective of model application (such as biological, air quality, and groundwater) and from the perspective of model complexity (ranging from a zero dimensional model to a three- dimensional urban/regional scale photochemical model). The execution times of the models also vary from a fraction of a second to a few hours per model run. The main objective of these case studies is to evaluate the applicability of the SRSM and the SRSM-ADIFOR to a wide range of transport-transformation models, and potentially to numerical models used in other areas. The case studies are presented here in the following order: I: A zero-dimensional pharmacokinetic model. II: A two-dimensional atmospheric plume model. Ill: A three-dimensional urban/regional scale air quality model. IV: A one-dimensional ground water model with discontinuous probability density functions. The order corresponds to both the chronological evaluation of the SRSM method, as well as to the associated complexity. The computational time for each model simulation in the first three cases is of the order of a fraction of a second (case I), through a few seconds (case II), to approximately 6 hours (case III). The groundwater model required about a few seconds per model simulation, but is presented at the end, since the complexity with respect to the specification of uncertainties in inputs is high. Each case study description contains the information about the model and its application in exposure/risk assessment. Then, the uncertainties associated with the model are described, and the application of different uncertainty propagation methods is presented. The estimates from the uncertainty propagation methods are then presented graphically, in the form of probability density functions (pdf s), along with a brief discussion at the end. 66 ------- Chapter: V Section: 1 67 5.1 Case Study I: A Zero Dimensional Physiological System Uncertainty Analysis of a Physiologically Based Pharmacokinetic Model for Perchloroethylene 5.1.1 Description of the Case Study There is often significant uncertainty associated with human Physiologically Based PharmacoKinetic (PBPK) model parameters since human in vivo experimental data are not usually available for toxic chemicals. Thus many of the parameters in human PBPK models are generally estimated by in vitro experimentation and by inter-species scale-up of animal PBPK model parameters. The uncertainty in human PBPK pa- rameters includes a significant amount of natural variability, reflecting the interindi- vidual variability inherent in human populations. It is desirable to estimate the effects that uncertainties (including variability) associated with model inputs and parameters have on output metrics such as internal and biologically effective doses. Here, a PBPK model for perchloroethylene (PERC) [59] is considered for uncer- tainty analysis. Human exposure to PERC, and the subsequent metabolism of PERC in the body, is of concern since PERC is a potential carcinogen [59]. Further, PERC is used as a cleaning solvent in the dry cleaning industry, and hence human exposures to it are very common. Three dose surrogates for PERC inhalation are considered in this study: (a) area under the arterial blood concentration (AUCA), (b) area under venous blood from liver (AUCL), and (c) the cumulative amount metabolized in the liver (CML). CML is considered to be an alternative indicator of risk than simply the total up- take of PERC, since the health risks are thought to be associated with the metabolism of PERC in the body. AUCA and AUCL are indicators of the amount of time PERC resides in the arterial blood and liver, respectively, and thus are also considered as alternative indicators of risk. The formulation of a PBPK model can be found in the literature [75,171]. Fig- ure 5.1 presents the schematic of the PERC PBPK model used in this case study; the human body is considered to consist of a set of compartments corresponding to dif- ferent organs and tissues. These are modeled as a series of interconnected continuous stirred tank reactors (CSTRs). The effect of uncertainty in the parameters of the PERC PBPK on the three dose surrogates is analyzed using the following: • Standard Monte Carlo approach, • Latin Hypercube Sampling (LHS), • Deterministic Equivalent Modeling Method/Probabilistic Collocation Method (DEMM/PCM), ------- Chapter: V Section: 1 68 rP rp sp sp Qfat fat Ql 1VW Cliv -ca Rapidly Perfused [~ Slowly Perfused Fat Liver rp -sp Qfat Ql IV -~ metabolism Lungs Cinh| Qalv Cexh Qalv r Cca Figure 5.1: Schematic representation of a PBPK model for PERC • Stochastic Response Surface Method with Efficient Collocation (ECM), • Regression based SRSM, also referred to simply as SRSM, and • Combined Application of the Stochastic Response Surface Method and Auto- matic Differentiation Method (SRSM-ADIFOR). The uncertainty analysis consisted of two stages: In the first stage, the effect of uncertainty in the metabolic constants (Km and Vma_y) and the partition coefficients are considered (see Table 5.1). The uncertainty in these six parameters are described by independent log-normally distributed random variables. In the second stage, the uncertainty in compartmental volume proportions are included in the analysis. These five additional parameters are described by a set of mutually dependent Dirichlet distributed random variables. In the first stage (6 parameters), the DEMM approach and the ECM are evaluated through comparison with the results from a standard Monte Carlo simulation. For the second stage (11 parameters), the ECM, Regression Based SRSM and SRSM-ADIFOR are evaluated by comparing the results from these methods with the results from the standard Monte Carlo and LHS methods. In an earlier study, Farrar et al. [65] examined the effect of PBPK model param- eter uncertainty on the three dose surrogates, CML, AUCA, and AUCL. The PBPK model parameter uncertainties used in the present work are adopted from their work, and are specified by the preferred values (PVs) and uncertainty factors (UFs) given ------- Chapter: V Section: 1 69 Table 5.1: Deterministic and uncertain parameters used in the uncertainty analysis of the PERC PBPK model Preferred Symbol Description Value UF BW body weight [Kg] 70 — Partition Coefficients -Pblood / air blood:air partition coefficient 12 1.7 Pfat/blood fat:blood partition coefficient 102 2.15 Psp/blood slowly perfused tissue:blood partition coefficient 2.66 11.0 Prp/blood rapidly perfused tissue:blood partition coefficient 5.05 5.69 -Pliv/blood liver:blood partition coefficient 5.05 9.37 Blood Flows Qc cardiac output [liters/hr] 348 1.12 Qp alveolar ventilation [liters/hr] 288 1.50 Q fat blood flow to fat [liters/hr] 17.4 1.09 Qsp blood flow to slowly perfused tissue [liters/hr] 87.0 1.04 Qrp blood flow to rapidly perfused tissue [liters/hr] 153 1.25 Qliv blood flow to liver [liters/hr] 90.6 1.35 Compartment Mass Fractions Vfat mass of fat compartment [Kg] 23.0% 1.09 Kp mass of slowly perfused tissue compartment [Kg] 62.0% 1.04 Kp mass of rapidly perfused tissue compartment [Kg] 5.0% 1.25 Miv mass of liver compartment [Kg] 2.6% 1.35 Metabolic Constants Michaelis-Menten constant for metabolism [mg/liter] 1.47 12.3 Kia x,c maximum rate of metabolism [mg/hr] 0.33 2.84 in Table 5.1. The preferred values for all parameters other than metabolic constants, correspond to the nominal values in the PERC PBPK model [59], updated as sug- gested by Reitz and Nolan [165]. The preferred values of metabolic constants are given by the geometric mean of values suggested by Hattis et al [90], and by Reitz and Nolan [165]. The uncertainty factor for a parameter is defined such that the inter- val from Xp/UF to Xp UF is the 95% confidence interval, where Xp is the preferred value of parameter X. ------- Chapter: V Section: 1 70 5.1.2 Specification of Parameter Uncertainty Partition coefficients and metabolic constants are assumed to be lognormally distributed, according to Farrar et al [65]. The pdf for a given lognormally distributed parameter is specified by assuming that the preferred value of the parameter corre- sponds to the median of the distribution. This is convenient, since the mean and standard deviation that specify the corresponding normal distribution are then given by lnXp and ln(UF)/1.96 respectively. Compartment volume fractions are assumed to be specified by a Dirichlet distri- bution [65]. This implies that each compartment volume is specified by a x2 distri- bution [111, 194], the pdf of which is given by: x > 0 f(x-,a) = ^ r(a/2)2"/2 ' (5.1) 0, elsewhere The mean of the x2 distributions is a, and is related to the preferred value and the Dirichlet distribution parameter 0, as follows: E[Xj] = xipO where Xj is the ith compartment volume, E[Xj] is its mean, and xip is the preferred value of the fraction Xi} as given in Table 5.1. From the values of 0 and the preferred values for each compartment volume fraction, the parameters of the pdfs of the compartment volumes can be obtained. Compartment blood flows are assumed to be specified as deterministic functions of compartment volumes and preferred values of compartmental blood flows. In sleeping humans, the blood flow to compartment j is given by: Vj Qj = Qj,p^ Vj,P ' where the notation Xp for the preferred value of the parameter X is employed. The total cardiac output is given by Qc Qliv "I" Qfat "I" Qsp "I" Qrp ? and the total cardiac output in waking humans is given by: Qcw Qc (Qcw,p Qc,p)qci i where qce is a lognormally distributed random variable, the preferred value and un- certainty factor for which are given in Table 5.1. The form of the expression for Qcw ensures that Qcw > Qc always. The pdf for qce is constructed in a manner analogous to that of the other lognormally distributed parameters as described above. ------- Chapter: V Section: 1 71 The increase in cardiac output in waking humans, is accounted for by an increase in blood flow to the slowly perfused compartment, as follows: Q spw QsP,p (Qcw,p Qc,p)q ce* Alveolar ventilation rate is a deterministic function of the cardiac output, and the preferred values for ventilation are given in Table 5.1. Alveolar ventilation rates in sleeping and waking humans are given by: ^ p. Qc i p. p. Qcw ^%p — Wp,P s~\ ) ^%pw — ^pw,P lc,P Qcw,P 5.1.3 Implementation of the PERC PBPK Model The PERC PBPK model has been implemented in SimuSolv [56], a software environ- ment for numerical modeling and simulation. The PBPK model consisted of a set of coupled ordinary differential equations, which describe the dynamic variation in the amount of PERC in each of the compartments shown in Figure 5.1, resulting from uptake, distribution, metabolism and excretion processes. The random parameters in the model have been sampled for the simulations using the pseudo-gaussian random number generator function in SimuSolv, RGAUSS. The representation of pdf s of uncer- tain parameters in terms of standard gaussian random variables, each with zero mean and unit variance, is described here. Daily values of dose surrogates, CML, AUCA, and AUCL, are computed using the PBPK model to simulate a 24 hour inhalation ex- posure to 10 ppb of PERC, which is a typical high concentration in urban/industrial areas [99]. A lognormally distributed random variable X, with median fix, and an uncer- tainty factor UFx, can be represented in terms of a standard normal random variable £ = N(0,1), having zero mean and unit variance, as follows [194]: X = exp (\n(nx) + ^ A x2 distributed random variable with parameter a, can be approximated in terms of a normal random variable £ as follows [207]: y=a (?\/I+1 - £) <5-3) 5.1.4 Results for the PBPK case study Stage I (Six Uncertain Parameters): Evaluation of DEMM/PCM and ECM The effect of six uncertain PBPK model parameters on the dose surrogate pdfs is studied in the first stage of uncertainty analysis. The number of model simulations ------- Chapter: V Section: 1 72 used for DEMM/PCM and the ECM are equal, as both these methods differ only with respect to the selection of collocation points. Second and third order approximations are used for both DEMM/PCM and the ECM. The number of simulations required for both these methods for the case of six uncertain inputs are 28 and 84, for 2nd and 3rd order approximations, respectively. Additionally, 10,000 Monte Carlo simulations are used in the first stage for evaluation of the DEMM/PCM and ECM. Figures 5.2 and 5.3 present the pdfs estimated from four different sets of colloca- tion points selected from the same set of original 4096 points (DEMM Trial 1, 2, 3 and 4), for the dose surrogates AUCA, AUCL and CML, respectively. They also present the pdfs estimated by standard Monte Carlo approach. Here, the four DEMM trails involved the same number of sample points, but different samples. Further, the pdfs estimated by the Monte Carlo simulation are considered to be the "correct" pdfs. From these figures, it can be seen that the DEMM/PCM approach cannot con- sistently guarantee convergence of the approximation to the true probability density. Since the DEMM/PCM method failed to produce consistent results for a simple model with independent input probability distributions, this method was not used in other case studies, which involved more complex models or input distributions. DEMM Trial 1 DEMM Trial 2 DEMM Trial 3 DEMM Trial 4 Monte Carlo - 10,000 runs 40 20 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 Cumulative Amount Metabolized [mg] (normalized by mass of liver) Figure 5.2: Evaluation of DEMM/PCM: uncertainty in the cumulative amount of PERC metabolized, over a period of one day, resulting from 6 uncertain parameters ------- Chapter: V Section: 1 73 250 DEMM Trial 1 DEMM Trial 2 DEMM Trial 3 DEMM Trial 4 Monte Carlo - 10,000 runs 200 O) A CO 150 100 0.02 0 0.005 0.01 0.015 0.025 0.03 Area Under the Arterial Concentration Curve [ng.hr/cm^] (a) E o 0 Q >* -I—' TO _Q O DEMM Trial 1 DEMM Trial 2 DEMM Trial 3 DEMM Trial 4 Monte Carlo - 10,000 runs 0.05 0.1 0.15 0.2 0.25 Area Under Liver Concentration Curve [ng.hr/cm^] (b) Figure 5.3: Evaluation of DEMM/PCM: uncertainty in the area under the PERC concentration-time curve in (a) arterial blood (AUCA) and in (b) liver (AUCL), over a period of one day, resulting from 6 uncertain parameters ------- Chapter: V Section: 1 74 200 Monte Carlo - 10,000 runs ECM - 2nd (28 runs) ECM - 3rd (84 runs) 150 O) zL CO E o -I—' CO 100 c ------- Chapter: V Section: 1 75 Monte Carlo - 10,000 runs ECM - 2nd (28 runs) ECM - 3rd (84 runs) 45 40 O) E -i—> CO 25 c ------- Chapter: V Section: 1 76 Monte Carlo - 100,000 runs ECM - 2nd (28 runs) ECM - 3rd (84 runs) 40 O) E & 25 CO c ------- Chapter: V Section: 1 77 180 Monte Carlo - 100,000 runs ECM - 2nd (28 runs) ECM - 3rd (84 runs) 160 140 O) zL 120 CO E o CO c ------- Chapter: V Section: 1 78 Monte Carlo - 100,000 runs LHS- 1,000 runs SRSM - 2nd (130 runs) SRSM - 3rd (600 runs) 40 O) E & 25 CO c ------- Chapter: V Section: 1 79 O) zL co" E o 0 Q -i—» TO _Q O 180 160 140 120 100 80 60 40 20 0 Monte Carlo - 100,000 runs LHS- 1,000 runs SRSM - 2nd (130 runs) SRSM - 3rd (600 runs) 0.002 0.004 0.006 0.008 0.01 0.012 Area Under the Arterial Concentration Curve [|ig.hr/cm 0.014 0.016 0.018 ,3i (a) 18 Monte Carlo - 100,000 runs LHS - 1,000 runs SRSM - 2nd (130 runs) SRSM - 3rd (600 runs) 16 14 12 10 8 6 4 2 0 0 0.05 0.1 0.15 0.2 0.25 Area Under Liver Concentration Curve (jag) 0.3 0.35 0.4 (b) Figure 5.9: Evaluation of SRSM (regression based) and LHS: uncertainty in (a) AUCL and in (b) AUCA (Figure b) resulting from 11 uncertain parameters ------- Chapter: V Section: 1 80 Monte Carlo - 100,000 runs SRSM - 3rd Order (600 runs) LHS- 1,000 runs SRSM-ADIFOR - 3rd Order (70 runs) 45 40 O) E -i—> CO c ------- Chapter: V Section: 1 81 CO E o >» -)—i ¦------- Chapter: V Section: 2 82 5.2 Case Study II: A Two-Dimensional Photochemical Air Quality Model Uncertainty Analysis of the Reactive Plume Model (RPM-IV) The Reactive Plume Model, version IV (RPM-IV), is a standard regulatory model used for calculating pollutant concentrations and establishing causal relationships between ambient pollutant concentrations and the emissions from point sources such as industrial stacks [148,189]. It uses either point source emission estimates or ini- tial plume concentrations as inputs, and calculates downwind concentrations, as the plume expands. RPM-IV is often applied for regulatory purposes to calculate the ozone (03) con- centrations at locations downwind of industrial point sources, since high ozone concen- trations in the ambient environment lead to adverse health effects. Ozone is primarily formed in the atmosphere through a series of complex chemical reactions involving oxides of nitrogen (NO^) and volatile organic compounds (VOCs) in the presence of sunlight. Some of the major point sources of NO^ and VOCs are industrial units, such as the power plants (NO^ sources) and refineries (VOC sources). The appli- cation of RPM-IV helps in establishing a quantitative causal relationship between emissions and ambient pollutant concentrations, which is useful in assessing various control strategies for emission reductions. However, there are significant uncertainties in developing estimates of the emis- sions from industrial sources. These uncertainties occur with respect to the amounts of emissions (e.g., total amounts of VOCs and NO^), and with respect to their chem- ical compositions (or speciations, i.e., the fractions of various chemicals within these groups of compounds). These uncertainties arise due to a variety of reasons: for ex- ample, emission estimates are typically derived from hourly averages projected from annual or seasonal averages [37]. Since there could be a significant variation in the load and operating conditions of an industrial unit, emission estimates and chemical compositions for specific days under consideration may differ significantly from the averages, and thus result in significant uncertainties. Hence, an uncertainty analy- sis that takes into account the emission estimate uncertainties is useful for a better understanding of the effects of control strategies for emission reductions. 5.2.1 Description of RPM-IV A brief description of the RPM-IV model is presented here; additional details are presented in Chapter 6, which deals with the model uncertainty associated with RPM- IV, and which presents an improved, three-dimensional version of the model. RPM-IV simulates mechanistically the complex nonlinear photochemistry and dis- persion processes occurring in an expanding plume. The nonlinear atmospheric gas ------- Chapter: V Section: 2 83 phase photochemistry is described by the Carbon Bond IV (CB-IV) mechanism [206], which consists of a set of 95 chemical reactions among 35 surrogate chemical species corresponding to organic bonds and functional groups. This model follows the tra- jectory of an expanding, moving plume and simulates its evolution. In this model, the pollutant mass is initially divided into cells containing equal amounts of pollu- tants. As the plume expands, the individual cells expand in volume and pollutant mass is transferred across cell boundaries in two phases: (a) an "entrainment" phase, where the expanding cell boundaries entrain the pollutants from other cells, and (b) a "detrainment" phase, where the pollutants diffuse across cell boundaries, due to concentration gradients. Further, the pollutants in each cell undergo chemical trans- formation governed by the Carbon Bond-IV mechanism. The equation describing the concentration changes within a cell is given by: dc) {dc)\ { 1 dwj\ ,• { 1 dhj\ ,• . . ~r = -r r-\uci~\i r1 ) uc) + F 5-4 dt \ dt J chem \WJ ds J \hJ ds ) where c* denotes the concentration of chemical species % in cell j. F- is the net flux into the cell, and is given by Fj = E* — Dj, where Ej is the entrainment as the cell expands, and Dj is the detrainment due to the diffusion of the species across the boundaries of the cell. Expressions for entrainment and detrainment are given by the following equations: n* = — (k—\ 2 fK ( c)+i~c) \ _ K (c)~c)-i 1 dy\ dy) y3-y3-1\ J \yi+1 - y3- J ^ [y3 - y3-2 where yj is the distance from the plume centerline to the far most side of the cell j, in the horizontal direction. An additional condition is imposed on the expansion of the cells: the cells are allowed to expand in a way such that the amount of an inert species remains constant within each cell, i.e., no net transfer of an inert species occurs from a cell. This condition results in the following equation for expansion of the cells, as the plume travels downwind: Ej-D^O RPM-IV was selected for evaluation of the SRSM and SRSM-ADIFOR because it is sufficiently complex to represent a wide range of environmental models, and at the same time it is computationally feasible to perform a large number of Monte Carlo simulations with this model. Further, the complex nonlinear photochemistry of this model is employed by a number of photochemical models that are computationally very demanding. Thus, evaluation of the SRSM and SRSM-ADIFOR with RPM-IV could potentially serve as a preliminary test of applicability of this method to a wide range of complex photochemical models. ------- Chapter: V Section: 2 84 5.2.2 Uncertainty Analysis of RPM-IV The present work studies the effect of uncertainty in emission estimates on predicted downwind secondary pollutant concentrations; here, the specific focus is on the un- certainties in the calculated ozone concentrations resulting from the uncertainties in amounts and the chemical composition of the emissions. This study uses emission estimates and field data measured near Marathon oil refinery at Robinson, Illinois, during June and July, 1977 by Sexton et al. [177]. Since there was no characteriza- tion of uncertainty, and since the focus of this work is to evaluate the applicability of SRSM-ADIFOR, only representative probability distributions to describe the uncer- tainties in emissions were assumed. The following distributions for the emissions of VOCs and NOx are used: (a) the amounts of VOCs and NO^ released are assumed to have normal distributions with a standard deviation of 20% of the mean value, and (b) the chemical compositions of VOCs and NOx are assumed to follow a Dirichlet distribution [194], The Dirichlet distribution satisfies the condition that the sum of mole fractions is unity. According to this distribution the mole fraction of the zth compound, is given by: Vi = (5-7) i= 1 where Xi is an independent random variable, representing the the amount (in moles) of the zth compound; here a normal distribution is assumed for all a^s, with a nominal standard deviation of 40% of mean value. In this case study, the VOC group consists of five subgroups (paraffins, ethylene, olefins, toluene, and xylene), and the NO^ group consists of NO and NO2. Thus, the total number of uncertain parameters for the model is nine (including two parameters representing the total amounts of VOCs and NOx). Thus, a total of 9 uncertain parameters with 7 degrees of freedom were used in this case study; these were represented by 9 srvs. The output metrics considered are the average ozone concentration in the plume for selected distances downwind from the source. The original RPM-IV model was implemented in Fortran and obtained from the EPA Exposure Models Library [195]. The derivative code was obtained using the ADIFOR system on RPM-IV code. For Monte Carlo, LHS, and the SRSM, the model was run at selected sample points, whereas for the SRSM-ADIFOR, the derivative model was run. 5.2.3 Results and Discussion From equations 3.5 and 3.6 for the number of unknown coefficients for a second and third order polynomial chaos expansions, for n=9, the number of coefficients to be ------- Chapter: V Section: 2 85 determined is 78 and 364 for second and third order approximations, respectively. For a regression based method, 300 model simulations were selected for a second order approximation, and 600 model simulations were selected for a third order approxima- tion, in order to facilitate regression with an adequate number of model outputs. The estimates of pdfs for these cases were evaluated with the results obtained from 10,000 Monte Carlo simulations and 10,000 Latin Hypercube sampling methods. For the case of the SRSM-ADIFOR, 80 simulations were used for a third order approximation. Two representative output metrics were considered here: (a) ozone concentration at a downwind distance of 2 km, representative of near-source transport and transfor- mation, and (b) ozone concentration at a downwind distance of 20 km, representative of larger scale transport and transformation. The output pdfs were obtained first using second and third order approximations of the SRSM/ECM. Figure 5.12 shows the pdfs of ozone concentration at a downwind distance of 2 km and of 20 km, as estimated by the ECM, and the conventional Monte Carlo method. Figure 5.13 shows the pdfs of the predicted ozone concentrations at downwind distances of 2 km and 20 km, as estimated by the regression based SRSM, traditional Monte Carlo and Latin Hypercube Sampling. As shown in Figures 5.12 and 5.13, although the regression based-method required significantly fewer runs than the Monte Carlo method, the results agree very closely with the Monte Carlo results. On the other hand, the predictions of ECM become inaccurate as the order of approximation increases, indicating the lack of robustness in the collocation method. This behavior is more prominent for large downwind distance, indicating that the collocation method may not converge when used with highly nonlinear models (the near-source behavior is expected to be less nonlinear than far- source behavior). On the other hand, a regression based method resulted in similar estimates for both second and third order approximations and was consistent for all ranges of downwind distances. The results indicate that, although the regression methods require a higher number of model simulations for uncertainty propagation, compared to the collocation methods, their robustness makes them a more viable tool for uncertainty analysis of complex environmental models. For the evaluation of the SRSM-ADIFOR method with RPM-IV, Figure 5.14 shows the uncertainty estimates obtained. As shown in the figure, SRSM/ADIFOR gave closer estimates with only 80 model runs, while 10000 Latin Hypercube sam- ples were not sufficient to achieve agreement with the results from the Monte Carlo methods. ------- Chapter: V Section: 2 86 Evaluation of The ECM Method Monte Carlo Method (10,000 runs /\ ECM - 3rd Order (364 runs , \ ECM - 2nd Order (78 runs 160 140 120 100 80 60 40 20 0 0.065 0.07 0.075 0.08 0.085 0.09 0.095 Average Plume 03 Concentration (ppm) at 2 km downwind from the source (a) Evaluation of The ECM Method 160 Monte Carlo Method (10,000 runs) ECM - 3rd Order (364 runs) ECM - 2nd Order (78 runs) 140 E 120 Q_ Q_ — 100 -I—' CO c ------- Chapter: V Section: 2 87 E Q. Q. 0 Q >1 -)—i ro .Q o 160 140 120 100 80 60 40 20 Monte Carlo : 10,000 runs LHS- 10,000 runs SRSM - 2nd order (91 runs) SRSM - 3rd order (600 runs) 0.07 0.075 0.08 0.085 0.09 0.095 Average Plume 03 Concentration (ppm) 2 km (left) from the Source (a) E Q_ Q_ -)—i u) 0 Q .Q CO .Q O 160 140 120 100 Monte Carlo : 10,000 runs LHS - 10,000 runs SRSM - 2nd order (91 runs) SRSM - 3rd order (600 runs) 0.11 0.115 0.12 0.125 0.13 0.135 0.14 0.145 Average Plume 03 Concentration (ppm) 20 km (left) from the Source (b) Figure 5.13: Evaluation of SRSM (regression based): uncertainty in the predicted ozone concentrations at (a) 2 km and (b) 20 km downwind from the source ------- Chapter: V Section: 2 88 E Q. Q. 0 Q >1 -)—i ro .Q o 160 140 120 100 80 60 40 20 d Monte Carlo : 10,000 runs ¦ jy*\\ SRSM - 3rd order (600 runs) \\ LHS- 10,000 runs / % SRSM-ADIFOR (3rd - 80 runs) V »\ / V m j / f! fi " rt k( hi LI *v.i wt ¦ V ¦ V y £ 0.07 0.075 0.08 0.085 0.09 0.095 Average Plume 03 Concentration (ppm) 2 km from the Source (a) E Q_ Q_ -)—i '------- Chapter: V Section: 3 89 5.3 Case Study III: A Three-Dimensional Urban/Regional Scale Photochemical Air Quality Model Uncertainty Analysis of the Urban Airshed Model (UAM-IV) for the New Jersey/Philadelphia Domain Photochemical modeling of ozone (O3) levels in the atmosphere is complicated by the fact that ozone is a secondary pollutant; it is formed through complex chemi- cal reactions involving oxides of nitrogen (NO2 and NO), and volatile organic com- pounds (VOCs). The modeling process is further complicated by the presence of natural, model, and data uncertainties, as shown in Figure 5.3. Uncertainties in the input data and parameters (such as emission estimates and reaction rates) become es- pecially significant when photochemical models are used to evaluate control strategies for the reduction of ozone levels [42], Biogenic emissions have a significant impact on atmospheric ozone concentra- tions, depending on the availability of NOx- Earlier studies on the effects of biogenic emissions on O3 concentrations indicate that typically O3 concentrations increase in response to increases in biogenic hydrocarbons. However, modeling studies suggest that in some extremely NOx-limited areas, increases in biogenic emissions decrease O3 concentrations [168]. Sensitivity tests in a case study show that local biogenic emissions are an important contributor to local O3 production, relative to anthro- pogenic hydrocarbons (AHCs) [137]. Isoprene forms a main component of biogenic emissions, and affects the chemistry of the troposphere because its oxidation products are precursors for the photochemical production of ozone [11]. It is one of the most abundant phytogenic chemical species found in the ambient air [71], and accounts for a substantial portion of the atmospheric hydrocarbon load [85,179]. However, biogenic emission estimates are laden with large uncertainties [168]. In addition, there are significant uncertainties associated with the chemical mechanisms employed in photochemical modeling, such as uncertainty in the chemical reaction rates and uncertainties arising from the lumping of chemical species [7]. The focus of the present work is on the comparative evaluation of the effects of uncertainty in the biogenic emission estimates and uncertainties in the reaction rate coefficients on the estimated ozone levels, with respect to isoprene (2-methyl-l,3-butadiene), since it is one of the major components of biogenic emissions. More specifically, the aim is to compare the effects of uncertainties associated with the emission rates of isoprene with the effects of uncertainties in the rates of reactions involving isoprene. Such uncertainty analysis could provide information that could be useful in identifying the area to focus resources: studying the reaction mechanisms in more detail versus improving the inventorying methods. ------- Chapter: V Section: 3 90 Atmospheric Turbulence Variabilty Model Uncertainty Natural Uncertainty Data Uncertainty Simplifications in Model Formulation Spatial and Temporal Averaging Lumping in Chemical Mechanisms Emission Estimate Uncertainty Errors in Measurement Insufficient Data Figure 5.15: Origins and types of uncertainty present in Photochemical Air Quality Simulation Models 5.3.1 Uncertainties Associated with Biogenic Emission Estimates There are significant uncertainties associated with biogenic emission estimates, and specifically with isoprene emission estimates. These uncertainties occur mainly due to the following factors: • variability in the vegetation: the amounts of biogenic emissions differ signifi- cantly from one plant species to other, and there is a wide range of plant species in an urban/regional scale modeling area. • variability in emission rates: the emission rates depend on factors such as sun- light, temperature, nitrogen and water availability, and are associated with significant spatial and temporal variability. Currently employed models and associated inventories for estimating biogenic emis- sions typically cannot treat the above variability in adequate detail. Hence significant uncertainties occur in the emission estimates. The total U.S. emission estimates of biogenics are reported to range from 22 to 50 Tg/yr depending upon the formulation of different emission rate factors [132], Some studies report the uncertainty in the biogenic emission estimates of an inventory for the United States to be of the order of 300% [133,168]. In another study, the uncertainties in the emission estimates of isoprene in Europe are reported to be of the order of 500% [182], In addition, there are significant uncertainties associated with the description of isoprene chemistry. Recently reported studies show that different photochemical ------- Chapter: V Section: 3 91 mechanisms with different treatments of isoprene chemistry give significantly different estimates of ozone levels [215]. The main sources of uncertainties in existing chemical mechanisms are: (a) lumping of various biogenic compounds, and (b) insufficient data to estimate reaction rate parameters associated with biogenic compounds. It is reported in the literature that some reaction rate parameters associated with isoprene chemistry are uncertain up to a range of 150% [7]. 5.3.2 Uncertainty Analysis Considering the importance of biogenic emissions, and the uncertainties involved in their estimates and reaction rates, it is important to study how these uncertainties propagate through the model and affect the predicted 03 concentrations. For simple models, uncertainty propagation can be accomplished through conventional meth- ods such as Monte Carlo methods. However, grid-based photochemical models are typically very demanding computationally, thus making conventional Monte Carlo methods prohibitively expensive. Due to the lack of computationally efficient uncertainty analysis methods, a num- ber of studies reported in the literature with respect to uncertainty in photochemical models involved running the model for selected sets of parameter or input values, and thus obtaining some estimates on the "bounds" of uncertainty in model results. The main advantage of this approach is that it is simple and requires few model runs. However, this approach ("sensitivity/uncertainty" testing) does not provide insight into the distribution of the model results. Here, the regression based SRSM is applied to estimate the probability distributions of the estimated ozone concentrations. As mentioned earlier, one goal of this approach is to characterize individual contri- butions of uncertainties in chemistry and emissions on the ambient ozone levels, and to identify important contributors to uncertainty, so that resources could be focused to reduce uncertainties in those factors. Another goal is to obtain an approximate estimate of the range of uncertainty in the predicted ozone concentration, resulting from the above mentioned input uncertainties; such an estimate could potentially assist in answering some regulatory questions. 5.3.3 Estimation of Uncertainties The present work involves the study of the effects of uncertainty in the emission and reaction rates of isoprene, on the ozone levels in the atmosphere. This is pursued as follows: • a single cell ("box") model is used as a screening tool to identify parameters that could potentially affect the results of the grid-based PAQSMs. The param- eters considered include the the initial concentration of isoprene, and the rate constants for reactions involving isoprene in the CB-IV chemical mechanism. ------- Chapter: V Section: 3 92 • approximate estimates of uncertainty in the parameters thus identifies are ob- tained from the literature, and are then propagated through the Urban Airshed Model (UAM-IV) to study urban/regional scale effects of these uncertainties. More specifically, a photochemical box model is used to perform preliminary screening based on the effects of uncertainty in isoprene reaction rate constants and initial concentrations of isoprene on the time-dependent concentrations of ozone. The results from this exercise are utilized in identifying key factors contributing to uncer- tainty in the application of a grid-based PAQSM, specifically the UAM-IV. A box-model consists of a single well mixed reactor cell, and the evolution of chemical species depends only on time dependent photochemical reaction rates. In the present work, a single cell option of the Reactive Plume Model (RPM-IV) is used for the screening runs. A single cell trajectory model is equivalent to a box model, since there is no transfer of material occurring from outside the cell. The RPM- IV was used here since the same chemical mechanism, the Carbon-Bond IV (CB- IV) mechanism [206], is employed to describe the chemistry in both RPM-IV and UAM-IV. The CB-IV mechanism consists of a set of 95 chemical reactions among 35 surrogate chemical species corresponding to organic bonds and functional groups. The Urban Airshed Model, version IV, [149] is a photochemical grid model that has been applied to several urban areas in the United States, Europe and the Far East [61,75]. This model solves the atmospheric diffusion equation (ADE) [175] on a three dimensional grid covering the airshed of interest. Atmospheric chemistry is described by the CB-IV mechanism. This model requires meteorological inputs and the emission information for both anthropogenic and biogenic emissions. This case study considers the modeling of the Philadelphia-New Jersey region for July 19 and 20, 1991, when a severe ozone episode occurred in this region. The domain and the modeling grid (at a horizontal resolution of 5 kmx5 km) for the case study are shown in Figure 5.16. The meteorological inputs were developed using the Diagnostic Wind Model (DWM) [55], the anthropogenic emissions were obtained from State-provided in- ventories, processed using the Emission Processing System (EPS) [30]. The bio- genic emissions were obtained from the EPA's Biogenic Emissions Inventory Sys- tem (BEIS) model [30]. These inputs were obtained from the input data from Geor- gopoulos et al. [75] The procedure used for uncertainty analysis consisted of the following steps: • Performing systematic screening tests to study the sensitivity of the results of the box-model with respect to selected reaction rate constants of isoprene, and initial concentration of isoprene • Identifying the factors that have clear effects of the predicted ozone concentra- tion • Obtaining approximate uncertainty estimates for these factors from the litera- ture [7] ------- Chapter: V Section: 3 93 Philadelphia/New Jersey Modeling Domain County Locations UTM ZONE 18 UTM East (km) Figure 5.16: The Philadelphia/New Jersey modeling domain used for uncertainty analysis ------- Chapter: V Section: 3 94 1400 July 19 July 20 1200 1000 £¦ 800 (/) c ------- Chapter: V Section: 3 95 July 19 July 20 ® 2500 1 2000 £ 1500 0.07 0.08 Daily Average 03 Concentration (ppm) Figure 5.18: Probability distribution of the daily average ozone concentration Here, 2.5 is used as a truncation factor to specify the probability distribution uniquely, and to limit the range of values x could have. 2000 1800 1600 1400 1200 ® 1000 o 800 600 400 200 0.11 July 19 July 20 k 0.12 0.13 0.14 Maximum 8hr 03 Running Average (ppm) 0.15 0.16 Figure 5.19: Probability distribution of the daily maximum 8 hr running average ozone concentration More specifically, the uncertainties in isoprene emission estimates, and the rate ------- Chapter: V Section: 3 96 0.014 0.012 S" 0.01 'c 3 O > 0.008 '<75 c ------- Chapter: V Section: 3 97 Table 5.2: Contribution of uncertainties in the individual parameters to the overall uncertainty in the UAM-IV, as estimated by the SRSM Jul 19 Mean OH NO., ISOP Jul 20 Mean OH NO., ISOP Max (ppm) Ave (ppm) 8hr-Ave (ppm) Pervasiveness 0.143 0.0003 0.0000 0.0009 0.062 0.0000 0.0000 0.0002 0.121 0.0000 -0.0001 0.0005 341.71 6.32 -1.22 47.53 0.178 0.0003 -0.0001 0.0003 0.072 -0.0001 0.0000 0.0001 0.149 -0.0003 0.0000 0.0001 4075.0 31.30 -9.17 157.14 hour running average concentration, for the same case, whereas Figure 5.20 shows the uncertainty in the pervasiveness. In summary, the following conclusions were reached: • The daily average 03 concentration was not affected significantly by the input uncertainties considered. • The daily maximum eight hour running average was affected mildly by the input uncertainties. • The daily maximum hourly average was also affected mildly. • The pervasiveness was affected significantly by the uncertainty in the inputs considered. • The probability distributions of output metrics differed significantly for the two episode days considered, implying that the uncertainty analysis method needs to be applied on a case by case basis. • The variance in the concentration metrics for all cases was less than 1% of the mean value. • The variance in pervasiveness was of the order of 5%-f0% of the mean value. Table 5.2 summarizes the uncertainty in the uncertainty estimates resulting from this analysis. The coefficients in the table indicate the uncertainty contributed to a metric, due to the uncertainty in the individual parameter. The coefficients indicate that the uncertainty contribution is very small relative to the mean value of the metric. Further, they indicate that the uncertainty in the isoprene emission rate contributes significantly to the overall uncertainty, and that the uncertainty in the reaction rate of the NC>3-isoprene reaction has an almost negligible effect. However, the uncertainty in the OH-isoprene reaction constant contributes to an extent comparable to the isoprene emission uncertainty. So, the above analysis indicates that the isoprene emission rates and the reaction rate constant of the isoprene-OH reaction should be studied in more detail. ------- Chapter: V Section: 3 98 This case study also demonstrates how the SRSM, while requiring a small number of simulations, can be used to estimate the uncertainties in the outputs of a complex model such as UAM-IV. The execution time, about five days (20 model runs x 6 hours per model run) is much less when compared with a Monte Carlo analysis involving, for example, 500 runs, requiring 125 days of computer time. ------- Chapter: V Section: 4 99 5.4 Case Study IV: A Ground Probability Distributions Uncertainty Analysis of Tritium the EPACMTP Model Water Model with Discontinuous Contamination in a Landfill using Characterization of exposure due to leakage of hazardous contaminants from a land disposal unit involves significant uncertainty due to inherent variability in the hydro- geologic properties of the site, and due to incomplete understanding of the processes involved in transport of contaminants. Since the exposure estimates are utilized in making policy decisions, it is important that any exposure characterization should take into account the variability and uncertainty involved with the physical system, and with the modeling process. Here, a case study is presented in relation to characterization of uncertainty in estimated tritium exposure at a receptor well. The main source for the contamination is a hypothetical landfill unit in the southern United States. The fate and transport model used in this work is the EPA's Composite Model for leachate Migration and Transformation Products (EPACMTP). In the following sections, a brief description of the model used, the estimates of the model parameters and the uncertainty analysis methods is given. The variability and uncertainty associated with the hydrogeologic parameters and with the physical properties of the landfill unit are considered to characterize the uncertainty in the calculated concentrations of tritium. Environ- mental metrics considered are the estimated maximum concentration of tritium in a "receptor" well and the estimated time of occurrence of the maximum concentration. 5.4.1 EPACMTP Model The EPA's Composite Model for leachate Migration and Transformation Products (EPACMTP) provides estimates of potential human exposure to hazardous chem- icals leaching from land disposal facilities [62], EPACMTP simulates the subsur- face fate and transport of contaminants released from land disposal sites, and pre- dicts the associated groundwater exposure in a domestic drinking water receptor well. This model is an improvement over the EPA's Composite Model for Land- fills (EPACML) [60]. EPACML accounts for the first-order decay and sorption of chemicals, but disregards the formation and transport of transformation products. In addition, EPACML can describe only uniform, unidirectional groundwater flow. On the other hand, EPACMTP can take into consideration: (i) chain decay reactions and transport of daughter and grand-daughter products, (ii) effects of water-table mounding on groundwater flow and contaminant migration, (iii) finite source as well as continuous source scenarios, and (iv) metals transport. EPACMTP consists of two modules: an unsaturated zone module called Finite ------- Chapter: V Section: 4 100 Table 5.3: Probability distributions used for the EPACMTP model parameters for the uncertainty analysis Pa ran Dist." Distribution Parameters'" units Description Ks LN f] = 0.343, a = 0.989 cm/hr Saturated hydraulic conductivity K U min = 1500, max = 2114 m/yr Hydraulic conductivity a LN r] = 0.019, (7 = 0.012 cm"1 Moisture retention parameter (3 SB r] = 1.409, a = 1.629 - Moisture retention parameter 9r SB rj = 0.068, a = 0.071 - Residual water content %OM SB r] = 0.105, a = 5.88 - Percentage of organic matter d LN r] = 0.09, a = 4.0 cm Particle diameter LN •q = 10.0, a = 5.0 m Longitudinal dispersivity pH LN r] = 6.5, a = 0.75 - Aquifer pH foe SB r] = 0.000432, a = 0.0456 - Fractional organic carbon content m N r] = 800, a = 150 m Radial distance of observation well e U min = 0°, max = 90° - Angle off-center B U miri = 30.0, max = 35.0 m Aquifer thickness "Distribution types : LN - Lognormal, U - Uniform, N - Normal, and SB - Johnson SB. 6Distribution parameters: r/ - mean, a - standard deviation, min - minimum value, and max - maximum value Element and semi-analytical Contaminant Transport in the Unsaturated Zone (FEC- TUZ), and a saturated zone module called Combined Analytical-Numerical SAtu- rated Zone in 3-Dimensions (CANSAZ-3D). FECTUZ is a one-dimensional model that simulates vertically downward steady-state flow and contaminant transport through the unsaturated zone above an unconfined aquifer. CANSAZ-3D simulates 3-D steady-state groundwater flow and transient or steady state contaminant trans- port. EPACMTP currently uses a simplified 2-D version of the CANSAZ-3D, and the modules are optimized for computational efficiency. ------- Chapter: V Section: 4 101 5.4.2 Data Sources for the Application of EPACMTP Data for the site characteristics, infiltration rates, the volume and area of the landfills, and the probability distributions for the hydrogeologic parameters are obtained from a review conducted by the Hydrogeologic Inc. [62], In this review, a number of different sources were used for the development of this site-based approach. Four of these sets were selected to derive the regional characteristics of important parameters for each sampled site: • the OPPI survey of waste management units (EPA, 1986), • the infiltration and recharge analysis performed by OSW, • the USGS state-by-state inventory of groundwater resources, and • the Hydrogeologic Database for Modeling (HGDB), developed from a survey of hazardous waste field sites in the U.S. These datasets were used in conjunction with the soil mapping database provided by the Soil Conservation Service (SCS), the data sets from the National Oceanic and Atmospheric Administration, and simulations from the HELP model [174], The data for the hypothetical scenario in this case study were adapted from a data set provided by Hydrogeologic Inc. 5.4.3 Implementation of the Monte Carlo Method in EPACMTP The Monte Carlo method requires that for each input parameter that has associated uncertainty or variability, a probability distribution (or a frequency distribution) be provided. The method involves the repeated generation of pseudo-random values of the uncertain input variables (drawn from the known distribution and within the range of any imposed bounds) and the application of the model using these values to generate a set of model responses or outputs (for example, the receptor well concen- tration, Crw)• These responses are then statistically analyzed to yield the probability distribution of the model output. The various steps involved in the application of a Monte Carlo simulation are: • selection of representative cumulative probability distribution functions for the relevant input variables, • generation of a pseudo-random number from the selected distributions, repre- senting a possible set of values (a realization) for the input variables, • checking whether the realization satisfies the constraints or bounds on the dis- tribution (and resampling till a reasonable realization is obtained), • application of the model to compute the derived inputs and outputs, • repeated application of the above steps for a specified number of iterations, and • statistical analysis of the series of the output values generated. ------- Chapter: V Section: 4 102 Table 5.4: Deterministic Parameters in the EPACMTP model for the case study Pa ram Value Units Description Aw 3.5xl05 „_2 m Area of the source xw 304.8 m Length of the source Yw 1143.0 m Width of the source Ir 0.381 m/yr A real recharge rate T 17.5 °C Aquifer temperature A 0.0564 1/yr Radioactive decay coefficient ts 20 yr Source leaching duration (XT 8.0 m Transverse dispersivity dv 160.0 m Vertical dispersivity 0s 0.45 - Saturated water content Dw 10.67 m Depth to water table S 0.0086 - Hydraulic gradient Rs 1.0 - Retardation factor pb 1.65 g/cm3 Bulk density 5.4.4 Uncertainty Analysis This case study consists of a site based landfill modeling for tritium contamination at a groundwater well resulting from a hypothetical landfill unit in the southern United States. The landfill is a finite source, of 0.35 km2 area, with a leaching duration of 20 years, and a recharge rate of 0.381 m/yr. The receptor well is located at a distance of 1635 m from the landfill. The EPACMTP model is used to simulate the radioactive decay and transport of tritium through the saturated and unsaturated zone underlying the landfill. The data used here are adapted from data provided by Hydrogeologic Inc. The probability distributions of the site-specific parameters for the land fill at the Old Burial Ground are given in Table 5.3, and the values of the constant model parameters for EPACMTP are given in Table 5.4. ------- Chapter: V Section: 4 103 Uncertainty in Max. Tritium Concentration 50 SRSM : 350 Runs 1000 Runs 5000 Runs 40 0 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 Predicted Maximum Concentration of Tritium At Well N1 (Normalized) Figure 5.21: Uncertainty in estimated maximum tritium concentration in a receptor well 5.4.5 Results and Discussion Figure 5.21 shows the uncertainty associated with the estimation of maximum tritium concentration in the ground water, as a result of the leaching from the landfill unit. Figure 5.22 shows the uncertainty associated with the estimated time of occurrence of the maximum tritium concentration in the ground water at the well. The figures show the pdf s estimated by the Monte Carlo simulations and by the SRSM. Results of the SRSM with 350 model runs are compared with those of Monte Carlo simulations with 1000, 3000 and 5000 model runs. The results indicate that the SRSM shows close agreement with the Monte Carlo results, while requiring much fewer number of runs. Further, there are certain other advantages in using the SRSM, that are focused in the ongoing research: • The relationships between inputs and outputs can be easily estimated, since both the input and output pdfs are expressed as a series expansion in srvs. This implies that SRSM provides insight into the relative contribution of each uncertain input each output, thus giving extensive sensitivity and uncertainty information. Conventional Monte Carlo methods do not provide such informa- tion. • Correlation between outputs can be easily estimated since all outputs are ex- ------- Chapter: V Section: 4 104 Uncertainty in Time of Occurrence of Max. Tritium Concentration 0.12 0.1 0.08 >. ' C ------- Chapter 6 CHARACTERIZATION AND REDUCTION OF MODEL UNCERTAINTY: AN ATMOSPHERIC PLUME MODEL STUDY 6.1 Introduction and Background Uncertainty in the application of transport-transformation models arises not only due to uncertainty in model inputs or parameters (i.e., parametric or data uncertainty), but also due to uncertainty in model formulation (i.e., model uncertainty). As men- tioned in Chapter 1, model uncertainty arises under several conditions, including the following: (a) when alternative sets of scientific or technical assumptions for devel- oping a model exist (Model Structure), (b) when models are simplified for purposes of tractability (Model Detail), and (c) when a coarse numerical discretization is used to reduce the computation demands of the model (Model Resolution). Some of the major sources of model uncertainty in transport-transformation modeling are listed in Table 2.1. When a model application involves both model and data uncertainties, it is im- portant to identify the relative magnitudes of the uncertainties associated with data and model formulation. Such a comparison is useful in focusing resources where it is most appropriate (e.g., data gaps versus model refinement). The approach followed here for characterizing model uncertainty here is based on the development of models corresponding to different model formulations subsequent comparison of the model results. The availability of the alternative models, ranging from simplified to more detailed, aids in evaluating the applicability of the low reso- lution models - if the results of the low resolution models agree closely with those of the high resolution models, then the low resolution models are preferable, since they typically require fewer computational resources and lesser input data. Furthermore, the availability of the alternative models aids in the characterization of the model uncertainty associated with the application of those models - the bounds on model calculations corresponding to different model formulations provide an estimate of the uncertainty associated with the model formulation. For example, if the bounds are narrow the uncertainty associated with the model formulation can be considered to be relatively small. This chapter focuses on the model uncertainties associated with photochemical air 105 ------- Chapter: VI Section: 2 106 quality modeling applications. Specifically, the uncertainties associated with model resolution and assumptions of "well mixedness" are addressed here. A case study is presented involving an EPA regulatory atmospheric photochemical trajectory model, the Reactive Reactive Plume Model (RPM) [189], which is briefly described in Sec- tion 5.2, and is described in more detail in the following sections. As mentioned in Section 5.2, the RPM describes the evolution of a photochemical plume from "point sources" of emissions, such as the stacks of refineries and of power plants. The currently used version of RPM, RPM-IV, lacks vertical resolution, as the model assumptions include uniform mixing in the vertical direction; the lack of resolution is highlighted in Figure 6.3. Here, two aspects of the model are studied: (a) the uncertainty associated with the assumption of uniform vertical concentration profile, and (b) the uncertainty associated with the choice of horizontal resolution. In this process, the results of the RPM at varying horizontal resolutions are studied first. Then a "three-dimensional version" of the RPM, termed RPM-3D, is developed by incorporating vertical resolution into the RPM-IV. The results from the RPM and the RPM-3D are subsequently compared with each other in order to characterize the uncertainty associated with the vertical resolution in the RPM. The relevant background information on photochemical modeling, required for a detailed understanding of the RPM, is presented here. That is followed by a descrip- tion of the formulation of RPM-IV and RPM-3D. 6.2 Photochemical Air Pollution Modeling The main objective of Photochemical Air Quality Simulation Models (PAQSMs) is to establish causal relations among emission levels, meteorological conditions and ambient air quality. These are numerical models that simulate the transport and chemical transformation of pollutants in the atmosphere. Air quality can be modeled via prognostic modeling, that is based on the fundamental physiochemical principles governing air pollution, and as via diagnostic modeling, that is statistical description of observed data. These models have been applied to local scale (e.g., plume models), to urban and regional scales (e.g., airshed models). The applications of PAQSMs range from simulation of transport and transformation of chemicals over a few kilometers over few hours (local scale) to over a few thousand kilometers over a few weeks (regional scale). In the application of PAQSMS, one of the main pollutants of concern is ozone, as mentioned in Chapter 5.2. Ozone is a "secondary pollutant" formed through nonlinear chemical reactions between precursor species, oxides of nitrogen (NOx) and volatile organic compounds (VOCs) that are emitted by a wide variety of both anthropogenic and natural (biogenic) emissions sources. Certain meteorological conditions can sig- nificantly enhance the ambient concentrations of ozone in specific locales, leading ------- Chapter: VI Section: 2 107 to substantial health risks [75,176]. The PAQSMs are employed to study the effec- tiveness of various alternative strategies for reductions in emissions with respect to reductions in ambient ozone levels. 6.2.1 Mathematical and Numerical Formulation of PAQSMs Many PAQSMs are based on the fundamental description of atmospheric transport and chemical processes. The atmospheric diffusion equation, given by d{ci) _ d{d) _ d f d{d}\ / ~sr+u'~si~ - d7s {)+ R'( ¦ ¦ ¦ ¦ • M) + Si (fU) is typically employed to describe the pollutant concentration in a selected control volume. This equation expresses the conservation of mass of each pollutant in a tur- bulent fluid in which chemical reactions occur. In the above equation, (•) represents the ensemble average, Ci is the concentration, Ri is the rate of formation of the zth species, Kjj is the turbulent dispersion coefficient (obtained by approximating turbu- lent transport with a gradient diffusion scheme), and Si is the rate of emissions of the zth species into the control volume. The Einstein summation convention is used here with respect to the subscript j. The assumptions leading to the atmospheric diffusion equation above include: (a) first order closure approximation (i.e., turbulent fluxes are approximated by a gradient driven fluxes), (b) negligence of molecular diffusion compared to turbulent dispersion, (c) incompressibility of the atmosphere, and (d) approximation of ensemble average of reaction rates with the reaction rates for ensemble averages. Air quality models can be broadly categorized into the following types: Box models: These are the simplest of the numerical models for air quality mod- eling. The region to be modeled is treated as a single cell, or box, bounded by the ground at the bottom, and some upper limit to the mixing on the top, and the east- west and north-south boundaries on the sides. Here, the pollutant concentrations in a volume of air, a "box", are spatially homogeneous and instantaneously mixed. Under these conditions, the pollutant concentrations can be described by a balance among the rates they are transported in and out of the volume, their rates of emis- sions, and the rates at which pollutants react chemically or decay. Since these models lack spatial resolution, they cannot be used in situations where the meteorological or emissions patterns vary significantly across the modeling region. Grid Models: Grid models employ a fixed Cartesian reference system and divide the modeling region into a two- or three-dimensional array of uniform grid cells. Horizontal dimensions of each cell usually measure on the order of kilometers, while vertical dimensions can vary from a few meters to a few hundred meters. One example of the grid models is the Urban Airshed Model (UAM), briefly described in Section 5.3. ------- Chapter: VI Section: 3 108 Grid models as currently designed are generally not capable of resolving pollutant concentrations at the microscale - that is, at scales smaller than the size of a grid cell. Thus, important local effects, such as high pollutant concentrations in the immediate vicinity of point sources, tend to become obscured by the averaging of pollutant concentrations over the grid cell volume. There are practical and theoretical limitations to the minimum grid cell size. Increasing the number of cells increases computing and data acquisition effort and costs. In addition, choice of a grid cell implies that the input data, such as the wind flows, turbulence, and emissions, are resolved to that scale. In practice, most applications employ grid cell sizes of a few kilometers. Further, the assumptions involved in the formulation of the atmospheric diffusion equation make it applicable for grid models with horizontal resolution coarser than 2 km [134], These models are used to assess the effectiveness of emissions reductions at a urban/regional scale. In essence, considering the domain-wide effects of emissions reductions. As these models assume uniform conditions within each grid cell, all the local scale processes are lumped into one average quantity. Trajectory Models: The trajectory models use a moving coordinate approach to describe pollutant transport. Here, the atmospheric diffusion equation is solved in an air parcel of interest, which is assumed to travel solely with the horizontal wind. These models give results only on the path of traversed by the air parcel described, and do not permit construction of the spatial and temporal variation of concentrations over the entire region. However, their utility is in the calculation of extreme concentrations that are typically found in the plumes from point sources. Since the numerical grids used in grid models are considerably large in size (on the length scales of a few kilometers), an understanding of the "sub-grid" effects is important in order to assess local scale effects. In order to describe the sub-grid phenomena, with specific emphasis on establishing causal relationships between emis- sions from point sources (such as industrial stacks) and the pollutant concentrations in the neighborhood of the point sources, plume models are employed. Plume models are trajectory models that describe short term behavior of the plume of emissions from a point source. The typical time scales range from 1 hour to 24 hours, with length scales ranging from a few hundred meters to a few kilometers. Such detail cannot be achieved by increasing the resolution of grid-based models, since the model formulation of the grid-based models is not valid for finer resolutions (i.e., less than 2 km length scales). The formulation of the grid models and of the plume models is discussed extensively in the literature [75,77,139,175,213]. 6.3 Formulation of The Reactive Plume Model (RPM) The Reactive Plume Model, version IV (RPM-IV), is a standard regulatory model used for estimating pollutant concentrations in the atmosphere, resulting from the ------- Chapter: VI Section: 3 109 Touch Down L = Depth of mixing layer Z = Effective stack height TD Source Downwind Distance Figure 6.1: Schematic depiction of the evolution of an atmospheric plume (adapted from [189]) emissions from point sources such as industrial stacks [148,189]. It uses either point source emission estimates or initial plume concentrations as inputs, and calculates downwind concentrations, as the plume expands. RPM-IV makes use of a diffu- sion/reaction equation similar in form to the atmospheric diffusion equation (Equa- tion 6.1), but with dispersion parameters that evolve with time. This model considers a control volume that moves downwind along the plume trajectory, and solves the diffusion/reaction equation by considering the nonlinear photochemistry and diffusion of chemical species from one cell into another cell. In this model, the plume is modeled by a set of "cells" consisting of equal masses of pollutants, by assuming a Gaussian distribution of the pollutant mass along the plume centerline. As the plume expands, the individual cells expand in volume cor- respondingly. The transfer of pollutant mass across cell boundaries is modeled in two phases: (a) an "entrainment" phase, where the expanding cell boundaries entrain the pollutants from other cells, and (b) a "detrainment" phase, where the pollutants diffuse across cell boundaries due to concentration gradients. Further, the cells are modeled to expand in a manner such that the amount of an inert species remains constant within each cell. The expansion of the boundaries, and the governing equa- tions for pollutant concentrations within each cell are as described in the following section. ------- Chapter: VI Section: 3 110 Plume Centerline Entrainment Detrainment • • • • o o j = -2 j = -l o o j = l j = 2 t = t \ / i \ 1 • • 1 > • o • • o • > 1 • o • • o • 1 1 • • 1 j = -2 j = -1 j = l ]=2 t = tQ+ dt t = tQ+ dt Figure 6.2: Schematic depiction of entrainment and detrainment steps simulated in the RPM (adapted from [189]) 6.3.1 Initial Physical Dimensions of the Plume The total width W(s) and the total depth H(s) of the photochemical plume can be estimated as functions of the horizontal and vertical dispersion parameters, as follows: W(s) = fx crx(s) and H(s) = fyaz(s), (6.2) where s is the total distance traveled by the plume up to time t under consideration, and fx, and fz are selected as cutoff values of the plume boundaries. Based on the Gaussian distribution of initial plume mass [175], fx and fz indicate the amount of pollutant mass that is included in the model boundaries, as functions of standard normal quantiles. A typical value of fx = fy = 4.0 implies that 95.4% of the total pollutant mass is included within the plume boundaries. For a plume described by M cells in the horizontal direction, with only one cell in the vertical, at the start of the simulation each cell contains equal amount of pollutant mass. Since the pollutant mass is assumed to follow a Gaussian distribution, the width ------- Chapter: VI Section: 3 111 of the cell j, Wj, can be calculated using the recursive equation 2 M erfte)-erfte)' (6-3) where Wj is the width of the jth cell [189]. 6.3.2 RPM-IV Model Equations Based on mass balance, the governing equation of the Reactive Plume Model is given by: B D dc) (dc)\ ( 1 dwA ,• ( 1 dhj , ,• ¦ , . —- = —1 p Uc - -1)uc)+ F 6.4 dt \ dt J chem \WJ ds J \hJ ds "V V A C Here, the term (A) describes the change in pollutant concentrations due to chem- ical transformation, term (B) the dilution as the jth cell expands in the horizontal, term (C) the dilution as the jth cell expands in the vertical, and term (D) the net flux of the zth contaminant into the cell. In this equation, c* denotes the concentration of chemical species i in cell j, u denotes the speed at which the cells travel downwind, and Wj and hj denote the width and height of the jth, respectively. Additionally, F- is the net flux into the cell, and is given by F- = Fj — Dj, where F] is the entrainment as the cell expands, and Dlj is the detrainment due to the diffusion of the species across the boundaries of the cell. Figure 6.2 describes the process of entrainment and detrainment in detail. Expressions for entrainment and detrainment are given by the following equations: Di_9(KdC\_ 2 f f Q,, Q \ / J dy{ dy) ,n ,n , I //, J ? 1U //, 2 where yj is the distance from the plume centerline to the far most side of the cell j, in the horizontal. An additional condition is imposed on the expansion of the cells: the cells are allowed to expand in a way such that the amount of an inert species remains constant within each cell, i.e., no net transfer of an inert species occurs from a cell. This condition results in the equation E1- — D1- = 0, for the expansion of the ------- Chapter: VI Section: 3 112 cells, as the plume travels downwind. This condition leads to a recursive relationship to compute Kj, as follows: i-1 ^i\ r1 fdyj-l\ r1 = ? I k I °Ij+1 ~ °Ij 1 k I °Ij ~ dt ) 3+1 V dt ) 3 | 3 ^-+1 - Vj_x J J_1 y Vj - Vj—2 where Cj is the concentration of an inert species I in the jth cell. Further, the condition of zero concentration gradient at the plume centerline results in the equation: k - ( c2 V2 ~ Vo \ dyi rfi The values of Kj can then be recursively obtained using the above two equations. Once KjS are computed, Equations 6.4, 6.5, and 6.6 can be used to compute the concentration in each cell at different times. In RPM-IV, the nonlinear plume photochemistry is modeled through the carbon- bond mechanism, version IV (CB-IV) [206]. CB-IV describes the complex non-linear gas-phase atmospheric photochemistry through a set of 95 reactions among 35 surro- gate chemical species corresponding to organic bonds/functional groups [206]. This mechanism lumps similarly bonded carbon atoms, resulting in a condensed mecha- nism of that is used widely in the regulatory photochemical models. 6.3.3 Limitations of the RPM-IV Figure 6.1 presents a schematic depiction of the physical structure of a plume as it evolves from a stack, and travels downwind. The assumption of uniform verti- cal concentration in the RPM (hence, only one "well mixed" cell in the vertical) is valid only at distances much greater than Xtd from the point source. This poses a limitation, since the typical touchdown distance could vary between a few hundred meters to a few kilometers, depending on the meteorological conditions, the height of the stack, and the exit velocity at the plume source. When the value of Xtd is large, the well-mixedness assumption is not valid at the vicinity of the plume, thus lessening the main advantage of using a plume model to study the "sub-grid" scale, local phenomena. In such cases, if the RPM is used with the uniform vertical mixing assumption, significant "errors" could result in the model calculations. In order to address this limitation of the RPM-IV, a corresponding "three- dimensional" version of the RPM, called RPM-3D, is developed as part of this work. This RPM-3D simulates the evolution of a plume by dividing the plume cross-section into rectangular regions consisting of equal initial pollutant mass; the plume cross section is divided into columns of cells, as in the case of RPM-IV, and each column is further subdivided into rows. ------- Chapter: VI Section: 4 113 I II 1 1 C' B' A' A B C i i i i i i ••*#>•• i •• i R3 B' R2 jA" Y' _x: X" Y Z RPM-IV RPM-3D Figure 6.3: Discretization of the plume cross section by RPM-IV (left) and RPM-3D (right) Figure 6.3 illustrates the physical structure of the plume cross section in the RPM- IV and RPM-3D. In the figure, R1 is the region containing the cell closest to the plume centerline, R2 contains the ground-level cell along the plume centerline in the horizontal, and R3 contains the centerline cells with respect to both the horizontal and the vertical. Here, R1 corresponds to the representation used by RPM-IV, whereas R2 and R3 correspond to representations used by RPM-3D. In typical cases, the extreme pollutant concentration occurs along the plume centerline, and the pollutant concentration decreases as the distance of the cells from the centerline increases. Hence, the extreme pollutant concentration (e.g., concentration of ozone) calculated by the RPM-IV is an average over the region Rl. This provides a concentration for the estimation of exposure to ozone. However, as shown in the figure, the region R2 corresponds to the most likely ground level maximum ozone concentration (typically covering the region from the ground to up to less than a hundred meters), whereas the region R3 corresponds to the maximum ozone level concentration at the plume centerline. Clearly, the RPM-3D provides estimates of pollutant concentrations at a detailed spatial resolution that is appropriate for estimating human exposure. 6.4 Formulation of the RPM-3D The equations describing the transport of chemical species in RPM-IV are modified accordingly to accommodate the three dimensional description of the plume. Thus, Equation 6.4 is modified as follows: ------- Chapter: VI Section: 4 114 dcj,k _ ( dcj,k\ ( 1 dwjtk\ i ( 1 dhjtk . j „ , , "f-i t T rVit + dt ~\dt )n^m \wj>k ds J^'k \hj)k ds (6.9) where cj fc is the concentration of the zth species in the cell (j, k); j denotes the cell number in the horizontal direction, and k denotes the cell number in the vertical. Fy • k and Flz • fc denote the fluxes along the horizontal and vertical directions respectively. The equations for the calculation of the fluxes, are similar to Equations 6.5 and 6.6, and are as follows: T?l T?% T^l rY,j,k ~ UYj,k jji en r\i Z,j,k ~ ^ZJ.k Z,j,k ^ = (6'10) pi _ ^ j ( cj+l,k ~ Cj,k j-r (Cj,k ~ Cj-l,k F'y.j.k — \ y,j,k ~ KY,j-l,k Vj,k Uj-l,k ^ yj/j+l,fc Uj-l,k J \ Vj,k Uj—2,k ZM Zi,k-:,,k-1 H dt J 1Ml \ dt ) Dh,k = ^ (I* ) - KZJ,k.. (d'-k ~ d,'k-> Zj,k Zj,k—1 I \ Zj,k+1 Zj,k—1 J \^j,k ^j,k—2 where yj>k, and Zj>k denote the distances from the plume centerlines to the far most sides, in the horizontal and vertical directions respectively. KZjtk and KYj,k are solved using recursive relationships similar to Equation 6.7: the relationships are obtained from the following conditions: (a) Fyjk and FZ j k are equal to zero for an inert species /, (b) there is a zero concentration gradient at the plume centerline in both horizontal and vertical directions. The RPM-IV uses the Gear algorithm [74] to solve the stiff set of equations de- scribing the chemical reactions. In this method, the step size for each successive calculation is adjusted according to the local truncation errors. The same approach is followed in the RPM-3D. ------- Chapter: VI Section: 5 115 0.13 0.125 0.12 0.115 11 0.105 0.1 0.095 0.09 2 Cells 4 Cells 6 Cells 8 Cells 10 Cells 0.085 0.08 0.075 0 20 40 Plume Evolution Time (min) 60 80 100 120 Figure 6.4: Dependence of plume average 03 concentration on horizontal resolution of the RPM-IV in a VOC dominant regime 6.5 Case Studies The RPM-IV and RPM-3D are both applied to two significantly different case studies. An attempt is made to understand the extent of the uncertainty associated with the model assumptions and model resolution, by comparing the predictions of these two models. The first case study uses emission estimates and field data measured near Marathon Oil Refinery at Robinson, Illinois, during June and July, 1977 [177]. This case study involves a primarily VOC dominant source (i.e., the volatile organic com- pounds emissions are significantly higher than the NO/NO2 emissions). The me- teorological data and initial plume concentrations were obtained from the work of Georgopoulos and Roy [76], which was based on the measurements reported by Sex- ton et al. [177]. Figure 6.4 shows the average ozone concentrations in the plume, calculated by RPM-IV, as a function of the plume evolution time for different horizontal resolutions, ranging from two cells to ten cells in the horizontal. Similarly, Figure 6.5 shows the average plume concentrations calculated by RPM-3D for different vertical resolutions with four cells in the horizontal. The calculations from RPM-IV, which has no vertical resolution, and RPM-3D with vertical resolution corresponding to two cells, four cells, and six cells, are shown in the figure. The results indicate that the different vertical and horizontal resolutions produce similar estimates of the average plume ------- Chapter: VI Section: 5 116 0.13 0 20 40 60 80 100 120 Plume Evolution Time (min) Figure 6.5: Dependence of plume average 03 concentration on vertical resolution of the RPM-3D in a VOC dominant regime (4 horizontal cells) 0.14 0.13 0.12 E CL CL c o ro c <1) O C o O 0.09 Plume Average Outermost Horizontal Cell Middle Horizontal Cell Centerline Cell 0.08 0.07 0 20 40 60 80 100 120 Plume Evolution Time (min) Figure 6.6: Horizontal profile of the O3 concentration in the plume in a VOC dominant regime (6 horizontal cells, RPM-IV) concentrations. Hence, it is sufficient to use a low-resolution two-dimensional model, for studying the plume average concentrations in such cases. Figure 6.6 shows the ozone concentrations within each horizontal cell calculated ------- Chapter: VI Section: 5 117 Centerline Cell (Horizontal) Outermost Vertical Cell Middle Vertical Cell Centerline Cell Plume Evolution Time (min) Figure 6.7: Vertical profile of the 03 concentration at the plume centerline (w.r.t. horizontal) in a VOC dominant regime (4 horizontal cells and 6 vertical cells, RPM- 3D) by RPM-IV when the plume cross section is divided into six cells. The plots indicate the average ozone concentrations in the different horizontal cells (indicated by "A", "B", and "C" in Figure 6.3 for RPM-IV). Similarly, Figure 6.7 shows the vertical profile of the ozone concentrations calculated by RPM-3D for the centerline horizontal cell when the plume is divided into four horizontal cells and each cell is in turn divided into six vertical cells. The plots indicate the average of the vertical cells along the horizontal centerline, and individual vertical cells at the horizontal line. The cells are illustrated in Figure 6.3. The horizontal centerline cell is given by "A", and the corresponding vertical slices are "AZ" (outermost), "AY" (middle), and "AX" (centerline). Examining the results, it is clear that in a NOx dominant regime, this would clearly produce an under-estimation of the ground level ozone levels. Examining the results, it is clear that by ignoring the vertical structure of a plume, the RPM-IV approximates the highest possible ground level concentration by the average in the vertical. Hence, in a VOC dominant regime, this would clearly produce an over-estimation of the ground level ozone levels. ------- Chapter: VI Section: 5 118 0.08 0.07 0.06 0.05 0.04 0.03 0.02 2 Cells 4 Cells 6 Cells 8 Cells 10 Cells 0.01 0 20 40 60 80 100 Plume Evolution Time (min) Figure 6.8: Dependence of plume average 03 concentration on horizontal resolution of the RPM-IV in a NOx dominant regime The second case study considers a single point source in Mercer County, New Jersey. The simulation is performed for the period of July 7, 1988, 8:00 am to 10:00 am, corresponding to the occurrence of a severe ozone episode in the region. This case study involves a NO/NO2 dominant regime. Ambient pollutant concentrations were obtained from the results of an Urban Airshed Model (UAM-IV) simulation for the Philadelphia/New Jersey modeling domain for the time period of July 6 to July 8, 1988, described by Georgopoulos et al. [78]. The meteorological inputs were obtained from the model inputs of the UAM-IV simulation for the modeling domain [78]. The model results are analyzed in the same manner as described in the earlier case study involving a VOC dominant regime. Figure 6.8 shows the average ozone concentrations in the plume, calculated by RPM-IV, as a function of the plume evolution time for different horizontal resolutions, ranging from two cells to ten cells in the horizontal. Similarly, Figure 6.9 shows the average plume concentrations calculated by RPM-3D for different vertical resolutions with four cells in the horizontal. The calculations from RPM-IV, which has no vertical resolution, and RPM-3D with vertical resolution corresponding to two cells, four cells, and six cells, are shown in the figure. The results indicate that the different vertical and horizontal resolutions produce similar estimates of the average plume concentrations. Hence, it is sufficient to use a low-resolution two-dimensional model, for studying the plume average concentrations in such cases. Figure 6.10 shows the ozone concentrations within each horizontal cell calculated ------- Chapter: VI Section: 5 119 0.08 0.07 0.06 0.05 0.04 0.03 0.02 RPM-IV 2 Cells 4 Cells 6 Cells 0.01 0 20 40 60 80 100 Plume Evolution Time (min) Figure 6.9: Dependence of plume average 03 concentration on vertical resolution of the RPM-3D in a NOx dominant regime 0.08 0.07 0.06 0.05 0.04 0.03 0.02 Plume Average Outermost Horizontal Cell Middle Horizontal Cell Centerline Cell 0.01 0 0 20 40 60 Plume Evolution Time (min) 80 100 Figure 6.10: Horizontal profile of the O3 concentration in the plume in a NOx domi- nant regime (6 horizontal cells, RPM-IV) by RPM-IV when the plume cross section is divided into six cells. Similarly, Fig- ure 6.11 shows the vertical profile of the ozone concentrations calculated by RPM-3D ------- Chapter: VI Section: 6 120 0.08 ¦fj 0.04 ------- Chapter: VI Section: 6 121 knowledge is very useful in building an "optimal model", one that produces outputs similar to more detailed models, but requires much less detailed inputs and much fewer computational resources. Further, results from models with varying detail pro- vide an estimate of the range of model calculations, thus helping in characterizing the uncertainty associated with model formulation and detail. ------- Chapter 7 CONCLUSIONS AND DISCUSSION The application of transport-transformation models involves significant uncertainties that may have implications on the confidence in model estimates and on their use in the assessment of possible alternative scenarios in a regulatory framework, for example, with respect to control strategy selection for decreasing pollutant levels. Hence, it is important to address these uncertainties. However, the main limitations in performing comprehensive uncertainty analyses of transport-transformation models are the associated cost and effort, and the increased data needs for characterizing uncertainties in inputs. Clearly, there is a need for methods that can reduce the cost and effort associated with uncertainty analysis of transport-transformation models. 7.1 Development and Application of the SRSM One of the main contributions of this research is the development of computation- ally efficient methods for uncertainty propagation, specifically, the development of the Stochastic Response Surface Method (SRSM) (Chapter 3). The SRSM approxi- mates the model inputs and the outputs through a series of "well behaved" standard random variables; the series expansions of the outputs contain unknown coefficients which are calculated by a method that uses the results of a limited number of model simulations. The SRSM is based on the Stochastic Finite Element Method [80] and the Deterministic Equivalent Modeling Method [191]. The SRSM has been implemented as a modular and readily portable stand-alone tool that can be used by researchers as a black-box tool, without requiring the math- ematical details of the SRSM. In fact, the SRSM has been implemented as a web- based tool, that can be accessed through a web browser, as shown towards the end of Chapter 3. This implementation includes modules for (a) the transformation of non- standard distributions into functions of the srvs, (b) the construction of polynomial chaos approximations for the model outputs, (c) the identification of sample points for model runs, (d) the estimation of the unknown coefficients in the approximations, and (e) the subsequent calculation of the statistical properties of the model outputs. Further, mathematical formulae have been developed and presented for the exten- sion of this method for dealing with arbitrary empirical distributions (with correlated and uncorrelated inputs), and for the estimation of correlations between two model outputs, and between a model input and a model output. 122 ------- Chapter: VII Section: 2 123 The development and the application of the SRSM meets the need for fast and accurate estimates of the uncertainties associated with the model outputs, and also the need for identification of inputs that contribute to the output uncertainties the most. The main impact of this research is the facilitation of fast uncertainty analyses of complex, computationally intensive models, and the estimation of the relative contribution of each model input to the uncertainties in the outputs. The SRSM has been evaluated for case studies involving the following models: (a) a biological model (a Physiologically Based PharmacoKinetic (PBPK) model, i.e., a lumped system described by a set of stiff ODEs) describing the uptake and metabolism of perchloroethylene in humans, (b) a two-dimensional atmospheric photochemical plume model that calculates pollutant concentrations downwind of point sources, the Reactive Plume Model (RPM), (c) a three-dimensional urban/regional scale air quality model, the Urban Airshed Model (UAM), version IV, and (d) a one-dimensional ground water model with discontinuous probability density functions, the EPA's Composite Model for leachate Migration and Transforma- tion Products (EPACMTP). In the first two cases, the performance of the SRSM was significantly superior to both the Monte Carlo and the Latin Hypercube Sampling methods - the SRSM application required up to an order of magnitude fewer simulations for estimating the uncertainties in the model outputs. In the third case study, involving the Urban Airshed Model, it was impractical to perform hundreds of Monte Carlo simulations, whereas, the SRSM required just twenty (20) simulations to estimate the output uncertainties: this demonstrates the potential of the SRSM to address problems that are beyond the scope of conventional methods (in terms of computational and time limitations). In the last case study involving the ground water model, EPACMTP, the SRSM, as before, required significantly fewer model runs compared to the Monte Carlo method. 7.2 Development and Application of the SRSM-ADIFOR In order to further improve the computational efficiency of the SRSM, this method was coupled with a state of the art automated sensitivity analysis method, AD IFOR (Automatic Differentiation of FORtran) [17]. ADIFOR produces fast and accurate estimates of first order partial derivatives of model outputs with respect to model inputs or any intermediate quantities in the model code. ADIFOR accomplishes this by rewriting the model code and producing code that calculates first order partial derivatives. Hence, the coupling of the SRSM and the ADIFOR, where the series ------- Chapter: VII Section: 3 124 expansions of the SRSM are used in conjunction with outputs and partial derivatives from the AD IFOR generated code, provides the potential to substantially reduce the computer demands. The coupled method, SRSM-AD IFOR, was applied to two of the case studies used for evaluating SRSM: a human PBPK model, and the Reactive Plume Model (RPM). The case studies indicate substantial computer time savings; up to two orders of mag- nitude reductions in the required number of model simulations compared to "con- ventional" methods were achieved. This demonstrates the advantages of combining sensitivity analysis methods with uncertainty propagation methods. The application of the SRSM-ADIFOR involves processing the original model code using ADIFOR, and subsequent use of the derivative code for uncertainty analysis. For this reason, developing a "black-box tool" for using the combined SRSM-ADIFOR approach (as was done for the "stand-alone" SRSM) is not possible. 7.3 Consideration of Uncertainties Beyond Parametric/Data Un- certainty In addition to the uncertainties associated with model inputs, there are often uncer- tainties associated with model formulation and structure. Model uncertainty arises mainly when competing models, ranging from simplified to more detailed models, exist. When a model application involves both model and data uncertainties, it is important to identify the relative magnitudes of the uncertainties associated with data and model formulation. Such a comparison is useful in focusing resources where it is most appropriate (e.g., filling data gaps versus refining a model). The case study involving the development of a three dimensional version of the Reactive Plume Model (RPM-3D, Chapter 6) examines the issues associated with the characterization of uncertainties associated with model formulation. Lack of sufficient model resolution could in fact result in model predictions that may either significantly over-estimate or under-estimate the quantities that are important for decision making. The case study stresses the need for comprehensive hierarchical models, for the systematic evaluation of the simplified models. 7.4 Directions for Future Work 7.4.1 Improvements to the Stochastic Response Surface Method In the present form, the SRSM includes modules for the transformation of probability distributions commonly used to represent uncertainties in the inputs of environmental and biological models. The transformations are listed in Table 3.1. However, this list is not extensive, since model inputs can follow one of several other distributions. In order to extend this list, the methods outlined for transforming random variables that ------- Chapter: VII Section: 4 125 follow empirical distributions (Section 3.2.3) and for transforming random variables that are correlated (Section 3.2.4) can be incorporated into the SRSM tool. The SRSM calculates probability density functions (pdfs) to quantify the uncer- tainties in model outputs. Additionally, several other statistical metrics, such as the cumulative density function, moments of the output metrics, and percentile estimates, can be calculated. This can be accomplished by following the techniques outlined in Section 3.5. Finally, the methods for calculating correlations between outputs and between an input and an output, presented in Section 3.5, can also be incorporated into the SRSM tool. This would facilitate the identification of individual contributions of model inputs to the uncertainties in model outputs, in a more rigorous manner than that presented in relation to the case study in Chapter 5 (Table 5.2). These improvements could further enhance the already wide range of applicability of the SRSM. Furthermore, the same improvements translate into the improvements in the SRSM-ADIFOR method, and even in the wed-based SRSM tool. 7.4.2 Further evaluation of the SRSM and the SRSM-ADIFOR Further rigorous, statistically based evaluation of the SRSM and the SRSM-ADIFOR, that would be based not only on the pdfs of model outputs, but also on the moments, cumulative densities, percentiles and correlation coefficients, would aid in the wider acceptance of this method. Additionally, application of the SRSM and the SRSM- ADIFOR to more complex models could also aid in the identification of areas where these methods could be further improved. 7.4.3 Addressing uncertainty propagation under constraints The case study for the evaluation of the SRSM with a groundwater model showed certain limitations in the application of the SRSM to models with discontinuous prob- ability distributions. Transport-transformation models sometimes have constraints on the values the model inputs can assume; often, the constraints are based on the values of other model inputs. For instance, the sample value of random input x may affect the range from which another random input y is sampled. While many constraints are defined in terms of joint pdfs, in some cases, the constraints could follow a dis- continuous pattern. For example, the diameter and porosity of a particle may not assume certain combinations in the EPACMTP model (Chapter 5). The Monte Carlo method can address such constraints by following the rules listed below: • generate sample point by randomly sampling all inputs from their respective distributions ignoring constraints, ------- Chapter: VII Section: 4 126 • ignore the sample point if any constraints are not satisfied, and repeat the above step till an acceptable sample point is obtained, • run the model at the accepted sample point, • repeat the above procedure till required number of sample points are obtained, and • statistically analyze the model outputs corresponding to all the sample points. It must be noted that samples are drawn from the given pelfs, and they also obey the constraints. That is an advantage of using Monte Carlo methods in a brute-force manner. In the SRSM, such an approach does not appear to be readily applicable, because the inputs are represented as algebraic functions of the srvs. This means that the inputs have pdf s that are continuous and well behaved (as opposed to the actual pdf s which are discontinuous). One possible approach that could be followed is suggested here:* • express inputs in terms of srvs, ignoring constraints, • approximate the outputs using the polynomial chaos expansions, and • generate the sample points for the srvs without considering the constraints, • estimate the unknown coefficients in the series expansion through regression on the model outputs at these sample points, and • calculate the statistical properties of the model outputs from the srvs using the methods listed in Chapter 3.5 and in the process rejecting the combinations of srvs that do not obey the constraints. One potential focus of the future efforts could be on the evaluation of this approach, in addition to the identification of other techniques to address uncertainty propagation under constraints. 7.4.4 Random processes and random fields The SRSM, in its present form addresses only random variables, i.e., random quan- tities that do not vary with time or space. From the perspective of the uncertain- ties that occur in the nature, the random variables are analogous to points in a multi-dimensional space. Random processes and random fields are generalizations of random variables to multi-dimensional spaces. A random process can be considered as a function of time in the random space, as y(t,0 *This approach has not been tested, and its applicability is still unknown. ------- Chapter: VII Section: 4 127 where t denotes the time dimension, and ( is used to denote the randomness. Here, a particular realization of y can be considered as a deterministic function y(t, Q), where (i denotes one realization of the various possible functions that the random process y can assume. Further, at a given time, U, y reduces to a random variable. The relationship between random processes, random variables, deterministic functions, and deterministic variables can be summarized as follows [155]: • x(t, £) is a stochastic process, or a family of time functions, • x(U,() is a random variable, • x(t,(i) is a single time function, and • x(U,(i) is a single number. Similarly, random fields are random functions of spatial coordinates and time. Random processes and random fields are common in environmental and biological systems: for example, the flow rates or emissions from a point source are random processes, and emissions from area sources, wind patterns over a domain are examples of random fields. Further research could focus on extending the SRSM so that it can address uncertainty propagation involving model inputs defined by random processes and random fields. 7.4.5 Uncertainties Associated with Evaluation Data Characterization of uncertainty associated with the evaluation data is an important component of a comprehensive uncertainty analysis. The uncertainties in the evalua- tion data may sometimes in fact impact the selection of an appropriate model from a set of alternative models. Comparisons of parametric and evaluation data uncertain- ties can provide insight into where available resources must be focused (input data versus evaluation data). Various techniques (such as those used in the spatio-temporal analysis of environmental data, e.g., Vyas and Christakos [35,202]) could be explored for their potential to improve the characterization of uncertainties in environmental databases that are used to evaluate transport-transformation models. ------- Bibliography [1] A. E. Abdin and J. J. Kaluarachchi. Stochastic analysis of three-phase flow in heterogeneous porous media. 1. Spectra/perturbation approach. Water Re- sources Research., 33(7): 1549—1558, 1997. [2] A. E. Abdin and J. J. Kaluarachchi. Stochastic analysis of three-phase flow in heterogeneous porous media. 2. Numerical simulations. Water Resources Research., 33(7):1559-1566, 1997. [3] G. Adomian. Applied stochastic processes. In G. Adomian, editor, Stochastic System Analysis, pages 1-17. Academic Press, New York, 1980. [4] K. I. Al-Wali and P. J. Samson. Preliminary sensitivity analysis of urban airshed model simulations to temporal and spatial availability of boundary layer wind measurements. Atmospheric Environment., 30(12-2):2027-2042, 1996. [5] G. Alefeld and J. Herzberger. "Introduction to Interval Computations". Aca- demic Press, New York, 1983. [6] K. E. Atkinson. "An Introduction to Numerical Analysis, Second Edition". John Wiley & Sons, New York, 1988. [7] R. Atkinson. Gas-phase trophospheric chemistry of organic compounds: A review. Atmos. Environ., 24A(1):1-41, 1990. [8] B. M. Ayyub and Kwan-Ling. Lai. Selective sampling in simulation-based reliability assessment. International Journal of Pressure Vessels & Piping., 46(2):229-249, 1991. [9] B. M. Ayyub and Kwan-Ling. Lai. Structural reliability assessment with ambi- guity and vagueness in failure. Naval Engineers Journal, 104(3):21 35, 1992. [10] T. B. Bahder. Mathematica for Scientists and Engineers. Addison-Wesley, Reading, MA, USA, 1994. [11] D. Baldocchi, A. Guenther, P. Harley, et al. The fluxes and air chemistry of isoprene above a deciduous hardwood forest. Philosophical Transactions of the Royal Society of London Series. A-Physical Sciences & Enqineerinq, 315(1696): 279-296, 1995. 128 ------- Chapter: VII Section: 4 129 [12] T. S. Barrett. Probabilistic Design for Rotordynamic Simulations using Receptance-based Reanalysis. PhD thesis, Texas A&M University, 1996. [13] J. F. M. Barthelemy and L. E. Hall. Automatic differentiation as a tool in engineering design. Structural Optimization., 9(2):T6—82, 1995. [14] L. R. Bauer and D. M. Hamby. Relative sensitivities of existing and novel model parameters in atmospheric tritium dose estimates. Radiation Protection Dosimetry, 37(4):253-260, 1991. [15] J. C. Bezdek. Fuzzy models — what are they, and why? Transactions on Fuzzy Systems, 1(1): 1—6, 1993. [16] C. Bischof, A. Carle, P. Khademi, and A. Mauer. "The ADIFOR 2.0 System for the Automatic Differentiation of Fortran 77 Programs". Preprint MCS-P481- 1194, Mathematics and Computer Science Division, Argonne National Labora- tory, and CRPC-TR94491, Center for Research on Parallel Computation, Rice University, 1994. [17] C. H. Bischof, P. Khademi, A. Mauer, and A. Carle. ADIFOR 2.0 - auto- matic differentiation of Fortran 77 programs. IEEE Computational Science & Engineering., 3(3): 18—32, 1996. [18] C. H. Bischof, G. Pusch, and R. Knoesel. Sensitivity analysis of the MM5 weather model using automatic differentiation. Computers in Physics, 6(10): 605—612, 1996. [19] C. H. Bischof, L. Roh, and A. J. Maueroats. ADIC - an extensible automatic differentiation tool for ANSI-C. Software-Practice & Experience., 27(12):1427- 1456, 1997. [20] P. Bjerager. On computation methods for structural reliability analysis. Struc- tural Safety, 9(2):79-96, 1990. [21] G. E. P. Box and N. R. Draper. Empirical Model-Building and Response Sur- faces. John Wiley & Sons, New York, 1987. [22] G. E. P. Box, Hunter W. G., and Hunter J. S. Statistics for Experimenters: An Introduction to Design, Data Analysis and Model Building. John Wiley & Sons, New York, 1978. [23] K. Breitung. Probability approximations by log likelihood maximization. Jour- nal of Engineering Mechanics-ASCE., 117(3) :457—477, 1991. ------- Chapter: VII Section: 4 130 [24] R. P. Broadwater, H. E. Shaalan, and W. J. Fabrycky. "Decision Evaluation with Interval Mathematics: A Power Distribution System Case Study". IEEE Transactions on Power Delivery, 9:59-65, 1994. [25] W. Bryc. The Normal Distribution. LNS 100. Springer-Verlag, New York, 1995. [26] W. Brzakala and W. Pula. Probabilistic analysis of foundation settlements. Computers & Geotechnics., 18(4):291-309, 1996. [27] C. G. Bucher and U. Bourgund. Fast and efficient response surface approach for structural reliability problems. Structural Safety., 7(l):57-66, 1990. [28] G. R. Camilo. Food Web Analysis of Macrobenthic Riffle Communities (South Llano River, Texas). PhD thesis, Texas Tech University, 1992. [29] G. R. Carmichael, A. Sandu, and F. A. Potra. Sensitivity analysis for atmo- spheric chemistry models via automatic differentiation. Atmospheric Environ- ment., 31 (3) :475—489, 1997. [30] M. C. Causley. "User's Guide for the Urban Airshed Model. Volume IV: User's Manual for the Emissions Preprocessor System". (U.S. EPA-450/4-90-007D), U.S. Environmental Protection Agency, Research Triangle Park, NC., 1990. [31] Y. F. Chang and G. Corliss. ATOMFT - solving ODEs and DAEs using taylor series. Computers & Mathematics with Applications., 28(10-12):209—233, 1994. [32] H. M. L. Chaves and M. A. Nearing. Uncertainty analysis of the WEPP soil erosion model. Transactions of the ASAE., 34(6):2437-2444, 1991. [33] L. Chen, H. Rabitz, D. B. Considine, C. H. Jackman, and J. A. Shorter. Chemical reaction rate sensitivity and uncertainty in a two-dimensional mid- dle atmospheric ozone model. Journal of Geophysical Research-Atmospheres., 102(D 13): 16201-16214, 1997. [34] S. Chinchalkar. Application of automatic differentiation to problems in en- gineering analysis. Computer Methods in Applied Mechanics & Engineering., 118(1-2): 197 207, 1994. [35] G. Christakos and V. M. Vyas. "A composite space-time study of ozone distribu- tion over the eastern United States". Atmospheric Environment., 32(16):2845- 2857, 1998. [36] R. A. Christensen. Fitting Distributions to Data, Volume XI. Entropy Ltd., 1994. ------- Chapter: VII Section: 4 131 [37] K. C. Chun. "Uncertainty Data Base for Emissions-Estimation Parameters: In- terim Report". "ANL/EES-TM-328", Argonne National Laboratory, Argonne, Illinois, 1987. [38] L. Cizelj, B. Mavko, and H. Riesch-Oppermann. Application of first and second order reliability methods in the safety assessment of cracked steam generator tubing. Nuclear Engineering & Design, 147(3) :359—368, 1994. [39] D. C. Collins and R. Avissar. An evaluation with the fourier amplitude sensi- tivity test (FAST) of which land-surface parameters are of greatest importance in atmospheric modeling. Journal of Climate., 7(5):681—703, 1994. [40] M. E. Coltrin, R. J. Kee, and F. M. Rupley. Surface CHEMKIN: A general formalism and software for analyzing heterogeneous chemical kinetics at a gas- surface interface. International Journal of Chemical Kinetics, 23:1111, 1991. [41] G. Corliss, A. Griewank, T. Robey, and S. Wright. Automatic differentiation applied to unsaturated flow — ADOL-C case study. Technical Memorandum ANL/MCS-TM-162, Mathematics and Computer Science Division, Argonne National Laboratory, April 1992. [42] NRC (National Research Council). Rethinking the Ozone Problem in Urban and Regional Air Pollution. National Academy Press, Washington, DC, 1991. [43] L. A. Cox. Reassessing benzene risks using internal doses and Monte-Carlo uncertainty analysis. Environmental Health Perspectives., 104(Suppl 6): 1413— 1429, 1996. [44] E. L. Crump, T. L. Jacobs, and P. A. Vesilind. "Fuzzy-set Approach for Op- timizing Sludge Application Land Selection". Journal of Urban Planning and Development, 119:53-71, 1993. [45] Decisioneering, Inc, Denver, CO. Crystal Ball 4-0 for Risk Analysis. [46] R. G. Derwent. Treating uncertainty in models of the atmospheric chemistry of nitrogen compounds. Atmospheric Environment, 21 (6): 1445—1454, 1987. [47] L. Devroye. Non-Uniform Random Variate Generation. Springer-Verlag, New York, 1986. [48] M. Dobmann, M. Liepelt, and K. Schittkowski. Algorithm 746 - PCOM - A Fortran code for automatic differentiation. ACM Transactions on Mathematical Software., 21(3):233-266, 1995. [49] J. D. Doll and D. L. Freeman. "Randomly Exact Methods". Science, 234:1356- 1360, 1986. ------- Chapter: VII Section: 4 132 [50] W. M. Dong, W. L. Chiang, and F. S. Wong. Propagation of uncertainties in deterministic systems. Computers & Structures., 26(3):415—423, 1987. [51] C. H. Dou, W. Woldt, I. Bogardi, and M. Dahab. Steady state groundwater flow simulation with imprecise parameters. Water Resources Research., 31(11):2709— 2719, 1995. [52] C. H. Dou, W. Woldt, I. Bogardi, and M. Dahab. Numerical solute transport simulation using fuzzy sets approach. Journal of Contaminant Hydrology., 27(1- 2): 107—126, 1997. [53] E. P. Doughtery and Rabitz H. A computational algorithm for the Green's func- tion method of sensitivity analysis in chemical kinetics. International Journal of Chemical Kinetics, 11:1237-1249, 1979. [54] E. P. Doughtery, J. T. Hwang, and Rabitz H. Further developments and ap- plications of the Green's function method of sensitivity analysis in chemical kinetics. International Journal of Chemical Kinetics, 71(4):1794-1808, 1979. [55] S. G. Douglas, R. C. Keller, and E. L. Carr. "User's Guide for the Urban Airshed Model. Volume III: User's Manual for the Diagnostic Wind Model". (U.S. EPA-450/4-90-007C), U.S. Environmental Protection Agency, Research Triangle Park, NC., 1990. [56] Dow Chemical Company, Midland, MI. SimuSolv Reference Guide - Volumes I and II, 1993. [57] N. R. Draper and H. Smith. Applied Regression Analysis. John Wiley & Sons, New York, 1981. [58] A. M. Dunker. The decoupled direct method for calculating sensitivity coef- ficients in chemical kinetics. Journal of Chemical Physics, 81(5):2385-2393, 1984. [59] Environmental Protection Agency (EPA). Health Assessment Document for Tetrachloroethylene (Perchloroethylene). External Review Draft. EPA-600/8- 82-005B, 1983. [60] U.S. EPA. "Background Document for EPA's Composite Model for Landfills (EPACML).". Technical report, U.S. EPA, Office of Solid Waste, Washington, D.C., 1990. Background Document and User's Guide. [61] U.S. EPA. "Guide for Regulatory Application of the Urban Airshed Model M- 91.". (EPA-450/4-91-013), U.S. Environmental Protection Agency, Research Triangle Park, NC., 1991. ------- Chapter: VII Section: 4 133 [62] U.S. EPA. "EPA's Composite Model for leachate Migration and Transformation Products (EPACMTP).". Technical report, U.S. EPA, Office of Solid Waste, Washington, D.C., 1995. Background Document and User's Guide. [63] G. W. Evans, W. Karwowski, and M. R. Wilhelm. "An Introduction to Fuzzy Set Methodologies for Industrial and Systems Engineering". In G. W. Evans, W. Karwowski, and M. R. Wilhelm, editors, Applications of Fuzzy Set Method- ologies in Industrial Engineering, pages 3-11. Elsevier, New York, 1986. [64] A. H. Falls, G. J. McRae, and J. H. Seinfeld. Sensitivity and uncertainty analysis of reaction mechanisms for photochemical air pollution. International Journal of Chemical Kinetics, 10:1137-1162, 1979. [65] D. Farrar, B. Allen, K. Crump, and A. Shipp. Evaluation of uncertainty in input parameters to pharmacokinetic models and the resulting uncertainty in output. Toxicology Letters, 49:371-385, 1989. [66] V. V. Fedorov. Analysis and design of simulation experiments for the approxi- mation of models. IIASA, WP-83-71, 1983. [67] G. Fichtner, H. J. Reinhart, and D. W. T. Rippin. "Design of Flexible Chemical Plants by the Application of Interval Mathematics". Computers and Chemical Engineering, 14:1311-1316, 1990. [68] R. A. Fisher. The Design of Experiments. Hafner Press, New York, 1971. [69] G. S. Fishman. Monte Carlo : Concepts, Algorithms, and Applications. Springer Verlag, New York, 1996. [70] H. C. Frey. Quantitative analysis of uncertainty and variability in environmental policy making. Available from: Directorate for Science and Policy Programs, American Association for the Advancement of Science, 1333 H Street, NW, Washington, DC, September 1992. [71] J. D. Fuentes, D. Wang, H. H. Neumann, et al. Ambient biogneic hydrocarbons and isoprene emissions from a mixed deciduous forest. Journal of Atmospheric Chemitry, 25(l):67-95, 1996. [72] D. F. Gao, W. R. Stockwell, and J. B. Milford. Global uncertainty analysis of a regional-scale gas-phase chemical mechanism. Journal of Geophysical Research- Atmospheres., 101(D4):9107-9119, 1996. [73] C. W. Gardiner. Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences. Springer-Verlag, New York, 1983. ------- Chapter: VII Section: 4 134 [74] C. W. Gear. The automatic integration of ordinary differential equations. Com- munications of the ACM, 14:176-179, 1971. [75] P. G. Georgopoulos. "Regulatory Ozone Modeling: Status, Directions and Research Needs". Environmental Health Perspectives, 103 - Supplement 2:107- 132, 1995. [76] P. G. Georgopoulos and A. Roy. Review and evaluation of the Reactive Plume Model (RPM-II) modifications. Ozone Research Center Technical Report ORC- TR93-01, Environmental and Occupational Health Sciences Institute, Piscat- away, NJ, 1993. [77] P. G. Georgopoulos and J. H. Seinfeld. "Mathematical Modeling of Turbu- lent Reacting Plumes". A.R.B. Contract No. A0-044-32, California Institute of Technology, Pasadena, CA, 1986. [78] P.G. Georgopoulos, S. Arunachalam, and S.W. Wang. Alternative Metrics for Assessing the Relative Effectiveness of NOx and VOC Emission Reductions in Controlling Ground-Level Ozone. J. Air & Waste Manage. Assoc., 47:838-850, 1997. [79] R. Ghanem and D. Ghiocel. Comparative analysis of FORM/SORM and poly- nomial chaos expansions for highly nonlinear systems. Proceedings of Engineer- ing Mechanics, 1:535-538, 1996. [80] R. G. Ghanem and P. D. Spanos. Stochastic Finite Elements: A Spectral Ap- proach. Springer-Verlag, New York, 1991. [81] M. Gonzalez. Analysis of the effect of microscale turbulence on atmospheric chemical reactions by means of the pdf approach. Atmospheric Environment., 31(4) :575—586, 1997. [82] A. Griewank. On automatic differentiation. In M. Iri and K. Tanabe, editors, Mathematical Programming: Recent Developments and Applications, pages 83- 108. Kluwer Academic Publishers, 1989. [83] A. Griewank and G. F. Corliss, editors. Automatic Differentiation of Algo- rithms: Theory, Implementation, and Application. SIAM, Philadelphia, Penn., 1991. [84] A. Griewank, D. Juedes, and J. Utke. Algorithm 755 - ADOL-C - a pack- age for the automatic differentiation of algorithms written in C/C++. ACM Transactions on Mathematical Software., 22(2): 131 167, 1996. ------- Chapter: VII Section: 4 135 [85] A. Guenther, P. Zimmerman, L. Klinger, et al. Estimates of regional natural volatile organic compound fluxes from enclosure and ambient measurements. Journal of Geophysical Research-Atmospheres, 101(D1):1345-1359, 1996. [86] J. P. Gwo, L. E. Toran, M. D. Morris, and G. V. Wilson. Subsurface stormflow modeling with sensitivity analysis using a Latin-Hypercube Sampling technique. Ground Water., 34(5):811-818, 1996. [87] M. M. Hamed and P. B. Bedient. On the performance of computational methods for the assessment of risk from ground-water contamination. Ground Water, 35(4):638-646, 1997. [88] M. M. Hamed, P. B. Bedient, and C. N. Dawson. Probabilistic modeling of aquifer heterogeneity using reliability methods. Advances in Water Resources, 19(5) :277—295, 1996. [89] M. M. Hamed, J. P. Conte, and P. B. Bedient. Probabilistic screening tool for ground-water contaminant assessment. Journal of Environmental Engineering, 121(11):767 775, 1995. [90] D. Hattis, S. Tuler, L. Finkelstien, and L. Zhi-Quan. A Pharmakokinetic/Mechanism-Based Analysis of the Carcinogenic Risk of Perchloroethylene. No. CTPID 86-7, Center for Technology, Policy and Industrial Development, Massachusetts Institute of Technology, 1986. [91] A. Heck. "Introduction to Maple". Springer-Verlag, New York, 1992. [92] B. Heller. "MACSYMA for Statisticians". John Wiley & Sons, New York, 1991. [93] J. C. Helton. Uncertainty and sensitivity analysis techniques for use in perfor- mance assessment for radioactive waste disposal, pages 327-367, 1993. [94] M. Hohenbichler, S. Gollwitzer, W. Kruse, and R. Rackwitz. New light on first- and second-order reliability methods. Structural Safety., 4(4):267-284, 1987. [95] J. E. Horwedel, R. J. Raridon, and R. Q. Wright. Sensitivity analysis of AIRDOS-EPA using ADGEN with matrix reduction algorithms. Technical Memorandum ORNL/TM 11373, Oak Ridge National Laboratory, Oak Ridge, Tenn., 1989. [96] J. E. Horwedel, R. J. Raridon, and R. Q. Wright. Automated sensitivity analysis of an atmospheric dispersion model. Atmospheric Environment., 26A(9):1643- 1649, 1992. ------- Chapter: VII Section: 4 136 [97] Jim E. Horwedel. GRESS: A preprocessor for sensitivity studies on Fortran programs. In Andreas Griewank and George F. Corliss, editors, Automatic Differentiation of Algorithms: Theory, Implementation, and Application, pages 243-250. SIAM, Philadelphia, Penn., 1991. [98] P. Hovland, C. Bischof, D. Spiegelman, and M. Casella. Efficient derivative codes through automatic differentiation and interface contraction - an applica- tion in biostatistics. SIAM Journal on Scientific Computing., 18(4): 1056—1066, 1997. [99] P. H. Howard, editor. "Handbook of Fate and Exposure Data for Organic Chem- icals", volume II. Lewis Publishers, 1990. [100] P. J. Huber. Robust statistical procedures. SIAM, Philadelphia, 2nd edition, 1996. [101] M. Hurme, M. Dohnal, and M. Jaervalaeinen. "Qualitative Reasoning in Chemi- cal and Safety Engineering". Computers and Chemical Engineering, 17:441-446, 1993. [102] D. Hwang, D. W. Byun, and M. T. Odman. An automatic differentiation technique for sensitivity analysis of numerical advection schemes in air quality models. Atmospheric Environment., 31(6):879-888, 1997. [103] J. T. Hwang, E. P. Dougherty, and H. Rabitz. The Green's function method of sensitivity analysis in chemical kinetics. Journal of Chemical Physics, 69(11) :5180 5191, 1978. [104] R. L. Iman and W. J. Conover. Small sample sensitivity analysis techniques for computer models, with an application to risk assessment. Communications in Statistics, Part A. Theory and Methods, 17:1749-1842, 1980. [105] R. L. Iman and M. J. Shortencarier. A FORTRAN 77 program and user's guide for the generation of Latin Hypercube and random samples for use with com- puter models. NUREG/CR-3624, SAND83-2365, Sandia National Laboratories, Albuquerque, NM, 1984. [106] M. Iri. Roles of automatic differentiation in nonlinear analysis and high-quality computation. Nonlinear Analysis-Theory Methods & Applications., 30(7):4317- 4328, 1997. [107] J. C. Issac and R. K. Kapania. Aeroelastic sensitivity analysis of wings using automatic differentiation. AIAA Journal., 35(3):519 525, 1997. ------- Chapter: VII Section: 4 137 [108] Y. S. Jang, N. Sitar, and A. D. Kiureghian. Reliability analysis of contaminant transport in saturated porous media. Water Resources Research, 30(8):2435- 2448, 1994. [109] P. H. M. Janssen, P. S. C. Heuberger, and R. Sanders. UNCSAM: A tool for automating sensitivity and uncertainty analysis. Environmental Software., 9(1): 1—11, 1994. [110] M. Jerosolimski and L. Levacher. New method for fast calculation of jacobian matrices: automatic differentiation for power system simulation. IEEE Trans- actions on Power Systems., 9(2):700-706, 1994. [111] N. L. Johnson and S. Kotz. Distributions in Statistics: Continuous Multivariate Distributions. John Wiley & Sons, New York, 1972. [112] C. H. Juang, X. H. Huang, and D. J. Elton. Modelling and analysis of non- random uncertainties - fuzzy-set approach. International Journal for Numerical & Analytical Methods in Geomechanics, 16(5):335-350, 1992. [113] M. H. Kalos and P. A. Whitlock. Monte Carlo Methods : Basics. John Wiley & Sons, New York, 1986. [114] A. Karamchandani and C. A. Cornel. Sensitivity estimation within first and second order reliability methods. Structural Safety, 11(2):95—107, 1992. [115] L. S. Katafygiotis and C. Papadimitriou. Dynamic response variability of struc- tures with uncertain properties. Earthquake Engineering & Structural Dynam- ics., 25(8):775-793, 1996. [116] A. Kaufmann and M. M. Gupta. Introduction to Fuzzy Arithmetic: Theory and Applications. Van Nostrand Reinhold Company, New York, New York, 1985. [117] R. B. Kearfott. Algorithm 763: INTERVAL_ARITHMETIC: A fortran 90 mod- ule for an interval data type. ACM Transactions on Mathematical Software, 22(4):385—392, December 1996. [118] R. B. Kearfott, M. Dawande, K. Du, and C. Hu. Algorithm 737: INTLIB: A portable Fortran-77 elementary function library. ACM Transactions on Math- ematical Software, 20(4):447-459, December 1994. [119] R. B. Kearfott and V. Kreinovich, editors. "Applications of Interval Computa- tions", volume 3 of Applied Optimization. Kluwer Academic Publishers, 1996. [120] A. I. Khuri and J. A. Cornell. Response Surfaces: Design and Analyses. Marcel Dekker, New York, 1987. ------- Chapter: VII Section: 4 138 [121] T. W. Kim, S. H. Chang, and B. H. Lee. Comparative study on uncertainty and sensitivity analysis and application to LOCA model. Reliability Engineering & System Safety., 21 (1): 1 26, 1988. [122] G.J. Klir. "The Many Faces of Uncertainty". In B.M Ayyub and M.M. Gupta, editors, Uncertainty Modeling and Analysis: Theory and Applications, pages 3-19. Elsevier Science, 1994. [123] J. G. Klir and B. Yuan. Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice Hall, Englewood Cliffs, New Jersey, 1995. [124] M. Koda, G. J. McRae, and J. H. Seinfeld. Automatic sensitivity analysis of kinetic mechanisms. International Journal of Chemical Kinetics, 11:427-444, 1979. [125] J. Korelc. Automatic generation of finite-element code by simultaneous op- timization of expressions. Theoretical Computer Science., 187(1-2):231—248, 1997. [126] H. U. Koyluoglu, A. S. Cakmak, and S. R. K. Neilson. "Interval algebra to deal with pattern loading and structural uncertainty". Journal of Engineering Mechanics, 121 (11): 1149-1157, 1995. [127] A. Kraslawski. Review of applications of various types of uncertainty in chemical engineering. Chemical Engineering & Processing., 26(3): 185-191, 1989. [128] A. Kraslawski, T. Koiranen, and L. Nystrom. "Concurrent Engineering: Ro- bust Design in Fuzzy Environment". Computers and Chemical Engineering, 17S:S447-S452, 1993. [129] V. Kreinovich, L. Anatoly, J. Rohn, and P. Kahl, editors. "Computational Complexity and Feasibility of Data Processing and Interval Computations", vol- ume 10 of Applied Optimization. Kluwer Academic Publishers, 1997. [130] K. Kubota. PADRE2, a FORTRAN precompiler yielding error estimates and second derivatives. In Andreas Griewank and George F. Corliss, editors, Auto- matic Differentiation of Algorithms: Theory, Implementation, and Application, pages 251-262. SIAM, Philadelphia, Penn., 1991. [131] S. Kutscher and J. Schulze. "Some Aspects of Uncertain Modelling Experiences in Applying Interval Mathematics to Practical Problems". In H. Bandemer, editor, Modelling Uncertain Data, pages 62-68. Akademie Verlag, Berlin, 1993. [132] B. Lamb, D. Gay, H. Westberg, and T. Pierce. Biogenic hydrocarbon emission inventory for the U.S.A. using a simple forest canopy model. Atmos. Environ., 27A( 11): 1673-1690, 1993. ------- Chapter: VII Section: 4 139 [133] B. Lamb, A. Guenther, D. Gay, and H. Westberg. National inventory of biogenic hydrocarbon emissions. Atmos. Environ., 21 (8): 1695—1705, 1987. [134] R. G. Lamb. "Note on the Application of K-Theory to Diffusion Problems Involving Nonlinear Chemical Reaction". Atmospheric Environment, 7:257- 263, 1973. [135] B. W. Lee and O. K. Lim. Application of the first-order perturbation method to optimal structural design. Structural Engineering & Mechanics., 4(4):425-436, 1996. [136] J. T. Lim, H. J. Gold, G. G. Wilkerson, and C. D. Raper. Monte carlo/response surface strategy for sensitivity analysis, application to a dynamic model of veg- etative plant growth. Applied Mathematical Modelling., 13(8):479-484, 1989. [137] X. Lin, O. T. Melo, D. R. Hastie, et al. Case study of ozone production in a rural area of central Ontario. Atmos. Environ., 26A(2):311-324, 1992. [138] T. S. Liou and H. D. Yeh. Conditional expectation for evaluation of risk ground- water flow and solute transport - one-dimensional analysis. Journal of Hydrol- ogy., 199(3-4):378-402, 1997. [139] M. K. Liu, G. E. Moore, and H. Y. Holman. "Survey of Plume Models for Atmo- spheric Application". Interim Report (EA-2243/1616-9), Systems Applications International, San Rafael, California, February 1982. [140] Y. Q. Liu and R. Avissar. Sensitivity of shallow convective precipitation induced by land surface heterogeneities to dynamical and cloud microphysical parame- ters. Journal of Geophysical Research-Atmospheres., 101(D3):7477-7497, 1996. [141] W. L. Loh. On Latin Hypercube Sampling. Annals of Statistics., 24(5):2058- 2080, 1996. [142] R. Lu, Y. Luo, and J. P. Conte. Reliability evaluation of reinforced concrete beam. Structural Safety, 14:277-298, 1994. [143] G. J. McRae, W. R. Goodin, and J. H. Seinfeld. "Mathematical Modeling of Photochemical Air Pollution". Technical report, Environmental Quality Lab- oratory, California Institute of Technology, Pasadena, CA, 1982. (EQL report no. 18). [144] G. J. McRae, J. W. Tilden, and J. H. Seinfeld. Global sensitivity analysis - a computational implementation of the fourier amplitude sensitivity test (FAST). Computers and Chemical Engineering, 6(1): 15 25, 1982. ------- Chapter: VII Section: 4 140 [145] C. Mischler, X. Joulia, E. Hassold, A. Galligo, and R. Esposito. Automatic differentiation applications to computer aided process engineering. Computers & Chemical Engineering., 19(Suppl):S779-S784, 1995. [146] R. E. Moore. "Interval Analysis". Prentice-Hall, Englewood Cliffs, New Jersey, 1966. [147] R. E. Moore. "Methods and Applications of Interval Analysis". SIAM, Philadel- phia, 1979. [148] E. R. Morris, E. C. Chang, Shepard S. B., and M. P. Ligocki. "User's Guide to Version IV of the Reactive Plume Model (RPM-IV)". (SYSAPP-92/037), Systems Applications International, San Rafael, California, April 1992. [149] E. R. Morris and T. C. Myers. "User's Guide for the Urban Airshed Model. Volume I: User's Guide for the UAM (CB-IV).". (EPA-450/4-90-007A), U.S. Environmental Protection Agency, 1990. [150] A. Neumaier. "Interval Methods for Systems of Equations". Cambridge Uni- versity Press, Cambridge, England, 1990. [151] E. M. Oblow, F. G. Pin, and R. Q. Wright. Sensitivity analysis using computer calculus: A nuclear waste application. Nuclear Science & Engineering, 94(1 ):46- 65, 1986. [152] H. Osnes. Stochastic analysis of velocity spatial variability in bounded rect- angular heterogeneous aquifers. Advances in Water Resources., 21(3):203—215, 1998. [153] I. Ozaki and T. Terano. Applying an automatic differentiation technique to sen- sitivity analysis in design optimization problems. Finite Elements in Analysis & Design., 14(2-3): 143-151, 1993. [154] W. W. Pan, M. A. Tatang, G. J. McRae, and R. G. Prinn. Uncertainty anal- ysis of direct radiative forcing by anthropogenic sulfate aerosols. Journal of Geophysical Research-Atmospheres, 102(D18):21915-21924, 1997. [155] A. Papoulis. Probability, Random Variables, and Stochastic Processes. McGraw- Hill, New York, 1991. [156] F. G. Pin, B. A. Worley, et al. An automated sensitivity analysis procedure for the performance assessment of nuclear waste isolation systems. Nuclear & Chemical Waste Management, 6(3-4):255-263, 1986. [157] PLG, Inc, Newport Beach, CA. RISKMAN: Integrated Quantitative Risk As- sessment Package. ------- Chapter: VII Section: 4 141 [158] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. "Numerical Recipes (Fortran Version)". Cambridge University Press, New York, 1989. 159] Primavera Systems, Inc, San Francisco, CA. Monte Carlo 3.0. 160] G. J. Prokopakis. Decoupled direct method for the solution of ordinary bound- ary value problems. Applied Mathematical Modelling., 17(9):499-503, 1993. 161] M. L. Puri and D. A. Ralescu. Fuzzy random variables. J. Math. Analysis and Applications, 114:409-422, 1986. 162] S. Quin and G. E. O. Widera. "Application of Fuzzy Relational Modeling to Industrial Product Quality Control". Journal of Pressure Vessel Technology - Transactions of the ASME, 118:121-124, 1996. 163] K. Radhakrishnan and D. A. Bittker. GCKP86 - an efficient code for general chemical kinetics and sensitivity analysis computations. Chemical and Physical Processes in Combustion, 4:1-46, 1986. 164] M. E. Reed and W. B. Whiting. Sensitivity and uncertainty of process de- signs to thermodynamic model parameters: a Monte Carlo approach. Chemical Engineering Communications, 124:39-48, 1993. 165] R. H. Reitz and R. J. Nolan. Physiological pharmacokinetic modeling for per- chloroethylene dose adjustment. Unpublished Data Submitted to EPA Science Advisory Board, 1986. 166] L. C. Rich and D. R. Hill. Automatic differentiation in MATLAB. Applied Numerical Mathematics., 9(1):33—43, 1992. 167] Y. Rong, R. F. Wang, and R. Chou. Monte Carlo simulation for a ground- water mixing model in soil remediation of tetrachloroethylene. Journal of Soil Contamination., 7(1):87-102, 1998. 168] S.J. Roselle. Effects of biogenic emission uncertainties on regional photochem- ical modeling of control strategies. Atmospheric Environment., 28(10):1757- 1772, 1994. 169] T. Ross. Fuzzy Logic with Engineering Applications. McGraw-Hill, 1995. 170] N. Rostaing, S. Dalmas, and A. Galligo. Automatic differentiation in Odyssee. Tellus Series A-Dynamic Meteorology & Oceanography., 45A(5):558-568, 1993. [171] A. Roy. Mechanistic Modeling of Transport and Metabolism in Physiological Sys- tems. PhD thesis, Rutgers, The State University of New Jersey, New Brunswick, NJ, 1997. ------- Chapter: VII Section: 4 142 [172] R. Y. Rubinstein. Simulation and the Monte Carlo Method. John Wiley & Sons, New York, 1981. [173] B. A. Schichtel and R. B. Husar. Regional simulation of atmospheric pollutants with the CAPITA Monte Carlo model. Journal of the Air & Waste Management Association., 47(3):331—343, 1997. [174] P. R. Schroeder et al. "The hydrologic evaluation of landfill performance model (HELP): Volume I - Users guide for version I and Volume II - Documentation for version I.". Technical report, U.S. EPA, U.S. EPA, Washington, D.C., 1984. EPA/530-SW-84-009. [175] J. H. Seinfeld. "Atmospheric Chemistry and Physics of Air Pollution". John Wiley & Sons, 1986. [176] J. H. Seinfeld. Ozone air quality models: A critical review. Journal of Air Pollution Control Association, 38(5):616 645, 1988. [177] K. Sexton and H. Westberg. "Photochemical Ozone Formation from Petroleum Refinery Emissions". Atmospheric Environment, 17:467-475, 1983. [178] H. E. Shaalan and R. P. Broadwater. "Using Interval Mathematics in Cost Ben- efit Analysis of Distribution Automation". Electric Power Systems Research, 27:145-152, 1993. [179] T. D. Sharkey. Isoprene synthesis by plants and animals. Endeavour, 20(2):74- 78, 1996. [180] D. Shiriaev, A. Griewank, and J. Utke. A User Guide to ADOL-F: Automatic Differentiation of Fortran Codes. Institute of Scientific Computing, TU Dres- den, Germany, 1996. [181] T. W. Simon. Combining physiologically based pharmacokinetic modeling with Monte Carlo simulation to derive an acute inhalation guidance value for trichloroethylene. Regulatory Toxicology & Pharmacology., 26(3):257-270, 1997. [182] D. Simpson, A. Guenther, C. N. Hewitt, and R. Steinbrecher. Biogenic emissions in europe .1. estimates and uncertainties. Journal of Geophysical Research- Atmospheres, 100(Dll):22875-22890, 1995. [183] G. Sistla, S.T. Rao, and Godowitch J. Sensitivity analysis of a nested ozone air quality model. Proceedings of the AMS/AWMA Joint Conference on Ap- plications of Air Pollution Modeling and Its Applications, New Orleans, LA, 1991. ------- Chapter: VII Section: 4 143 [184] V. Sitaraman and R. B. Oza. Particle trajectory model of atmospheric disper- sion on a parallel processing system. Environmental Software., 11(4):229 234, 1996. [185] M. Smithson. Ignorance and Uncertainty: Emerging Paradigms. Springer- Verlag, New York, 1988. [186] I. M. Sobol. A Primer for the Monte Carlo Method. CRC Press, 1994. [187] D. J. Spiegelhalter. "A Statistical View of Uncertainty in Expert Systems". In W. A. Gale, editor, Artificial Intelligence and Statistics, pages 17-55. Addison- Wesley, Reading, 1986. [188] M. Stein. Large sample properties of simulations using Latin Hypercube Sam- pling. Technornetrics., 29(2): 143—151, 1987. [189] D. A. Stewart and M. Liu. "Development and Application of a Reactive Plume Model". Atmospheric Environment, 15:2377-2393, 1981. [190] M. A. Tatang. "Combined Stochastic and Deterministic Approach in Solving Stochastic Differential Equations". Master's thesis, Carnegie Mellon University, 1992. [191] M. A. Tatang. "Direct Incorporation of Uncertainty in Chemical and Environ- mental Engineering Systems.". PhD thesis, Massachusetts Institute of Technol- ogy, 1995. [192] M. A. Tatang, W. W. Pan, R. G. Prinn, and G. J. McRae. An efficient method for parametric uncertainty analysis of numerical geophysical model. Journal of Geophysical Research-Atmospheres, 102(D18):21925-21932, 1997. [193] R. Tomovic and M. Vukobratovic. General Sensitivity Theory. Elsevier, New York, 1972. [194] C. P. Tsokos. Probability Distributions: An Introduction to Probability Theory with Applications. Wadsworth Publishing Company, Belmont, CA, 1972. [195] U. S. Environmental Protection Agency. Exposure Models Library (EML) and Integrated Model Evaluation System (IMES), March 1996. EPA/600/C-92/002, Developed for the Office of Research and Development, USEPA, by Versar, Inc., Columbia, MD. [196] U. S. Environmental Protection Agency. Summary Report for the Workshop on Monte Carlo Analysis. EPA/630/R-96/001, September 1996. ------- Chapter: VII Section: 4 144 [197] U. S. Environmental Protection Agency. Policy for use of Probabilistic Analysis in Risk Assessment. EPA/630/R-97/001, March 1997. [198] M. Vanderperk. Effect of model structure on the accuracy and uncertainty of results from water quality models. Hydrological Processes., 11(3):227—239, 1997. [199] B. E. Vieux and S. Needham. Nonpoint-pollution model sensitivity to grid-cell size. Journal of Water Resources Planning & Management-ASCE., 119(2): 141— 157, 1993. [200] J. Villadsen and M. L. Michelsen. Solution of Differential Equation Models by Polynomial Approximation. Prentice-Hall, Englewood Cliffs, New Jersey, 1978. [201] L. Vuilleumier, R. A. Harley, and N. J. Brown. First- and second-order sensitiv- ity analysis of a photochemically reactive system (a Green's function approach). Environmental Science & Technology., 31 (4): 1206 1217, 1997. [202] V. M. Vyas and G. Christakos. Spatiotemporal analysis and mapping of sulfate deposition data over eastern USA. Atmospheric Environment., 31 (21):3623 3633, 1997. [203] P. Walley. "Statistical Reasoning vnth Imprecise Probabilities". Chapman & Hall, London, 1991. [204] G. Whiffen, C. Shoemaker, C. H. Bischof, A. Ross, and A. Carle. Application of automatic differentiation to groundwater transport models. In A. Peters, editor, Computational Methods in Water Resources X, pages 173-182. Kluwer Academic Publishers, 1994. [205] W. B. Whiting, T. M. Tong, and M. E. Reed. Effect of uncertainties in ther- modynamic data and model parameters on calculated process performance. Industrial & Engineering Chemistry Research., 32(7): 1367 1371, 1993. [206] G. Z. Whitten, H. Hogo, and J. P. Killus. "The Carbon Bond Mechanism: A Condensed Kinetic Mechanism for Photochemical Smog" Analysis Techniques to a Photochemical Ozone Model". Environmental Science and Technology, 14:690, 1980. [207] E. B. Wilson and M. M. Hilferty. The distribution of chi-square. Proceedings of the National Academy of Sciences, 17:684-688, 1931. [208] W. Winston. Simulation Modeling Using @RISK (User's Guide to @RISK software). Palisade Corporation - PAL #54, Newfield, New York, 1996. [209] K. L. Wood, K. N. Otto, and E. K. Antonsson. "Engineering Design Calcula- tions with Fuzzy Parameters". Fuzzy Sets and Systems, 52:1-20, 1992. ------- Chapter: VII Section: 4 145 [210] B. A. Worley, F. G. Pin, J. E. Horwedel, and E. M. Oblow. ADGEN - ADjoint GENerator for computer models. Technical Report ORNL/TM-11037, Oak Ridge National Laboratory, 1989. [211] J. Yen, R. Langari, and L. A. Zadeh. Industrial Applications of Fuzzy Logic and Intelligent Systems. IEEE Press, Piscataway, New Jersey, 1995. [212] L. A. Zadeh. "Fuzzy Sets". Information and Control, 8:338-253, 1965. [213] P. Zannetti. "Air Pollution Modeling". Van Nostrand Reinhold, 1990. [214] D. X. Zhang. Numerical solutions to statistical moment equations of ground- water flow in nonstationary, bounded, heterogeneous media. Water Resources Research., 34(3):529 538, 1998. [215] J. Zimmermann and D. Poppe. A supplement for the RADM2 chemical mecha- nism - the photooxidation of isoprene. Atmos. Environ., 30(8):1255-1269, 1996. ------- -------