903R93006 CBP/TRS 76/93 June 1993 Incorporating Uncertainty Associated with Censored Water Quality Data in Parametric Trend Analysis TD 225 .C54 162 copy 2 Chesapeake Bay Program I Printed on recycled paper ------- Incorporating Uncertainty Associated with Censored Water Quality Data in Parametric Trend Analysis Chesapeake Bay Program June 1993 Produced under contract to the U.S. Environmental Protection Agency Contract No. 68-WO-0043 ri o y ^A R-c^ioiv in *j o. iJ-> •"• ""-^ , •v-r.ir o-^vl Center Tor .n-irv Printed by the U.S. Environmental Protection Agency for the Chesapeake Bay Program ------- ENDORSEMENT The Chesapeake Bay Program Monitoring Subcommittee has reviewed the assumptions and methods of data analysis used in this i ->port and finds them appropriate for the analysis conducted. The findings of this report are consistent with and supported by the analytical techniques employed. csc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page i ------- ------- TABLE OF CONTENTS EXECUTIVE SUMMARY vii I. INTRODUCTION 1 II. SUBSTITUTION METHODS 2 III. WEIGHTING METHOD 3 IV. CONCLUSIONS 13 V. RECOMMENDATIONS 14 VI-. REFERENCES .-_. . 16 APPENDIX 1: Annotated Bibliography 17 APPENDIX 2: Proof of upper bound on variance 25 esc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page iii ------- ------- LIST OF FIGURES Figure 1. Sample time series of observed mean monthly orthophosphate (P04F) concentrations, showing detection limits Figure 2. Sample time series of observed mean monthly orthophosphate (P04F) concentrations with error bars. Below detection data prior to March 1985 set to 1/2 the prevailing detection limit Figure 3. Sample time series of observed mean monthly orthophosphate (P04F) concentrations, with additional censoring levels (ADL's) shown LIST OF TABLES Table 1. Trend estimates from weighted and non-weighted regressions with different levels of censoring, with below detection observations set to the detection limit 10 Table 2. Trend estimates from weighted and non-weighted regressions with different levels of censoring, with below detection observations set to one-half of the detection limit 11 Table 3. Trend estimates from weighted and non-weighted regressions with different levels of censoring, with below detection observations set to zero 12 esc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page v ------- EXECUTIVE SUMMARY Water quality data are often collected in monitoring programs to servr as a basis for the estimation of trends. A problem arises in trend estimation when data series contain observations reported as below detection limit (BDL). The several published methods that deal with BDL observations are generally oriented towards obtaining the "best value" to substitute for the censored values. We suggest a weighted regression approach, as a modification of the usual substitution approach, which incorporates the fact that a detection limit is a measure of precision of the data. When the definition of a detection limit associated with a datum is unknown, a conservative lower bound (with a theoretical justification) for the precision of the observation is given. We illustrate this approach using water quality monitoring data collected by the Chesapeake Bay Program. The results show that weighted regression estimates of linear trends are much less sensitive to the method of substitution for BDL values than unweighted regression trend estimates. This is an important finding since the use of weighted approaches, by lessening the bias of simple substitution methods, may obviate the need for "best value" substitution methods which are more cumbersome to implement. The results also indicate that use of weighted regression in multiply censored data series eliminates the need to apply the highest detection limit to all data in the series (when data below this highest limit exist in the data series) in order to avoid trends due to changing detection limits. This is an important result since the latter practice which is generally used in (unweighted) trend estimation results in the loss of the advantage of improved detection limits. esc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page vii ------- ------- I. INTRODUCTION Water quality data are often collected in monitoring programs to serve as a basis for the estimation of trends. Such time series of observations may be censored at one detection limit, or at several. Change-s in detection limits during the course of a monitoring program often occur as a result of intentional improvements to the level of detection or by switches in the chemical analysis methods that are used. Varying detection limits in a data series can also occur as a result of quality assurance procedures that specify periodic re- determinations of the detection limit. Although data can be censored at both lower and upper limits, below detection limit (BDL) observations have been the common problem we have encountered in water quality data from the Chesapeake Bay Program^. In general, seasonally low BDL nutrient concentrations may be typical of estuarine and marine monitoring situations. The several published methods that deal with BDL observations are generally oriented towards obtaining the "best value" to substitute for the censored values (1-9). In the second section, we give a brief description of these substitution methods. A more detailed review can be found in (4), which suggests that overall best performance is obtained by what are termed "robust methods" of substitution. The Monte Carlo studies by (6,1) support the claim. However, these substitution methods are not easily adapted to correlated data. This is a significant drawback, since time series of environmental data are usually correlated. The American Chemical Society Committee on Environmental Improvement (ACSCEI) defined the limit of detection (LOD) as "the lowest concentration level that can be determined to be statistically different from a blank" (10). They recommended that the LOD or detection limit be calculated as 3 times the standard deviation of blank determinations. A detection limit may re^er to an instrument detection limit (IDL) based on signal to noise ratios or a method detection limit (MOL) based on replicate determinations of a blank or field blanks (see 11). Once the detection limit is determined, it is used as a censoring threshold for subsequent measurements and the concentration levels observed below that limit are reported as BDL. A detection limit, defined as a multiple of the standard deviation of a number of measurements, thus contains information about the precision of observations above and below the detection limit. We present, in the third section, a new substitution approach that incorporates this fact. In some cases an analyst may be faced with data containing observations esc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 1 ------- censored at one or more detection limits whose method of calculation is not known. For this case, we derive a conservative estimate of the variance associated with a censoring threshold. II. SUBSTITUTION METHODS The accompanying annotated bibliography (Appendix 1) gives a fairly extensive survey of the current state of methodology for dealing with BDL data. The consensus among the authors is that "bad data is better than no data". That is, if possible one should obtain the actual values for the observations." reported as BDL. However, this is usually not possible; indeed, the ACSCEI (10) recommends reporting below detects as "not de.tected" with the associated limit of detection appended as a s-eparate variable. Sin-ce values for BDL observations are necessary in many statistical situations, estimation methods for BDL data have been studied. The most commonly used method is the method of simple substitution. These methods consist of substituting a single value for an unknown BDL observation. Following the substitution, the usual statistical analysis is carried out as if the fill-in data is real data. The fill-in values may be one of the following: zero, the detection limit or half the detection limit. The consensus of the literature (1-9) is that this simple method generally leads to biased estimates. The performance of these simple substitutions in Monte Carlo studies (6,1) were generally poor. However, these methods are definitely the simplest to implement. A second class of methods are based on distributional assumptions. Under the assumption that the observations follow a certain distribution (such as log-normal) the parameters are estimated based on the observed concentrations above the detection limit and th~ percentage of BDL data. Either maximum likelihood or probability plotting methods can be used. The maximum likelihood method is more involved to implement than the plotting method. The plotting method is also complex when more than one detection limit is involved. These methods are described in (1,5). This is also the approach of tobit regression (8), in which maximum likelihood estimates of model parameters are obtained under a distributional assumption. A combination of the substitution method and the distributional assumption is recommended by (4). Under the distributional assumption, the first few (depending on how many BDL observations are in the data set) percentiles are estimated. These values are substituted for the BDL observations and the target analysis is carried out as usual. This method is esc.sA2.6/92 Incorporating Uncertainty in Parai. :tric Trend Analysis • Page 2 ------- computationally involved, especially when there are multiple detection limits; however, this approach consistently produced small estimation errors (4). The methods described in the literature so far concern only uncorrelated data. Extensions of these methods and their performances to correlated da,a have not been explored in the literature. These methods, even in cases in which it is possible to extend them to account for autocorrelation in the data, will be computationally intensive. III. WEIGHTING METHOD The treatment of BDL data suggested by in this report is novel. It takes advantage of one common feature of the definition of a detection limit: that of measurement uncertainty. Thus we note that a detection limit, in addition to being a threshold for data censoring, also contains information regarding variability in the data. We, therefore, propose to incorporate this information in analysis via weighting. We distinguish two situations: one in which the detection limit calculation method is known, and one in which it is unknown. Because detection limits are measures of uncertainty, it is clear that the varying detection limits in the data are an indication of heteroscedasticity. We postulate that the standard deviation of the tth observation is st given by s2t proportional to r2t where rt is a measure of the uncertainty in the tth observation. The quantity rt could simply be 1/3 of the detection limit (if the ACSCEI definition applies) or some other number which reflects a knowledge of the uncertainty in the data. In the case wheie the detection limit calculation is unknown, a conservative estimate of the variance can be arrived at as follows. When a datum is reported as below detection, its true value could lie anywhere between zero and the detection limit. An estimate for the datum is given by the mid-ooint of this interval. The uncertainty in this estimate depends on the exact distribution of the data, which is unknown. However, a upper bound exists for the variance of any distribution of data in this interval. Technically, this can be stated as follows: If X is any random variable taking values in an interval [a,b] then the variance of X can not exceed (b-a)2/4. (See Appendix 2 for the proof.) csc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 3 ------- Thus, in such a case, an estimate of the variance of the data below the detection limit is the square of the detection limit/4. This estimate of variance does not apply to data above detection in this case and other estimates of uncertainty might be explored. If we conclude that the detection limit or above conservative estimate appropriately represents the changing variance of multiply-censored data series, then the best estimation procedure for parametric trend estimation is weighted least squares, since observations can be weighted according to the uncertainty associated with them to obtain appropriate variance estimates (see 12). Thus the observations are weighted by the inverse of the estimates of their standard deviations. In our approach, the BDL data are replaced with y(t) - the value of 1/2 of the operant detection limit and observations are weighted" by w( t ) = ( l/rt) , (see 12, eq. 2.2, p. 11). In general, heteroscedasticity in the data can occur due to many factors, including those due to sampling variability. It is important to recognize the need for a composite measure of uncertainty due to all contributing factors. However, construction of such a model requires estimates of the individual sources of variability and an understanding of their inter-relationships, which may be difficult to obtain. An example Since the inception of the Chesapeake Bay Program Water Quality Monitoring Program, detection limits for orthophosphate concentrations sampled at mainstem monitoring stations successively declined due to modifications in the water chemistry analysis methods (Figure 1). In this example, we apply our method to a time series of orthophosphate concentrations measured from June 1984 through February 1991 at a monitoring station in Maryland waters near the Maryland/Virginia border. In the field sampling procedure, grab samples were taken at different depths in the water column at the station in order to characterize the depth profile of nutrient concentrations. These samples were taken at the surface, bottom and two mid- water depths which depended upon the density structure of the water column. We selected samples measured above the pycnocline (AP) because this series appeared to contain a downward trend and inspection of the data revealed no further significant autocorrelation in the data. CSC.SA2.6/.; Incorporating Uncertainty ~.i Parametric Trend Analysis • Page k ------- Figure 1. Sample time series of observed mean monthly orthophosphate (P04F) concentrations, showing detection limits. Below detection data only occurred prior to March 1985 in this data set and are shown set to the detection limit. The prevailing detection limit in each period is shown by the dashed line. CB5.3 AP P04F Mean Monthly Concentration BDL observations set to the detection limit P04F mg/l 0.014 0.013 0.012-j* 0.011 0.010 0.009 0.008 0.007- 0.006- O.OOB 0.004- I 0.003- 0.002- 0.001 0.000- I ^••••••f o 0 0 o oo I'l I | T-VT-f 1 J 0 F u c • n t b J U n ^T1 0 c t 8 4 B 0 F « b a 6 J U n O C t F « b a 7 J U n O C t F « b e 8 JO UC nt F « b e 8 JO UC nt FJ «U bn 9 0 O C t g 1 Month esc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 5 ------- During the winter, the station was sampled once a month; in other months, it was sampled twice a month. To obtain a monthly series of observations, we averaged the within-month concentrations to obtain monthly values (see Figure 1). In this data set, within-month concentrations were either BDL or above detection at the same detection limit, therefore obviating the problem of estimating a monthly average from BDL and above-BDL data. In this plot, BDL data were set equal to the detection limit; the only BDL data in the series occurred prior to March 1985 and all these values were BDL. The monthly values and the range of uncertainty, r(t), associated with each monthly value in this sample data set is shown in Figure 2. In this plot, BDL data were set equal to 1/2 the detection limit. We calculated r(t) as 1/3 the prevailing detection limit (see Figure 1), since detection limits of orthophosphate were determined as 3 times the standard deviation of duplicate samples (C. Zimmermann, pers. comm.; N. Fristche). To investigate the level of censoring on our estimates, we created four additional data sets based on this original sample data. This was done by applying an artificial detection limit (ADL) to the data from March 1985 onward. We selected four ADL's to correspond to approximate additional censoring levels of 5% (ADL=0.00240) , 15% (ADL=0.00265), 30% (ADL=0.00315) and 50% (ADL=0.00420) of the data. Thus, each of these ADL's, applied individually, replaced the real detection limits of 0.0016 and 0.0006 that occurred from March 1985 onward. These artificial data sets retained the real detection limits of 0.012 and 0.007 prior to March 1985. These contrived detection limit series are shown in Figure 3 along with the observed mean monthly concentrations (BDL prior to March 1985 are shown set to 1/2 the detection limit). To examine the effect of the substitution choice for BDL data, we constructed three categories of sample data sets by setting Br>L observations equal to 0, to 1/2 the prevailing detection limit, or to the prevailing detection limit. To illustrate the effects of applying the weighting procedure, we modeled each sample series twice: (1) weighting each observation according to the prevailing detection limit (as shown in Figures 1 and 3) and (2) applying no weights (which computationally means applying the same weight) to the observations. Our rationale for weighting all observations, not only BDL data, in a time period according to the prevailing detection limit is based on the idea of data precision that is associated with detection limits (10): a datum of 0.03 determined with a detection limit of .03 has the same level of uncertainty of measurement as a value of 0.015 associated with a detection limit of 0.015 (100%; see 13, p.82). esc 3A2.6/92 Incorporating Uncer'ainty in Parametric Trend Analysis • Page 6 ------- Figure 2. Sample time series of observed mean monthly orthophosphate (P04F) concentrations with error bars. Below detection data prior to March 1985 set to 1/2 the prevailing detection limit. The error bars around each observation are equal to 1/3 the prevailing detection limit. CB5.3 AP P04F Mean Monthly Concentration with range of uncertainty about values BDL observations =1/2 detection limit P04F mg/l 0.014- 0.013 0.012- 0.011- 0.010- 0.009- 0.008- 0.007- 0.006- O.OOB- 0,004- 0.003- 0.002- 0.001- 0.000- J U n t t , t t + V H /, • • - Ut T j h . T ^ + +++++ ++++++++ + + + + •f 1 • ' p • I I • * • I • ' • i • • • | 1 • I | I i 1 | 1 I I | I 1 I | I I I | I i 1 p I I p I I | i i I | i 1 i p i i p 1 I | 1 1-rf i i i p i i [ OFJOFJOFJOFdOFJOFJOF c«uc*uc«uceuceuceuca tbntbntbntbntbntbntb e 4 8 6 8 e 8 7 e B B 9 g o g i Month esc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 7 ------- Figure 3. Sample time series of observed mean monthly orthophosphate (P04F) concentrations, with additional censoring levels (ADL's) shown. ADL's used were 0.00240, 0.00265, 0.00315 and 0.0042 mg/1. Below detection data prior to March 1985 are shown set to the prevailing detection limit. The actual detection limits prior to March 1985 are shown by a single dashed line. CB5.3 AP P04F Mean Monthly Concentrations showing levels of ADLs applied March 1985 on P04F mg/l -- 0.014 0.013 0.012 0.011 0.010 0.008- 0.008- 0.007- 0.008- 0.000- 0.004- 0.003- 0.002- 0.001- 0.000- « 1 I 1 4 O 00° o &ooeeooo o o o 0 ° 0 n 0 0 o o o0 o oo o o » „ iftz^z-?io?>_irrri^_^_r^:i§j£:Qi::ij_5ip i 5" O ~*CT O JOFJOFJOFJOFJOFJOFJOF Jc«uc«uc«uc«uc«uceuc» itbntbntbntbntbntbntb 988 88 8 0 9 (B6 78 9 0 1 Month esc.sA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 8 ------- .We used the method described in (14) to model trends in each of the example series as a seasonal autoregression of concentration on time. While this method consists of two steps, we terminated our estimation after the first step since this gave the results of interest to us. Following the first step of this method, preliminary estimates of the trend and seasonal effects were obtained using ordinary least squares (OLS) regression. We accomplished the weighting procedure by multiplying y(t) and t by w(t) prior to the model step. For the second step, a model for the residuals would have been obtained using the Box-Jenkins methodology. Using the selected model for the error terms, revised estimates of trend and seasonal effects could then be obtained using generalized least squares. We followed this method prior to step two and made computations using the AUTOREG procedure of SAS/ETS Version 6. The trend estimates, their standard errors, and associated probability level obtained for each data set using weighted and unweighted (weights=l) regression estimation are tabulated in Tables 1-3 according to the BDL substitution method. Tables 1, 2 and 3 show the results from the unweighted and weighted regressions for the three substitution choices for BDL data and different censoring levels. These results show that the weighted regression estimates of trend coefficients were less sensitive to the value used for substitution than were the unweighted regression estimates. Estimates of trend coefficients from the weighted regressions ranged from -0.000013 to -0.000024; trend coefficients from unweighted regressions ranged from -0.000002 to -0.000038. Specifically, for the original data series, the trend estimates from the weighted regressions were -0.000021, -0.0000022 and -0.000023 whereas the corresponding unweighted regression estimates were -0.000019, - 0.000009 and -0.000038. Even at the level of 50% censored data, trend estimates from the weighted regression were comparatively more robust. Although the occurrence of BDL data introduced bias in the both weighted and unweighted trend estimates in each substitution method, the weighted regression method appeared more advantageous in another important aspect of trend modelling of environmental data with changing detection limits. By interpreting the detection limit as a measure of uncertainty associated with all data points, the practice of setting all data below the highest detection limit of censored data in the data set to the highest detection limit to avoid the estimation of spurious trends becomes unnecessary. This is illustrated by the stability of weighted trend estimates over censoring levels esc.SAZ.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 9 ------- Table 1. Trend estimates from weighted and non-weighted regressions with different levels of censoring, with below detection observations set to the detection limit. Estimated slope (b), standard error of the slope and associated probability (p) are indicated. Censoring level original + 5 % + 15 % + 30 % + 50 % b s .e . (b) P b s.e.(b) P b s.e. (b) P b s.e. (b) P b s.e. ( b) P Weighted -0.000023 0.000009 0.0243 -0.000023 0.000009 0.0208 -0.000024 0.000009 0.0177 -0.000023 0.000009 0.0135 -0.000022 0.000008 0.0068 Unweighted -0.000038 0.000009 0.0001 -0.000036 0.000009 0.0002 -0.000036 0.000009 0.0002 -0.000036 0.000009 0.0002 -0.000030 0.000008 0.0002 esc.SA2.6/92 Incor, oratin? Uncertainty in Parametric Trend Analysis • Page 10 ------- Table 2. Trend estimates from weighted and non-weighted regressions with different levels of censoring, with below detection observations set to one-half of the detection limit. Estimated slope (b), standard error of the slope and associated probability (p) are indicated. Censoring level original + 5 % + 15 % + 30 % + 50 % Weighted b s.e.(b) P b s.e . (b) P b s.e. (b) P b s.e. (b) P b s.e.(b) P -0. 0. 0. -0. 0. 0. -0. 0. 0. -0. 0. 0. -0. 0. 0. 000022 000010 0317 000020 000010 0552 000018 000011 . 0952 000019 000011 0936 000019 000011 0974 Unweighted -0 0 0 -0 0 0 -0 0 0 -0 0 0 -0 0 0 .000009 .000008 .3149 .000010 .000009 .2988 .000009 .000009 .3560 .000011 .000010 .2628 .000014 .000010 .1696 csc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 11 ------- Table 3. Trend estimates from weighted and non-weighted regressions with different levels of censoring, with below detection observations set to zero. Estimated slope (b), standard error of the slope and associated probability (p) are indicated. Censoring level original + 5 % + 15 % + 30 % + 50 % b s.e.(b) P b s.e . (b) P b s.e. ( b ) P b s.e.(b) P b s.e.(b) P Weighted -0.000021 0.000010 0.0433 -0.000017 0.000012 0.1481 -0.000013 0.000013 0.3169 -0.000015 0.000015 0.3067 -0.000016 0.000016 0.3319 Unweighted -0.000019 0.000011 0.0912 -0.000017 0.000012 0.1603 -0.000018 0.000013 0.1582 -0.000012 0.000014 0.3945 -0.000002 0.000015 0.9026 csc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 12 ------- within each substitution method. Using weighted regression therefore avoids the loss in improved data quality imposed by this latter practice. Use of weighted regression also improved other data aspects. When the data series contained a high percentage of BDL data, a significant amount of autocorrelaMon occurred in the model residuals. xhis autocorrelation was not nearly as prevalent in residuals from our weighted regressions as in the residuals from the unweighted regression. IV. CONCLUSIONS We conducted a comprehensive search of the literature on the treatment of BDL data in statistical water quality analysis. We found several useful articles written by Helsel, Gilliom, Conn and others on this topic. The accompanying annotated bibliography (Appendix 1) provides a more detailed summary of the articles. The articles we found discussed the computation of simple descriptive statistics only. These papers (except for 8 which concerns tobit regression) did not discuss trend estimation. The literature is sparse on how to deal with censored data when the data are autocorrelated. In this paper we proposed a weighting scheme which interprets the detection limit as a measure of uncertainty. The approach seems reasonable since detection limits are usually calculated as multiples of the standard deviation of blank determinations. For instances in which detection limit calculations are unknown, we present a weighting function that can be used which does not rely on distributional assumptions. A mathematical result shows that it is the most conservative estimate of uncertainty, over all possible distributions. The advantage of using weighted regression rather than unweighted regression in describing trends in our sample data set was threefold. First, weighted trend estimates were much less sensitive to the level of censoring in the data than were unweighted trenc! estimates. Second, weighted trend estimates were much less sensitive to the choice of data substitution for BDL data (i.e. 0, 1/2 the detection limit, or the detection limit) than were unweighted trend estimates. Finally, the complicating effects of autocorrelation weire less in the weighted regression models than in the unweighted regression models. The results also showed that the use of weighted regression eliminates the need to censor data to the maximum detection limit when dealing with multiply censored data series to avoid spurious trends. Thus, trend estimation with weighted csc.sA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 13 ------- regression preserves the effect of improved detection limits in data sets. Th.e performance of the estimation procedure based on the weighting scheme we have demonstrated could be further evaluated using Monte Carlo methods. Such a study would provide a good topic for further research. Questions that we have attempted to address in a general way could be moi.e extensively examined. For example, the answer to the question of how many BDL data are permissible in an analysis is affected by the range in concentration in the data relative to the censored data levels, the distribution of censored values in the data series (are they just at the beginning, or do they occur seasonally, etc.), and the amount of change in above detect data concentrations over time . An answer to the question concerning the best choice of substitution method is also data dependent and application dependent, even in the estimation of simple quantities such as mean, median, standard deviation etc. Relative performance of estimation methods described in (6) for the mean, standard deviation, median and interquartile range of actual river sample water quality data series of 25 observations depended on the level of censoring. Most of the 8 methods differed little at 20%, the lowest level of censoring they investigated. At higher levels of 40%, 60%, and 80%, assuming a uniform distribution (UN) or a log-normal distribution (LR) for BDL data produced the lowest and similar errors of estimation for the mean and standard deviation. For the estimation of the median and interquartile range of these data sets, the UN method was usually second best, but only greatly at 80% censoring of the data, to a maximum likelihood method (LM). Thus, the uncomplicated assumption of a uniform distribution for BDL data in most cases was quite adequate? for the estimation of a mean, this is equivalent to choosing 1/2 the detection limit for BDL data values. These results differed somewhat from those the same authors achieved on simulated data sets which were generated from specific types of distributions, most of which were skewed to the right, with long tails extending to higher concentrations (1). In these simulations, the' relative performance of LR , LM and UN methods varied among cases, with the UN method being only really poorer at the 80% censoring level for estimation of median and interquartile range. V. RECOMMENDATIONS It is difficult to advance a general recommendation concerning the accommodation of BDL data in trend analysis of Chesapeake Bay Program water quali.ty monitoring data which will perform well in all scenarios. We offer the following simple csc. SA.. 6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page ------- recommendations for trend estimation based on our results, those of papers we have consulted and the characteristics of CBP mainstem data we have examined. It is important to keep in mind that no substitution method for BDL data is completely accurate, and that the degree of inaccuracy imparted by substitution methods for BDL data to estimated statistics cannot be known with real data seri s. 1) Assume a uniform distribution of BDL data between the detection limit and zero. Substitute 1/2 the detection limit for BDL data. 2) Use weighted regression, applying a weighting factor that represents uncertainty in the data. esc.sA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 15 ------- VI. REFERENCES (1) Gilliom, R.J.; Helsel, D.R. Water Resour. Res., 1986,22,135- 146. (2) Gilliom, R.j.; Hirsch, R.M.; Gilroy, S. J. Environ. Sci. Technol., 1984, 18, 530-535. (3) Gleit, A. Environ. Sci. Technol., 1985, 19, 1201-1206. (4) Helsel, D.R. Environ. Sci. Technol., 1990, 24, 1766-1774. (5) Helsel, D.R.; Cohn, T.A. Water Resour. Res., 1988, 24, 1997- 2004. (6) Helsel, D.R.; Gilliom, R. J. Water Resour. Res., 1986, 147- 155. (7) Porter, P.S.; Ward, R.C.; Bell, H.F. Environ. Sci. Technol.,1988, 22, 856-861. (8) Robinson, P. M. Econometrica, 1982, 50, 27-41. (9) Travis, C. C.; Land, M. L. Environ. Sci. Technol., 1990, 24,961-962. (10) ACS Committee on Environmental Improvement, 1980, Anal. Chem., 52, 2242-2249. (11) ACS Committee on Environmental Improvement, 1983, Anal. Chem., 55, 2210-2218. (12) Carroll, R.J.; Ruppert, D. Transformation and weighting in Regression. Chapman and Hall: New York, New York, 1988. (13) Taylor, J.K. Quality Assurance of Chemical Measurements, Lewis Publishers, Inc.: Chelsea, Michigan, 1987. (14) Nagaraj, N. K.; Brunenmeister, S. L. In Proceedings of a Conference "New Perspectives in the Chesapeake system: A research and Management Partnership", Baltimore, MD. Chesapeake Bay Consortium Pub. No. 137, 355-369. csc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 16 ------- APPENDIX 1 Annotated Bibliography Gilliom, R. J. and Helsel, D. R. (1916) "Estimation of distributional parameters for censored trace level water quality data. 1. Estimation techniques" Water Resources Research, 22 (2):135-146. Several methods were evaluated for estimating distributional parameters for censored data sets using only the uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which "small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median and interquartile range. The eight methods we re: 1. (ZE) Censored observations were set equal to zero 2. (DL) Censored observations were set equal to the detection limit 3. (UN) Censored observations were assumed follow a uniform distribution between zero and detection limit. 4. (NR) Censored observations were assumed follow the zero-to- detection limit portion of a normal distribution which was fit to the uncensored observations using least squares regression. 5. (LR) Censored observations were assumed to follow zero-to- detection limit portion of the log-normal distribution fit to the uncensored data by least squares regression. The method is identical to NR, except that concentrations were log-transformed prior to the analysis. 6. (KM) Concentrations are assumed to be normally distributed with parameters estimated from the uncensored observations by the maximum likelihood method for a censored normal distribution. 7. (LM) Concentrations are assumed to be log-normally distributed with parameters estimated from the uncensored observations by the maximum likelihood method for a censored lognormal distribution. csc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 17 ------- 8. (DT) Censored observations are assumed to be zero and uncensored observations are assumed follow a lognormal distribution. Estimates of parameters of the overall delta distribution are obtained by computing maximum likelihood estimates of the uncensored lognormal portion using relationships between these and the overall delta distribution. The evaluation of the reliability of estimation methods was based on the root mean squared error (RMSE) computed from actual parameters underlying the distribution. The average RMSE, over all the parent distributions, was expressed as fractions of the true values. The methods were then ranked according to increasing value for the RMSE, for various parameter and censoring level combination. The best overall method was chosen by summing the ranks of RMSE's over all sample sizes, censoring levels, and parameters. The method with the smallest sum of ranks is declared to be the best. Note that some of the methods considered have certain distributional assumptions. If a priori knowledge of the true distribution of the data is known, one may simply consider the corresponding method. The paper also gives results from a Monte Carlo study where certain sample characteristics are used to classify the data as to what the parent distribution would be and apply the corresponding method. The paper concludes that the most robust estimation method for minimizing errors in estimates of the mean, median, standard deviation and interquartile range of censored data was the log probability regression method (LR). This method is based on the assumption that censored observations follow the zero-to- censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their normal scores. When method performance was evaluated separately for each distributional parameter, LR resulted in the lowest RMSEs for mean and standard deviation. The lognormal maximum likelihood estimator for censored data (LM) produced the lowest RMSE for median and interquartile range. The authors conclude that these two methods constitute the best procedures for their respe-ctive parameters. Gilliom, R. J., Hirsch, R. M., and Gilroy, E. J. (1984) "Effect of Censoring Trace-Level Water-Quality Data on Trend- Detection Capability" Environmental Science and Technology, 18(7): 530-535. esc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 18 ------- This paper makes a strong case against reporting values as below detect values. The article decidedly argues that the data analyst is better off with an unreliable data value than to be given a value as simply below detect. The following are a few chosen paragraphs from the article highlighting its argument. Summary: Monte Carlo experiments were used to evalua'e whether trace-level water quality data that are routinely censored (not reported) contain valuable information for trend detection. Measurements are commonly censored if they fall below a level associated with some minimum acceptable level of reliability (detection limit). Trace-level organic data were simulated with best- and worst-case estimates of measurement uncertainty, various concentrations and degrees of linear trend, and different censoring rules. The resulting classes of data were subjected to a nonparametric statistical test for trend. For all classes of data evaluated, trends were most effectively detected in uncensored data as compared to censored data even when the data censored were highly unreliable. Thus, censoring data at any concentration level may eliminate valuable information. Whether or not valuable information for trend analysis is, in fact, eliminated by censoring of actual rather than simulated data depends on whether the analytical process is in statistical control and bias is predictable for a particular type of chemical analysis. The purpose of the report is to demonstrate that trace- level data that are routinely censored may contain valuable information despite their greater uncertainty, and that, given certain conditions, adopting a "no censoring" rule increases the data analyst's ability to detect the actual trend in data. This conclusion was reached by the Monte Carlo experiment reported in the article. Trace-level data were simulated with best- and worst-case estimates of measurement uncertainty, various degrees of linear trend, and different censoring rules. The resulting class of data were subjected to a nonparametr ic statistical test for trend. The effect of different degrees of censoring is shown by comparing the power of (effectiveness) of trend detection, as determined from Monte Carlo experiments, for different classes of data with different censoring rules. For all classes of data evaluated, trends were most effectively detected in uncensored data as compared to censored data even when the data were highly unreliable. Censoring data at any concentration level, especially given our lack of knowledge about prevailing concentrations, may eliminate valuable information. The more reliable the data censored, the greater the information lost and the more detrimental the effects of censoring. esc.sA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 19 ------- The results of the present study lend strong support to the argument, earlier advanced by others in Rhodes, R. C. I In "Environmetrics 81: Selected Papers"; SIAM Institute for Mathematics and Society: Philadelphia, 1981; pp 157-162] and generally followed in new ASTM standards, that data should not be routinely censored by laboratories. Uncensored data should always be retained in permanent records available to data users even if policy makers of a laboratory decide that some censoring or other form of qualification is necessary before general public release of data. Measurement data should not be discarded unless the lack of statistical control in the measurement process is clearly demonstrated. Gleit, Alan. (1985) "Estimation for Small Normal Data sets vith Detection Limits" Environmental Science and Technology, 19(12): 1201-1206. Summary: The paper identifies the detection limit problem as a problem of censoring. The problem of optimal substitution for the values reported as below detection limit is explored. The several commonly used methods are considered. Some new methods, such as using EM-algorithm to estimate the unavailable data; using some optimal linear functions of order statistics, etc. are also considered. The performance of these methods are compared, via a simulation study. The overall winner was the method which uses the expected values as the fill-in values. But it takes a few iterations to obtain the substitution for each below detection limit value. The steps involved in the iteration are: (i) Select any initial convenient guesses for the mean and variance. (ii) On the basis of our tentative current values for the mean and variance, calculate the expected values for th'e first p order statistics conditional on being less than L. (Note that the p is the number of below detects in the data and L is the detection limit.) (iii) Now that first iterates are available for the below detect values the initial guess for the mean and variance can be updated. (iv) If the current and updated values are close then STOP. Otherwise, go back to step (ii) using the updated versions as current values. csc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 20 ------- The procedure described above can be programmed in SAS with the .help of MACRO processing. In general, the number of iterations required will depend on the accuracy desired. The most significant conclusion of the article is that fill-in by constants (zero or the detection limit or half the detection limit) can be, in general, orders of magnitude worse. The methods discussed are under the assumption that the observations are independent. The change in the detection limit problem is not addressed. The fill-in by expectation method given in the paper assumes normality. The most significant drawback of the results, as far as its applicability to the CBP water quality monitoring data is concerned, is the assumption of independence. A considerable amount of theoretical work may be involved before the methods described in the paper may be applied to correlated data. Helsel, D. R. (1990) "Less than Obvious: Statistical Treatment of data below the detection limit" Environmental Science and Technology, 24(12): 1766-1774. This is by far the most comprehensive article, from a statistical stand point and therefore a must-read in this list. The paper discusses the most appropriate statistical procedures for handling data that have been reported as below detects. First, the appropriateness of various commonly used summary statistics in the presence of below detect data is discussed. It points out that mean and standard deviation may be quite sensitive to the deletion or addition of even one observation. Since environmental data are typically skewed, mean and standard deviation make poor measures of central value and variability. Alternative measures suggested in this article are median and interquartile range. Also the geometric mean, the mean of logarithms of the data also serves well since it estimates the median (in original units) when the logarithms are symmetric. The paper points out an important advantage of median and interquartile range when applied to censored data. The sample median is completely known when up to, but less than, 50% of the data were reported as below detects. Similarly, interquartile range for the sample is known when up to, but less than, 25% of the data is reported as below detects. Thus, no fix-ups are necessary to obtain these sample summary statistics. The paper considers three classes of estimation methods. esc.sA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 21 ------- 1. Substitution methods: These methods substitute a single value for the unseen below detect observations in the data. These, generally, lead to biased estimation of mean, median, standard deviation and interquartile range. 2. Distributional method: An assumption about the underlying distribution which produced the data is made. Giver the distribution, estimates of summary statistics that best match the observed concentrations above the detection limit and the percentage of below detects are obtained. Maximum likelihood estimation and method of probability plotting are two methods that can be used. The two methods are unbiased as sample size gets large, provided the assumption about the distribution is correct. 3. Robust methods: This is, in fact, a combination of methods 1 and 2t Under certain distributional assumption, the first few (depending on how many non-detects are in the data) percentiles are estimated. Then these percentiles are used in the place of the below detect data values in the ensuing calculations. The author reports, with references, that the robust methods have consistently produced small errors. The author strongly recommends the use of robust methods in practice. The implementation of the robust method in commonly available software (SAS, for example) is described here. The method is more involved if there are multiple detection limits present. The FORTRAN code to implement this procedure may be obtained from the author. The references cited by the author explore the problem under the assumption of independence. Thus, a study extending these ideas and exploring their usefulness for correlated data is called for. Helsel, D. R. and Conn, T. A. (1988) "Estimation of Descriptive Statistics for Multiply Censored Water Quality Data" Water Resources Research, 24(12): 1997-2004. This paper extends the work of Gilliom and Helsel (1986) to the situation when multiple detection limits are present. In addition to the eight methods in that paper, substitution of all below detects by a value of one half their detection limit; an extension of the log probability method (LR) to multiple detection limit situation, namely MR; an extension of the maximum likelihood method for censored data to multiple esc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysi- • Page 22 ------- censoring limit situation, namely MM; and an adjusted maximum likelihood procedure [due to Tim Cohn discussed in an USGS Tech Report] namely, AM. The major conclusions of the paper are: 1. Estimates of descriptive statistics for multiply censored data may contain no more error than those for singly censored data, if efficient methods are used. 2. The AM and MR methods are found to be substantially better than any of the simple substitution methods. Helsel, D. R. and Gilliom, R. J. (1986) "Estimation of Distributional Parameters for Censored Trace Level Water Quality Data. 2. Verification and Applications" Water Resources Research, 22(2): 147-155. This is a companion paper to Gilliom and Helsel (1986); the approach is basically the same. Real data sets were used in place of simulated data sets. Uncensored data sets with more than 50 observations for suspended sediment, total phosphorus, total Kjeldahl nitrogen, and nitrate nitrogen concentrations were obtained from 313 stations of the U.S. Geological Survey's National Stream Quality Accounting Network (NASQAN). Most data sets were monthly samples taken during 1974-1981, resulting in 917 data sets having more than 50 observations and no censoring. The sample values from the uncensored data were used as the population parameters in the computation of RMSE. The LR method produced the lowest RMSE for the moment parameters. Porter, P. S., Ward, R. C., and Bell, H. F. (1988) "The Detection Limit" Environmental Science and Technology, 22(8): 856-861. This article argaes strongly that it is _mportant give the actual measurement and the margin of error rather than report it a below detect. They emphasize the loss of information due to the practice of reporting the value as below detect. esc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 23 ------- Travis, Curtis, C. and Land, Miriam L. (1990) "Estimating the mean of data sets with nondetectable values" Environmental Science and Technology, 24(7): 961-962. This paper mentions the use of log-probit analysis to estimate the mean, when a possibly a very high percentage of data sets are censored. Several relevant references are listed. The technical description of the log-probit analysis is not available here. An example is available. csc.sA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page ------- APPENDIX 2 Proof of upper bound on variance In this appendix we give the pro f of Result 1. RESULT 1 Let X be a random variable taking values in the interval [a,b]. Then variance of X does not exceed (b-a)2/4. PROOF : Let Y=(X-a)/(b-a) . Since Var ( X)= ( b-a ) 2 Var(Y), it is enough to. show that 0 < Var(Y) < 1/4. Since Y takes values in the interval [0,1], Y2 also takes values in the interval [0,1] and we have that 0 j< E(Y2) <. E(Y) 1 1. Therefore , Var(Y) = E(Y2) - (E(Y) ]2 1 E(Y) - [E(Y) ]2 Now the conclusion follows noting that maximum of the function p(l-p), where p ranges in the interval [0,1], is 1/4. esc.SA2.6/92 Incorporating Uncertainty in Parametric Trend Analysis • Page 25 ------- ------- |