NCEE# NATIONAL CENTER FOR ENVIRONMENTAL ECONOMICS Issues and Challenges in Measuring Environmental Expenditures by U.S. Manufacturing: The Redevelopment of the PACE Survey Randy Becker and Ron Shadbegian Working Paper Series Working Paper # 07-08 July, 2007 £ M'i 1 * U.S. Environmental Protection Agency National Center for Environmental Economics 5 " 1200 Pennsylvania Avenue, NW(MC 1809) Washington, DC 20460 ¦h pro^^ http://www.epa.gov/economics ------- Issues and Challenges in Measuring Environmental Expenditures by U.S. Manufacturing: The Redevelopment of the PACE Survey Randy Becker and Ron Shadbegian NCEE Working Paper Series Working Paper # 07-08 July, 2007 DISCLAIMER The views expressed in this paper are those of the author(s) and do not necessarily represent those of the U.S. Environmental Protection Agency. In addition, although the research described in this paper may have been funded entirely or in part by the U.S. Evironmental Protection Agency, it has not been subjected to the Agency's required peer and policy review. No official Agency endorsement should be inferred. ------- Abstract The Pollution Abatement Costs and Expenditures (PACE) survey is the most comprehensive source of information on U.S. manufacturing's capital expenditures and operating costs associated with pollution abatement. In 2003, the U.S. Environmental Protection Agency began a significant initiative to redevelop the survey, guided by the advice of a multi-disciplinary workgroup consisting of economists, engineers, survey design experts, and experienced data users, in addition to incorporating feedback from key manufacturing industries. This paper describes some of these redevelopment efforts. Issues discussed include the approach to developing the new survey instrument, methods used to evaluate (and improve) its performance, innovations in sampling, and the special development and role of outside expertise. The completely redesigned PACE survey was first administered in early 2006. Keywords: Survey design, survey evaluation, sampling, environmental costs, manufacturing Subject Matter Classifications: Costs of Pollution Control Note that the opinions and conclusions expressed herein are those of the authors and do not necessarily represent the views of the U.S. Census Bureau or the U.S. Environmental Protection Agency. All empirical results have been reviewed to ensure that no confidential information is disclosed. Otherwise, this work has not undergone the review accorded official Census Bureau publications. The authors would like to thank Paul Hsen and Amanda Lee for their comments on an earlier draft. A version of this paper is forthcoming in Proceedings of the Third International Conference on Establishment Surveys. 1 ------- 1. Introduction The Pollution Abatement Costs and Expenditures (PACE) survey is the most comprehensive source of information on U.S. manufacturing's capital expenditures and operating costs associated with pollution abatement. Administered by the U.S. Census Bureau, the PACE survey began in 1973, but was discontinued after 1994 for budgetary reasons. With guidance and financial support from the U.S. Environmental Protection Agency (EPA), a substantially new version of the PACE survey was administered for reference year 1999. For a number of reasons, the usefulness of the data from this particular survey is limited (Becker and Shadbegian 2005). In response, in late 2003, the EPA began a significant initiative to redevelop the survey, guided by the advice of a multi-disciplinary workgroup consisting of economists, engineers, survey design experts, and experienced data users, as well as incorporating feedback from key manufacturing industries. This paper describes some of these efforts, focusing on particular measurement issues and challenges. Among these issues is determining what should be measured by such a survey. This requires balancing the needs of data users with the ability of businesses to report information that they may not specifically track. Another obvious challenge is to design a survey instrument to adequately capture these difficult-to-report items. Here, we summarize the approach taken to develop the new survey instrument. The redeveloped survey also benefited from two novel evaluation exercises. In one, responses to a pretest survey were compared to estimates produced by engineers and economists during a visit to the establishment. In the other, responses to a larger pilot survey were compared to historical data, both at the industry- and establishment-levels. As we will describe, this 2 ------- resulted in additional significant improvements in the survey instrument. Several significant innovations in sampling are also discussed. We end by noting the role that outside experts played in the redevelopment effort. The completely redesigned PACE survey was first administered in early 2006, for reference year 2005. 2. Developing the Survey Instrument and Instructions After 22 years of continuous collection, the PACE survey was discontinued by the Census Bureau for budgetary reasons after the 1994 survey. With an unmet need for such data, the EPA decided to step in with the necessary funding for the PACE survey (Iovanna et al. 2003). With consultation from groups within the agency, the EPA introduced a substantially new version of the PACE survey, which was administered for reference year 1999. Concerned about respondents' ability to provide meaningful data on pollution abatement expenditures - a concern that has been expressed by many economists over the years - the U.S. Office of Management and Budget (OMB) approved the PACE survey for just this one year, pending a post-survey review of the quality of responses and the plausibility of the resulting published estimates. The usefulness of the data from the 1999 PACE survey proved to be quite limited, for a number of reasons, not the least of which was loss of longitudinal comparability (Becker and Shadbegian 2005). If there was to be another PACE survey, it became clear that it would need to be different from the 1999 survey. It is with this backdrop that, in late 2003, the EPA initiated a comprehensive review and redevelopment of the PACE survey, to be led in part by RTI International (under subcontract to ICF Consulting). This expansive initiative had numerous goals. Among them, experts and 3 ------- stakeholders outside of the EPA would be consulted frequently throughout. Another would be to restore the longitudinal consistency of the PACE data, while at the same time employing current terminology and structuring the survey in a manner consistent with establishments' ability to report the data. Serious attempts would also be made to address and overcome the concerns that have been raised about the PACE data. Over the years, numerous academic studies (including some by the authors) have cast suspicion on the quality of PACE data. The OMB, too, has expressed apprehension about the ability to collect accurate data on pollution abatement. In response, significant analyses to examine the validity of survey responses would be conducted. Issues and recommendations raised during a 2-day workshop on the PACE survey, funded by EPA, held by Resources for the Future in March 2000, and attended by over 40 experts from academia, government agencies, nongovernmental organizations, and industry, would also be considered (see Burtrawe^a/. 2001). Here, we briefly outline some of RTFs efforts (see Gallaher etal. 2006 for more detail). RTI began with a historical review of the PACE survey, a review of the literature that has raised concerns about the PACE survey, and initial thoughts on its redevelopment (Ross et al. 2004). An RTI economist and engineer also conducted four on-site interviews with establishments engaged in the production of pulp & paper, iron & steel, petroleum, and electricity. The purpose of these visits was: (1) to gain insight into the type of cost information that facilities compile that may, in turn, be used to calculate the costs associated with pollution abatement, (2) to determine the usefulness of these data for responding to the PACE survey, and (3) to solicit comments regarding the format, content, and clarity of the 1994 and 1999 versions of the PACE survey 4 ------- instrument. Consultations also began with the multi-disciplinary expert panel comprised of economists (some with significant experience with PACE data and an interest in future PACE data), an environmental engineer, and a survey design expert, with participation from several others as well. In parallel, an EPA workgroup consisting of representatives from seven of its program offices was also consulted regarding the potential survey content. From all of the above grew an early draft of a (new) PACE survey instrument and instructions. This was followed by 9 one-on-one interviews, conducted by an RTI economist and engineer, of 5 establishments and 4 industry trade associations (from the same four industries consulted at the outset), who had been sent the survey instrument and instructions beforehand. The valuable feedback obtained from these visits was discussed and debated over a series of meetings with the multi-disciplinary expert panel. The end result of these efforts was a 2004 PACE survey instrument that would be the subject of a pretest and a pilot survey (discussed in the next section). Because of data users' need for longitudinal comparability, this 2004 survey is closest in spirit to the 1994 survey, particularly in its intended definition of pollution abatement costs. However, data users' argued to keep one main feature of the 1999 survey in the 2004 (with some modification), namely the recognition of four distinct pollution abatement activities: treatment/capture, prevention, recycling, and disposal.1 Because this is merely an additional partitioning of pollution abatement costs, relative to the 1994, rather than a change in the scope of these costs, this should not impact historical comparability. The 2004 survey still asks costs by media (air, water, solid waste) and by type of cost (capital expenditure, labor costs, energy costs, materials & supplies, contract work & services, 1 Unlike 1999, environmental testing and monitoring, as well as certain administrative activities, are to be included in these four activities. Also, the concept of treatment/capture was (confusingly) called pollution abatement in 1999. Other definitional differences exist as well. 5 ------- depreciation). However, the manner in which this is done is different than in 1994. Instead of a matrix of type of cost by media, total pollution abatement operating costs would be asked as the sum of the 5 types of costs, and respondents would then be asked to report the percentage of that total attributable to each of the 4 activities, and the percentage of that total attributable to each of the 3 media. Likewise, in the case of capital expenditure, instead of a matrix of media by activity, total pollution abatement capital expenditures would be asked as the sum of the 4 activities, and respondents would then be asked to report the percentage of that total attributable to each of the 3 media. Interviews with facilities revealed that this structure is more consistent with their recordkeeping and their ability to respond. While this comes with some loss in data, data users' agreed that the matrix approach was not worth the additional respondent burden and likely item non-response. A further comparison of the 2004 survey with the 1994 and 1999 is beyond the scope of this current paper. 3. Evaluating the Performance of the Survey Instrument and Instructions To assess the performance of this revised survey instrument and instructions, and to gain approval from the OMB for the administration of a full survey for reference year 2005 and beyond, two distinct evaluation exercises were conducted. In one, 18 establishments were recruited to respond to a pretest survey and their responses were compared to estimates produced by engineers and economists during a visit to the establishment. In another, responses to a much larger pilot survey were compared to historical data, both at the industry-level and at the establishment-level. Both of these activities resulted in additional revisions and refinements to the survey instrument and 6 ------- instructions. We now describe both of these evaluations in more depth. 3.1. The PACE Pretest Survey In April 2005, OMB granted the EPA permission to conduct an innovative pretest of the 2004 PACE survey. Large- and medium-sized establishments in some of the most pollution-intensive sectors were asked to volunteer for the pretest, including those engaged in the production of pulp and paper, iron and steel, petroleum, electricity, chemicals, plastics, computers and electronic equipment, fabricated metal, and furniture. In the end, 18 establishments were recruited and given four weeks to respond to the survey. Each facility was then visited by an environmental engineer and an economist from RTI. The purposes of these visits were multifold. Specifically, respondents were asked to provide feedback on the survey instrument and instructions, including their interpretation of key concepts. They were also asked to discuss the data sources and methodologies used to respond to the survey, including their ability to reliably identify and estimate environment-related costs apart from their total costs. A walk-through of the facility was also conducted with company representatives, who were interviewed on the pollution abatement equipment and activities at the establishment. This information was subsequently used by RTI to develop independent (engineering) estimates of pollution abatement operating costs and capital expenditures. These estimates were then compared to the costs reported by the establishment on the pretest survey, lending insight into both the reportability of such data and the effectiveness of the survey instrument and instructions. For a definitive review of the findings from these on-site visits, an assessment of the pretest responses vis-a-vis the engineering estimates, and RTFs 7 ------- recommendations for improvements to the survey instrument and instructions, see Gallaher et al. (2006). 3.2. The 2004 PACE Pilot Survey On July 29, 2005, a mandatory 2004 PACE pilot survey was mailed by the Census Bureau to 2,051 establishments. The primary purpose of this pilot survey was to evaluate whether there were any systematic issues with the survey instrument and/or the ability of establishments to respond — estimates would not be produced. Given this objective, establishments and industries with significant pollution abatement activity were purposely targeted. In particular, nearly 80% of this sample was allotted to 86 six-digit NAICS industries in 5 sectors known to have major pollution abatement expenditures:2 Paper (NAICS 322), Petroleum (NAICS 324), Chemicals (NAICS 325), Primary metals (NAICS 331), and Electric power generation (NAICS 22111). The pilot sample was allocated to each of these 86 industries roughly in proportion to the number of establishments each had with 20 or more employees, while ensuring that each of these industries received a minimum of 10 survey forms and no more than 60. We also ensured that certain industries (e.g., pulp mills and petroleum refineries) were particularly well- represented, and that other "important" industries would have sufficient sample sizes to permit more robust analysis. Within an industry, larger establishments were sampled with higher probability, or, if the industry was subject to the screener (see below), establishments that claimed the largest expenditures were the first to be sampled. The remaining 20% (or so) of the pilot's sample was allocated toward 6 sectors with 2 Three industries within these sectors were exempted for having too few establishments: NAICS 325221, 331311, and 331411. 8 ------- substantial but more moderate pollution abatement expenditures: Mining (NAICS 212), Beverage & tobacco (NAICS 312), Leather (NAICS 316), Plastics and rubber products (NAICS 326), Nonmetallic minerals (NAICS 327), and Furniture (NAICS 337). Each of these sectors received between 60-112 survey forms and was sampled similarly to those in the "major" industries. By early October 2005, 1,217 establishments (59.3%) had responded and were used in the analyses we conducted.3 Here, we summarize some of our more salient findings.4 In terms of total pollution abatement operating costs (PAOC), the item non- response rate was just over 1%. Of those that did respond, the "inconsistency" rate - as defined by the total not equaling the sum of the components, or at least one of the components is missing - was 14.6%. The bulk of such cases can be (and were) remedied through rather straightforward, automated edit routines. The PAOC incidence rate - i.e., percent of cases with non-zero PAOC - was 87.3%). It is difficult to imagine that any of these establishments in these industries would not have any PAOC. Upon further investigation, we surmised that some were being untruthful. Presumably this will always be (and has always been) the case. We further examined the issue by producing the incidence rates for the 15 largest PAOC industries in 1994 and their nearest NAICS counterparts in 2004.5 Drops in incidence rates between 1994 and 2004 appear in some of these industries, most notably among pulp mills. We also discovered an issue that particularly affected the electric utilities industry (NAICS 221112). We found that the addresses of some of those sampled, and/or the remarks made 3 The eventual response rate would be approximately 71%. 4 Additional details are mostly contained in the following two mimeos by the authors, which are available upon request: "An Evaluation of the 2004 PACE Pilot Survey" (October 14, 2005) and "An Examination of Linked 1994 & 2004 PACE Establishments" (October 24, 2005). 5 Together, these industries accounted for 52% of all PAOC in the manufacturing sector in 1994. 9 ------- by respondents on the form, strongly suggested that auxiliary operations (headquarters, regional offices, etc.) were sampled, rather than facilities actually engaged in power generation. We think this explains most of the establishments in this industry that reported zero PAOC. This, and the precedence (prior to 1999) of excluding utilities and mining from the PACE survey, in part led us to eliminate these sectors from the scope of the PACE survey for 2005 and beyond. As before, the survey now focuses only on manufacturing industries. Using pilot respondents with minimal or no inconsistencies in their reported PAOC, and with usable reported value of shipments (VS), we computed 2004 PAOC/VS ratios for the same 15 largest PAOC industries and compared these to the corresponding 1994 PAOC/VS ratios based on published aggregates.6 The 2004 ratios are lower in all industries, sometimes substantially lower. These lower ratios do not appear to be driven by (low) outliers; more often than not, the median plant in each industry has an even lower ratio. We may believe that these lower ratios reflect actual changes in pollution abatement costs in these industries. Another possibility - and potential concern - is that establishments may be systematically excluding certain classes of expenditures, relative to what they were reporting in 1994. To explore this possibility, we examined distributions, incidence rates, and cost/VS ratios of PAOC by: type (salaries & wages, fuels & electricity, materials & supplies, contract work, etc.); activity (treatment, prevention, disposal, recycling); and medium (air, water, solid waste, multimedia). In terms of PAOC by type, we found that most industries experienced declines in expenditure ratios between 1994 and 2004 across all 6 For full comparability, we excluded depreciation costs from the 1994 ratios, but included payments to governments. The 2004 industry-level statistics were based on as few as 5 establishments up to 28, with the median industry having 17. 10 ------- four types of cost. While there are dramatic cases in each of the four types of cost, declines appear to be largest and most pervasive vis-a-vis materials & supplies and with contract work, leasing, and other purchased services. We also found a particularly sharp decline in spending on water pollution abatement, relative to the other media. In terms of total pollution abatement capital investment (PACI), the item non- response rate was under 1%. Of those that did respond, the "inconsistency" rate was 6.3%. As with PAOC, the bulk of such cases can be (and were) remedied through rather straightforward, automated edit routines. Because PACI occurs more irregularly than PAOC, even among heavily regulated establishments, it is more difficult to evaluate the nature and quality of these responses. Nonetheless, we found that pilot respondents had a PACI incidence rate of 54.7%. Using respondents with minimal or no inconsistencies in their reported PACI, and with usable reported value of shipments (VS), we computed 2004 PACI/VS ratios for the 15 largest PAOC industries and compared these to the corresponding 1994 PACI/VS ratios based on published aggregates.7 In all but two of these industries, the ratios are less than what they were in 1994, often dramatically so. Perhaps this is plausible. But, as with operating costs, it may be possible that establishments are excluding certain classes of capital expenditures. We next examined distributions, incidence rates, and investment/VS ratios of PACI by activity and by medium. In terms of PACI by activity, we found a decrease in the proportion of PACI devoted to "end of line" techniques (treatment and disposal). Over this 10-year period, one might have anticipated this relative shift toward prevention- related capital investment. In terms of PACI by medium, we found that most industries 7 Ideally, the denominator would be total establishment capital expenditure, but these numbers were not available. Based on our recommendation, total capital expenditure was added to the PACE survey of 2005 and beyond. 11 ------- experienced declines in expenditure ratios between 1994 and 2004 across all three media. While there are dramatic cases in each of the three media, declines appear to be largest and most pervasive vis-a-vis water and also with solid waste. Concerned by the potential compositional differences between the 1994 and 2004 PACE samples - even within an industry (e.g., plant size, geography, product mix, etc.) - we turned to analyzing just establishments that were in both survey years. Because the 1994 and 2004 files have no establishment-level identifiers in common, a number of intermediate steps were necessary to link records longitudinally. We then restricted our attention to those establishments who had actually responded to both surveys (and the 1994 Annual Survey of Manufactures) and had "usable" data in both years, leaving a sample of 444 establishments.8 With our sample of 444 establishments, for each expenditure category, we computed within-establishment "modified" percentage changes in (a) nominal dollar expenditures, (b) real dollar expenditures, and (c) ratio of expenditure to value of shipments.9 We then examined the mean, median, 25th, and 75th percentiles of these various measures. Our findings are perhaps best summarized by Table 1, which shows the average modified percent change between 1994 and 2004 in various expenditure categories. Table 1: Average Within-Establishment Change Between 1994 and 2004 (N=444) Real Expenditure 8 Our mimeo from October 24, 2005 contains further details on our treatment and editing of the microdata, adjustments necessary to make the data longitudinally comparable, checks on the quality of the longitudinal match, and potential limitations and caveats regarding our analyses. 9 "Modified" percent change = (X2004 - X1994) / (0.5* (X2004 + X1994)). Real expenditures are calculated using the GDP implicit price deflator, as published in the August 2005 issue of the Survey of Current Business. The price deflator implies a price change of +20.87% between 1994 and 2004. 12 ------- expenditures to VS ratio Total PAOC (less depreciation expenses) -21% 11% Salaries & wages Energy costs Material & supplies Contract work & services (including government services) +28% +30% -26% -46% +35% +34% -19% -38% Depreciation expenses 9% 6% Air Water (including gov't industrial sewage service) Solid waste (including gov't collection/disposal) +14% -49% -48% +19% -40% ¦35% We see that this set of establishments reported less total PAOC, lower materials & supplies, and much lower contract work & services in 2004 than in 1994. They also reported less water and solid waste PAOC. Costs that increased over this time period - both in real terms and as a share of total output - include salaries & wages, energy costs, and PAOC devoted to air emissions. These results are not inconsistent with what our earlier analyses had showed, though here we are obviously controlling for various aspects of the composition of the sample. 3.3. Subsequent Revisions and Additions to the Survey Instrument and Instructions Unless we believe that these declines reflect real changes in cost intensity in these industries, these results may point to particular costs that were not reported the same way in 2004 as they had been in 1994. We were not alone in this assessment. We found the results compelling enough to undertake a thorough review and comparison of the 2004 and 1994 surveys. Our review identified numerous areas where we believe the 2004 pilot survey instrument and instructions are not as clear and explicit as those of the 1994 13 ------- PACE. This led us to recommend substantial revisions and additions to the survey instrument and instructions that were subsequently incorporated in the 2005 PACE survey. The specific changes that were made are much too numerous to recount here; they can be seen most clearly by comparing the 2004 and 2005 surveys.10 We will try to highlight some of the main changes. In terms of the survey form itself, we added various cues regarding costs that should be included (and that may have been under-reported in the pilot). For example, bullets highlighting the need to report certain "incremental costs" were prominently added to the form, which also includes references to where in the instruction booklet one can find relevant definitions and examples. We felt that this would help improve the reporting of several items, especially PACI, materials & supplies, and energy costs. We also added to the survey form some brief, additional detail on the types of items to be included in certain categories, most notably materials & supplies and contract work & services, both of which may have been under-reported in the pilot, and which may also directly explain the drop in spending on water and solid waste abatement. Among other changes, we made a point to add specific references throughout the form to indicate exactly where in the instruction booklet one can find relevant definitions, instructions, and examples. We felt that this would help respondents with particularly difficult concepts, such as "primary purpose" and "incremental costs", but also with identifying different types of pollution abatement costs, pollution abatement activities, and pollution media. In parallel, the survey instruction booklet was substantially revised and reorganized. 10 A series of lengthy mimeos from October and November 2005 by one or both authors, addressed to the other experts on RTFs panel, speaks to many of our recommendations. Together with Cynthia Morgan of the EPA, the authors developed and incorporated further improvements to the survey instrument and instructions during the final round of revisions in early 2006. 14 ------- We felt that a more logical, linear layout to the instructions, together with a more extensive table of contents, would make it much easier for respondents to navigate and find the information they need. And, as with the survey form, specific references were carefully added throughout this document to indicate exactly where one can find further relevant information, including related definitions and additional examples. Perhaps most critically, instructions were considerably expanded throughout. In particular, explicit lists of items that ought to (and ought not) be reported in costs were developed and added to the instructions, including those for capital expenditure by activity category (treatment, prevention, recycling, disposal), labor costs, energy costs, materials & supplies, and contract work & services. In addition, examples illustrating how to report particularly difficult-to-report costs were developed and judiciously added to the instructions. These include examples of reporting air pollution control devices, quantities of wastewater, quantities of solid waste, incremental PACI, PACI by medium, incremental labor costs, incremental materials & supplies costs, labor costs, incremental fuel costs, and estimated energy costs. In addition, several critical definitions and concepts were elaborated upon and refined, to more accurately reflect exactly what we hope to measure. Some of the existing illustrative examples were also refined. The intent of these cumulative changes to the survey form and instructions is to prevent under-reporting - and misreporting, more generally - due to unclear, less-than-explicit, and less-than-complete instructions, which may have been an issue in the 2004 pilot survey, particularly in some of the noted areas. In addition to the changes just summarized, our analyses of data from the pilot survey led to a number of other changes. In terms of depreciation expenses, whose 15 ------- inclusion in the PACE survey was conditional, we found that the level of item non- response was not surprising given the item's placement toward the end of the form and that the incidence rate was comparable to those of the 4 other types of operating costs. Furthermore, we found that depreciation expenses for the 15 largest PAOC industries were nontrivial, ranging from 14-42% of (recomputed) total PAOC, and not entirely dissimilar to what they had been in 1994. Moreover, the implicit depreciation rates (i.e., deprecation expenses as a percent of the book value of pollution abatement capital) seemed entirely plausible (with perhaps a couple of exceptions), ranging from 0.7% to 7.8%), with most industries in the 4-5% range. We felt confident enough about the reportability of this item to recommend that it be retained and, critically, included with the other 4 types of operating costs to comprise total PAOC, which is consistent with the historical definition of PAOC. Similarly, we found that respondents appeared to be better able to respond to book value of pollution abatement capital than might have been expected initially - a fact confirmed by RTFs site visits. Subsequently, this item was retained (in a reworded form) and moved to the PACI section of the survey. Meanwhile, we found the incidence rate of costs related to product redesign/reformulation to be very low and that the incidence does not appear to be widespread across industries, with relatively few of the industries in the sample having at least one establishment with such costs. By far, the major industry with the highest incidence rate was petroleum refineries, no doubt related to their production of reformulated gasoline. For the mean and median establishments in most industries, product redesign costs relative to PAOC [PACI] were fairly trivial, but there were instances where these costs were substantial - most notably, again, in petroleum refining. 16 ------- Given the extremely low and concentrated incidence of these expenditures, an argument could be made for the removal of this item. However, its removal may bias the reporting of "traditional" PAOC and PACI in certain important industries, as establishments may look for an outlet to report these large costs. It was decided that this item would be retained on the PACE survey, even though it will not be tabulated and published. Our analyses did lead to other items being removed from the survey. In particular, we found the incidence of the number and value of tradable SO2 and NOx permits, at under 2%, much too low to justify its continued collection.11 We also found that the questions asking the percentage of total PAOC [PACI] devoted toward hazardous pollutants yielded responses that strongly suggested that many establishments did not use the definition of hazardous pollutants that was provided. In particular, 0%, 1%, 2%, 3%, and 5% were common responses, but so were 100% and 50%. It was felt that these items were not critical enough to retain and to attempt any modifications to improve responses. The multimedia category was also removed as a type of media. Besides being ambiguous (almost by definition), the site visits revealed that establishments could usually apportion costs to the 3 media. Changes were also made the Facility Information section of the survey. In particular, we saw no reason for the continued collection of production capacity and actual production in units. The responses seemed usable only in certain industries and even then the information was not necessarily easy to use. Instead, we believe total value of shipments, as defined on the ASM, to be sufficient for the purposes of the PACE survey. Similarly, we argued for limiting questions on establishment employment to just 11 Meanwhile, the more traditional environmental permits & fees item had an incidence rate of 74% and revealed costs that were fairly significant (relative to PAOC). This permits & fees item was retained on the PACE survey. 17 ------- total employment, as defined on the ASM. And we added total capital expenditures to the form, to assist with the editing and imputation of PACI. Finally, we had these ASM-type questions moved to before the form skip, so that this information is collected of all establishments, including those who rightly or wrongly believe they have no environmental expenditures to report. In fulfillment of the terms of clearance, the analyses from the pretest and pilot surveys, together with a revised survey instrument and instruction booklet, were submitted to OMB in late November 2005. In early December 2005, the authors - together with others from EPA - provided an oral presentation of these materials to OMB, highlighting the ways in which the issues with the pilot survey instrument had been addressed. Satisfied, OMB offered their approval to conduct a full PACE survey for reference years 2005-2007. The 2005 PACE survey was mailed out in April 2006. 4. Making Effective Use of the Sample Given the resources available, the sample size for the 2005 PACE survey had to be limited to approximately 20,400 of the over 350,000 manufacturing establishments in the United States. Decisions had to be made on how to best allocate this sample within and across the 473 six-digit NAICS manufacturing industries theoretically in scope to the survey and whether there were any "sample saving" measures that could be implemented with relatively little sacrifice. Within an industry, larger establishments - or, more accurately, establishments suspected to have higher environmental expenditures - would be sampled more heavily, as is typical in such surveys. The screener survey (described below) aided in this effort, 18 ------- in industries that were in scope to the screener. Beyond that, it was decided that the approximately 242,000 manufacturing establishments with fewer than 20 employees would be exempt from sampling. With the exception of the surveys of 1973-1976 and 1999, this group has traditionally been excluded from PACE sampling, and since 1980 (with the exception of 1999) estimates have not accounted for this particular group. Becker and Shadbegian (2005) estimate that such establishments accounted for 3.0% of the environmental expenditure in the entire manufacturing sector in 1999. We were also willing to sacrifice the ideal of producing expenditure estimates for each of the 473 six-digit NAICS manufacturing industries, in order to achieve higher quality estimates in the industries that remain.12 We decided that establishments and industries in NAICS 315 (Apparel manufacturing) would be out-of-scope to the 2005 PACE. With the exception of the 1999 survey, this industry subsector has traditionally been excluded from the PACE survey because of relatively negligible environmental expenditure. This fact is confirmed by the 1999 data. This removed from sampling 24 six-digit NAICS industries with over 13,000 establishments, approximately 3,300 of which had more than 20 employees. Rather than eliminate any additional industries, we instead sought opportunities to curtail industry detail, from the six-digit NAICS level up to the five-digit, four-digit, or three-digit NAICS level. We did so using three main guiding principles: (1) An effected six-digit NAICS industry should have relatively small levels of environmental expenditures. 12 As we already noted above in Section 3.2, we also decided not to include any non-manufacturing industries, such as those engaged in mining (NAICS 21) and electric power generation (NAICS 22111). These had been included for the very first time in the 1999 PACE survey, and they were also included in the 2004 pilot survey. 19 ------- (2) An effected six-digit NAICS industry should have relatively low "intensity" of environmental expenditures, as measure by dollars of environmental expenditures per dollar of total value of shipments. Industries with intense expenditures are of interest to researchers, even if their aggregate expenditures are relatively low. (3) The effected six-digit NAICS industries within the five-digit [four-digit, three- digit] NAICS should all be relatively homogenous in terms of their intensity of environmental expenditures, in addition to having relatively low expenditure intensities. If a three-digit, four-digit, or five-digit NAICS can satisfy these three conditions, it can be argued that not much information is lost by sacrificing the underlying six-digit NAICS detail. Our analysis began with 1994 pollution abatement operating costs (PAOC) and 1994 value of shipments (VS) for each of the 448 four-digit SIC industries in scope to the 1994 PACE. We then used these data in conjunction with the 1997 SIC-NAICS bridge file (with VS-based weights) to convert the 1994 data to the NAICS basis.13 Table 2 shows the approximate distributions of PAOC intensity and PAOC (in millions of dollars) across six-digit NAICS industries. Table 2: Distribution Across Six-Digit NAICS Industries (N=427) Min. 25% 33% Med. 66% 75% Max. PAOC/VS 0 0.0016 0.0020 0.0028 0.0045 0.0056 0.0502 PAOC 0 4.5 6.6 12.2 19.6 28.9 2,842.3 We first examined whether there are any three-digit NAICS industries in which 13 Additional details regarding the analyses discussed in this section are contained in the following mimeo by the authors, which is available upon request: "Proposal for Tabulation, Industrial Stratification, and Industrial Prioritization in the 2005 PACE Survey" (December 14, 2005). 20 ------- four-, five-, and six-digit NAICS detail could potentially be sacrificed. We found that, while there are certainly three-digit NAICS industries with relatively low levels of PAOC and relatively low PAOC intensity, each of these had at least one above-median six-digit NAICS industry, in terms of its PAOC intensity. We therefore decided against "rolling back" any industrial detail to the three-digit NAICS level. Next we examined whether there are any four-digit NAICS industries in which five- and six-digit NAICS detail could potentially be sacrificed. Honoring our guiding principles from above, we identified four-digit NAICS industries which: (1) had less than $52 million of PAOC [i.e., the bottom third of the distribution for four-digit NAICS industries], (2) had a PAOC intensity of less than 0.0020 [the bottom third of the distribution for six-digit NAICS industries, as seen in the table above], and (3) have no six-digit NAICS industries in the top two-thirds of the six-digit NAICS PAOC intensity distribution [i.e., all the component six-digit NAICS industries had a PAOC intensity of less than 0.0020], There were 8 four-digit NAICS industries satisfying these conditions, and with multiple six-digit NAICS industries that can be sacrificed.14 Finally we examined whether there are any five-digit NAICS industries in which six-digit NAICS detail could potentially be sacrificed. The exercise conducted is similar to the one just described except that a cutoff of $22.5 million was used in (1) - i.e., the bottom third of the distribution for five-digit NAICS industries. There were 6 five-digit NAICS industries satisfying the three conditions, and with multiple six-digit NAICS industries that can be sacrificed.15 The net result of these rollbacks of industrial detail is a total of 412 industrial 14 The industries are NAICS 3141, 3169, 3332, 3335, 3341, 3342, 3353, and 3379. 15 The industries are NAICS 31182, 31491, 33391, 33392, 33592, and 33993. 21 ------- categories in the 2005 PACE publication. This represents a significant reduction in the published industrial detail relative to the 1999 PACE survey's 506 industries, while maintaining a roughly similar sample size to that survey. It also represents a reduction relative to the 1994 PACE survey's 428 industries, which also had a sample size that was 13% smaller than the 2005 PACE. We believe this reduction in industrial detail could occur without much sacrifice in the richness of the data and will result in better estimates. Beyond this, we support the notion that the environmental expenditures of some manufacturing industries are of greater interest to policymaker and researchers than those of other industries. We therefore chose to prioritize industries - into high, medium, and low importance - and devote relatively more of the sample to industries of greater interest in order to achieve better estimates (i.e., relatively lower expected standard errors). To prioritize industries, we employed the same two measures as we did above: PAOC and PAOC intensity. We began by mapping our 412 industries into the bivariate distribution of PAOC and PAOC intensity, by tertile. Table 3 shows the count of industries in each cell. Table 3: Count of Industries (N=412) PAOC a - top third b - mid third c - bottom third A - top third 78 44 16 B - mid third 37 54 46 C - bottom third 23 39 75 It is rather easy to classify the 78 industries in Aa as High priority. Likewise, the 75 industries in Cc are easily classified as Low priority. Beyond that, designations of High 22 ------- and Low are somewhat more difficult - or at least more subjective. We personally see value in having better estimates (and relatively more observations) in industries that are in the top tier of PAOC intensity, and therefore designated the 44 industries in Ab as High priority. We however reclassified 4 of these industries in Ab that are in "residual" industries - i.e., six-digit NAICS industries ending in a 9 - as Medium priority. Beyond that, we recognized a need for better estimates in industries in Ba and Ca because of their relative importance in the manufacturing-wide aggregate. However we did not wish to include all 60 (37+23) of these industries in the High priority group. Instead, we classified just the 11 industries in Ba and Ca with more than $77 million in PAOC, which is roughly the top decile of PAOC in these 412 industries. This yielded 129 High priority industries (78+44-4+11), which we find accounted for about 80% of manufacturing-wide PAOC in 1994. In terms of additional Low priority industries (beyond the 75 in Cc), we again see relatively more value in having better estimates in industries that are relatively more PAOC intensive. We therefore recommended relatively less allocation toward the 39 industries in Cb, for a total of 114 (75+39) Low priority industries. These industries accounted for less than 4% of manufacturing-wide PAOC in 1994. In turn, the remaining 169 Medium priority industries accounted for 16% of manufacturing-wide PAOC. 5. Innovations in Sampling In recognition of the fact that pollution abatement expenditures are typically unevenly distributed across industries and oftentimes across establishments within industries (e.g., relative to production), innovations in sampling were also introduced into 23 ------- the PACE survey. In particular, the measure of size (MOS) used in PPS (probability proportional to size) sampling and weighting was allowed to vary by industry. And in industries with no satisfactory MOS and/or with low expected incidence of PACE expenditures, a screener was sent to establishments, in order to better target subsequent sampling. We now describe these efforts in more depth. 5.1. Industry-specific Measure of Size (MOS) for Sampling and Weighting A challenge in drawing a sample for the PACE survey is that pollution abatement expenditures are not necessarily well correlated with total value of shipments (VS) - a measure of size (MOS) that is typically used in sampling and weighting in surveys such as this. We were asked by the survey's statisticians to explore this matter. Using establishment-level PAOC from the 1992 PACE, combined with data from the 1992 Census of Manufactures, our preliminary research showed that the correlation between PAOC and VS is just 0.4453. Meanwhile, PAOC exhibits higher correlations with the cost of fuels (0.6789), machinery assets (0.6507), and cost of materials (0.4883). This high correlation with cost of fuels (CF) would make some sense, since fuel combustion is a highly polluting activity. Machinery assets (MA), meanwhile, is not only highly correlating with PAOC but, not surprisingly, also with CF. Regression analysis shows that CF and MA play independent roles in determining PAOC. We also discovered that the best correlate with PAOC varied by industry. This led us to search for an industry-specific MOS to be used in sampling and weighting, for the 24 ------- 412 industries in scope to the survey.16 After initially considering a dozen different production-related variables from the Census of Manufactures, we decided to limit our focus to just three possible MOS: VS, CF, and cost of materials (CM).17 We began by linking data from 1992 PACE respondents to data they reported in the 1992 Census of Manufactures. This match yielded 13,567 establishments. Next, we computed the pairwise correlation statistics between PAOC and the three possible MOS, by 4-digit SIC industry. To reduce the influence of potential outliers, we did two things. First, we removed from our calculations the top two and bottom two observations within each industry in terms of the ratio of PAOC to the respective variable of interest (i.e., VS, CF, or CM). Second, we removed from our calculations the top observation within each industry in terms of PAOC as well the top observation within each industry in terms of the variable of interest (which may in fact be the same observation and may have already been eliminated by the prior ratio restriction). Because the PACE survey will now obviously be collected on a NAICS basis, we needed to convert the above correlation statistics from an SIC basis to NAICS. We did so using the SIC-NAICS concordance with weights based on 1997 value of shipments.18 We made the appropriate adjustments to the concordance and the respective weights so that 16 Additional details regarding the methodology and findings discussed in this section are contained in the following mimeo by the authors, which is available upon request: "An Industry-Specific Measure of Size for PACE Sampling and Weighting" (January 6, 2006). 17 We chose not to consider MA, as we had in preliminary research, because it is generally considered to be among the more poorly measured and edited variables in the Census of Manufactures. Also, this variable is somewhat unusual in that it is an accumulation of various investments measured in current (rather than constant) dollars. Therefore, two establishments with identical MA need not be of comparable size; one could in fact be considerably smaller but its capital investments occurred in more recent years. That this variable captures - to a certain extent - both size and/or vintage may perhaps explain why it is sometimes well correlated with PAOC, since environmental regulation is often targeted toward larger establishments and toward recent capital improvements (i.e., older capital equipment and establishments are often exempt from regulations). While intriguing, we felt that this relationship between PAOC and MA needs to be better researched before adopting it as a MOS, particularly for the purposes of weighting. 18 See http://www.census.gOv/epcd/ec97brdg/INDXNAI3.HTM#31-33. 25 ------- the conversion yielded the 412 industry groupings discussed above in Section 4 (as opposed to all 449 in scope six-digit NAICS manufacturing industries). The resulting NAICS-based correlation statistics are an appropriate weighted average of the SIC-based statistics. We note that some of our 412 industries have no correlation statistics because the relevant SIC(s) were out-of-scope to the 1992 PACE survey. Likewise, correlation statistics for some of our 412 industries are based on just the portion of the industry that came from in scope SICs. We then used the following criteria to assign a MOS to an industry. We note this set of criteria is somewhat "conservative" in that we use VS as the default MOS, unless there is compelling evidence to use an alternative measure. VS is, after all, what would be used for all industries in the absence of this exercise. First, if an industry's correlation statistic is based on fewer than 10 plants (less than 4-6 plants, once outliers have been removed) we assigned VS as the MOS, regardless of which measure actually has the highest correlation with PAOC. We simply had no confidence in choosing an alternative other than VS based on so few observations. Second, if an industry's best correlate is VS, we assigned VS as its MOS — this is uncontroversial. We will note however that VS is not necessarily well correlated with PAOC in all these cases. It is simply the best correlated of the three alternatives. Third, if an industry's correlation between VS and PAOC is at least 0.7, we assigned VS as its MOS, regardless of which measure actually has the highest correlation with PAOC. We deemed 0.7 a fairly strong correlation, so that we saw no particular need to adopt one of the other two alternatives, especially since they are typically less well measured and less well edited than is VS. Adopting either of the other two variables would introduce some 26 ------- measurement error in these industries that (arguably) is not worth the apparent improvement in correlation. Fourth, if an industry's correlation between VS and PAOC is less than 0.7 and one of the alternate measures (CF or CM) has a better correlation, we assigned the alternate MOS, but only if the industry's statistics are based on at least 15 plants (9-11, once outliers have been removed) and the improvement in the correlation when compared to that of VS is at least 0.1 points. Given that CF and CM are typically less well measured and less well edited than is VS, we saw no particular need to adopt one of these alternative measures if the apparent improvement in correlation would be rather modest. Adopting either of the other two variables would introduce some measurement error in these industries that (arguably) is not worth the apparent benefit.19 Finally, we also adopted the alternate MOS when the improvement in the correlation is more modest (i.e., less than 0.1 points) if adopting that measure would result in a "relatively significant" improvement in the rank correlation (over VS).20 While the MOS'spairwise correlation is an important factor in both sampling and weighting, an improvement in the rank correlation would seem to be particularly beneficial during sampling (e.g., identifying cases with potentially large PAOC), particularly in industries that were not in scope to the screener. The net result of these criteria is that VS was assigned as the MOS in the vast majority of the 412 industries (n=331), followed by CF (n=56) and CM (n=25). The 19 For industries with 10-14 establishments, we chose a more conservative criteria: If an industry's correlation between VS and PAOC is less than 0.5 and one of the alternate measures (CF or CM) has a better correlation, we assigned the alternate MOS, but only if the improvement in the correlation when compared to that of VS is at least 0.3 points. The argument is the same as above, except that we demanded a larger change from a lower point in order to feel comfortable assigning either CF or CM as the MOS when so few observations are present. 20 No outliers were dropped in the computation of rank correlation statistics. 27 ------- median industry here has a MOS that is correlated 0.698 with PAOC, with the 25th percentile at 0.561 and the 10th percentile at 0.446. If we had simply chose VS as the MOS for all industries, the median industry would have VS correlated 0.630 with PAOC, with the 25th percentile at 0.493 and the 10th percentile at 0.327. The improvements obviously come from the 81 industries in which CF or CM was chosen as the MOS. It is probably well worthwhile to update this analysis and our choice of industry- specific MOS once data from the new PACE survey are available. 5.2. PACE Screener It was also decided that a PACE "screener" survey would be helpful in sampling. This short survey, mailed months prior to the anticipated mailout of the full 2005 PACE survey, collected some coarse information about an establishment's PAOC in 2004 and its anticipated PACI in 2005. In particular, each establishment was asked to check whether those two expenditures were in the range of $l,000-$25,000, $25,000-$ 100,000, or over $100,000. This information would then be used to stratify establishments within an industry into expenditure groups (in addition to non-respondents to the screener and non-mailed cases), with the intent of sampling those with more expenditures more heavily, in order to produce higher quality estimates. The screener also gathered information on the person at the establishment to be contacted regarding PACE, which presumably would reduce the response time on the full survey, if the establishment is sampled. Two types of industries would be targeted by the screener: industries with low 28 ------- PAOC incidence rates and industries without a good MOS.21 (Some industries may fall into both groups.) To identify those in the first group, we used the 14,621 respondents to the 1994 PACE and computed the proportion of an industry's establishments that had non-zero PAOC, for each of the 436 four-digit SIC industries found in this sample. The median industry had an incidence rate of 87.5%. 127 industries (29%) had an incidence rate of 100%. Because the PACE survey will now obviously be collected on a NAICS basis, we needed to convert the above incidence rates from an SIC basis to NAICS. We did so using the SIC-NAICS concordance with weights based on the number of establishments in 1997 classified in SIC-NAICS pairs.22 We made the appropriate adjustments to account for NAICS industries that are based, in part, on non- manufacturing SICs that were out-of-scope to the 1994 PACE survey. There were also some NAICS industries that are based entirely on non-manufacturing SICs that were out- of-scope to the 1994 PACE survey (e.g., electric utilities, mining, retail bakeries, etc.). Because we do not know even a minimal amount about their PAOC incidence rates, we chose to add all of them to the list of industries to be screened. Of the six-digit NAICS industries in scope to the PACE survey, 103 had an incidence rate of 100%. The median again was 87.5%. While somewhat arbitrary, we flagged for screening those six-digit NAICS industries with an incidence rate of under 75%. Together with those industries for which we do not have incidence rates, this yielded 150 industries subject to the screener. To this we added industries without a good MOS, and in particular, those industries in which the chosen MOS (see above) had a correlation with PAOC of less than 0.6, or if 21 Additional details regarding the methodology discussed in this section are contained in the following mimeo by the authors, which is available upon request: "NAICS Industries to Include in (and Exempt from) PACE Survey Screener" (March 9, 2005). 22 See http://www.census.gOv/epcd/ec97brdg/INDXNAI3.HTM#31-33. 29 ------- the correlation statistic was based on relatively few observations. This criterion yielded 142 six-digit NAICS industries, of which 44 were previously flagged. Therefore, 248 six- digit NAICS industries were identified to receive the screener. These 248 industries have approximately 70,000 establishments with 20 or more employees. Within each of these industries, these establishments were ranked by their MOS, and all the largest establishments were sampled until 80% of the industry's MOS was covered. A random l-in-10 sample was then taken of the remaining (smallest) establishments in the industry. All told, 29,064 establishments were mailed the screener in May 2005, and in July the screener was sent again to some of the most critical non- respondents. By September, an unweighted response rate of 69.4% had been achieved.23 6. The Development and Role of Subject Matter Experts and Experienced Data Users Finally, we wish to note the rather innovative use - throughout the redevelopment of the PACE survey - of outside experts with both subject matter expertise and extensive experience with historical PACE data. One area where this expertise is valued is in helping develop editing and imputation methodology for the newly developed survey. Because the structure, content, and processing of the 2005 PACE survey is so very different from previous PACE surveys, editing and imputation routines must be developed from scratch. And because of complexities inherent in the environmental expenditures of businesses, the PACE survey poses very unique challenges in both these areas — so much so that typical editing and 23 Weighted response rates by industry as well as an analysis of screener responses are contained in the following mimeo by Stacey Cole: "Analysis of the PACE Screener" (September 28, 2005). 30 ------- imputation schemes may not always be appropriate. For example, as noted in the previous section, environmental expenditures by U.S. manufacturers are sometimes only relatively weakly related to the size and industry of an establishment. Instead, environmental expenditures are more closely linked to the degree of environmental regulation faced by the establishment, which in turn is a complex function of a plant's industry, size, pollution profile, location, vintage, specific technologies, fuel usage, specific input usage, investment patterns, political & economic influence, and so forth. So, for example, a rather large plant may have relatively small environmental expenditures if it is "grandfathered" from various environmental regulations (because of the vintage of its installed equipment) and/or is located in a relatively lax state or locale (perhaps because it is sparsely populated and/or relatively unpolluted). A deep understanding of environmental regulation, who it affects, how its been changing, and how it impacts their PACE-related costs can be tremendously helpful, not only in the development and evaluation of the survey instrument and instructions, as described in Sections 2 and 3, but also in developing editing and imputation specifications. A great deal of this knowledge - regarding the nature of environmental expenditures - has been openly fostered over the past 25 years by the U.S. Census Bureau through its Center for Economic Studies (CES) and its Research Data Center (RDC) program, whereby confidential, historical, longitudinally-linked establishment- level microdata (from the Census of Manufactures, ASM, PACE, and a whole host of other business surveys) are made available to qualified social scientists at one of a number of secure facilities (currently 9) located across the United States. With these data, these research associates - mainly academic economists - produce research destined for 31 ------- academic journals and books. The Census Bureau's primary purpose in encouraging such research is to better understand the quality of its data through their intensive and extensive use in investigating real world phenomena. What these researchers discover in the course of their research with the establishment-level microdata may suggest better methodologies for producing the published aggregate estimates. Another obvious byproduct is a network of past and present research associates with rare and often extensive knowledge of Census Bureau survey microdata from their years of research experience, including important knowledge of historical aspects of these data. Furthermore, they possess a special understanding of how the data relate (or should relate) to specific economic phenomena being measured as well as the data's place in the larger economic context. It is this expertise that has been tapped into, by both the Census Bureau and EPA, for the development of the survey instrument, development of editing & imputation methodology, specification of tables to be published, and so forth.24 This has occurred through workshops and through continuing consultation with these experts throughout the redevelopment process. We consider this a perfect example of the use of the intellectual capital that the Census Bureau has purposefully cultivated over the years through its Center for Economic Studies and RDC network. 24 We will also note that both authors were RDC-based research associates, well before our current affiliations with the Census Bureau's CES and the EPA, respectively. 32 ------- References Becker, Randy A. and Ronald J. Shadbegian. "A Change of PACE: Comparing the 1994 and 1999 Pollution Abatement Costs and Expenditures Surveys," Journal of Economic and Social Measurement, 30(1), 63-95, 2005. Burtraw, Dallas, Alan Krupnick, Richard Morgenstern, William Pizer, and Jhih-Shyang Shih. "Workshop Report: Pollution Abatement Costs and Expenditures (PACE) Survey Design for 2000 and Beyond," Resources for the Future Discussion Paper, 01-09, March 2001. Gallaher, Michael P., Brian C. Murray, Rebecca L. Nicholson, and Martin T. Ross. Redesign of the Pollution Abatement Costs and Expenditures (PACE) Survey: Findings and Recommendations from the Pretest and Follow-up Visits. Research Triangle Park, NC: RTI International, December 2006. Iovanna, Rich, Kelly Maguire, and A1 McGartland. "The Pace of the PACE at the U.S. Environmental Protection Agency," Association of Environmental and Resource Economists Newsletter, 23(2), 21-24, November 2003. Ross, Martin T., Michael P. Gallaher, Brian C. Murray, Wanda W. Throneburg, and Arik Levinson. PACE Survey: Background, Applications, and Data Quality Issues (Draft Report). Research Triangle Park, NC: RTI International, July 2004. 33 ------- |