NMPB* Office of Research and Development Center for Public Health and Environmental Assessment Public Health and Environmental Systems Division vvEPA United States Environmental Protection Agency EPA/600/R-20/367 December 2020 www.epa.gov/ord Environmental Quality Index 2006-2010 Technical Report 1 WW ------- ------- v>EPA EPA/600/R-20/367 United States Environmental Protection Agency ENVIRONMENTAL QUALITY INDEX 2006-2010, Technical Report Public Health and Environmental Systems Division Epidemiology Branch Chapel Hill, NC Office of Research and Development Public Health and Environmental Systems Division ------- Acknowledgments Project Personnel Danelle T. Lobdell, United States Environmental Protection Agency (EPA), Office of Research and Development (ORD), Center for Public Health and Environmental Assessment (CPHEA) Kristen M. Rappazzo, EPA, ORD, CPHEA Stephanie DeFlorio-Barker, EPA, ORD, CPHEA Alison K. Krajewski, Oak Ridge Institute for Science and Education (ORISE) Postdoctoral Grantee Lynne C. Messer, Oregon Health and Science University-Portland State University School of Public Health, Support Contractor Jyotsna S. Jagai, University of Illinois at Chicago, Support Contractor Christine L. Gray, Duke University, ORISE Grantee Monica P. Jimenez, Oak Ridge Associated Universities (ORAU) Student Services Contractor Achal Patel, ORAU Student Services Contractor Barbara Rosenbaum, General Dynamics Information Technology/Woolpert, Inc. (GDIT/Woolpert), Geographic information systems (GIS) Contractor Support Steven Jett, GDIT/Woolpert, GIS Contractor Support External Peer Reviewers Sheryl Magzamen, Department of Environmental and Radiological Health Sciences at Colorado State University's College of Veterinary Medicine and Biomedical Sciences. Anne M. Roubal, County Health Rankings & Roadmap at University of Wisconsin Population Health Institute Ying Zhou, Infant Outcomes Monitoring, Research, and Prevention Branch, Division of Birth Defects and Infant Disorders, National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention Internal Peer Reviewers Tom Brody, EPA Region 5 Linda Harwell, EPA ORD, Center for Environmental Measurement and Modeling This document has been reviewed by the U. S. Environmental Protection Agency, Office of Research and Development, and approved for publication. Mention of trade names or commercial products does not constitute endorsement or recommendation for use. ------- Table of Contents 1.0 Overview of Report 1 2.0 Background 3 Brief Overview of EQI 2000-2005 3 EQI 2000-2005, Summary of Creation 3 3.0 Development of the EQI 2006-2010 5 Overview 5 Data Source Identification and Review 5 Approach 5 Summary of Activities 6 Built-Environment Domain 12 Summary of Changes to 2006-2010 data sources from original 2000-2005 EQI 13 Variable Construction 13 Approach 13 Summary of Activities 14 Changes to 2006-2010 variable construction from original 2000-2005 EQI 19 Data Reduction and Index Construction 21 Overall Approach 21 Results 23 Changes to 2006-2010 index construction from original 2000-2005 EQI 36 Domain-Specific Index Description and Loadings on Overall EQI 37 4.0 Discussion 39 Summary of changes made to 2006-2010 version compared with 2000-2005 39 Strengths and Limitations 39 Conclusion 41 5.0 References 43 Appendix I: List of References Related to 2000-2005 Environmental Quality Index A-l Appendix II: Identified Variables by Source for Each Domain B-l Appendix III: Changes in Variables from EQI 2000-2005 to EQI 2006-2010 C-l Appendix IV: Table of Highly Correlated Variables for Each Domain D-l Appendix V: Sociodemographic and Built-Domain Valence Correction E-l Appendix VI: County Maps of Environmental Quality Index 2006- 2010 F-l Appendix VII: Quality Assurance G-l ------- List of Table Table 1. Constructs for each environmental domain 5 Table 2 Sources of data for air, water, land, built-environment, and sociodemographic domains for use in the county Environmental Quality Index 20006-2010 7 Table 3. 2005 NATA variables included in EQI 2006-2010 15 Table 4. Air domain variable means, standard deviations (SDs), and ranges - Overall and rural-urban continuum codes (RUCCs) stratified 23 Table 5. Water domain variable means, standard deviations (SDs), and ranges - Overall and rural-urban continuum codes (RUCCs) stratified 25 Table 6. Land domain variable means, standard deviations (SDs), and ranges - Overall and rural-urban continuum codes (RUCCs) stratified 28 Table 7. Sociodemographic domain variable means, standard deviations (SDs), and ranges - Overall and rural-continuum codes (RUCCs) stratified 29 Table 8. Built-environment domain variable means, standard deviations (SDs), and ranges - Overall and rural-urban continuum codes (RUCCs) stratified 30 Table 9. Variable loadings, valence determination of variables - Air domain 31 Table 10. Variable loadings, valence determination of variables - Water domain 32 Table 11. Variable loadings, valence determination of variables - Land domain 34 Table 12. Valence corrected variable loadings, valence determination of variables - Sociodemographic domain 35 Table 13. Valence corrected variable loadings, valence determination of variables - Built domain 36 Table 14. Description of the domain indices contributing to the overall and rural-urban continuum codes (RUCCs) stratified Environmental Quality Index for 3143 U.S. counties (2006-2010) 37 Table 15. Loadings of the domain indices contributing to the overall and rural-urban continuum codes (RUCCs) stratified Environmental Quality Index for 3143 U.S. counties (2006-2010) 38 ------- List of Figures Figure 1. Conceptual environmental quality - Hazardous and beneficial aspects 3 Figure 2. Principal component analysis for the Environmental Quality Index (EQI). All counties included with four rural-urban continuum codes (RUCCs) 4 Figure 3. Rural-urban continuum code (RUCC) stratification for all counties in the United States 23 ------- ------- List of Acronyms ACRES Assessment, Cleanup, and Redevelopment RCRA Exchange RqE AQS Air Quality System RUCC C-CAP Coastal Change Analysis Program ^ CO Carbon monoxide SDWIS CWA Clean Water Act gQ2 EPA United States Environmental Protection Agency SSTS EQI Environmental Quality Index TIGER FARS Fatality Annual Reporting System FBI UCR Federal Bureau of Investigation Uniform Crime TOD Report -ppj FIPS Federal Information Processing Standard GIS Geographic information systems y g GTFS General Transit Feed Specification USDA EE HAP Hazardous air pollutant HUD Housing and Urban Development WATERS LEHD Longitudinal Employer-Household Dynamics LQG Large Quantity Generators WQS MRLC Multi-Resolution Land Characteristics MSHA Mine Safety Health Administration NADP National Atmospheric Deposition Program NATA National-Scale Air Toxics Assessment NCOD National Contaminant Occurrence Database NGS National Geochemical Survey NLCD National Land Cover Database N02 Nitrogen dioxide NPDES National Pollutant Discharge Elimination System NPL National Priorities List NPUD National Pesticide Use Database NWI National Walkability Index PCA Principal component analysis PM Particulate matter PM10 Particulate matter below 10 micrometers (|im) in aerodynamic diameter PM2 5 Particulate matter below 2.5 micrometers (|im) in aerodynamic diameter PWS Public water systems RAD REACH Address Database Resource Conservation and Recovery Act Report on the Environment Rural-urban continuum code Standard deviation Safe Drinking Water Information System Sulfur dioxide Section Seven Tracking System Topologically Integrated Geographic Encoding and Referencing Transit Oriented Development Toxic Release Inventory Treatment, Storage, and Disposal United States United States Department of Agriculture Economic Research Service Watershed Assessment, Tracking, and Environmental Results Water quality standards ------- ------- 1.0 Overview of Report An overall Environmental Quality Index (EQI), which represents multiple domains of the ambient environment, including air, water, land, built, and sociodemographic, for all counties in the United States, was created for the period 2000-2005[l], It was developed to provide a better estimate of overall environmental quality and to improve the understanding of the relationship between environmental conditions and human health. This report describes the efforts to update the EQI for all counties in the United States for the 2006-2010 period. The EQI was created for two main purposes: (1) as an indicator of ambient conditions/ exposure in environmental health modeling and (2) as a covariate to adjust for ambient conditions in environmental models. However, with the public release of the EQI and variables that constructed the EQI, other uses may emerge. The methods applied provide a reproducible approach that capitalizes almost exclusively on publicly available data sources. This report is written for audiences interested in the construction of the EQI and is technical in nature. The created variables, EQI, domain-specific indices, and EQI stratified by rural-urban continuum codes (RUCCs) are available publicly at the United States Environmental Protection Agency's (EPA's) Environmental Dataset Gateway. Also, an interactive map of the EQI is available at EPA's GeoPlatfonii. ------- ------- 2.0 Background Conceptually, the EQI accounts for the multiple domains of the environment with which humans interact (see Figure 1). These domains include chemical, natural, built, and sociodemographic environments that have both positive and negative influences on health. People move in and out of these positive and negative influences. Also, the positive and negative influences often are co-located. Brief Overview of EQI 2000-2005 The EQI 2000-2005 was developed in four steps: (1) The five domains were identified, (2) data for each of the five domains were located and reviewed, (3) environmental variables were developed from the data sources, and (4) data were combined in each of the environmental domains; then these domain indices were used to create the overall EQI. The EQI relied on data sources that were mostly available to the public. Below is a Summary of the creation of the county level EQI 2000-2005. For more detailed technical information, see the technical report for EQI 2000-2005 [1] located at the Environmental Dataset Gateway. EQI 2000-2005, Summary of Creation Domain Identification. Based on three sources, (1) the Report on the Enviromnent (ROE) |2J. (2) literature review, and (3) experts, five environmental domains were identified and developed for the EQI: (1) air, (2) water, (3) land, (4) built, and (5) sociodemographic. Data Source Identification and Review. Predetermined constructs were identified to represent each domain. Based on those constructs, data sources were explored to provide variables representing those constructs. Air Domain: Three data types were considered: (1) monitoring data, (2) emissions data, and (3) modeled estimates representing two constructs: concentrations of either criteria air pollutants or hazardous air pollutants (toxics). Twelve data sources were identified, and seven were considered for the EQI. Two were used for the air domain of the EQI because they were the most complete. Water Domain: Five broad data types witliin the water domain were identified: (1) modeled, (2) monitoring, (3) reported, (4) survey/study, and (5) miscellaneous data. Eighty data sources were identified. Five were used for the water domain of the EQI representing seven constructs: water quality, general water contamination recreational water quality, domestic use, deposition, drought, and chemical contamination. Land Domain: Land domain data sources were grouped into five constructs: (1) agriculture, (2) pesticides, (3) contaminants, (4) facilities, and (5) radon. Eighty sources were identified. Eleven were retained. ENVIRONMENTAL QUALITY Hazardous Beneficial Polluted Air Home Ownersl Factories 'hysicai Activity Figure 1. Conceptual environmental quality - Hazardous and beneficial aspects. Built-Environment Domain: Built environment considered five data types: (1) traffic-related. (2) transit access, (3) pedestrian safety, (4) access to various business environments (such as the food, recreation, health care, and educational environments), and (5) the presence of subsidized housing. Twelve data sources were identified, and four were retained for the built-enviromnent domain of the EQI for five constructs: (1) roads, (2) highway road safety, (3) public transit behavior, (4) business environments (physical activity, food, health care, and educational), and (5) subsidized housing. Sociodemographic Domain: The sociodemographic domain is represented by crime and socioeconomic constructs. Only two data sources were identified for the sociodemographic domain of the EQI, one for each of the constructs. Variable Construction. After researching and choosing data sources, variables were created to represent each of the five domains. New variables were created because raw data sources were not always appropriate for statistical analysis. The process for selecting and creating variables included making variables for each domain for each available year of data (2000-2005), looking for highly correlated variables that are giving the same information statistically and deciding which of the variables best represents the environmental domain (and remove the extra variables). 3 ------- Principal components analysis (PCA) reduced multiple variables into domain-specific indices for each RUCC strata and overall. Domain-specific indices combined using PCA to create EQI for each RUCC strata and overall. Air variables Water variables Socio- demographic variables Land * variables Built variables EQI Air Indices Built Indices Socio- demographic Indices Water Indices Land Indices RUCC1 = metropolitan-urbanized RUCC2 =nonmetropolitan-urbanized RUCC3 =less urbanized RUCC4 =thinly populated OVERALL Figure 2. Principal component analysis for the Environmental Quality Index (EQI). All counties included with four rural-urban continuum codes (RUCCs). looking for missing data, looking at the distribution and statistical properties of each variable and deciding how it should be scaled for analysis, and averaging variables from 2000-2005 for each county. Data Reduction and Index Construction. After variables were created, they were combined into a single index (the EQI) using statistical methods. Each domain has its own index (air domain index, water domain index, etc.). Next, each of the domain- specific indices was used to create the overall EQI. The statistical process used to add these variables together is called principal component analysis (PCA). Figure 2 shows the steps that include Since the creation of the EQI 2000-2005, multiple studies were conducted examining the relationship between overall environmental quality and health outcomes, including preterm birth [3], mortality |4|. cancer incidence |5|. asthma prevalence [6], physical inactivity and obesity [7], infant mortality [8], and pediatric multiple sclerosis [9], A complete list of references related to EQI and health outcomes is shown in Appendix I. 4 ------- 3.0 Development of the EQI 2006-2010 Overview The development of the EQI 2006-2010 followed mostly the same protocol as the EQI 2000-2005. The majority of constructs identified for each of the five domains in the EQI 2000-2005 were maintained as the basis for variable identification, with the exception of one deletion each in the water domain and land domain and constructs added to the water domain land domain sociodemographic domain, and the built-enviromnent domain. Most data sources remained unchanged. Principal components analysis was used to develop the indices. However, using lessons learned from the creation of the EQI 2000-2005, some modifications were adopted to improve the EQI 2006-2010; these modifications included exploring new data sources that were not available during EQI 2000-2005 development, assessment of all variables for continued inclusion in the EQI, and assessment of variables' valence within a domain and valence correction. This section outlines the development of the EQI 2006-2010 through (1) data source identification and review, (2) variable construction, and (3) data reduction and index construction. Data Source Identification and Review Approach Data Selection An index that comprehensively captures the total enviromnent relating to human health requires numerous variables representing the full range of health-influencing exposures. From within each domain identified in the conceptual model (air, water, land, sociodemographic, and built enviromnents), specific constructs or major areas were identified (Table 1). In general, the identified constructs from EQI 2000-2005 were maintained for the EQI 2006-2010. However, in the water domain, we removed the "recreational water quality" construct as it only provided data for 231 counties in the United States with beach recreational waters. Because of this low representation, the variables in this domain had extremely low loading values in the Principal Components Analysis; therefore, they were removed in the 2006-2010 EQI. In addition, a dataset representing drinking water quality was identified and, therefore, we were able to include "Drinking water quality" construct. In the land domain, the "Contaminants" construct was eliminated. We eliminated these data because they were not the same quality as the rest of the data for the EQI. There was a lack of updated contaminants data. Table 1. Constructs for each environmental domain. Domain Air Water Land Sociodemographic Built Environment Constructs Criteria air pollutants Hazardous air pollutants Overall water quality General water contamination Domestic use Atmospheric deposition Drought Chemical contamination Drinking water quality (new 2006-2010) Agriculture Pesticides Facilities Radon Mining activity (new 2006-2010) Socioeconomic Crime Political character (new 2006-2010) Creative class representation (new 2006-2010) Roads Highway/road safety Commuting behavior Business environment Housing environment Walkability (new 2006-2010) Green space (new 2006-2010) 5 ------- and, because of the high correlation between this construct and constructs in other domains, contaminants of this type were better represented by water contaminant data. Also, in the land domain, a "Mining activity" construct was added. The sociodemographic domain added two new constructs: (1) political character and (2) creative class representation. There was a change in how educational attainment was represented in the 2006-2010 EQI. That change in the education variable from percent of adults with greater than high school education in the 2000-2005 EQI to percent of adults with a college education in the 2006-2010 EQI resulted from inclusion of an education variable with more variability, as almost all citizens have a high school education at this time. The built-environment domain added two new constructs: (1) walkability and (2) green space. Data sources were explored to identify variables that represent the identified constructs for construction. All data sources used for EQI 2000- 2005 were reviewed for data updates, and a subsequent search was conducted to identify potential new data sources. We had solid representation of data for most domains, and we sought to ensure continuity and comparability for the 2006-2010 EQI. Still, our update required identification of new data sources to ensure representation of identified constructs. Because the team came to appreciate the limitations and knowledge gaps in data from the original EQI, the data source identification process was different for the 2006-2010 period than that undertaken for the original (2000-2005) EQI. For example, because of limitations in the National Geochemical Survey representing the geology construct in the land domain, we looked for alternative sources and are now using mines data in the land domain. In recognition of gaps, such as the absence of walkability in the built domain and absence of political climate in the sociodemographic domain, we sought additional data sources to represent the new constructs that we believed would represent more fully the environmental quality of a county. The details of the new data sources that were identified and included in the EQI 2006-2010 are included in the data source descriptions below. Data Source Search Once the desired constructs were identified, the research team conducted an extensive search for potential sources for data to represent those constructs. In general, a broad approach to searching for data sources was undertaken to identify EPA and non-EPA domain-specific environmental data sources for all counties in the 50 states of the United States; summarize environmental data source availability, quality, spatial and temporal coverage, storage requirements, and acquisition steps; and Possible data sources were identified using Web-based search engines (e.g., Google), site-specific search engines (e.g., federal and state data sites), literature-reported data sources (e.g., PubMed, ScienceDirect, TOXNET), and personal communications from data owners. Data that were available at or had the potential to be aggregated to the U.S. county level were sought. Data were restricted to represent the years 2006-2010. Data Quality and Coverage Assessment Once potential data sources were identified, several criteria were used to assess sources for inclusion in the EQI. First, constructs representing each domain were identified. Data sources were evaluated as to whether variables could be developed to represent the construct. If a data source could provide variables for a construct in the domain, then data quality and data coverage were used to evaluate data sources for use in the EQI. Data sources of the highest quality were sought. Quality was assessed by one or more of the following ways: Through documentation and discussion with the data source managers, in data reports and internal documentation, project investigators, and the larger field of environmental research through use and critique of the various data sources. Data coverage, which included spatial and temporal components, was more challenging to achieve. Coverage for the entire United States, including Alaska and Hawaii, was one important spatial criterion. Often, it was relatively straightforward to identify high-quality data on a few individual locations or a small geographic area, but the EQI was developed to represent all counties (N=3143) in all 50 States. A second spatial criterion was county-level representation, so data had to be constructible at the county-level for inclusion (e.g., average of point measures or census tract values). Temporally, ideal sources would have had annual data for the 2006-2010 period. At minimum, however, at least some data must have fallen within the 2006-2010 period or close to this time. In theory, a "perfect" data source would have variable measurements at high temporal and spatial resolutions. In practice, data often met one but not both criteria, and evaluation of trade-off values was required, along with consideration of data quality. Unfortunately, some of the data sources used in EQI 2000-2005 did not have any updates for the 2006-2010 period. Redundant data sources that were determined to meet the criteria for inclusion but were not selected for inclusion were retained for use in sensitivity analyses. Summary of Activities Table 2 identifies the data sources that were acquired and used for the construction of the EQI and includes a description of the data source and variables constructed from data source. obtain the identified data. 6 ------- Table 2 Sources of data for air, water, land, built-environment, and sociodemographic domains for use in the county Environmental Quality Index 20006-2010 Air Domain Source of Data Air Quality System (AOS 2006-2010) [10] Description Repository of ambient air quality data, including both criteria and hazardous air pollutants (HAPs) Variables* PM|0 - Particulate matter under 10 |jg in aerodynamic diameter (|jg/mJ 5-year average); PM,5 - Particulate matter under 2.5 |jg in aerodynamic diameter (|jg/ md 5-year average); NO,- Nitrogen dioxide (parts per billion [ppb] 5-year average); SO, - Sulfur dioxide (ppb 5-year average); 03 - Ozone (parts per million (ppm) 5-year average); CO - Carbon -monoxide (ppm 5-year average) EQI version 2000-2005 and updated 2006-2010 National-Scale Air Toxics Assessment (NATA 2005)[11] Estimates of HAP concentrations using emissions information from the National Emissions Inventory and meteorological data input into the Assessment System for Population Exposure Nationwide model Water Domain Source of Data Watershed Assessment, Tracking and Environmental Results Program Database (WATERS)[12] A_TeCA -1,1,2,2-tetrachloroethane (tons emitted per year); A_112TCA-1,1,2-trichloroethane (tons emitted per year); A_DBCP -1,2-dibromo-3-chloropropane (tons emitted per year); A_Acrylic_acid-Acrylic acid (tons emitted per year); A_Benzidine - Benzidine (tons 2000-2005 and 2006-2010 (used 2005 NATA only) emitted per year) emitted per year) emitted per year) Description Collection of EPA water assessments programs, including impairment, water quality standards, pollutant discharge permits, and beach violations ; A_Benzyl_CI - Benzyl chloride (tons ; A_Be - Beryllium compounds (tons ; A_DEHP - b/s-2-ethylhexyl phthalate (tons emitted per year); A_CCI4 - Carbon tetrachloride (tons emitted per year); A_CS - Carbon sulfide (tons emitted per year); A_CI - Chlorine; A_C6H5CI - Chlorobenzene (tons emitted peryear); A_chloroform - Chloroform (tons emitted per year); A_Chloroprene - Chloroprene (tons emitted per year); A_Cr - Chromium compounds (tons emitted peryear); A_Co - Cobalt compounds (tons emitted peryear); A_CN - Cyanide compounds (tons emitted peryear); A_DBP - Dibutylphthalate (tons emitted per year); A_EtCI - Ethyl chloride (tons emitted per year); A_EDB - Ethylene dibromide (tons emitted per year); A_EDC - Ethylene dichloride (tons emitted per year); A_Formaldehyde - Formaldehyde (tons emitted per year); A_Glycol_ethers - Glycol ethers (tons emitted per year); A_N2H2 - Hydrazine (tons emitted per year); A_HCI - Hydrochloric acid (tons emitted per year); AJsophorone - Isophorone (tons emitted per year); A_Mn - Manganese compounds (tons emitted peryear); A_MeBr - Methyl bromide (tons emitted peryear); A_MeCI - Methyl chloride (tons emitted per year); A_PH3 - Phosphine (tons emitted per year); A_PCBs - Polychlorinated biphenyls (tons emitted per year); A_ProCI2 - Propylene dichloride (tons emitted per year); A_Quinolin - Quinoline (tons emitted per year); A_C2HCI3 - Trichloroethylene (tons emitted per year); A_VyCI - Vinyl chloride (tons emitted peryear) Variables! ALLNPDESperKMJn - All NPDES permits per 1000 km of stream in county (permits per 1000 km stream length) EQI version 2000-2005 and updated 2006-2010 7 ------- Table 2. continued National Atmospheric Deposition Program (NADP 2006-2010) [13] Estimated Use of Water in the United States (2010)[14] Drought Monitor Data (2006-2010) [15] Samples both regulated and unregulated contaminants in public water supplies; maintained by EPA to satisfy statutory requirements for Safe Drinking Water Act County-level estimates of water withdrawals for domestic, agricultural, and industrial use calculated by the United States Geological Survey Geographic information systems raster files reporting weekly modeled drought conditions; a collaboration that includes the National Atmospheric and Oceanic Administration, the U.S. Department of Agriculture, and academic partners. Measures deposition ofvarious pollutants, such as calcium, sodium, potassium, and sulfate, from rainfall Safe Drinking Water Monitoring of public water systems for health-based Information System violations (SDWIS 2006-2010) [17]{United States Environmental Protection Agency (EPA), #966} Land Domain Source of Data Description National Pesticide Delineates state-level pesticide usage rates for cropland Use Database: applications; contains estimates for active ingredients, of 2009[18] which 68 are insecticides, and 22 are other pesticides CaAveJn - Calcium (Ca) precipitation weighted mean (mg/L); KAveJn - Potassium (K) precipitation weighted mean (mg/L); N03Ave - Nitrate (NO-J precipitation weighted mean (mg/L); ClAveJn - Chloride (CI) precipitation weighted mean (mg/L); S04_mean_ave - Sulfate (S04) precipitation weighted mean (mg/L); HgAve - Total mercury deposition (ng/M J Per_TotPopSS - Percent of population on self supply (percent); Per_PSWithSW - Percent of public supply population that is on surface water (percent) 2000-2005 and updated 2006-2010 AvgOfD3_ave - Percent of county drought - (D3-D4) (percent) W_As_ln -Arsenic (mg/L); W_Ba_ln - Barium (mg/L); W_Cd_ln - Cadmium (mg/L); W_Cr_ln - Chromium (total) (mg/L); W_CN_ln - Cyanide (mg/L); W_FL_ln - Fluoride (mg/L); W_HG_ln - Mercury (inorganic) (mg/L); W_N03_ln - Nitrate (as N) (mg/L); W_N02_ln - Nitrite (as N) (mg/L); W_SE_ln - Selenium (mg/L); W_Sb_ln -Antimony (mg/L); W_Endrin_ln - Endrin (|jg /L); W_ methoxychlorjn - Methoxychlor (ug/L); W_Dalapon_ln - Dalapon (|jg /L); W_DEHA_ln - Di(2-ethylhexyl)adipate (DEHA) (|jg /L); W_Simazine_ln - Simazine (|jg /L); W_DEHP_ln - Di(2-ethylhexyl) phthalate (DEHP)( pg /L); W_Picloram_ln - Picloram (|jg /L); W_Dinoseb_ln - Dinoseb (|jg /L); W_atrazine_ln - Atrazine (|jg /L); W_24D_ln - 2,4-D (2,4-Dichlorophenoxyacetic acid) (|jg /L); W_BenzoAP_ln - Benzo[a]pyrene (|jg /L); W_PCP_ln - Pentachlorophenol (|jg /L); W_PCB_ln - Polychlorinated biphenyls (PCBs) (|jg /L); W_DBCP_ln -1,2-Dibromo-3-chloropropane (DBCP) (|jg /L); W_EDB_ln - Ethylene dibromide (EDB) (|jg /L); W_xylenes_ln - Xylenes (Total)( |jg /L); W_Chlordane_ln - Chlordane (|jg /L); W_DCM_ln - Dichloromethane (methylene chloride) (|jg /L); W_ PDCBJn -1,4-Dichlorobenzene (p-dichlorobenzene) (|jg /L); W_111 trichlorane_ln -1,1,1 -Trichloroethane (|jg /L); W_Trichlorene_ln - Trichloroethylene (|jg /L); W_C2CI4_ln - Tetrachloroethylene (|jg /L); W_ benzenejn - Monochlorobenzene (chlorobenzene) (|jg /L); W_Toluene_ln - Toluene (|jg /L); W_ethylbenz_ln - Ethylbenzene (|jg /L); W_styrene_ln - Styrene (|jg /L); W_Alpha - Alpha Particles (Gross Alpha, excluding radon and uranium) (pCi/L); W_DCE_ln - cis-1,2- Dichloroethylene (|jg /L) Coliform_proportion_ln - Total coliform proportion (average number of violations*(population served/ county population) Variablest insecticidejn - Insecticide applied (lb); herbicidejn- 2000-2005 and updated 2006-2010 extreme 2000-2005 and updated 2006-2010 2000-2005 and 2006-2010 (not updated, used same variables from 2000-2005) 2006-2010 EQI version 2000-2005 and updated 2006-2010 Herbicides applied ( applied (lb) ; fungicidejn - Fungicides 8 ------- Table 2. continued 2007 Census of Agriculture Full Report[19] Summary of agricultural activity including number of farms by size and type, inventory and values for crops and livestock, and operator characteristics pct_manure_acres_ln - Manure, acres applied per county acres (percent); pct_nematode_acres_ln - Chemicals used to control nematodes, acres applied per county acres (percent); pct_disease_acres_ln - Chemicals used to control diseases in crops and orchards, acres applied per county acres (percent); pct_defoliate_acres_ln - Chemicals used to control growth, thin fruit, or defoliate, acres applied per county acres (percent); Pct_AU_ln - Animal units, animal units per county acres (percent); farms_per_acre_ln - Number of farms (number); pct_irrigated_acres_ln - Irrigated acres, acres irrigated per county acres (percent); pct_harvested_acres_ln - Harvested acres, acres harvested per county acres (percent) 2000-2005 and updated 2006-2010 EPA Geospatial Data Download Service (2006-2010)[20] Maintained by EPA and provides locations of and information on facilities throughout the United States; different datasets within this database are updated at different intervals, but most are updated monthly; no set spatial scale across datasets. Some provide addresses, some geocoded addresses, etc. facilities_rate_ln - Log transformed rate of all facilities per county (proportion) 2000-2005 and updated 2006-2010 Map of Radon Zones [21] Identifies areas of the United States with the potential for elevated indoor radon levels; maintained by EPA Radon - Radon zone (ordinal value) 2000-2005 and 2006-2010 (not updated, used same variable from 2000-2005) Mine Safety and Health Administration (MSHA) Mines Data Set(2006-2010)[22] Includes status of coal/metal/nonmetal mines under MSHA jurisdiction since 1970 std_coal_prim_pop_ln - Primarily coal mines, mines per county population (proportion); std_metal_prim_pop_ln - Primarily metal mines, mines per county population (proportion); std_nonmetal_prim_pop_ln - Primarily nonmetal mines, mines per county population (proportion); std_sandandgravel_prim_pop_ln - Primarily sand and gravel mines, mines per county (proportion); std_stone_prim_pop_ln - Primarily stone mines, mines per county population (proportion) 2006-2010 National Geochemical Survey[23] Geochemical data (arsenic, selenium, mercury, lead, zinc, magnesium, manganese, iron, etc.) for the United States based on stream sediment samples 2000-2005; not used in 2006-2010. These data are represented in the water domain with the National Contaminant Occurrence Database (2006-2010) and the National Atmospheric Deposition Program (2006-2010) Sociodemographic Domain Source of Data Description Variables^ EQI version United States Census (2010)[24] County-level population and housing characteristics, including density, race, spatial distribution, education, socioeconomics, home and neighborhood features, and land use Pct_RenterOcc - Percent renter-occupied units (percent); Pct_Vacant_Housing - Percent vacant units (percent); Med_HH_Value - Median household value (dollars); ln_HH_lnc - Natural log transformed median household income (dollars); pct_fam_pov - Percent of families living below federal poverty level (percent); pct_BS - Percent of persons with bachelor's degree or higher, age 25+ (percent); pct_unemp_total - Percent of persons who are unemployed (percent); ln_Occs_Room - Natural log transformed number of occupants per room (count); GINI_est - Measure of income inequality (proportion) 2000-2005 and updated 2006-2010 Uniform Crime Reports (2006-2010) [25] County-level reports of violent crime ln_ViolAv - Natural log transformed violent crime rate (log of count ofviolent crimes / county population) 2000-2005 and updated 2006-2010 Dave Leip's Atlas of U.S. Presidential Elections (2008)[26] 2008 Election results DEM02008 - Percent county voting Democratic in 2008 (percent) 2006-2010 9 ------- Table 2. continued United States An index of a county's share of population employed in num_CreatClass - Percent county employed in a 2006-2010 Department occupations that require thinking creatively" creative class (percent) of Agriculture Economic Research Service Creative Class County Codes (2010)|27] Built-Environment Domain Source of Data Dun and Bradstreet North American Industry Classification System codes (2008)[28] Description Description of physical activity environment (recreation facilities, parks, physical-fitness-related businesses), food environment (fast food restaurants, groceries, convenience stores), and education environment (schools, daycares, universities) per county Topologically Integrated Geographic Encoding and Referencing (2009) [29] and NAVTEQ map data[30] Fatality Annual Reporting System (2006-2010) [31 ] Housing and Urban Development Data (2010)[32] Road type and length per county; road types by county created by joining NAVTEQ map data to Topologically Integrated Geographic Encoding and Referencing (TIGER) county definitions Annual pedestrian-related fatality per 100,000 population; maintained by National Highway Safety Commission Housing authority profiles provide general housing details (low-rent and subsidized/Section 8 housing); information updated by individual public housing agencies. Variables! al_pwn_gm_env_rate_ln - Natural log transformed rate of vice-related businesses per county (log of count of businesses / county population); ed_env_rate_ln - Natural log transformed rate of education-related businesses per county (log of count of businesses / county population); neg_food_rate_ln - Natural log transformed rate of negative food resources per county (log of count of businesses / county population); pos_ food_rate_ln - Natural log transformed rate of positive food resources per county (log of count of businesses / county population); hc_env_rate_ln - Natural log transformed rate of health-care-related businesses per county (log of count of businesses / county population); rec_env_rate_ln - Natural log transformed rate of recreation-related businesses per county (log of count of businesses / county population); ss_env_rate_ln - Natural log transformed rate of social service agencies per county (log of count of businesses / county population); civic_env_rate_ln - Natural log transformed rate of civic-related businesses per county (log of count of businesses / county population SecondaryRoadProportion - Proportion of all roads that are secondary roads (proportion) EQI version 2000-2005 and updated 2006-2010 Ln_fatalities - Natural log transformed rate (count/ county population) of fatal car crashes per county (log transformed count / county population) total_units_ln - Natural log transformed rate of the sum of the following two variables (Iow_rent_un its - Count of low-rent units per county [count] and section_eight_ units - Count of section eight units per county [count]) (log of summation of units / county population) 2000-2005 and updated 2006-2010 2000-2005 and updated 2006-2010 2000-2005 and updated 2006-2010 United States Census (2010)[24] EnviroAtlas Green space dataset (2011, 2005-2011)[33] EPA's National Walkability Index ) (2010)[36] County-level population characteristics, including density, race, spatial distribution, education, socioeconomics, home and neighborhood features, and land use Description of 20 different land covers for National Land Cover Database (NLCD)[34] and 24 for Coastal Change Analysis Program (C-CAP)[35]; given as percent of county Characterizes every census block group walkability on a score from 0 to 20 based on four variables: (1) mix of employment types and occupied housing, (2) mix of employment types in a block group, (3) street intersection density, and (4) predicted commute mode split - proportion of workers in the block group who carpool CommuteTime - Time it takes to travel from home to 2006-2010 work (min); ln_PubTrans -Natural log of percent of county residents who report using public transportation (percent) NINDEX_open - Percent of county land area classified 2006-2010 as natural land cover and open space developed land cover (percent) sum_NWIBG - Walkability score (ordinal) 2006-2010 "Air domain: All variables are natural log transformed with the exceptions of A_edb, A_formaldehyde, 03, PM|0, and PM-,5. Water, Land, and Built domains: Variables with Jn indicated natural log transformation. Sociodemographic domain: ln_ indicates natural log transformation. Data sources highlighted in blue are new data sources added to 2006-2010 EQI. Data sources highlighted in gree are data sources used in 2000-2005 EQI but are not included in 2006-2010 EQI. 10 ------- Air Domain Two constructs represent the air domain: (1) criteria air pollutants and (2) hazardous air pollutants (HAPs). The Air Quality System (AQS)[10] was used to construct variables for the criteria air pollutants and the National-Scale Air Toxics Assessment (NATA) database [11] was used to construct variables for the HAPs. The AQS is a repository for criteria ambient air pollution data collected by federal, state, local, and tribal agencies from thousands of monitors for EPA's ambient air monitoring program across the United States. Monitored pollutants include all criteria air pollutants, PM species, and approximately 60 ozone precursors. Major strengths of the AQS are that data are measured, rather than modeled, and these measurements are synchronized across the country. Monitors in the network and the reported data are audited regularly for accuracy and precision. However, most of the ambient air monitors are located in or near urban areas, leaving many U.S. counties without reported data. In addition, the AQS provides sparse and limited data collection for HAPs. The NATA database uses data from the National Emissions Inventory [3 7] to construct air dispersion models for estimating ambient concentrations of HAPs at the county and census-tract levels. Beginning in 1996, the National Emissions Inventory data are constructed every 3 years, providing annual estimates. The NATA databases contain estimated ambient concentrations for 177 to 180 of the 187 HAPs and use validated models that take meteorology and chemical dispersion into account. The methodology for estimating concentrations may change between assessments, but these modifications are well documented and justified. Although the ambient concentrations may be comparable over time, some differences between estimates are attributable to these minor methodological modifications. The temporal resolution of the assessments is adequate for the intended EQI, but, because of the 3-year release schedule, there are gaps in temporal coverage. NATA 2008 was not developed and thus, for EQI 2006-2010, NATA 2005 was used. Water Domain The water domain included six data sources: (1) the WATERS program database[12], (2) Estimated Use of Water in the United States. [14], (3) the National Atmospheric Deposition Program (NADP)[13], (4) the Drought Monitor Network[ 15], (5) the National Contaminant Occurrence Database (NCOD)[16], and (6) the Safe Drinking Water Information System (SDWIS)[17], Using these six data sources, variables were created to represent seven constructs that describe the overall water environment. The seven constructs were (1) overall water quality, (2) general water contamination, (3) drinking water quality, (4) domestic use, (5) atmospheric deposition, (6) drought, and (7) chemical contamination. The Watershed Assessment, Tracking, and Environmental Results (WATERS) Program[12] database represents the surface water assessment programs under the Clean Water Act (CWA). A limitation of this data source is that data are maintained at the state level and reported to the federal system. Although all states report county-level data, there is little consistency in the temporal reporting and type of data reported across states. These data were first geocoded to a specific stream length in the National Hydrography Dataset[38] via the REACH Address Database (RAD) [3 9], The geocoded WATERS program data were used to calculate human-exposure-related variables, such as percentage of stream length impaired for recreational use. This dataset is the only database maintaining information on EPA CWA regulations, which is a strength. The National Contaminant Occurrence Database (NCOD)[16] is a surveillance database maintained to satisfy the requirements of the Safe Drinking Water Act. This database includes information on contaminants in public water supplies that are not measured elsewhere. The survey is conducted every 6 years, and data are provided by public water suppliers. The data are limited, as they are provided by public water suppliers, and, therefore, spatial aggregation was needed to get county-level estimates. Estimated Use of Water in the United Stales\ 14|. which is modeled by the United States Geological Survey, provided county-level estimates of water withdrawals (an indication of water stress in a county) for domestic, irrigation, livestock, and industrial use. This dataset already is provided at the county level, which is a strength; however, it is limited, as the estimates are based on several different data sources. Two data sources provided information on meteorological impacts on water quality. The Drought Monitor Data[15] are modeled weekly drought conditions. Weekly coverage for the entire country is a strength of this dataset. The National Atmospheric Deposition Program (NADP)[13] provided weekly measures and national coverage of the deposition of various pollutants from rainfall using monitors around the country. Again, this database provided weekly information for the entire country; however, it was reported by monitors and required spatial aggregation to achieve county-level estimates. Drinking water quality data was gathered from the Safe Drinking Water Information System[17] (SDWIS), which is a repository maintained for compliance with federal regulations. This is a new data source to the water domain. SDWIS provides publicly available data based on requirements from the Safe Drinking Water Act. States are required to report basic information about the public water systems (PWS), violations, and enforcement information. The health-based violations provided in SDWIS are not measured elsewhere. Of the SDWIS measures, only total coliform health-based violations were considered for inclusion in the 2006-2010 EQI, as the other contaminant categories have a high frequency of missing data (arsenic: 87.18%; ground water: 97.8%; inorganic chemicals: 97.04%; lead and copper: 90.87%; long-term enhanced surface water treatment rule 1 and 2: 87.69%; nitrates: 91.92%; radionuclides: 89.76%; disinfection and disinfectant by-products: 66.43%; surface water treatment: 90.84%; synthetic organics: 98.79%; and volatile organic chemicals: 98.5%) for health-based violations. Average total coliform health-based violations were used to estimate the proportion of the county population affected by coliform violations between 2006 and 2010. Land Domain The land domain included five data sources representing five constructs: (1) Agriculture, (2) Pesticides, (3) Facilities, (4) Radon, and (5) Mining Activity. The data sources identified for this domain include: 2007 Census of Agriculture [19], 2009 National Pesticide Use Database[18], EPA Geospatial Data 11 ------- Download Service[20], Map of Radon Zones[21], and Mine Safety and Health Administration (MSHA) mines data[22]. The MSHA mines database is a data source new to EQI2006-2010. Also, the National Geochemical Survey database used in EQI 2000-2005 was not used in EQI 2006-2010. The 2007 Census of Agriculture Full Report[19] was used to represent agricultural factors. Information on nonpesticide chemicals used in farming, animal units, harvested acreage, irrigated acreage, manure acreage, and proportion of farms was taken from the 2007 Census of Agriculture. The Census of Agriculture[19] data provided mostly farm-related summary characteristics and did not offer direct pesticide measures or probable exposure information. As a strictly environmental indicator, the Census of Agriculture was useful, but its ability to link to human health was somewhat limited. Eight variables from the census of agriculture were included in the EQI. The 2009 National Pesticide Use Database (NPUD)[18] provides county-level rates of pesticide use. A limitation of the NPUD was its availability only for contiguous states. Pesticides were classified into three pesticide classes and then summed to estimate county-level pesticide use (in kilograms) for herbicides, fungicides, and insecticides. These three pesticide categories were included in the EQI. The industrial facilities data source, the EPA Geospatial Data Download Service[21], was used to find the following types of sites: Brownfield sites; Superfund sites; Toxic Release Inventory sites; pesticide-producing-location sites; large-quantity generator sites; and treatment, storage, and disposal sites. All facilities- related data were retained for inclusion in the EQI with extensive information on each facility for the years 2006-2010. The EPA Radon Zone [21] map assigned a radon potential level to each county in the United States. As the data source provided radon potential, not actual measurement, these data were limited. The three-level radon categorization masked important radon- level heterogeneity across the United States. Despite these limitations, the data sources provided land-related data not available elsewhere. The Mine Safety and Health Administration (MSHA) Mines Data Set[22] was used to create the mining activity construct. The MSHA's dataset includes current and historical coal, metal, and nonmetal mines. The list included the status of each mine (Abandoned, Abandoned and Sealed, Active, Intermittent, New Mine, Nonproducing, Temporarily Idled) and in which county the mine was located. The dataset does not include the size of each mine, so it is possible a mine may span two counties, but only the county indicated by its official address is reported. The National Geochemical Survey (NGS)[23], used in the 2000- 2005 version of the EQI to determine the contaminant construct, was not included in the updated version. The NGS data provided the mean and standard deviations for multiple soil chemicals. However, these values were calculated from multiple surveys of soil samples collected over several years based on local agencies' interests and resources and, therefore, were combining many varying sources of data. Because of high correlation between the NGS and the National Contaminant Occurrence Database and the National Atmospheric Deposition Program, the decision to drop the NGS was made. Sociodemographic Domain The original sociodemographic domain included only two constructs: (1) socioeconomics and (2) crime. In an effort to better reflect each county's sociodemographic character, the updated Sociodemographic Domain for EQI 2006-2010 has four constructs: (1) Socioeconomic, (2) Crime, (3) County creative typology (new for EQI 2006-2010), and (4) County political valence (new for EQI 2006-2010). Because counties can be characterized as "working class" or "tech savvy," we added the creative typology to help capture these characteristics. Similarly, counties may be known for their political valence (e.g., a "red" county in a "blue" state); the percent voting Democratic in the 2008 election was added to capture this county characteristic. Only four data sources were identified and retained for the sociodemographic domain: (1) the United States Census Bureau[24], (2) the Federal Bureau of Investigation Uniform Crime Reports (FBI UCRs)[25], (3) the United States Department of Agriculture Economic Research Service (USD A ERS)[27], and (4) Dave Leip's Atlas of U.S. Presidential Elections (2008)[26], The United States Census[24] reports county-level population and housing characteristics, including population density, race, spatial distribution, socioeconomic characteristics, home and neighborhood features, and land use. One strength of this data source is its national coverage and consistency of data collection with standard methods. One weakness of this data source is its decennial collection. The FBI UCR[25] provides annual violent and property crime counts and rates for reporting areas. These data are a valuable source of crime exposure, but reporting is not mandatory and may vary by jurisdiction. The USDAERS[27] creates a "creative class" index, derived from census data, to identify what proportion of the population may be employed in creative pursuits. This variable helps to characterize counties as being attractive to people in creative work (e.g., physicians, professors, architects). Because this variable is based on census data, it has the same strengths and weaknesses of the United States Census. Dave Leip's Atlas of U.S. Presidential Elections[26] tracks the political valence of the counties. Political valence tracks with a number of county-level attributes, such as provision of social supports, levels of school funding, etc. Capturing this variability may be useful for differentiating counties from each other. One strength of Dave Leip's Atlas of U.S. Presidential Elections data source is its data quality, and one weakness of this data source is its infrequency of publication. Each of these data sources represents critical aspects of the human sociodemographic environment and is updated regularly and available at the county-level for the entire country. Built-Environment Domain Built-environment data sources were identified for the following constructs: Business environment, Highway safety, Housing, Roads, Commuting practices, Walkability, and Green Space. For EQI 2006-2010, we added two new data constructs with new data sources: one representing green space and another estimating county walkability. 12 ------- For the road construct, NAVTEQ road map data[30] were joined to Topologically Integrated Geographic Encoding and Referencing (TIGER) [29] county definitions to result in road types by county. The road data from NAVTEQ, whose underlying map database was based on first-hand observation of geographic features, rather than relying on official government maps, is the majority supplier for car navigation systems (around 85% of car makers). The TIGER files provide relatively uniform and nationwide coverage. From these files, county- specific proportions were characterized for various road types. Unfortunately, considerable heterogeneity may be lost; for instance, a tertiary road in Maryland may not be qualitatively equivalent to one located in Wyoming. The Fatality Analysis Reporting System[31] of the National Highway Safety Commission was retained as part of traffic safety because of its national coverage. The data are regularly updated and available from the Web site. A limitation of these data is that traffic fatalities result from diverse types of events (e.g., from road conditions or substance-involved fatalities), but this diversity is not captured well. North American Industry Classification System codes through Dun and Bradstreet[28] were used as the data source to estimate five different business environment topics: (1) physical activity, (2) food, (3) educational, (4) social, and (5) health care environments. These data are available as geocoded business addresses. Although these data have sometimes been criticized for inadequate spatial resolution (e.g., inaccurate geocoding to small units of aggregation, such as census tracts), they should be sufficient as a construct for county-level business environments of food, physical activity, and education. The Housing and Urban Development database [32] includes data on Section 8 and low-income housing. These housing units are a feature of built environments associated with known and suspected health risks and disamenities. The EPA's National Walkability[36] data is the source of the walkability index. It combines data from 2010 Census TIGER/Line shapefiles, 2010 Census Summary File 1, Census Longitudinal Employer-Household Dynamics (LEHD) 2010, InfoUSA 2011, NAVTEQ NAVSTREETS 2011, General Transit Feed Specification (GTFS) data for 228 transit agencies, and the Center for Transit Oriented Development (TOD) Database 2012 to produce a block group score, which was aggregated to the county level. The Landcover data derive from the EPA's National Land Cover Database (NLCD)[34], It represents land cover across the contiguous 48 states, circa 2011. Each 30-m2 pixel has been classified using a standard land cover classification scheme, and some of these categories have been aggregated further according to procedures outlined in EPA's Report on the Environment[40], Data originally were processed and compiled by the Multi- Resolution Land Characteristics Consortium (MRLC)[41], a United States federal interagency group, based on Landsat satellite imagery. These data are combined with NOAA's C-CAP Land[35] cover county data to represent land cover for all 3143 counties. Summary of Changes to 2006-2010 data sources from original 2000-2005 EQI Air Domain - No changes to data sources Water Domain - One data source was added for 2006-2010 (SDWIS), and some variables developed from the WATERS database for 2000-2005 were not used in 2006-2010. Land Domain - One data source was eliminated for 2006-2010 (National Geochemical Survey). One data source was added for 2006-2010. Mine Safety and Health Administration (MSHA) Mines Data Set (2006-2010) Sociodemographic Domain - No data sources were eliminated for 2006-2010. Two data sources were added to the 2006-2010 EQI. USDAERS Creative class data 2008 Presidential Election results data Built Domain - No data sources were eliminated for 2006- 2010. Two data sources were added to the 2006-2010 EQI. EPA National Walkability data EPA NLCD +C-CAP data Variable Construction Approach We followed the same approach in developing variables for EQI 2006-2010 that we used for EQI 2000-2005. Most variables throughout the different domains were identified previously and developed as part of the EQI 2000-2005, then were updated for the 2006-2010 period. For the newly added data sources, we developed new variables. We assessed all variables as to whether the new variables needed to be standardized, as a proportion of geographical space (e.g., road proportions) or as a rate per population (e.g., violent crimes per capita) for use in the EQI. Additionally, some data were not available for all counties but required spatial kriging to provide national coverage. Kriging is a geospatial technique that uses known data points to interpolate data at locations with unknown measurements [42], The overall process for variable development for 2006-2010 was as follows. Update or identify and develop relevant variables within each domain for each available year (2006-2010) Assess collinearity among the variables within each domain and eliminate redundant variables Assess missing data and variability of each variable Assess normality of variables and transform as necessary Appendix II lists all the variables included in the EQI for each of the five domains for 2006-2010 and includes notes about whether the variables were used in the previous version of the EQI or if they are newly created variables. Appendix III provides the variables that were used in EQI 2000-2005 but were not used in the EQI 2006-2010 update. The created variables are available publicly at EPA's Environmental Dataset Gateway. 13 ------- Identification and Construction of Variables from Data Sources For each domain, all variables from EQI2000-2005 were reviewed and assessed for continued inclusion in the EQI 2006-2010. Variables were created from selected data sources to represent the constructs. Variables were developed in a variety of manners, including kriging and standardization by area or population. Each domain section below provides the details of variable construction. Assessing Variables The data reduction method Principal Component Analysis (PCA) is based on the variability between variables[43]; therefore, collinearity of variables was assessed. This assessment was done by developing correlation matrices for each domain. Variables with any correlation coefficient >0.70 were examined; representative variables were chosen for each pair or group of highly correlated variables (Appendix IV). Ideally, developed variables would have measured or estimated values for each county of the United States. When this criterion was not met, or when a majority (>50%) of values were zero, the proportion of missing data and zero values were evaluated for variable inclusion. If a particular variable had information missing for many counties, the nature of the missing data was evaluated. When it was determined that the missing data could be interpreted as meaningful zeros (i.e., no measures were taken because that condition did not occur in that county), the missing values were set to zero. For instance, the counties with no reported public housing were set to zero because public housing is truly absent from some counties. When counties were missing data because reporting areas were centralized, but the data could not be assumed to be truly missing, the data were spatially kriged when possible. For instance, crime was reported only for specific counties, even though it likely occurred in counties other than those in which it was reported as well. Therefore, crime rates were averaged spatially over adjacent counties to create an estimate for a county with no official reported crime. If the missing data could not be determined to be legitimate zeros, the data could not be reasonably kriged or averaged over geography, and the number of counties with missing data was too high (more than 50% of counties), the variable was not used in the EQI. In some instances, there may have been more than one data source that could represent a particular domain construct. In that case, the data source deemed to have better data quality and coverage was used. Finally, normality of variables was evaluated. Using PCA, the chosen data reduction technique, a key assumption is that variables are distributed normally[43]. If data were nonnormal, transformations were applied (typically log-transformation) to increase normality. For those variables with zero values, half of the nonzero minimum value was added to all observations before log-transformation. When data were updated on an annual or regular basis, variable consistency (mean and standard deviation) was compared across each year of the 5-year period (2006-2010). Summary of Activities Domain-Specific Variable Descriptions Air Domain The air domain consists of two data sources, (1) the AQS[10] and (2) the NATA[11], representing criteria air pollutants and HAPs. Criteria Air Pollutants Daily concentration data from the EPA's AQS monitors (point scale) were downloaded for ozone, carbon monoxide (CO), sulfur dioxide (S02), nitrogen dioxide (N02), particulate matter under 10 |im in aerodynamic diameter (PM10), and particulate matter under 2.5 |im in aerodynamic diameter (PM2.5). Annual averages were calculated for each of the six pollutants at each monitor with data. These averages then were used in a kriging procedure to estimate annual concentration at each county's center point for each year from 2006 to 2010. For the EQI spanning 2006 to 2010., a single average concentration was calculated from the annual average concentrations for each county from the kriged estimates. When indicated (i.e., lognormal distribution) half of the minimum nonzero value was added, and variables were log transformed. Hazardous Air Pollutants (HAPs) County-level concentrations estimates from NATA were used for all HAPs included in the EQI. HAPs were selected for inclusion from the full NATA pollutant list. Using data from 2005, variables were evaluated for collinearity and variability. Variables with any correlation coefficient >0.70 were examined, and representative variables were chosen for each pair or group of highly correlated variables (see Appendix IV). Correlations were determined after assessing for missingness/zeros and assessing normality. The variable that is correlated with the most other variables was chosen. For example, if variable A was highly correlated with variables B, C, D, and E, but each of those were correlated with a lower number of variables, A was chosen as the representative variable. The nonchosen variables (B, C, D, and E) then were removed from consideration within other groupings. If the correlation group was isolated (i.e., no variables in it were associated with any other variables outside the isolated group), then a representative variable was chosen without particular criteria. By the end, all variables remaining had correlation less than 0.7 with each other. All variables excluded were highly correlated with (represented by) at least one variable that was retained. Of the remaining variables, all missing values were set to zero, with the assumption that lack of estimate for an area indicated low concern for contamination with a particular HAP, and the number of zero values was evaluated for each variable. Pollutants with more than 50% zero values were dropped. This process left 37 HAPs included in the EQI. When indicated (i.e., log-normal distribution), half of the minimum nonzero value was added, and variables were log transformed. 14 ------- Table 3. 2005 NATA variables included in EQI 2006-2010 1,1,2,2-tefrachloroethane 1,1,2-trichloroethane 1,2-dibromo-3-chloropropane 1-3-dichloropropene Acrylic acid Benzidine Benzyl chloride Beryllium compounds bis-2-ethylhexyl phthalate Carbon tetrachloride Carbonyl sulfide Chlorine Chlorobenzene Chloroform Chloroprene Chromium compounds Cobalt compounds Cyanide compounds Dibutylphthalate Ethyl benzene Ethyl chloride Ethylene dibromide Ethylene dichloride Formaldehyde Glycol ethers Hydrazine Hydrochloric acid Isophorone Manganese compounds Methyl bromide Methylene chloride Phosphine Polychlorinated biphenyls Propylene dichloride Quinoline Trichloroethylene Vinyl chloride The air domain includes 43 variables representing criteria and HAPs. Water Domain The water domain included six data sources: (1) the WATERS program database[12], (2) Estimated Use of Water in the United States[14], (3) the National Atmospheric Deposition Program (NADP)[13], (4) the Drought Monitor Network[ 15], (5) the National Contaminant Occurrence Database (NCOD)[16], (6) the Safe Drinking Water Information System (SDWIS)[17] Using these six data sources, variables were created to represent seven constructs that describe the overall water environment. The seven constructs were (1) overall water quality, (2) general water contamination, (3) drinking water quality, (4) domestic use, (5) atmospheric deposition, (6) drought, and (7) chemical contamination. Overall Water Quality Impairment and water quality standards (WQS) data were obtained for the most recent state reported data that were collected under Sections 303(d) and 305(b) of the Clean Water Act (CWA)[44], The CWA is administered at the state level, and data are reported voluntarily from the states to the federal level. The dates of the reported data ranged from 2004 to 2010, as the federal reporting system maintains only the most recent data reported by each state. Under Section 305(b) of the CWA, states establish WQS for each hydrological feature based on the expected use (or uses) of these waters. Under Section 303(d) of the CWA, states assess whether waters are impaired (do not meet the standards) for the uses established in the WQS. This assessment is conducted biennially, and the states voluntarily report these data to the federal level. County-level impaired stream length was estimated for the contiguous United States using impairment and WQS data (from the WATERS database). With the designated uses listed for each state, the WQS was classified into five broad categories of water use: (1) agriculture, (2) drinking water, (3) recreation, (4) wildlife, and (5) industry. Using geographic information systems (GIS), county-level percentages of impairment were calculated. WQS and impairment datasets were joined to the map layer of hydrologic features in EPA's RAD [3 9], RAD is a replicate of the National Hydrography Dataset Plus [3 8] augmented for reporting water quality data. The defined broad water use categories were joined to the WQS data, and a table summarizing hydrologic features with multiple uses was created. WQS and impairment tables were assigned to features in the RAD using GIS Network and Event tools. These tools link tabular database information with linear or polygon features. Stream lengths were clipped by county boundaries to calculate percent impairment by county. Only linear water features were included in each category. Polygon features, such as lakes, were excluded because of the lack of well-defined county and state boundaries across water bodies. Next, county and state designations were linked with linear features in RAD. Once all data were associated to linear hydrologic features, lengths were calculated for water features impaired for any use, drinking water use, or recreational use and for all stream lengths within a county. The final variable was cumulative measure of percent of water impaired for any use. General Water Contamination Water contamination can be caused by several sources. Unfortunately, EPA only has consistent data on the point sources of contamination in the form of the number of National Pollutant Discharge Elimination System (NPDES)[45] permits. Therefore, the number of permits in a county was used as a proxy for general water contamination. Using permit information in the WATERS database, 13 variables were calculated for the number of discharge permits in a county. Permits that were current during the period 2006-2010 were selected. The 10 variables that were calculated based on individual permit types had too many missing data; therefore, three composite variables were created for inclusion in the EQI. A composite variable was developed for the number of sewage permits per 1000 km of stream length in a county. The number of animal feeding operations and concentrated animal feeding operations NPDES permits, combined sewer overflow NPDES permits, and NPDES permits 15 ------- for sludge in each county were summed and divided by the total stream length in the county. Similarly, composite variables were calculated for industrial permits (combining the total of pretreatment NPDES permits, general facilities NPDES permits, and individual facilities NPDES permits) and stormwater permits (combining the total of general stormwater NPDES permits, industrial stormwater NPDES permits) by county per 1000 km of stream length. Preliminary analyses demonstrated low loadings for the grouped variables; therefore, only one variable was maintained: the total number of discharge permits per 1000km of stream length in the county. Drinking Water Quality In the United States, drinking water quality is measured and maintained by the public water system (PWS) treating and distributing drinking water. Based on the Safe Drinking Water Act, states are required to report basic information about PWS, violation information for each PWS, and enforcement information to the federal system. The SDWIS data is publicly available data through the Fed Data Warehouse [17], The basic information for the PWSs were merged with the violations reports, so that the county and city served by the violations were together in one report. In instances where there were multiple counties served by a PWS, the counties were separated to account for these violations in both counties served by the PWS. Variables were created for each rule within the Safe Drinking Water Act, such as the Lead and Copper Rule. A time period average for each rule name violation by PWS was calculated as the frequency divided by the number of years in the time period of interest, in this case five (2006-2010). This time period average was then multiplied by the population served for each PWS, and these values were summed for the county to estimate the proportion of the population in the county affected by the violation. Most counties did not report violations for the majority of rules; therefore, only one variable constructed provided sufficient variability to be included, which was that calculated from violations to the Total Coliform Rule. Domestic Use Data from the Estimated Use of Water in the United States database[14] were used as a proxy for domestic water quality. If water is being withdrawn for competing uses (agriculture, industry, etc.), it will put stress on water supplies, which, in turn, will affect water quality. This database includes county-level estimates of water withdrawals for domestic, agricultural, and industrial use. Initially, 15 variables of water withdrawals for domestic, agricultural, and industrial use were developed. These data are estimated every 5 years and were included in the EQI as averaged data for 2006 and 2010. Two variables were included in the EQI after evaluation for collinearity (four variables removed) and missing data (nine variables removed). The two variables were (1) the percent of population on self-supplied water supplies and (2) the percent of those on public water supplies that are on surface waters. For these variables, higher values are not necessarily a marker for poor water quality. The data were provided at the county level and normally distributed; therefore, no additional transformation was required. Atmospheric Deposition The atmospheric deposition of chemicals can affect water quality. The NADP dataset[13] provides measures for the concentration of nine chemicals in precipitation: (1) calcium, (2) magnesium, (3) potassium, (4) sodium, (5) ammonium, (6) nitrate, (7) chloride, (8) sulfate, and (9) mercury. Annual summary data from each monitoring site for each year 2006-2010 were kriged spatially to achieve national coverage and county-level estimates. The annual estimates for each pollutant then were averaged over the 5-year study period. The data for all pollutants, except sulfate, were skewed and, therefore, were natural log transformed to achieve normal distributions. Magnesium, sodium, and ammonium were removed as they were highly correlated with potassium, chloride, and nitrate, respectively. Drought Drought affects the concentration of pathogens and chemicals in water bodies and, therefore, can affect water quality. The Drought Monitor dataset[15] provides raster data on six possible drought status conditions for the entire United States on a weekly basis. The data were aggregated spatially to the county level to estimate the percentage of the county in each drought status condition. The weekly data were averaged to achieve annual estimates for 2006- 2010 and, then, averaged to create a composite for the entire period. From this data, the percentage of the county in extreme or exceptional drought (intensity levels D3 and D4, respectively) was used in the EQI. The remaining five drought status conditions were removed because all of the drought statuses were highly correlated. Chemical Contamination Chemical contamination of water supplies can directly affect human health. The NCOD dataset[16] provides data on 69 contaminants provided by public water supplies throughout the country for the period from 1998-2005. Data for all samples in a county for each contaminant were averaged over the entire period of the dataset, 1998-2005. More recent data were not available. The data also were natural log transformed to achieve normal distributions. Missing values were set to zero, with the assumption that lack of measurement for an area indicated low concern for contamination with that particular contaminant. Nine contaminants, (1) asbestos, (2) beryllium, (3) diquat, (4) endothall, (5) glyphosate, (6) dioxin, (7) radium, (8) beta particles, and (9) uranium, did not include data for enough counties (missing data) to be included in the EQI construction. Twenty-one variables were deleted because of high correlation with other contaminants: (1) lindane, (2) thallium, (3) toxaphene, (4) oxamyl, (5) alachlor, (6) 2,4,5-TP (Silvex), (7) hexachlorocyclopentadiene, (8) carbofuran, (9) heptachlor, (10) heptachlor epoxide, (11) hexachlorobenzene, (12) 1,2,4-trichlorobenzene, (13) 1,2-dichlorobenzene, (14) vinyl chloride, (15) 1,1-dichloroethylene, (16) trans-1,2-dichloroethylene, (17) 1,2-dichloroethane, (18) carbon tetrachloride, (19) 1,2-dichloropropane, (20) 1,1,2-trichloroethane, (21) benzene. Land Domain The land domain consisted of five data sources, representing five constructs: (1) agriculture, (2) pesticide use, (3) facilities, (4) radon zone, and (5) mining activity. 16 ------- Agriculture Information on nonpesticide chemicals used in farming, animal units, harvested acreage, irrigated acreage, manure acreage, and proportion of farms was taken from the 2007 Census of Agriculture [19], Final acreage for each item then was divided by total acreage for each county to return a percentage (e.g., percentage of irrigated acres out of total acres in a county). In some cases, county-level acreage for items was suppressed. In these, case estimates were imputed based on unaccounted for and total state-level acreage. Known acreage was subtracted from total state acreage, leaving an "unassigned" total acreage for each state. This total number was divided by the total number of farms in counties with suppressed acreage to return an average acreage for each farm. This average acreage then was multiplied by the number of farms in each county with suppressed acreage to estimate acreage. Animal units were estimated by multiplying the number of livestock (cows, hogs, and poultry) by the animals per animal unit statistic [46] and, then, adding together all livestock categories for each county. Eight variables representing agriculture were included in the EQI. Pesticide Use Pesticide use for each county was estimated using county- pesticide-use data from the 2009 National Pesticide Use Dataset[18], Each pesticide was categorized into one of three categories: (1) herbicide, (2) fungicide, or (3) insecticide. The average weight (in kilograms) of each pesticide was calculated for the years available (2006-2009) for each county, then summed by pesticide type. If a county did not have information for one of the pesticide categories, the national average was used. Despite the choice of high spatial coverage, there are recognized uncertainties in estimating the geographic distribution of compounds applied to specific crops as described by Baker et al. (2015) in prior literature [47], These three pesticide categories were included in the EQI. Pesticide variables were evaluated for normality and log transformed. Facilities Large facilities have the capacity to affect land quality. The facilities included in the land domain are those represented on the EPA Geospatial Data Download Service [20], Because many counties had at least one, but no counties had all six of the facility types present, a composite facilities data variable was constructed by summing the count of any one of the six facilities types (Brownfield sites (n=1273)[48]; Superfund sites (n=719) [49]; Toxic Release Inventory sites (n=2671)[20]; pesticide- producing-location sites (n=2099)[50]; large-quantity generator sites (n=1963)[51]; and treatment, storage, and disposal sites (n=874)[52]) across the counties. Facilities were included in the count if they were identified during the 2006-2010 period. The count of facilities was divided by the county population, which produced a facilities rate. The facilities rate variable was assessed for normality and log transformed. Radon Zone The potential for elevated indoor radon levels was represented using the county score from the EPA Radon Zone map[21], which was available for 3142 counties (one county, Broomfield, Colorado, was missing). The EPA Radon Zone map identified areas of the United States with the potential for elevated indoor radon levels. Each United States county was assigned to one of three zones based on radon-level elevation potential. Mines Mines, like large facilities, have the capacity to affect land quality. The mines included in the land domain are those found in the MSHA dataset[22], which includes those mines under MSHA jurisdiction since 1970. Mines were included if they were active at any point before 2010 and were not abandoned and sealed after 2006. Those excluded most likely do not continue to pose any environmental impact. Any mines already represented in Superfund data were excluded. Mines were separated by the five primary commodity types: (1) coal, (2) metal, (3) nonmetal, (4) sand and gravel, and (5) stone, and a county could have more than one type of mine. The counts of the mines were divided by the county population, producing a mine rate. Of the 3143 counties, 2904 had at least one mine. For those counties that had zero values for the different mine types, zeros were replaced with the minimum value of the mine type/2 was added to the standardized population variables. The mine variables were assessed for normality and log transformed. Sociodemographic Domain This domain was constructed to explore the sociodemographic features of counties in the United States. These features were used to approximate the social stress associated with residing in more deprived (low education, high unemployment, high violent crime, high poverty, etc.) or more affluent (high employment rates, low property crime, high proportion of college graduates, etc.) counties. This domain includes variables from the 2010 United States Census) [24], the FBI Uniform Crime Reports (UCR)[25], the 2008 Presidential election results[26], and the United States Department of Agriculture Economic Research Service Creative Class data[27]. Because the sociodemographic domain is related to population density, by virtue of the data's collection and reporting, variables were developed as population rates (denominator: count of persons per county), rather than area-based rates (denominator: square miles per county). Nine variables were obtained from the 2010 United States Census [24], The nine variables were (1) percent earning a bachelor's degree or higher among persons aged 25 years or older; (2) percent persons unemployed; (3) percent of families living below the federal poverty line; (4) percent vacant housing units; (5) median household value; (6) median household income; (7) percent renter-occupied units; (8) count of occupants per room; and (9) the Gini coefficient, a marker of income inequality. Owing to the skewed nature of the household income and count of occupants per room data, these variables were log transformed for inclusion in the EQI. The sociodemographic domain contains a mix of positive and negative features; therefore, when the sociodemographic domain was constructed, positive variables were reverse-coded to ensure that a higher amount of the sociodemographic domain will represent adverse environmental conditions. The area-level crime environment was represented using the FBI UCRs[25], The first step in constructing crime data was to assign each jurisdiction or place to a county using county Federal Information Processing Standards code[53]. In cases when a jurisdiction covered more than one county, the reported crime 17 ------- was assigned to both counties. Although this double assignment results in a slight inflation of crime reports for a state, there was no way to determine which county should receive the crime report. Further, if police or municipal jurisdictions crossed county lines, it is likely residents of both counties were "exposed" to the crime environment. Crime data attributed to more than one county occurred in approximately 15 counties. Second, because crime was reported for less than half the United States counties, crime data were kriged spatially and temporally to estimate values for counties with no reported crime. The decision was made to krig these data because data reporting was voluntary, and it seemed unlikely that no crime occurred in the nonreported areas. Because zeros could not be reasonably assigned to the missing counties, the data were interpolated spatially and temporally instead. Based on experience with the 2000-2005 county-level EQI, and in acknowledgement that the correlation between the property and violent crime rates was very high (0.96), only log violent crime was included in the EQI. The political climate of a county was represented by Leip's election map [26], On this Web site, county-specific percents voting Republican or Democratic are reported. These data were downloaded for each county. The report voting Democratic in the 2008 presidential election are included in the EQI. One county in Hawaii that had been an independent county unit, FIPS 15005, was subsumed by Maui for the presidential election data, so the same Democratic percentage was applied to county 15005 as to Maui. One creative class variable was included in the 2006-2010 EQI. The creative class thesisthat towns need to attract engineers, architects, artists, and people in other creative occupations to compete in today's economymay be particularly relevant to rural communities, which tend to lose much of their talent when young adults leave. The ERS creative class codes[27] indicate a county's share of population employed in occupations that require "thinking creatively." The percent employed in creative class occupations index was included in the EQI. Built Domain Seven data sources were included in the built domain, representing (1) the subsidized housing environment, (2) traffic safety, (3) public transportation usage and commuting times, (4) road properties (road type and density), (5) the business and service environments (e.g., food, recreation), (6) county walkability, and (7) green space. Housing Environment The subsidized housing environment was represented by the Housing and Urban Development data[32]. These data provide a count of the low-rent and Section 8 housing in each housing authority data area. The housing authority areas correspond to cities, which were assigned county codes. Data were collected in 2010, but, because low-rent and Section 8 housing does not change substantially over time, these data were considered representative of the 2006-2010 period. The variables were summed to result in the count of any low-rent or Section 8 housing in each county. The rate of subsidized housing was constructed by dividing the count of subsidized housing units per county by the county population. The data were log transformed prior to inclusion in the EQI. Traffic Safety Traffic fatalities, an important feature and consequence of the built environment, were estimated using the Fatality Analysis Reporting System (FARS) data[31], The FARS is a national census providing the National Highway Traffic Safety administration yearly reports of fatal injuries suffered in motor vehicle crashes. Rates for the 2006-2010 counts of fatal crashes per county were constructed by dividing the count of county-level fatal crashes by the county-level population. Many counties had no fatal crashes. To accommodate the large number of meaningful zeros in the data, the log of this rate variable was used in the built domain of the EQI. Public Transportation Usage and Commuting Time The percent of county residents who use public transportation was estimated using the 2010 United States Census[24] variable in the EQI. For many counties, the percent of the population that reports using public transportation is near zero. Therefore, this variable was log transformed prior to its use in the built domain of the EQI. Also obtained from the United States Census was the average number of minutes employed persons spent on the commute home from work. Road Properties For the built-environment domain, characterizing the relative proportions of each county that was served by highways, secondary roads, and primary roads were of interest, as these types of roads confer different risks (related to speed and safety) and benefits (related to neighborhood walking or ease of transit). Road type for the year 2008 was approximated using the NAVTEQ road data[30] associated to TIGER county boundary [29] data. Three proportion variables were constructed by dividing the mileage of each road type (e.g., secondary roads) by the total road mileage in each county. The proportions of all roadways that were secondary roads were included. Business and Service Environments Businesses represent an important component of the built environment and can contribute to the risk and amenity landscape. Variables representing various built-environmental features were constructed using the proprietary 2008 Dun and Bradstreet data[28], which include commercial information on businesses and data on more than 195 million records. Eight rate variables were constructed by dividing the county-level count of a business type by the county-level population count. The eight variables included the (1) positive food environment, (2) negative food environment, (3) vice environment (alcohol, pawn, and gaming), (4) health care business environment, (5) recreation environment, (6) education environment, (7) social-service environment, and (8) civic-related environment. Note: Positive food environments included those that sold healthier foods, like grocery stores, sit-down restaurants, and organic shops, whereas the negative food environment included businesses like fast-food restaurants, convenience stores, and pretzel trucks. Although related, these two food environments comprise different businesses and are not 100% inversely correlated. Nonnormally distributed variables were log transformed, and all eight were included in the EQI. 18 ------- Walkability Walkability is an important feature of the built environment, and variability across walkability may help explain poor or good health. The National Walkability Index (NWI)[36] was used to determine walkability as a mode of travel for each county. The scores, ranging from zero to 20 are calculated using a weighted rank of four variables: (1) mix of employment types (such as office, retail, and service) and occupied housing, (2) mix of employment types in a block group (such as office, retail, and service), (3) street intersection density (pedestrian-oriented intersections), and (4) predicted commute mode split - proportion of workers in the block group who carpool. A higher rank indicates an increased likelihood of walking being used as the mode of travel. The block group scores were added, and, then, a mean of the block group scores based on county population proportions was created. The county walkability scores ranged from 1.00 to 16.23. Green Space Exposure to green space also has been associated with improved health. The green space variable was created by EPA's EnviroAtlas[33] using National Land Cover Database (NLCD) [34] and Coastal Change Analysis Program[35] data. Three possible constructions were considered: The NINDEX variable was created by EnviroAtlas as a natural land cover variable and includes barren land, forest, shrub/scrub, grassland, sedge, lichens, moss, and wetlands. NINDEXopen is the NINDEX variable with developed open space, such as parks and golf courses, included. The Richardson index[54] is based on a green space paper and includes the NINDEX and also developed open space, low intensity, and medium intensity. For the sake of dissemination outside academic communities and ease of data availability/construction, the 2006-2010 EQI used the NINDEX_ open variable. The variables represented percentages of up to 24 possible land cover types. To create a green space variable, five total land cover groups were combined, those classified as (1) natural land cover (barren land, rock/sand/clay/tundra/perennial ice), (2) forest, (3) shrubland/scrub land, (4) herbaceous, and (5) wetlands) and those classified as developed open space, where impervious surfaces make up less than 20% of total cover and includes recreational areas, such as grassy lawns, parks, and golf courses. This combined variable of natural land cover and developed open space gave a percentage of the county that had green space and ranged from 3.88% to 99.99%. The variable then was assessed for normality. Changes to 2006-2010 variable construction from original 2000-2005 EQI Air Domain Variables eliminated from the 2006-2010 EQI The following air variables were eliminated because of high collinearity to one or more variables. Variable Represented by 2-4-toluene diisocyanate Ethylbenzene 2-chloroacetophenone Benzyl chloride 2-nitropropane Chloroprene 4-nitrophenol Ethylbenzene Acetophenone Ethylbenzene Acrolein Ethylbenzene Acrylonitrile Chloroprene Biphenyl Ethylbenzene Bromoform Benzyl chloride Cadmium compounds Chromium compounds Carbon disulfide Ethylbenzene Cresol cresylic acid Ethylbenzene Cumene Ethylbenzene Diesel engine emissions Ethylbenzene Dimethyl formamide Ethyl chloride Dimethyl phthalate Ethylbenzene Dimethyl sulfate Benzyl chloride Epichlorohydrin Chloroprene Ethyl acrylate Chloroprene Ethylene glycol Ethylbenzene Ethylene oxide Ethylene dichloride Ethylidene dichloride Vinyl chloride Hexachlorobenzene Polychlorinated biphenyls Hexachlorobutadiene Chloroprene Hexachlorocyclopentadiene Chloroprene Hexane Ethylbenzene Lead compounds Chromium compounds Mercury compounds Ethylbenzene Methanol Ethylbenzene Methyl chloride Carbon tetrachloride Methyl isobutyl ketone Ethylbenzene Methyl methacrylate Ethylbenzene Methylhydrazine Benzyl chloride MTBE Ethylbenzene Nitrobenzene Chloroprene n-n-dimethylaniline Chloroprene o-toluidine Chloroprene PAH/POM Ethylbenzene Propylene oxide Chloroprene Selenium compounds Ethylbenzene Styrene Ethylbenzene Tetrachloroethylene Ethylbenzene Toluene Ethylbenzene Triethylamine Ethylbenzene Vinyl acetate Ethylbenzene Vinylidene chloride Ethylbenzene 19 ------- Water Domain New variables added to the 2006-2010 EQI Total coliform health-based violations added Variables removed in the recreational water construct Number of days closed per event in county 2000-2005 numDays_Close_Activity_tot Number of days per contamination advisory event in county 2000-2005 numDays_Cont_Activity_tot Number of days per rain advisory event in county 2000-2005 numDays_Rain_Activity_tot Variables removed in the chemical contamination construct from the 2006-2010 EQI because of correlation with other variables Beryllium - W Be ln (mg/L) Lindane - W Lindane ln (mg/L) Thallium - W_Tl_ln (mg/L) 1996 Toxaphene - W Toxaphene ln (|ig/L) Oxamyl (Vydate) - W Oxamyl ln (|ig/L) Alachlor - WAlachlorln (|ig/L) 2,4,5-TP (Silvex) - W silvex ln (|ig/L) Hexachlorocyclopentadiene - W HCCPD ln (ng/L) Carbofuran - WCarbofuranln (ng/L) Heptachlor - WHeptachlorln (ng/L) Heptachlor Epoxide - W Heptachlor epox ln (|ig/L) Hexachlorobenzene - W HCB ln (ng/L) 1,2,4-Trichlorobenzene - W_124TCIB_ln (|ig/L) 1,2-Dichlorobenzene (o-Dichlorobenzene) - W ODCB ln (lig/L) Vinyl Chloride - W VCM ln (ng/L) 1,1-Dichloroethylene - W llDCE ln (ng/L) trans-l,2-Dichloroethylene - W_tl2DCE_ln (ng/L) 1,2-Dichloroethane (Ethylene Dichloride) - W EDC ln (|ig/L) Carbon Tetrachloride - W_CC14_ln (ng/L) 1,2-Dichloropropane - W PDC ln (ng/L) 1,1,2-Trichloroethane - W_112TCA_ln (ng/L) Benzene - W Cllbenz ln (ng/L) Land Domain Variables eliminated from the 2006-2010 EQI The following variables were eliminated because content was represented in the NCOD and NADP. Mean level of arsenic Mean level of selenium Mean level of mercury Mean level of lead Mean level of zinc Mean level of copper Mean level of aluminum Mean level of sodium Mean level of magnesium Mean level of phosphourous Mean level of titanium Mean level of calcium Mean level of iron New variables added to the 2006-2010 EQI Primarily coal mines per county population Primarily metal mines per county population Primarily nonmetal mines per county population Primarily sand and gravel mines per county population Primarily stone mines per county population Sociodemographic Domain Variables eliminated from the 2006-2010 EQI Percent management occupation - eliminated because content better covered in creative class index data Housing built before 1939 - eliminated because of unclear association with health Percent with no English - eliminated because of unclear association with health and increasing subjectivity Variables substitutions for the 2006-2010 EQI Percent bachelor's degree (>25 years old) substituted for percent greater than high school Percent family poverty substituted for percent persons in poverty Count of occupants per room replaced median number of rooms New variables added to the 2006-2010 EQI Percent of persons working in creative occupations Percent of county that voted Democratic in the 2008 presidential election Built domain Variables eliminated from the 2006-2010 EQI Entertainment environment - eliminated because of unclear association with health Transportation environment - because the data contained in this variable is better covered using other data sources Variables substitutions for the 2006-2010 EQI Percent secondary roads replaced percent primary roads New variables added to the 2006-2010 EQI Walkability score added Proportion of county in green space added 20 ------- Data Reduction and Index Construction Overall Approach After variable development, all the variables were combined into an index representing the overall environmental quality. The specific tasks required for index construction were as follows. Included all the variables from one domain in a PCA to empirically summarize that domain-specific environmental context (retaining the first component as the domain index) for each of the five domains Assessed the positive/negative direction (valence) of the variable loadings for each domain; if loadings were not in the correct direction to ensure a higher value on the index corresponded to worse environmental quality, corrected valence when necessary Combined each of the five domain-specific indices in another PCA to empirically summarize the overall environmental context into one index of environmental quality and retained the initial component as the overall EQI Repeated the three previous steps for each of the four RUCC strata (e.g., RUCC stratum 1 air domain; RUCC stratum 2 air domain, etc.), such that each RUCC had its own set of domain- specific indices, as well as its own overall index The EQI, domain-specific indices, and EQI stratified by rural- urban data are available publicly at EPA's Environmental Dataset Gateway. Also, an interactive map of the EQI is available at EPA's GeoPlatform. Principal Components Analysis (PCA) PCA is a data reduction technique frequently used to create sociodemographic scales or indices for inclusion in statistical models[43, 55], PCA analyzes total variance, and the loading represents the correlation between the variable and the component. PCA assumes no underlying latent variable structure but, rather, seeks to empirically summarize multiple possible domains. Three major goals of PCA are to 1. summarize the patterns of correlations among observed or measured variables, 2. provide an operational definitionin this case, a regression equationfor underlying processes by using observed or measured variables, and 3. reduce a large number of observed variables into a smaller number of factors or a single component. PCA was chosen for data reduction for several reasons. Production of an empirical summary of the various constituent components of the EQI was desired. Various data sources measured on multiple scales needed to be combined. PCA standardizes these measures prior to combining. Therefore, the differing scales are less problematic. To assess variables influences on the index, variables cannot simply be added together. To do so would mean knowledge for most of the variables would not be available to indicate if any one variable would prove to be more "influential" for environmental quality than another. PCA enables variable loadings to vary by their relative importance to the total component. This feature enabled exploration of variable loading differences for interpretation purposes. The PCA steps included selecting the set of variables to be used, preparing the correlation matrices, extracting the set of components from the correlation matrix, determining the number of components observed, and interpreting the findings. The sole modification to the PCA methodology in the county 2006-2010 EQI compared to that of the 2000-2005 EQI is "valence correction." We also have created a 2000-2005 valence- corrected version of the EQI. "Valence correction" refers to reorientation of PCA output for uniformity of interpretation of domain indices and uniformity in orientation of domain indices input into the second PCA for EQI construction. In this instance, we are defining valence as the departure from neutrality along a continuum; generally, we are interested how attributes depart from neutrality in opposite directions. The PCA loadings are a function of the program's starting point, or seed, which is not easily manipulable. Therefore, the loading valence needed to be corrected prior to the construction of the indices to ensure that higher values on a given index, and on the overall EQI, signify worse environmental quality [56, 57], Domain and EQI indices are designed such that lower (more negative) values represent "better" quality and higher (more positive) values represent "worse" quality. Under this setup, health beneficial variables should load negative in the PCA output ("+" or loading sign for a variable in the component variable loadings vector represents positive or negative correlation between that variable and the component, respectively). Given that the first principal component was taken to represent domain or environmental quality and that the orientation of these indices was designated as going from better to worse quality (negative to positive index value), it was necessary to reverse the component variable loadings vector from a PCA output if a high proportion of variables was deemed beneficial loaded "+", and a high proportion of variables was deemed detrimental loaded "-"[55], Determination of variables as beneficial or detrimental to human health across domains was done a priori based on literature evidence and content matter judgment. Reorientation of PCA-derived indices through multiplication of the component variables loading vector by -1 preserves (1) the direction of the relationship among the variables for a given PCA (i.e., variables that loaded with same signs will retain same signs, and variables that loaded opposite to each other will retain opposite signs after reversal, and, therefore, the pattern of correlations among the variables will remain intact); and (2) the magnitude of correlation among variables (reversal of loading signs does not impact the magnitude of the loading) [5 8], The sum of squares of variable loadings in a PCA output equals 1, and, therefore, each square of a variable loading can be viewed as a measure of the contribution of that variable toward the principal component (domain indices and EQI in this case), enabling estimation of the "correctness" of the orientation of the index. We used the square 21 ------- of variable loadings in a given PCA output in combination with aforementioned a priori designations of benefit or harm to guide choice of index reorientations. PCA analyzes the total variance. Therefore, in the PCA correlation matrix, "1" is In the positive diagonal[55]. To construct the EQI. variables from each domain were entered into domain-specific PC As. PCA produced variable loadings, which were roughly equivalent to the "weight" or contribution that each variable made toward explaining the total variance. The weights, however, need not sum to 1.0 because the loadings were for the total variance, rather than just the shared variance. The loading associated with each variable then was multiplied by its mean value for the given geography (county, for the EQI), and these weighted mean values were summed. Rural-Urban Continuum Both the domain-specific indices and the overall EQI were created for each county in the United States. Recognizing that environments differ dramatically across the rural-urban continuum[59], the decision was made that the EQI would be most useful if it accommodated rural-urban environmental differences. The EQI was stratified by RUCCs. The RUCC is a nine-item categorization code of proximity to or influence of major metropolitan areas[60]. The nine-item categories were condensed into four, where RUCC1 represents metropolitan- urbanized = codes 1+2+3, RUCC2 nonmetropolitan-urbanized = 4+5, RUCC3 less urbanized = 6+7, and RUCC4 thinly populated (rural) = 8+9 (see Figure 3)[61-64], For the 2006-2010 EQI, the 2013 RUCC was used. RUCC-stratified EQIs, and an overall EQI was constructed. Loadings on the stratified and nonstratified sets of indices were assessed to determine loading heterogeneity across counties. Because these loadings differed meaningfully by RUCC level, RUCC-stratified EQIs were constructed for each county. Although it was possible to form as many independent linear combinations as there were variables in PCA, only the first principal component was retained. The first principal component was the unique linear combination that accounted for the largest possible proportion of the total variability in the component measures. Therefore, the first component from each of these domain-specific indices was retained (e.g., air index, water index). Domain-specific indices were then entered into another PCA, where the first component was retained as the EQI (Figure 2). This process was undertaken separately for each of the four RUCC strata. Within each RUCC strata, domain-specific variable loadings were evaluated based on the value of variable loading and the variable's hypothesized relevance to health. For instance, although arsenic may occur in low frequency in a lot of counties and, therefore, may have a relatively small component loading, it is an important health hazard when present. Based on variable loading magnitude alone, dropping arsenic from an EQI may be a reasonable conclusion. However, it was retained for the EQI based on its relevance to human health. The first principal component, the domain-specific EQI (e.g., air domain EQI), then was standardized to have a mean of 0 and standard deviation (SD) of 1 by dividing the index by the square of its eigenvalue. Each domain-specific index was then included in a second PCA procedure (Figure 2) to result in the overall EQI for each stratum of RUCC. For orientation to the results, low index scores (EQI and domain- specific) indicate higher enviromnental quality, and higher index scores (EQI and domain-specific) indicate lower enviromnental quality. Metropolitan urbanized | Non-metro urbanized | Less urbanized | Thinly populated Figure 3, Rural-urban continuum code (RUCC) stratification for all counties in the United States. 22 ------- Results Description of Variables Comprising Environmental Quality Index Domains Air Domain Criteria air pollutants were distributed relatively evenly across the rural-urban gradient (Table 4). Some hazardous air pollutants varied in emissions across rural-urban strata; however, there was no discernable pattern for most. For example, 1,1,2-trichloroethane's highest levels were observed in the less urbanized stratum, whereas levels were similar across other strata, and emissions for manganese compounds were highest in the most metropolitan areas then steadily decreased across more rural strata. Table 4. Air domain variable means, standard deviations (SDs), and ranges - Overall and rural-urban continuum codes (RUCCs) stratified Metropolitan- Urbanized ] Nonmetropolitan- (RUCC1 = Urbanized (RUCC2 Less Urbanized Thinly Populated Total (3143) 1,167) Mean = 306) Mean (SD) (RUCC3 = 1,026) (RUCC4 = 644) Mean Mean (SD) Variable Units (SD) [Range] [Range] Mean (SD) [Range] (SD) [Range [Range] Construct: Criteria Air Pollutants 2.0E+01 (4.7E+00) 1.95E+01 (5.07E+00) 1.95E01 (4.37E+00) 1.89E+01 (4.88E+00) 2.0E+01 (4.7E+00) PM10 (jg/m3 [4.1E-01,5.4E+01] [6.00E+00, 6.60e+01] [5.39E+00,5.25E+01] [4.01E-01,3.42E+01] [4.0E-01,6.6E+01] 1.1E+01 (2.1E+00) 1.02E+01 (2.19E+00) 9.99E+00 (2.20E+00) 9.05E+00 (2.39E+00) 1.0E+01 (2.3E+00) PM2.5 |jg/m3 [4.1E+00,2.4E+01] [4.28E+00,1.48E+01] [3.35E+00,1.80E+01] [4.28E+00,1.79E+01] [3.3E+00,2.4E+01] 4.5E-02 (4.4E-03) 4.46E-02 (4.99E-03) 4.47E+02 (3.99E+03) 4.46E-02 (4.47E-03) 4.5E-02 (4.4E-03) Ozone ppm [2.2E-02,5.9E-02] [2.22E-02,5.76E-02] [2.99E-02,5.72E-02] [2.90E-02,5.65E-02] [2.2E-02,5.9E-02] 9.2E+00 (4.6E+00) 7.93E+00 (3.93E+00) 3.85E-01 (8.36E-02) 6.65E+00 (4.37E+00) 8.0E+00 (4.4E+00) Nitrogen oxide ppb [5.9E-01,3.1E+01] [5.92E-01,2.81E+01] [2.41E-01,8.89E-01] [5.91E-01,2.84E+01] [2.6E-01,3.1E+01] 2.2E+00 (1.5E+00) 1.97E+00 (2.22E+00) 7.53E+00 (4.00E+00) 1.47E+00 (1.39E+00) 1.9E+00 (1.5E+00) Sulfur dioxide ppb [7.3E-03,9.7E+00] [1.10E-02,3.09E+01] [2.65E-01,2.84E-01] [2.21E-02,9.23E+00] [7.3E-03,3.1E+01] 3.9E-01 (8.2E-02) 3.87E-01 (7.49E-02) 4.32E-03 (4.91 E-04) 3.93E-01 (9.57E-02) 3.9E-01 (8.5E-02) Carbon monoxide ppm [2.5E-01,8.7E-01] [2.49E-01,7.38E-01] [3.90E-03, 8.19E-03] [2.61E-01,8.90E-01] [2.4E-01,8.9E-01] 5.5E-04 (3.1 E-04) 5.47E-04 (3.47E-04) 5.50E-04 (3.14E-04) 4.77E-04 (2.75E-04) [ 5.4E-04 (3.1E-04) Ethylene dibromide Tons emitted [5.5E-05,2.0E03] [1.65E-04,1.64E-03] [1.65E-04,1.79E-03] 5.50E-05,1.68E-03] [5.5E-05,2.0E-03] 1.9E+00 (6.0E-01) 1.75E+00 (5.57E-01) 1.79E+00 (5.80E-01) 1.61E+00 (6.05E-01) 1.8E+00 (6.0E-01) Formaldehyde Tonsemitted [2.1E-01,5.6E+00] [6.83E-01,3.20E+00] [6.25E-01,3.86E+00] [2.08E-01,3.36E+00] [2.1E-01,5.6E+00] 4.4E-03 (7.5E-04) 4.46E-03 (9.07E-04) 1.39E-04 (2.79E-03) 4.20E-03 (6.61 E-04) 4.4E-03 (6.7E-04) 1,1,2,2-Tetrachloroethane Tonsemitted [1.3E-03,1.4E-02] [3.90E-03,1.33E-02] [1.76E-13,8.10E-02] [1.30E-03,1.60E-02] [1.3E-03, 1.6E-02] 4.0E-04 (6.6E-03) 2.00E-05 (1.24E-04) 5.25E-06 (9.53E-06) 9.61E-05 (1.58E-03) 2.1 E-04 (4.4E-03) 1,1,2-Trichloroethane Tonsemitted [1.8E-13,2.1E-01] [1.76E-13,1.73E-03] [1.95E-06,1.87E-04] [1.76E-03,3.59E-02] [1.8E-13,2.1E-01] 5.2E-06 (7.3E-06) 5.98E-06 (2.29E-05) 8.41 E-03 (2.26E-02) 4.34E-06 (6.27E-06) 5.1E-06 (1.0E-05) 1,2-Dibromo-3-chloropropane Tonsemitted [6.5E-07,9.1E-05] [1.95E-06,3.52E-04] [5.00E-16,3.75E-01] [6.50E-07,6.60E-05] [6.5E-07,3.5E-04] 1.1E-02 (3.4E-02) 1.06E-02 (2.13E-02) 6.41 E-05 (5.31 E-04) 5.00E-03 (1.38E-02) 9.1 E-03 (2.6E-02) 1,2-Dichloropropane Tonsemitted [5.0E-16,4.9E-01] [5.00E-16, 1.40E-1] [3.00E-015,1.01E-02] [5.00E-016,1.18E-01] [5.0E-16,4.9E-01] 1.4E-04 (2.4E-03) 2.06E-04 (2.45E-03) 3.43E-07 (7.89E-07) 9.76E-05 (1.39E-03) 1.1E-04 (1.8E-03) Acrylic acid Tonsemitted [3.0E-15,7.2E-02] [3.00E-15,4.23E-02] [1.46E-08,7.29E-06] [3.00E-15,3.36E-02] [3.0E-15,7.2E-02] 3.3E-07 (1.2E-06) 3.22E-07 (1.98E-06) 1.26E-05 (2.92E-05) 3.14E-07 (1.60E-06) 3.3E-07 (1.3E-06) Benzidine Tonsemitted [4.9E-09,3.6E-05] [1.48E-08,3.39E-05] [4.69E-12,3.90E-04] [4.88E-09,3.72E-05] [4.9E-09,3.7E-05] 1.4E-05 (3.9E-05) 1.40E-05 (4.08E-05) 1.26E-05 (2.92E-05) 1.10E-05 (4.97E-05) 1.3E-05 (3.9E-05) Benzyl chloride Tonsemitted [4.7E-12,8.5E-04] [4.69E-12,4.20E-04] [4.69E-12,3.90E-04] [4.69E-12,1.16E-03] [4.7E-12, 1.2E-03] 4.4E-05 (4.4E-05) 4.55E-05 (6.00E-05) 4.66E-05 (8.23E-05) 3.57E-05 (2.93E-05) 4.3E-05 (5.9E-05) Beryllium compounds Tonsemitted [7.5E-06,7.7E-04] [2.25E-05,6.93E-04] [2.25E-05,1.56E-03] [7.50E-06,6.26E-04] [7.5E-06,1.6E-03) 8.4E-03 (1.9E-03) 8.22E-03 (5.39E-04) 8.31 E-03 (1.77E-03) 8.08E-03 (6.40E-04) 8.3E-03 (1.6E-03) bis-2-Ethylhexyl phthalate Tonsemitted [2.6E-03,6.3E-02] [7.80E-03,1.30E-02] [7.80E-03,4.36E-02] [2.60E-03,1.22E-02] [2.6E-03,6.3E-02] 9.1E-01 (1.8E-02) 9.11E-01 (3.75E-04) 9.11E-01 (9.67E-04) 9.06E-01 (5.36E-02) 9.1E-01 (2.7E-02) Carbon tetrachloride Tonsemitted [3.0E-01,9.2E-01] [9.11E-01,9.15E-01] [9.03E-01,9.28E-01] [3.01E-01,9.27E-01] [3.0E-01,9.3E-01] 1.8E-03 (1.1E-02) 5.14E-03 (7.25E-02) 9.25E-04 (4.94E-03) 2.13E-03 (2.26E-02) 1.9E-03 (2.6E-02) Carbonyl sulfide Tonsemitted [5.0E-16,1.6E-01] [5.00E-16, 1.27E+00] [5.00E-16,7.78E-02] [5.00E-16,4.39E-01] [5.0E-16, 1.35E+00] 2.4E-03 (1.9E-02) 3.25E-03 (2.48E-02) 1.57E-03 (9.72E-03) 1.34E-03 (8.28E-03) 2.0E-03 (1.6E-02) Chlorine Tonsemitted [3.4E-13,5.6E-01] [3.41E-13,3.58E-01] [3.41E-13,1.76E-01] [3.41E-13,1.13E-01] [3.4E-13,5.6E-01] 4.2E-03 (1.5E-02) 3.40E-03 (1.17E-02) 2.73E-03 (9.33E-03) 1.60E-03 (5.08E-03) 3.1E-03 (1.1E-02) Chlorobenzene Tonsemitted [3.4E-11,2.3E-011 [2.77E-07,1.63E-01] [1.01E-10,1.74E-01] [3.36E-11,5.42E-021 [3.4E-11,2.3E-01] 23 ------- Table 4. continued Variable Units Metropolitan- Urbanized (RUCC1= 1,167) Mean (SD) [Range] ] Nonmetropolitan- Urbanized (RUCC2 = 306) Mean (SD) [Range] Less Urbanized (RUCC3 = 1,026) Mean (SD) [Range] Thinly Populated (RUCC4 = 644) Mean (SD) [Range Total (3143) Mean (SD) [Range] Chloroform Tons emitted 1.0E-01 (2.6E-02) [3.0E-02, 6.6E-01] 9.77E-02 (1.61E-02) [8.85E-02, 2.02E-01] 9.58E-02 (1.41E-02) [8.85E-02, 2.26E-01] 9.36E-02 (1.31 E-02) [2.95E-02, 2.11E-01] 9.7E-02 (2.0E-02) [3.0E-02, 6.6E-01] Chloroprene Tons emitted 1.9E-04 (3.1E-03) [1.6E-013, 8.8E-02] 1.06E-03 (1.81E-02) [1.57E-13, 3.17E-01] 2.05E-04 (5.31E-03) [1.57E-13,1.69E-01] 2.68E-05 (3.84E-04) [1.57E-13, 7.24E-03] 2.4E-04 (6.7E-03) [1.6E-13, 3.2E-01] Chromium compounds Tons emitted 4.1E-04 (7.0E-04) [2.1E-05, 6.6E-03] 3.44E-04 (6.25E-04) [6.15E-05, 5.63E-03] 3.28E-04 (7.70E-04) [6.15E-05,1.04E-02] 2.18E-04 (4.00E-04) [2.05E-05, 6.24E-03] 3.4E-04 (6.5E-04) [2.1 E-05, 1.0E-02] Cobalt compounds Tons emitted 3.9E-05 (3.5E-04) [2.2E-14, 8.5E-03] 2.66E-05 (1.12E-04) [2.20E-14,1.66E-03] 2.91 E-05 (2.56E-04) [2.20E-014, 6.95E-03] 3.80E-05 (2.92E-04) [2.20E-14, 4.67E-03] 3.5E-05 (2.9E-04) [2.2E-14, 8.5E-03] Cyanide compounds Tons emitted 2.5E-02 (6.1E-02) [8.1E-14, 1.4E+00] 2.50E-02 (5.74E-02) [8.10E-14, 8.76E-01] 1.76E-02 (2.15E-02) [8.10E-014, 2.54E-01] 1.49E-2 (3.50E-02) [8.10E- 14, 8.00E-01] 2.1 E-02 (4.6E-02) [8.1E-14,1.4E+00] Dibutylphthalate Tons emitted 3.5E-03 (5.3E-02) [1.3E-09, 1.7E+00] 5.63E-03 (2.92E-02) [3.81 E-08, 4.02E-01] 2.21E-03 (1 38E-02) [7.18E-09, 2.19E-01] 1.76E-03 (2.94E-02) [1.30E-09, 7.40E-01] 2.9E-03 (3.7E-02) [1.3E-09,1.7E+00] Ethyl chloride Tons emitted 1.8E-03 (1.5E-02) [7.6E-09, 5.1E-01] 1.18E-03 (1.67E-03) [4.97E-08,1.31E-02] 1.42E-03 (9.95E-03) [7.59E-09, 2.34E-01] 8.36E-04 (1.88E-03) [7.59E-09, 2.93E-02] 1.4E-03 (1.1 E-02) [7.6E-09, 5.5E-01] Ethyl benzene Tons emitted 7.7E-02 (1.2E-01) [3.5E-05, 1.9E+00] 6.56E-02 (8.41E-02) [1.78E-04, 5.41E-01] 5.88E-02 (8.87E-02) [2.49E-04, 8.86E-01] 4.86E-02 (8.28E-02) [3.46E-05, 8.46E-01] 6.4E-02 (1.0E-01) [3.5E-05,1.9E+00] Ethyl dichloride Tons emitted 4.2E-03 (2.5E-03) [9.0E-04, 3.9E-02] 4.17E-03 (3.10E-03) [2.70E-03, 3.04E-02] 4.30E-03 (4.07E-03) [2.70E-03, 7.73E-02] 3.89E-03 (4.38E-03) [9.00E-04, 9.84E-02] 4.2E-03 (3.6E-03) [9.0E-04, 9.8E-02] Glycol ethers Tons emitted 3.4E-03 (1.4E-02) [1.8E-11, 2.5E-01] 2.68E-03 (8.45E-03) [1.83E-11, 7.92E-02] 3.59E-03 (1 55E-02) [1.83E-11, 2.66E-01] 2.63E-03 (1 35E-02) [1.83E-11, 2.43E-01] 3.2E-03 (1.4E-02) [1.8E-11, 2.7E-01] Hydrazine Tons emitted 4.2E-06 (1.4E-05) [6.5E-08,1.4E-04] 4.60E-06 (1.46E-05) [1.95E-07,1.21E-04] 3.27E-06 (1.25E-05) [1.95E-07,1.83E-04] 3.34E-06 (1.67E-05) [6.50E-08, 2.80E-04] 3.8E-06 (1.4E-05) [6.5E-08, 2.8E-04] Hydrochloric acid Tons emitted 4.7E-01 (1.9E+00) [3.7E-06, 2.5E+01] 2.08E-01 (1.04E+00) [7.72E-05, 1.16E+01] 2.80E-01 (1.30E+00) [1.11 E-05, 2.52E+01] 1.96E-01 (1.09E+00) [3.69E-06, 2.15E+01] 3.3E-01 (1.5E+00) [3.7E-06, 2.5E+01] Isophorone Tons emitted 1.1E-04 (9.4E-04) [5.4E-14, 3.1E-02] 1.31E-04 (8.65E-04) [5.40E-14,1.46E-02] 9.79E-05 (6.31 E-04) [5.40E-14,1.71E-02] 4.55E-05 (1.63E-04) [5.40E-14, 2.45E-03] 9.4E-05 (7.3E-04) [5.4E-14, 3.1 E-02] Manganese compounds Tons emitted 2.4E-03 (1.8E-02) [2.9E-04, 5.6E-01] 2.21E-03 (1.19E-02) [8.70E-04, 2.03E-01] 1 58E-03 (3.79E-03) [8.70E-04, 9.02E-02] 1.49E-03 (3.39E-03) [2.90E-04, 6.50E-02] 1.9E-03 (1.2E-02) [2.9E-04, 5.6E-01] Methyl bromide Tons emitted 6.8E-02 (5.2E-02) [1.8E-02, 7.5E-01] 6.38E-02 (3.00E-02) [5.25E-02, 2.90E-01] 6.19E-02 (3.21 E-02) [5.25E-02, 5.94E-01] 5.77E-02 (1.66E-02) [1.75E-02, 2.22E-01] 6.3E-02 (3.8E-02) [1.8E-02, 7.5E-01] Methyl chloride Tons emitted 2.4E-01 (1.9E-01) [5.5E-02, 4.7E+00] 2.31E-01 (1.29E-01) [1.65E-01, 1.64E+00] 2.13E-01 (8.85E-02) [1.65E-01,1.04E+00] 1.96E-01 (6.98E-02) [5.50E-02,1.01E+00] 2.2E-01 (1.4E-01) [5.5E-02, 4.7E+00] Phosphine Tons emitted 3.8E-05 (7.5E-05) [2.6E-13, 8.3E-04] 3.72E-05 (6.85E-05) [2.64E-13, 4.70E-04] 4.20E-05 (8.84E-05) [2.64E-13,1.64E-03] 4.33E-05 (1.23E-04) [2.64E-13, 2.59E-03] 4.0E-05 (9.1 E-05) [2.6E-13, 2.6E-03] Polychlorinated biphenyls Tons emitted 3.8E-05 (1.1E-04) [2.1E-13, 3.7E-03] 3.66E-05 (3.78E-05) [2/06E-013, 2.99E-04] 3.14E-05 (3.47E-05) [2.06E-013, 4.21E-04] 2.87E-05 (3.70E-05) [2.06E-13, 4.88E-04] 3.4E-05 (7.4E-05) [2.1E-13, 3.7E-03] Propylene dichloride Tons emitted 1.6E-03 (2.2E-03) [2.3E-04, 4.5E-02] 1.21E-03 (1.06E-03) [6.90E-04, 7.98E-03] 1.03E-03 (8.81 E-04) [6.90E-04, 8.60E-03] 9.74E-04 (8.25E-04) [2.30E-04, 7.00E-03] 1.3E-03 (1.6E-03) [2.3E-04, 4.5E-02] Quinoline Tons emitted 1.4E-04 (2.7E-04) [4.4E-07,1.7E-03] 1.51E-03 (3.27E-04) [1.32E-06, 2.06E-03] 1.05E-04 (2.59E-04) [1.32E-06,1.89E-03] 5.10E-05 (1 49E-04) [4.40E-07,1.25E-03] 1.1 E-04 (2.5E-04) [4.4E-07, 2.1E-03] Trichloroethylene Tons emitted 5.2E-02 (4.9E-02) [2.5E-03, 7.6E-01] 4.69E-02 (4.06E-02) [7.50E-03, 2.21E-01] 4.45E-02 (4.13E-02) [7.50E-03, 2.84E-01] 3.48E-02 (4.08E-02) [2.50E-03, 4.36E-01] 4.5E-02 (3.1E-03) [2.8E-10, 7.0E-2] Vinyl chloride Tons emitted 7.8E-04 (3.8E-03) [2.8E-10, 7.0E-02] 5.35E-04 (1.87E-03) [2.84E-10, 2.35E-02] 6.01 E-04 (2.89E-03) [2.84E-10, 5.59E-02] 4.55E-04 (2.64E-03) [2.84E-10, 4.77E-02] 6.3E-04 (1.5E+00) [7.3E-03, 3.1E+01] 24 ------- Water Domain Variables included in the water domain (Table 5) suggest that urban counties were more likely to have impaired stream length (20%) compared with rural counties (9%). Additionally, urban counties had higher mercury deposition, chloride precipitation, sulfate precipitation, and the percentage of the county in drought status. Chemical contamination varied by urban-rural status depending on the chemical. Table 5. Water domain variable means, standard deviations (SDs), and ranges - Overall and rural-urban continuum codes (RUCCs) stratified Variable Construct: Domestic Use Units Metropolitan-Urbanized (RUCC1 = 1,167) Mean (SD) [Range] Nonmetropolitan- Urbanized (RUCC2 = 306) Mean (SD) [Range] Less Urbanized (RUCC3 = 1,026) Mean (SD) [Range] Thinly Populated (RUCC4 = 644) Mean (SD) [Range] Total (3143) Mean (SD) [Range] 2.30E+01 3.62E+01 (3.83E+01) (4.23E+01) 4.47E+01 (4.29E+01) [0.00E+00, 4.35E+01 (4.24E+01) 3.26E+01 (4.13E+01) [0.00E+00, [0.00E+00, Percent pop. on self-supply % 1.00E+02] [0.00E+00,1.00E+02] [0.00E+00, 1.00E+02] 1.00E+02] 1.00E+02] 3.38E+01 2.77E+01 (2.46E+01) (2.18E+01) Percent pop. on self supply that 2.33E+01 (2.10E+01) [-2.62E-04, 2.40E+01 (1.72E+01) 2.99E+01 (2.10E+01) [-6.78E-02, [-6.78E-02, is surface water % 1.00E+02] [0.00E+00, 8.20E+01] [-4.17E-02,9.21E+01] 1.00E+02] 1.00E+02] Construct: Overall Water Quality 1.50E+01 9.02E+00 (2.05E+01) Percent of stream length 1.97E+01 (2.35E+01) [1.00E-03, 1.72E+01 (2.30E+01) 1.28E+01 (1.84E+01) (1.33E+01) [1.00E- [1.00E-03, impaired % 1.56E+02] [1.00E-03, 1.00E+02] [1.00E-03,1.08E+02] 03,1.00E+02] 1.56E+02] Construct: General Water Contamination 4.74E+01 1.20E+01 (1.25E+02) NPDES permits per 1000 km 9.08E+01 (1.91E+02) [1.00E-03, 3.44E+01 (3.90E+01) 2.42E+01 (4.34E+01) (2.40E+01) [1.00E- [1.00E-03, of stream proportion 2.39E+03] [1.00E-03,2.97E+02] [1.00E-03,7.05E+02] 03,3.55E+02] 2.39E+03] Construct: Atmospheric Deposition 2.23E-01 (1.09E- 1.90E-01 (1.1 IE- Calcium precipitation weighted 1.63E-01 (9.69E-02) [1.22E-02, 1.83E-01 (1.10E-01) 2.03E-01 (1.20E-01) 01) [3.66E-02, 01) [1.22E-02, mean mg/L 5.94E-01] [1.22E-02,7.48E-01] [3.80E-02,1.06E+00] 8.63E-01] 1.06E+00] 2.83E-01 (7.16E- 2.66E-01 (5.31E- Potassium precipitation weighted 2.57E-01 (3.63E-02) [1.22E-01, 2.57E-01 (3.98E-02) 2.67E-01 (5.60E-02) 02) [1.58E-01, 02) [1.22E-01, mean mg/L 4.91E-01] [1.22E-01,4.44E-01] [1.68E-01,1.01E+00] 1.11E+00] 1.11E+00] 7.55E-01 (2.07E- 7.42E-01 (2.10E- 7.34E-01 (2.11E-01) [0.00E+00, 7.38E-01 (2.40E-01) 7.44E-01 (2.03E-01) 01) [5.47E-03, 01) [0.00E+00, Nitrate precipitation mg/L 1.13E+00] [0.00E+00,1.14E+00] [1.93E-02,1.14E+00] 1.14E+00] 1.14E+00] 1.88E-01 (1.77E- 2.44E-01 (2.13E Chloride precipitation weighted 2.98E-01 (2.44E-01) [3.47E-02, 2.37E-01 (2.19E-01) 2.22E-01 (1.79E-01) 01) [7.19E-02, 01) [3.47E-02, mean mg/L 1.91E+00] [3.47E-02,1.56E+00] [6.94E-02,2.15E+00] 1.58E+00] 2.15E+00] 1.03E+00 9.26E-01 (2.76E- (3.28E-01) Sulfate precipitation weighted 1.10E+00 (3.39E-01) [1.00E-01, 1.05E+00 (3.78E-01) 1.02E+00 (3.10E-01) 01) [2.03E-01, [1.00E-01, mean mg/L 1.89E+00] [1.00E-01,1.96E+00] [2.00E-01,2.09E+00] 1.92E+00] 2.09E+00] 9.15E+00 8.43E+00 (2.71 E+00) 9.44E+00 (2.59E+00) [2.81 E-02, 9.02E+00 (2.67E+00) 9.29E+00 (2.66E+00) (2.88E+00) [1.60E- [2.62E-02, Total mercury deposition ng/m2 1.84E+01] [2.62E-02,1.76E+01] [3.62E-01,1.55E+01] 01,1.46E+01] 1.84E+01] Construct: Drough 3.43E+00 3.84E+00 (5.92E+00) (6.75E+00) Percent of county drought 4.16E+00 (7.38E+00) [0.00E+00, 3.70E+00 (6.67E+00) 3.76E+00 (6.51 E+00) [0.00E+00, [0.00E+00, extreme % 4.52E+01] [0.00E+00,3.87E+01] [0.00E+00,4.82E+01] 4.43E+01] 4.82E+01] Construct: Chemical Contamination 2.67E-03 (3.24E- 3.46E-03 (4.66E 3.59E-03 (5.10E-03) [1.00E-03, 3.61E-03 (3.53E-03) 3.75E-03 (5.13E-03) 03) [1.00E-03, 03) [1.00E-03, Arsenic mg/L 1.34E-01] [1.00E-03,3.90E-021 [1.00E-03,7.20E-021 3.10E-021 1.34E-01] 25 ------- Table 5. continued Variable Barium Cadmium Chromium Cyanide Fluoride Mercury (inorganic) Nitrate Nitrite Selenium Antimony Endrin Metropolitan-Urbanized (RUCC1 = 1,167) Units Mean (SD) [Range] 8.08E-02 (3.93E-01) [1.00E-02, mg/L 1.31E+01] 1.71E-03 (8.60E-04) [1.00E-03, mg/L 6.00E-03] 6.21 E-03 (7.16E-03) [1.00E-03, mg/L 1.46E-01] 1.51E-02 (2.85E-02) [1.00E-03, mg/L 2.67E-01] 1.16E+00 (7.81 E+00) [2.00E-02, mg/L 1.50E+02] 1.15E-03 (1.13E-03) [1.00E-03, mg/L 3.60E-02] 8.07E-01 (1.64E+00) [1.00E-02, mg/L 2.00E+01] 6.78E-02 (1.76E-01) [1.00E-02, mg/L 3.60E+00] 4.19E-03 (5.46E-03) [1.00E-03, mg/L 9.50E-02] 2.51 E-03 (1.76E-03) [1.00E-03, mg/L 2.00E-02] 8.05E-02 (2.01E-01) [1.00E-02, mg/L 1.01 E+00] Nonmetropolitan- Urbanized (RUCC2 = 306) Mean (SD) [Range] 8.34E-02 (2.37E-01) [1.00E-02, 3.98E+00] 1.66E-03 (7.77E-04) [1.00E-03, 7.00E-03] 6.09E-03 (5.69E-03) [1.00E-03, 3.60E-02] 1.68E-02 (2.92E-02) [1.00E-03, 2.11E-01] 4.31E-01 (4.20E-01) [2.00E-02, 2.65E+00] 1.08E-03 (2.74E-04) [1.00E-03, 2.00E-03] 6.59E-01 (1.19E+00) [1.00E-02,1.46E+01] 6.70E-02 (1.39E-01) [1.00E-02,1.90E+00] 3.82E-03 (3.48E-03) [1.00E-03, 3.10E-02] 2.50E-03 (1.59E-03) [1.00E-03, 7.00E-03] 7.26E-02 (1.84E-01) [1.00E-02,1.01 E+00] Less Urbanized (RUCC3 = 1,026) Mean (SD) [Range] 6.81E-02 (9.96E-02) [1.00E-02,1.03E+00] 1.66E-03 (7.67E-04) [1.00E-03, 8.00E-03] 6.27E-03 (7.48E-03) [1.00E-03, 5.60E-02] 1.57E-02 (3.18E-02) [1.00E-03, 3.39E-01] 4.83E-01 (6.44E-01) [2.00E-02, 8.71 E+00] 1.09E-03 (3.08E-04) [1.00E-03, 5.00E-03] 7.37E-01 (2.80E+00) [1.00E-02, 8.10E+01] 5.84E-02 (1.17E-01) [1.00E-02,1.54E+00] 3.96E-03 (4.21 E-03) [1.00E-03, 3.10E-02] 2.49E-03 (1.63E-03) [1.00E-03, 7.00E-03] 7.86E-02 (2.03E-01) [1.00E-02,1.01 E+00] Thinly Populated (RUCC4 = 644) Mean (SD) [Range] 4.84E-02 (7.72E- 02) [1.00E-02, 6.70E-01] 1.45E-03 (6.96E- 04) [1.00E-03, 7.00E-03] 4.21 E-03 (6.36E- 03) [1.00E-03, 1.01E-01] 1.39E-02 (4.12E- 02) [1.00E-03, 8.16E-01] 3.50E-01 (6.63E- 01) [2.00E-02, 1.14E+01] 1.08E-03 (3.44E- 04) [1.00E-03, 7.00E-03] 6.22E-01 (2.01 E+00) [1.00E- 02, 3.28E+01] 5.18E-02 (1.71E- 01) [1.00E-02, 3.41 E+00] 3.21 E-03 (4.50E- 03) [1.00E-03, 4.80E-02] 2.00E-03 (1.44E- 03) [1.00E-03, 7.00E-03] 5.71 E-02 (1.75E- 01) [1.00E-02, 1.01E+001 Total (3143) Mean (SD) [Range] 7.03E-02 (2.59E- 01) [1.00E-02, 1.31E+01] 1.64E-03 (7.96E- 04) [1.00E-03, 8.00E-03] 5.81 E-03 (7.02E- 03) [1.00E-03, 1.46E-01] 1.52E-02 (3.26E- 02) [1.00E-03, 8.16E-01] 7.02E-01 (4.80E+00) [2.00E-02, 1.50E+02] 1.11 E-03 (7.33E- 04) [1.00E-03, 3.60E-02] 7.32E-01 (2.13E+00) [1.00E-02, 8.10E+01 ] 6.13E-02 (1.55E- 01) [1.00E-02, 3.60E+00] 3.88E-03 (4.72E- 03) [1.00E-03, 9.50E-02] 2.40E-03 (1.65E- 03) [1.00E-03, 2.00E-02] 7.43E-02 (1.95E- 01) [1.00E-02, 1.01 E+001 4.76E-01 1.59E-01 (6.02E- (1.52E+00) 6.90E-01 (1.97E+00) [1.00E-02, 5.66E-01 (1.65E+00) 4.06E-01 (1.23E+00) 01) [1.00E-02, [1.00E-02, Methoxychlor pg/L 1.00E+01] [1.00E-02,1.00E+01] [1.00E-02,9.65E+00] 8.01E+00] 1.00E+01] 7.89E+00 7.78E+00 (2.41E+01) 7.28E+00 (2.27E+01) [8.00E-02, 8.47E+00 (2.44E+01) 8.47E+00 (2.50E+01) (2.49E+01) [8.00E- [8.00E-02, Dalapon |jg/L 1.00E+02] [8.00E-02,1.00E+02] [8.00E-02,1.00E+02] 02,1.00E+02] 1.00E+02] 5.74E+00 1.30E+00 (1.79E+02) 1.12E+01 (2.94E+02) [6.00E-02, 3.15E+00 (9.77E+00) 3.03E+00 (1.77E+01) (5.69E+00) [6.00E- [6.00E-02, Di(2-ethyIhexyl) adipate pg/L 1.00E+04] [6.00E-02,5.01E+01] [6.00E-02,5.01E+02] 02,5.01E+01] 1.00E+04] 1.56E-01 (2.31E- 2.10E-01 (2.94E- 2.25E-01 (3.12E-01) [5.00E-02, 2.38E-01 (3.77E-01) 2.19E-01 (2.77E-01) 01) [5.00E-02, 01) [5.00E-02, Simazine pg/L 4.89E+00] [5.00E-02,5.05E+00] [5.00E-02,1.85E+00] 1.05E+00] 5.05E+00] 7.57E-01 4.79E-01 (8.87E- (1.21 E+00) 8.55E-01 (1.26E+00) [8.00E-02, 8.72E-01 (1.20E+00) 7.87E-01 (1.29E+00) 01) [8.00E-02, [8.00E-02, Di(2-ethyIhexyl) pthalate pg/L 9.41E+00] [8.00E-02,6.08E+00] [8.00E-02,1.59E+01] 9.15E+00] 1.59E+01] 2.34E+00 1.22E+00 (9.71 E+00) 2.44E+00 (1.00E+01) [4.00E-02, 3.64E+00 (1.25E+01) 2.54E+00 (1.00E+01) (6.36E+00) [4.00E- [4.00E-02, Picloram pg/L 5.00E+01] [4.00E-02,1.00E+02] [4.00E-02,5.00E+01] 02,5.00E+01] 1.00E+02] 26 ------- Table 5. continued Variable Units Metropolitan-Urbanized (RUCC1 = 1,167) Mean (SD) [Range] Nonmetropolitan- Urbanized (RUCC2 = 306) Mean (SD) [Range] Less Urbanized (RUCC3 = 1,026) Mean (SD) [Range] Thinly Populated (RUCC4 = 644) Mean (SD) [Range] Total (3143) Mean (SD) [Range] Dinoseb pg/L 2.94E-01 (4.19E-01) [8.00E-02, 3.08E+00] 3.32E-01 (4.45E-01) [8.00E-02, 2.08E+00] 2.92E-01 (4.64E-01) [8.00E-02, 9.08E+00] 2.48E-01 (3.87E- 01) [8.00E-02, 2.08E+00] 2.88E-01 (4.31 E- 01) [8.00E-02, 9.08E+00] Atrazine pg/L 2.05E-01 (3.12E-01) [3.00E-02, 2.53E+00] 2.24E-01 (3.42E-01) [3.00E-02, 3.78E+00] 2.73E-01 (2.37E+00) [3.00E-02, 7.53E+01] 1.34E-01 (2.35E- 01) [3.00E-02, 2.28E+00] 2.15E-01 (1.37E+00) [3.00E-02, 7.53E+01] 2,4-Dichlorophenoxyacetic acid Mg/L 1.40E-01 (1.08E-01) [9.00E-02, 2.51 E+00] 1.42E-01 (5.41 E-02) [9.00E-02, 4.00E-01] 1.42E-01 (2.27E-01) [9.00E-02, 7.19E+00] 1.20E-01 (5.30E- 02) [9.00E-02, 8.10E-01] 1.37E-01 (1.49E- 01) [9.00E-02, 7.19E+00] Benzo[a]pyrene pg/L 4.78E-02 (5.40E-02) [1.00E-02, 3.47E-01] 5.03E-02 (5.82E-02) [1.00E-02, 3.34E-01] 5.33E-02 (5.93E-02) [1 00E-02, 3.10E-01] 3.84E-02 (4.93E- 02) [1.00E-02, 2.10E-01] 4.79E-02 (5.56E- 02) [1.00E-02, 3.47E-01] Pentachlorophenol Mg/L 7.84E-02 (1.63E-01) [1.00E-02, 171 E+00] 8.91E-02 (1.81E-01) [1 00E-02, 1.01 E+00] 8.82E-02 (1.76E-01) [1.00E-02,1.01 E+00] 6.16E-02 (1.36E- 01) [1 00E-02, 1.01 E+00] 7.92E-02 (1.65E- 01) [1.00E-02, 1.71 E+00] Polychlorinated biphenyls pg/L 1.65E-01 (1.19E+00) [6.00E-02, 4.04E+01] 1.13E-01 (1.24E-01) [6.00E-02,1.06E+00] 1.13E-01 (1.88E-01) [6.00E-02, 4.31 E+00] 8.13E-02 (6.53E- 02) [6.00E-02, 1.06E+00] 1.26E-01 (7.35E- 01) [6.00E-02, 4.04E+01] 1,2-Dibromo-3-chloropropane Mg/L 2.19E-02 (1.93E-02) [1.00E-02, 5.45E-01] 2.01 E-02 (9.92E-03) [1 00E-02, 3.00E-02] 2.05E-02 (9.96E-03) [1 00E-02, 4.50E-02] 1.86E-02 (9.86E- 03) [1 00E-02, 3.00E-02] 2.06E-02 (1.42E- 02) [1.00E-02, 5.45E-01] Ethylene dibromide pg/L 8.28E-02 (1.60E-01) [1.00E-02, 1.17E+00] 7.14E-02 (1.39E-01) [1.00E-02, 5.10E-01] 6.94E-02 (1.41 E-01) [1 00E-02, 8.70E-01] 8.19E-02 (1.59E- 01) [1.00E-02, 5.10E-01] 7.72E-02 (1.52E- 01) [1.00E-02, 1.17E+00] Xylenes Mg/L 8.44E-01 (6.05E+00) [1.00E-01, 2.00E+02] 8.60E-01 (3.26E+00) [1.00E-01, 5.08E+01] 2.00E+00 (4.37E+01) [1.00E-01,1.40E+03] 2.01 E+00 (3.94E+01) [1.00E- 01, 1.00E+03] 1.46E+00 (3.09E+01) [1.00E-01, 1.40E+03] Chlordane pg/L 1.08E-01 (9.94E-02) [2.00E-02, 9.70E-01] 1.17E-01 (9.62E-02) [2.00E-02, 2.76E-01] 1.12E-01 (9.77E-02) [2.00E-02, 2.87E-01] 8.43E-02 (9.23E- 02) [2.00E-02, 2.20E-01] 1.06E-01 (9.77E- 02) [2.00E-02, 9.70E-01] Dichloromethane pg/L 4.99E-01 (4.91E-01) [1.00E-01, 1.03E+01] 4.90E-01 (2.67E-01) [1.00E-01,1.98E+00] 4.95E-01 (3.09E-01) [1.00E-01, 4.05E+00] 4.29E-01 (5.13E- 01) [1 00E-01, 1.18E+01] 4.83E-01 (4.27E- 01) [1.00E-01, 1.18E+01] p-Dichlorobenzene pg/L 5.09E-01 (5.13E+00) [2.00E-02, 175E+02] 3.72E-01 (2.41 E-01) [2.00E-02,1.54E+00] 3.62E-01 (2.57E-01) [2.00E-02, 2.77E+00] 3.11 E-01 (3.55E- 01) [2.00E-02, 6.02E+00] 4.07E-01 (3.13E+00) [2.00E-02, 1.75E+02] 1,1,1 -T richloroethane Mg/L 6.77E-01 (1.03E+01) [1.00E-02, 3.51 E+02] 7.94E-01 (7.15E+00) [1 00E-02, 1.25E+02] 3.99E-01 (9.67E-01) [1 00E-02, 3.03E+01] 3.03E-01 (2.51 E- 01) [1 00E-02, 2.16E+00] 5.21E-01 (6.67E+00) [1.00E-02, 3.51 E+02] Trichloroethylene pg/L 4.39E-01 (4.89E-01) [2.00E-02, 6.50E+00] 4.06E-01 (2.67E-01) [2.00E-02, 2.03E+00] 4.00E-01 (2.70E-01) [2.00E-02, 3.75E+00] 3.27E-01 (2.54E- 01) [2.00E-02, 1.93E+00] 4.00E-01 (3.67E- 01) [2.00E-02, 6.50E+00] Carbon tetrachloride Mg/L 4.62E-01 (5 79E-01) [1 OOE-02, 8.01 E+00] 4.13E-01 (3.76E-01) [1 00E-02, 5.12E+00] 4.22E-01 (7.75E-01) [1.00E-02, 2.38E+01] 3.26E-01 (2.96E- 01) [1 00E-02, 4.34E+00] 4.16E-01 (5.95E- 01) [1.00E-02, 2.38E+01] Benzene pg/L 4.92E-01 (3.48E-01) [1.10E-01, 4.24E+00] 4.87E-01 (2.43E-01) [1.10E-01,1.74E+00] 4.94E-01 (2.47E-01) [1.10E-01, 3.24E+00] 4.22E-01 (2.49E- 01) [1.10E-01, 1.55E+00] 4.78E-01 (2.90E- 01) [1.10E-01, 4.24E+00] Toluene Mg/L 7.60E-01 (6.22E+00) [7.00E-02, 2.01 E+02] 2.59E+00 (2.27E+01) [7.00E-02, 3.34E+02] 1.07E+00 (1.26E+01) [7.00E-02, 3.50E+02] 4.43E-01 (1.34E+00) [7.00E- 02, 3.37E+01] 9.74E-01 (1.08E+01) [7.00E-02, 3.50E+02] Ethylbenzene pg/L 5.00E-02 (0.00E+00) [5.00E-02, 5.00E-02] 5.00E-02 (0.00E+00) [5.00E-02, 5.00E-02] 5.00E-02 (0.00E+00) [5.00E-02, 5.00E-02] 5.00E-02 (0.00E+00) [5.00E- 02, 5.00E-02] 5.00E-02 (0.00E+00) [5.00E-02, 5.00E-02] 27 ------- Table 5. continued Variable Units Metropolitan-Urbanized (RUCC1 = 1,167) Mean (SD) [Range] Nonmetropolitan- Urbanized (RUCC2 = 306) Mean (SD) [Range] Less Urbanized (RUCC3 = 1,026) Mean (SD) [Range] Thinly Populated (RUCC4 = 644) Mean (SD) [Range] Total (3143) Mean (SD) [Range] Styrene pg/L 5.67E-01 (2.37E+00) [1.00E-01, 7.86E+01] 4.91 E-01 (3.40E-01) [1.00E-01, 3.58E+00] 4.93E-01 (3.12E-01) [1.00E-01, 5.00E+00] 4.14E-01 (2.73E- 01) [1.00E-01, 2.80E+00] 5.04E-01 (1 47E+00) [1.00E-01, 7.86E+01] Alpha particles pCi/L 1.05E+00 (2.32E+00) [0.00E+00, 3.58E+01] 1.24E+00 (3.42E+00) [0.00E+00, 5.15E+01] 1.34E+00 (3.19E+00) [0.00E+00, 3.47E+01] 7.33E-01 (2.01 E+00) [0.00E+00, 1.81E+01] 1.10E+00 (2.71 E+00) [0.00E+00, 5.15E+01] cis-1,2-Dichloroethylene pg/L 3.87E-01 (4 21E-01) [2.00E-02, 1.19E+01] 4.02E-01 (3.53E-01) [2.00E-02, 5.19E+00] 3.92E-01 (2.22E-01) [2.00E-02,1.22E+00] 3.28E-01 (2.53E- 01) [2.00E-02, 2.09E+00] 3.78E-01 (3.28E- 01) [2.00E-02, 1.19E+01] Construct: Drinking Water Quality Total coliform proportion Proportion 1.20E-01 (3.55E-01) [1.00E-03, 4.93E+00] 2.86E-01 (1.26E+00) [1.00E-03, 1.34E+01] 2.03E-01 (8.82E-01) [1.00E-03,1.84E+01] 2.22E-01 (8.41 E- 01) [1 00E-03, 9.71 E+00] 1.84E-01 (7.76E- 01) [1.00E-03, 1.84E+01] Land Domain In the land domain, the metropolitan-urbanized counties had lower agricultural-related variables (percent harvested and percent irrigated) than did nonmetropolitan-urbanized, less urban, and thinly populated counties (Table 6). Pesticides and animal units showed no clear pattern in variation across the strata. For example, average pounds of herbicides applied were 58,700, 78,400, 75,100, and 61,500 for most urban to most rural strata, respectively. There was little variation in the distribution of radon zones across the urban/rural strata. Table 6. Land domain variable means, standard deviations (SDs), and ranges - Overall and rural-urban continuum codes (RUCCs) stratified Variable Construct: Agriculture Farms per acre Irrigated acreage Chemicals used to control nematodes, acres applied per county acres Manure, acres applied per county acres Chemicals used to control diseases in crops and orchards, acres applied per county acres Chemicals used to defoliate/ control growth/thin fruit, acres applied per county acres Harvested acreage, acres harvested per county acres Animal Units, animal units per county acres Metropolitan- Urbanized (RUCC1 = 1,167) Mean (SD) Units [Range] 1.53E-03 (1.10E-03) Number [2.34E-06,7.87E-03] 2.20E+00 (6.72E+00) % [3.62E-04,7.42E+01] 1.01E-02 (1.28E-02) % [1.32E-06,1.07E-01] 1.69E-02 (2.56E-02) % [1.56E-06,2.63E-01] 1 48E-02 (2.62E-02) [8.78E-07, 2.25E-01] 1.46E-02 (2.91E-02) [8.49E-07, 3.84E-01] 1.90E-01 (2.12E-01) [2.59E-05, 9.94E-01] 2.62E-04 (1.01E-03) [1.31E-08,1.75E-02] Nonmetropolitan- Urbanized (RUCC2 = 306) Mean (SD) [Range] 1 49E-03 (1 06E-03) [2.34E- 06, 6.48E-03] 3.46E+00 (9.15E+00) [3.62E- 04, 5.65E+01] 1.14E-02 (1.54E-02) [1.32E- 06,1.30E-01] 2.10E-02 (2.71 E-02) [1.56E- 06,1.68E-01] 1.68E-02 (2.63E-02) [8.78E- 07,1.59E-01] 1.67E-02 (3.28E-02) [8.49E- 07, 3.63E-01] 2.47E-01 (2.50E-01) [2.59E- 05, 9.16E-01] 1.11E-04 (2.08E-04) [1.31E- 08, 2.36E-03] Less Urbanized (RUCC3 = 1026) Mean (SD) [Range] 1.34E-03 (1.03E-03) [2.34E-06, 5.95E-03] 3.45E+00 (8.73E+00) [3.62E-04, 7.14E+01] 1.27E-02 (1.60E-02) [1.32E-06, 1.50E-01] 1.96E-02 (2.83E-02) [1.56E-06, 2.52E-01] 1.86E-02 (3.06E-02) [8.78E-07, 2.60E-01] 1.91 E-02 (3.37E-02) [8.49E-07, 4.15E-01 ] 2.51E-01 (2.60E-01) [2.59E-05, 9.43E-01] 1.29E-04 (4.09E-04) [1.31E-08, 6.14E-03] Thinly Populated (RUCC4 = 644) Mean (SD) [Range] 9.15E-04 (8.72E-04) [2.34E-06, 5.18E-03] 2.81 E+00 (7.39E+00) [3.62E-04, 6.07E+01] 8.75E-03 (1 08E-02) [1.32E-06, 9.63E-02] 1.12E-02 (1.78E-02) [1.56E-06,1.54E-01] 1.95E-02 (3.32E-02) [8.78E-07, 3.05E-01] 1.32E-02 (1.92E-02) [8.49E-07, 2.12E-01] 2.18E-01 (2.25E-01) [2.59E-05, 9.21E-01] 1.32E-04 (5.43E-04) [1.31 E-08, 6.75E-03] Total (3143) Mean (SD) [Range] 1.34E-03 (1.05E- 03) [2.34E-06, 7.87E-03] 2.86E+00 (7.83E+00) [3.62E- 04, 7.42E+01] 1.08E-02 (1.39E- 02) [1.32E-06, 1.50E-01] 1.70E-02 (2.55E- 02) [1.56E-06, 2.63E-01] 1.72E-02 (2.93E- 02) [8.78E-07, 3.05E-01] 1.60E-02 (2.95E- 02) [8.49E-07, 4.15E-01] 2.21E-01 (2.37E- 01) [2.59E-05, 9.94E-01] 1.77E-04 (7.11 E- 04) [1.31 E-08, 1.75E-02] 28 ------- Table 6. continued Variable Units Metropolitan- Urbanized (RUCC1 = 1,167) Mean (SD) [Range] Nonmetropolitan- Urbanized (RUCC2 = 306) Mean (SD) [Range] Less Urbanized (RUCC3 = 1026) Mean (SD) [Range] Thinly Populated (RUCC4 = 644) Mean (SD) [Range] Total (3143) Mean (SD) [Range] Construct: Pesticides Fungicides, applied Pounds 2.66E+04 (2.00E+05) [3.75E-01, 5.17E+06] 8.56E+03 (2.44E+04) [3.00E- 01, 2.24E+05] 6.37E+03 (1.74E+04) [2.00E-01, 2.37E+05] 3.96E+03 (9.61 E+03) [4.33E-01, 1.59E+05] 1.36E+04 (1.23E+05) [2.00E- 01, 5.17E+06] Herbicides, applied Pounds 5.87E+04 (8.30E+04) [2.23E+00, 8.68E+05] 7.84E+04 (9.32E+04) [7.00E- 01, 6.17E+05] 7.51 E+04 (8.39E+04) [1.42E+01, 4.75E+05] 6.15E+04 (7.00E+04) [2.00E-01, 4.28E+05] 6.65E+04 (8.22E+04) [2.00E- 01,8.68E+05] Insecticides, applied Pounds 9.61 E+03 (3.23E+04) [2.00E-01, 5.72E+05] 8.96E+03 (2.11 E+04) [2.01 E+01, 2.30E+05] 8.11 E+03 (1.42E+04) [1.85E+00, 2.57E+05] 5.18E+03 (7.47E+03) [1.00E-01, 9.77E+04] 8.15E+03 (2.26E+04) [1.00E- 01,5.72E+05] Construct: Mines Primarily coal mines, mines per county pop. Proportion 1.11E-04 (7.38E-04) [6.25E-07,1.25E-02] 1.35E-04 (5.64E-04) [6.25E- 07, 4.67E-03] 4.05E-04 (2.18E-03) [6.25E-07, 2.82E-02] 5.67E-04 (3.75E-03) [6.25E-07, 5.78E-02] 3.03E-04 (2.17E- 03) [6.25E-07, 5.78E-02] Primarily metal mines, mines per county pop. Proportion 3.29E-05 (3.24E-04) [2.44E-07, 6.43E-03] 4.14E-05 (2.19E-04) [2.44E- 07, 2.54E-03] 1.19E-04 (7.78E-04) [2.44E-07, 1.43E-02] 5.18E-04 (3.84E-03) [2.44E-07, 7.41 E-02] 1.61E-04 (1.81E- 03) [2.44E-07, 7.41 E-02] Primarily nonmetal mines, mines per county pop. Proportion 3.16E-05 (2.57E-04) [2.86E-07, 7.67E-03] 3.08E-05 (7.09E-05) [2.86E- 07, 6.35E-04] 7.76E-05 (3.34E-04) [2.86E-07, 6.41E-03] 1.43E-04 (8.15E-04) [2.86E-07,1.66E-02] 6.94E-05 (4.46E- 04) [2.86E-07, 1.66E-02] Primarily sand and gravel mines, mines per county pop. Proportion 1.40E-04 (3.49E-04) [2.00E-07, 6.87E-03] 2.07E-04 (2.38E-04) [2.00E- 07,1.25E-03] 3.47E-04 (4.78E-04) [2.00E-07, 4.43E-03] 8.32E-04 (1.34E-03) [2.00E-07,1.24E-02] 3.56E-04 (7.49E- 04) [2.00E-07, 1.24E-02] Primarily stone mines, mines per county pop. Proportion 9.42E-05 (3.10E-04) [3.06E-07, 5.66E-03] 1.12E-04 (1.78E-04) [3.06E- 07,1.95E-03] 2.04E-04 (5.12E-04) [3.06E-07, 9.32E-03] 3.40E-04 (1.32E-03) [3.06E-07, 2.42E-02] 1.82E-04 (7.00E- 04) [3.06E-07, 2.42E-02] Construct: Radon Radon Ordinal 2.02E+00 (8.14E-01) [0.00E+00, 3.00E+00] 1.97E+00 (8.23E-01) [1.00E+00, 3.00E+00] 2.03E+00 (8.24E-01) [1.00E+00, 3.00E+00] 1.88E+00 (8.09E-01) [1.00E+00, 3.00E+00] 1.99E+00 (8.19E- 01) [0.00E+00, 3.00E+00] Construct: Facilities Facilities per county Proportion 3.69E-04 (2.82E-04) [5.60E-06, 3.22E-03] 4.99E-04 (3.25E-04) [3.69E- 05, 2.24E-03] 5.60E-04 (4.63E-04) [5.60E-06, 6.65E-03] 8.25E-04 (2.08E-03) [5.60E-06, 4.58E-02] 5.38E-04 (1.01E- 03) [5.60E-06, 4.58E-02] Sociodemographic Domain lowest household income ($30,300) and lowest household value Socioeconomic variables included in the sociodemographic ($94,900). From the crime perspective, however, rural areas domain indicated that rural counties generally were more were at an advantage compared with more urban areas; the mean deprived than were more urban counties (Table 7), with both the violent crime rate per county population for rural counties was 385.5 compared with 619.8 for the most urban counties. Table 7. Sociodemographic domain variable means, standard deviations (SDs), and ranges - Overall and rural-continuum codes (RUCCs) stratified Metropolitan- Urbanized (RUCC1 = 1167) Mean (SD) Variable Units [Range] Sociodemographic Domain Construct: Socioeconomic 15.1(5.8) 12.7(4.6) 10.5(4.0) 11.4(4.6) 12.6(5.3) Percent bachelor's degree % [2.6,37.2] [5.4,34.7] [3.0,42.2] [1.9,36.1] [1.9,42.2] 7.6(2.5) 8.1(2.6) 7.9(3.4) 6.7(4.6) 7.5(3.6) Percent unemployed % [0,27.5] [2.2,20.2] [0.3,26.3] [0.0,30.9] [0,30.9] Percent families less than 9.8(4.5) 11.9(4.8) 12.7(5.8) 11.9(6.4) 11.4(5.5) poverty level % [0,39.6] [3.1,35.1] [1.4,44.9] [0.0,44.4] [0,44.9] Nonmetropolitan- Urbanized (RUCC2 = 306) Mean (SD) [Range] Less Urbanized (RUCC3 = 1026) Mean (SD) [Range] Thinly Populated (RUCC4 = 644) Mean (SD) [Range] OVERALL (n=3143) Mean (SD) [Range] 29 ------- Table 7. continued Metropolitan- Nonmetropolitan- Less Thinly Urbanized Urbanized Urbanized Populated OVERALL (RUCC1 = 1167) (RUCC2 = 306) (RUCC3 = 1026) (RUCC4 = 644) (n=3143) Mean (SD) Mean (SD) Mean (SD) Mean (SD) Mean (SD) Variable Units [Range] [Range] [Range] [Range] [Range] 12.1 (6.5) 14.8 (7.7) 18.5(9.3) 25.8(12.3) 17.3(10.3) Percent vacant housing % [1.7,60.1] [5.5, 63.9] [4.9, 68.0] [7.2, 83.3] [1.7,83.3] 175.4(103.9) 135.4(78.7) 106.6 (64.9) 94.9 (55.5) 133.5 (88.4) Median household value (X1000) Dollar value [0, 868k] [57.0, 583.2k] [18.6,100.0k] [29.7, 4965.6k] [0,1000k] 82.6(17.0) 23.1 (9.7) 8.7 (4.9) 3.0 (2.4) 36.3k (109.9) Household income (X1000) Dollars [67.0,3217.9k] [5.9, 76.7k] [1.1,30.7k] [0.2,15.4k] [22,321.8k] 0.6 (0.6) 0.6 (0.6) 0.8(1.2) 0.9(1.4) 0.7(1.0) Count of occupants per room Count [0.1,6.1] [0.1,5.4] [0.1,20.2] [0.1,31.5] [0.1,31.5] 28.0 (9.3) 30.0 (6.3) 26.2 (5.9) 23.6 (7.0) 26.7 (7.8) Percent renter-occupied housing % [8.7,100] [16.8, 51.0] [11.3,53.7] [8.7,71.4] [8.7,100] 0.43 (0.04) 0.44 (0.03) 0.4(0.0) 0.4(0.0) 0.43 (0.04) Gini coefficient Proportion [0.3, 0.6] [0.35, 0.54] [0.3, 0.6] [0.2, 0.6] [0.21,0.65] Construct: Crime Rate per Mean number of violent crimes county 619.8 (441.4) 472.3 (308.2) 446.7 (249.8) 385.5 (195.1) 500.9 (344.5) per capita population [22.6, 6628.6] [19.52,1735.0] [7.3,1710.7] [69.9,1420.1] [7.3, 6628.6] Construct: County typology 0.2 (0.1) 0.2 (0.0) 0.2 (0.0) 0.15(0.0) 0.18(0.06) Creative class % [0,0.51] [0.1,0.4] [0.0, 0.5] [0, 0.4] [0,0.51] Construct: County political valence 44.8(13.7) 43.9(12.4) 40.2(12.9) 36.4(14.3) 41.5(13.8) Percent Democratic voters % [5.5, 92.5] [12.5,84.5] [7.8, 88.7] [4.9, 86.8] [4.9, 92.5] NOTE: Means calculated using nontransformed variables k = 1000 Built Domain The most urban counties had a higher rate of traffic fatalities and residents reporting spending more time commuting compared with more rural areas (Table 8). Urban counties also had a higher walkability score but contained less green space and undeveloped areas than rural counties. Table 8. Built-environment domain variable means, standard deviations (SDs), and ranges - Overall and rural-urban continuum codes (RUCCs) stratified Metropolitan- Urbanized (RUCC1 = 1167) Mean (SD) Variable Units [Range] Built Domain Construct: Business environment Vice-related environment Count/county population 4.9e-4 (3.1 e-4) [1,5e-5, 3.4e-3] 5.8e-4 (2.9e-4) [6.3e-5,1.8e-3] 6.4e-4 (4.3e-4) [1.5e-5, 2.8e-3] 8.9e-4 (8.9e-3) [1.5e-5, 7.2e-3] 6.3e-4 (5.3e-4) [1.5e-5, 7.2e-3] Civic-related environment Count/ county population 2.9e-3 (9.4e-4) [2.5e-4, 8.4e-4] 3.3e-3 (8.6e-4) [9.5e-4, 7.2e-3] 3.8e-3 (1.1 e-3) [5.9e-4, 6.5e-3] 4.3e-3 (1.7e-3) [2.5e-4,1,6e-2] 3.5e-3 (1.3e-3) [2.5e-4,1,6e-2] Education-related environment Count/ county population 1.2e-3 (4.2e-4) [1,8e-4, 4.5e-3] 1.3e-3 (3.6e-4) [6.3e-4, 3.2e-3] 1,5e-3 (6.0e-4) [5.9e-4, 6.5e-3] 2.5e-3 (1.8e-3) [1.8e-4,1.8e-2] 1.6e-3 (1 Oe-3) [1.8e-4,1.8e-2] Health care-related environment Count/ county population 3.4e-3(1.6e-3) [3.4e-3,1.6e-3] 3.7e-3 (1.1 e-3) [1.0e-3,1.1 e-2] 3.2e-3 (1.3e-3) [6.0e-4, 2.0e-2] 2.8e-3 (1.4e-3) [1 Oe-4, 9.1 e-3] 3.2e-3 (1.4e-3) [1 Oe-4, 2.0e-2] Negative food environment Count/ county population 1.2e-3 (3.4e-4) [7.0e-5, 3.4e-3] 1.4e-3 (3.8e-4) [6.4e-4, 4.3e-3] 1.4e-3 (4.2e-4) [1.7e-4, 4.7e-3] 1,3e-3 (8.5e-4) [7.0e-5, 1.3e-2] 1,3e-3 (5.2e-4) [7.0e-5,1.3e-2] Positive food environment Count/ county population 2.2e-3 (7.7e-7) [1,3e-4,8.1e-3] 2.3e-3 (8.5e-4) [1 Oe-3, 7.8e-3] 2.4e-3 (8.9e-4) [4.4e-4, 9.0e-3] 2.9e-3 (1.7e-3) [1.3e-4, 2.0e-2] 2.4e-3 (1.1e-3) [1,3e-4, 2.0e-2] Nonmetropolitan- Urbanized (RUCC2 = 306) Mean (SD) [Range] Less Urbanized (RUCC3 = 1026) Mean (SD) [Range] Thinly Populated (RUCC4 = 644) Mean (SD) [Range] OVERALL (n=3143) Mean (SD) [Range] 30 ------- Table 8. continued Variable Recreation environment Social service-related environment Construct: Highway safety Traffic fatality rate Construct: Housing Rate of low-rent + Section 8 housing Construct: Roads Proportion of roads that are secondary Construct: Commuting practices Residents who report using public transport Commute time Construct: Walkability Walkability score Construct: Green space County land area classified as natural cover and open space % NOTE: Means calculated using nontransformed variable Units Count/county population Count/ county population Fatality count/ county population Unit count/ county population Secondary road mile / total road miles Minutes Ordinal Metropolitan- Urbanized (RUCC1 = 1167) Mean (SD) [Range] 1.3e-3 (6.1 e-4) [4.7e-5,1.1 e-2] 1,5e-3 (5.9e-4) [9.2e-5, 5.1e-3] 23.2 (39.0) [1.0,685.8] 0.2 (0.4) [0.0,1.0] 0.2 (0.1) [0.0, 0.5] 1.8 (4.6) [0.1,60.5] 25.0(5.1) [6.2, 60.5] 7.1 (2.3) [1.7,16.2] 61.5 (24.4) [3.9, 99.7] Nonmetropolitan- Urbanized (RUCC2 = 306) Mean (SD) [Range] 1,6e-3 (8.5e-4) [3.0e-4, 8.8e-3] 1,8e-3 (6.5e-4) [6.2e-4, 4.8e-3] 11.2(6.5) [1.3,59.6] 0.2 (0.4) [0.0, 1.0] 0.1 (0.1) [0.0, 0.44] 0.7(1.2) [0.1,12.8] 20.7 (3.6) [12.3,31.8] 6.6(1.1) [4.1,13.8] 62.3 (28.0) [5.3, 99.8] Less Urbanized (RUCC3 = 1026) Mean (SD) [Range] 1.7e-3(1.0e-3) [1.2e-4,1.0e-2] 1,8e-3 (7.8e-4) [3.0e-4, 5.2e-3] 5.7 (3.5) [1.0,39.4] 0.4(0.5) [0.0,1.0] 0.14(0.1) [0.0, 0.4] 0.7(1.0) [0.1,13.0] 21.6 (5.0) [5.4, 38.5] 5.9(1.1) [2.0,10.5] 63.2 (28.6) [6.9,100.0] Thinly Populated (RUCC4 = 644) Mean (SD) [Range] 2.2e-3 (1.9e-3) [4.7e-5,1,8e-2] 1.9e-3 (1.1 e-3) [9.2e-5, 8.4e-3] 2.8(1.8) [1.0,14.0] 0.6 (0.5) [0.0,1.0] 0.1 (0.1) [0.1,24.1] 0.9(1.2) [0.1,24.1] 21.4(6.4) [4.3, 44.2] 5.3(1.2) [1.0,9.5] 68.5 (27.7) [6.2,100.0] OVERALL (n=3143) Mean (SD) [Range] 1.6e-3 (1.2e-3) [4.7e-5,1.8e-2] 1.7e-3 (8.0e-4) [9.2e-5, 8.4e-3] 12.1 (25.5) [1,685.8] 0.4(0.5) [0,1] (0.1) [0.2, 0.5] 1.2 (3.0) [0.1,60.5] 22.7 (5.5) [4.3, 44.2] 6.3(1.8) [1.0,16.2] 63.5 (27.0) [3.9,100.0] Variable Loadings on Environmental Quality Index Domains Air Domain The loadings for the variables comprising the air domain are displayed in Table 9. Each variable lias been annotated with a "+" or "-"that is the predicted direction for the loading. Because we want to ensure that higher values of the EQI are associated with worse enviromnental quality, those variables that we anticipate being associated with poor enviromnental quality are assigned a "+" indicating more of this attribute would be a negative for health. All variables except for S02 and benzidine (in certain strata) loaded as intended; loadings for S02 and benzidine were relatively low. Most variables loaded consistently across rural- urban strata. Table 9. Variable loadings, valence determination of variables - Air domain 1,1,2-Trichloroethane ( 0.0007 0.0016 0.0687 0.0272 0.0224 0.0402 0.0273 0.0728 0.0398 0.1306 0.1665 0.1652 0.1626 0.1514 0.0443 0.0718 0.0794 0.0738 0.0798 -0.0036 -0.0141 -0.0200 -0.0535 -0.0221 0.1345 0.1215 0.1458 0.1745 0.1513 0.1054 0.1191 0.1220 0.1204 0.1278 0.1410 0.1208 0.1478 0.1551 0.1475 0.1120 0.1181 0.1131 0.1042 0.1179 Metropolitan-Urbanized (RUCC1 = 1167) Nonmetropolitan-Urbanized (RUCC2 = 306) Less Urbanized (RUCC3 = 1026) Thinly Populated (RUCC4 = 644) OVERALL (n=3143) 0.1120 0.0443 0.1410 0.1654 0.1181 0.0718 0.1208 0.1508 0.1131 0.0794 0.1478 0.1583 0.1042 0.0738 0.1551 0.1648 0.1179 0.0798 0.1475 0.1616 31 ------- Table 9. continued. Air Domain 1,2-Dibromo-3-chloropropane M 1,2-Dichloropropane (+) Acrylic acid (+) Benzidine (+) Benzyl chloride (+) Beryllium compounds (+) bis-2-Ethylhexyl phthalate (+) Carbon tetrachloride (+) Carbonyl sulfide (+) Chlorine (+) Chlorobenzene (+) Chloroform (+) Chloroprene (+) Chromium compounds (+) Cobalt compounds (+) Cyanide compounds (+) Dibutylphthalate (+) Ethyl chloride (+) Ethyl benzene (+) Ethyl dichloride (+) Glycol ethers (+) Hydrazine (+) Hydrochloric acid (+) Isophorone (+) Manganese compounds (+) Methyl bromide (+) Methyl chloride (+) Phosphine (+) Polychlorinated biphenyls (+) Propylene dichloride (+) Quinoline (+) Trichloroethylene (+) Vinyl chloride (+) Metropolitan-Urbanized (RUCC1 = 1167) 0.0722 0.1069 0.1714 -0.0031 0.1976 0.1761 0.1046 0.0649 0.1524 0.1791 0.2065 0.1880 0.1724 0.2012 0.2120 0.1722 0.1923 0.1890 0.2407 0.1275 0.1882 0.1219 0.1910 0.1597 0.1229 0.1404 0.1931 0.0041 0.0971 0.1585 0.1805 0.2283 0.1781 Nonmetropolitan-Urbanized (RUCC2 = 306) 0.0657 0.1090 0.1785 0.0023 0.1926 0.1460 0.1343 0.1127 0.1322 0.1972 0.1810 0.1674 0.1560 0.2010 0.2223 0.1532 0.2087 0.2047 0.2313 0.1183 0.1987 0.1434 0.1987 0.1775 0.1369 0.0889 0.1905 0.0014 0.1004 0.1529 0.1881 0.2288 0.1577 Less Urbanized (RUCC3 = 1026) 0.0416 0.1095 0.1727 -0.0058 0.1968 0.1343 0.0872 0.0761 0.1439 0.1877 0.1998 0.1705 0.1479 0.2010 0.2093 0.2033 0.2029 0.1830 0.2343 0.1299 0.1965 0.1261 0.2066 0.1630 0.1358 0.1183 0.1887 0.0054 0.0933 0.1349 0.1915 0.2296 0.1696 Thinly Populated (RUCC4 = 644) 0.0879 0.1143 0.1422 0.0592 0.1850 0.1688 0.1654 0.1272 0.1664 0.1775 0.1995 0.1713 0.1443 0.1676 0.1908 0.1910 0.1988 0.1946 0.2138 0.1500 0.1673 0.1186 0.1974 0.1667 0.1187 0.1355 0.1756 0.0439 0.1288 0.1254 0.1560 0.1995 0.1767 OVERALL (n=3143) 0.0688 0.1129 0.1661 0.0135 0.1917 0.1557 0.1192 0.0823 0.1580 0.1866 0.2014 0.1740 0.1537 0.1904 0.2081 0.1825 0.2000 0.1875 0.2306 0.1344 0.1884 0.1246 0.1994 0.1647 0.1250 0.1247 0.1825 0.0089 0.1040 0.1428 0.1799 0.2210 0.1770 Water Domain The loadings for the variables that comprise the water domain are displayed in Table 10. Each variable lias been annotated with a "+" or that is the predicted direction for the loading. Because we want to ensure that higher values of the EQI are associated with worse environmental quality, those variables that we anticipate being associated with poor enviromnental quality are assigned a "+" indicating more of this attribute would be a negative for health. The variables in the drought, chemical contamination and drinking water quality constructs loaded in the direction intended; however, some of the variables in the remaining constructs loaded in the opposite direction intended. Table 10. Variable loadings, valence determination of variables - Water domain Water Domain Construct: Domestic Use Percent of population on self-supply (+) Percent of public supply population on surface water ( Metropolitan- Urbanized (RUCC1 = 1167) 0.0028 0.0197 Nonmetropolitan Urbanized (RUCC2 = 306) 0.0155 0.0155 Less Urbanized (RUCC3 = 1026) 0.0203 -0.0004 Thinly Populated (RUCC4 = 644) 0.0279 0.0251 Total (All: 3143) 0.0096 0.0191 32 ------- Table 10. continued Water Domain Metropolitan- Urbanized (RUCC1 = 1167) Nonmetropolitan Urbanized (RUCC2 = 306) Less Urbanized (RUCC3 = 1026) Thinly Populated (RUCC4 = 644) Total (All = 3143) Construct: Overall Water Quality Percent of stream length impaired in county (+) 0.0142 -0.0174 -0.0053 0.0160 0.0111 Construct: General Water Contamination ALL NPDES permits per 1000 km of stream (+) -0.0161 -0.0415 -0.0225 0.0164 -0.0009 Construct: Atmospheric Deposition Calcium precipitation weighted mean (+) 0.0378 0.0199 0.0347 -0.0039 0.0206 Potassium precipitation weighted mean (+) -0.0108 -0.0236 -0.0075 -0.0291 -0.0204 Nitrate precipitation weighted mean (+) 0.0239 0.0014 0.0182 0.0009 0.0140 Chloride precipitation weighted mean (+) -0.0408 -0.0329 -0.0457 -0.0077 -0.0278 Sulfate precipitation weighted mean (+) -0.0162 -0.0217 -0.0086 0.0209 -0.0035 Total mercury deposition (+) -0.0730 -0.0632 -0.0596 0.0015 -0.0462 Construct: Drought Percent of county drought - extreme (+) 0.0066 0.0179 0.0008 0.0142 0.0084 Construct: Chemical Contamination Arsenic (+) 0.1669 0.1674 0.1605 0.1584 0.1641 Barium (+) 0.1673 0.1684 0.1609 0.1628 0.1655 Cadmium (+) 0.1460 0.1475 0.1533 0.1615 0.1523 Chromium (+) 0.1661 0.1658 0.1592 0.1596 0.1636 Cyanide (+) 0.1369 0.1383 0.1181 0.1230 0.1291 Fluoride (+) 0.1736 0.1770 0.1804 0.1729 0.1765 Mercury (inorganic) (+) 0.0634 0.0494 0.0478 0.0614 0.0575 Nitrate (+) 0.1666 0.1600 0.1485 0.1417 0.1565 Nitrite (+) 0.1356 0.1322 0.1212 0.1231 0.1298 Selenium (+) 0.1661 0.1740 0.1644 0.1626 0.1663 Antimony (+) 0.1639 0.1541 0.1538 0.1586 0.1597 Endrin (+) 0.1392 0.1369 0.1387 0.1480 0.1412 Methoxychlor (+) 0.1670 0.1650 0.1676 0.1752 0.1690 Dalapon (+) 0.1462 0.1444 0.1409 0.1473 0.1449 Di(2-ethylhexyl) adipate (+) 0.1614 0.1576 0.1568 0.1624 0.1605 Simazine (+) 0.1674 0.1635 0.1651 0.1666 0.1671 Di(2-ethylhexyl) phthalate (+) 0.1682 0.1607 0.1594 0.1580 0.1638 Picloram (+) 0.1344 0.1301 0.1308 0.1445 0.1350 Dinoseb (+) 0.1599 0.1570 0.1550 0.1591 0.1584 Atrazine (+) 0.1758 0.1747 0.1738 0.1763 0.1759 2,4-Dichlorophenoxyacetic acid (+) 0.1612 0.1695 0.1565 0.1671 0.1623 Benzo[a]pyrene (+) 0.1578 0.1510 0.1538 0.1589 0.1561 Pentrachlorophenol (+) 0.1652 0.1622 0.1689 0.1715 0.1674 Polychlorinated biphenyls (+) 0.1244 0.1169 0.1081 0.1189 0.1185 1,2,-Dibromo-3-chloropropane (+) 0.1606 0.1552 0.1622 0.1631 0.1613 Ethylene dibromide (+) 0.0947 0.1043 0.1051 0.1035 0.1000 Xylenes (+) 0.1685 0.1654 0.1790 0.1816 0.1744 Chlordane (+) 0.1734 0.1755 0.1755 0.1763 0.1751 Dichloromethane (+) 0.1877 0.1950 0.1986 0.1900 0.1921 p-Dichlorobenzene (+) 0.1814 0.1886 0.1807 0.1814 0.1820 1,1,1-Trichloroethane (+) 0.1885 0.1917 0.1977 0.1906 0.1920 Trichloroethylene (+) 0.1893 0.1954 0.1992 0.1914 0.1932 Carbon tetrachloride (+) 0.1919 0.1968 0.2008 0.1926 0.1951 Benzene(+) 0.1880 0.1957 0.2008 0.1901 0.1929 33 ------- Table 9. continued. Water Domain Metropolitan- Urbanized (RUCC1 = 1167) Nonmetropolitan Urbanized (RUCC2 = 306) Less Urbanized (RUCC3 = 1026) Thinly Populated (RUCC4 = 644) Total (All = 3143) Toluene (+) 0.1839 0.1736 0.1908 0.1876 0.1859 Styrene (+) 0.1822 0.1927 0.1980 0.1905 0.1896 Alpha particles (+) 0.0670 0.0537 0.0609 0.0771 0.0639 cis1,2-Dichloroethylene (+) 0.1892 0.1958 0.1998 0.1904 0.1930 Total coliform proportion (+) 0.0084 -0.0088 0.0008 0.0105 0.0067 Land Domain The loadings for the variables that comprise the mines construct of the land domain varied by RUCC (Table 11), but loadings for the variables that comprise the other constructs (agriculture, pesticides, radon, and facilities) were consistent across RUCCs. Each variable again has been annotated with a "+" or that is the predicted direction for the loading to ensure that higher values of the EQI represent worse enviromnental quality. Table 11. Variable loadings, valence determination of variables - Land domain Land Domain Metropolitan- Urbanized (RUCC1 = 1167) Nonmetropolitan Urbanized (RUCC2 = 306) Less Urbanized (RUCC3 = 1026) Thinly Populated (RUCC4 = 644) Total (All = 3143) Construct: Agriculture Farms per acre (+) 0.3742 0.3148 0.3275 0.3501 0.3487 Irrigated acreage (+) 0.2750 0.1364 0.1789 0.1720 0.2109 Chemicals used to control nematodes (+) 0.3127 0.2753 0.2883 0.3297 0.3070 Manure (+) 0.3701 0.3049 0.3174 0.3561 0.3483 Chemicals used to control diseases in crops and orchards (+) 0.3589 0.3384 0.3302 0.3420 0.3479 Chemicals used to defoliate/control growth/thin fruit M 0.2796 0.2486 0.2630 0.3209 0.2793 Harvested acreage (+) 0.4173 0.3943 0.4039 0.4074 0.4156 Animal units (+) 0.1876 0.1135 0.1118 0.1603 0.1479 Construct: Pesticides Fungicides (+) 0.1055 0.2088 0.2125 0.0972 0.1582 Herbicides (+) 0.2007 0.3285 0.3177 0.2388 0.2742 Insecticides (+) 0.1759 0.2893 0.2604 0.1676 0.2272 Construct: Mines Primarily coal mines, mines per county population (+) -0.0220 -0.0497 -0.0966 -0.0583 -0.0611 Primarily metal mines, mines per county population (+) -0.0836 -0.2283 -0.1961 -0.2172 -0.1754 Primarily nonmetal mines, mines per county population (+) 0.0076 -0.0798 -0.0904 -0.0676 -0.0521 Primarily sand and gravel mines, mines per county population (+) 0.1181 -0.0229 -0.0341 0.0058 0.0270 Primarily stone mines, mines per county population (+) 0.0740 -0.0971 -0.1101 -0.1088 -0.0515 Construct: Radon Radon zone (+) -0.0680 -0.0838 -0.0517 -0.1475 -0.0827 Construct: Facilities Facilities (+) 0.1389 0.2361 0.1930 0.1322 0.1598 34 ------- Sociodemographic Domain The loadings for the variables that comprise the sociodemographic domain varied by RUCC (Table 12), indicating some variables were more influential on the domain score in urban counties, whereas others exerted more of an effect in rural counties. For instance, percent unemployed loaded on the RUCC 1 sociodemographic domain at 0.16 compared with its loading on RUCC 4 sociodemographic domain of 0.44. Each variable has been annotated with a "+" or that is the predicted direction for the loading. Because we want to ensure that higher values of the EQI are associated with worse enviromnental quality, those variables that we anticipate being associated with poor enviromnental quality are assigned a "+" indicating more of this attribute would be a negative for health. Most of the variables initially loaded in nearly the opposite direction intended. The loadings are a function of the program's starting point, or seed, which is not easily manipulable. Therefore, the loading valence needed to be corrected prior to the construction of the indices to ensure that higher values on a given index, and on the overall EQI, signify worse enviromnental quality. One important item to note is that the patterns of association within the socioeconomic construct across RUCC levels were not consistent. For instance, percent Democratic voting in the 2008 election loaded negatively in the most urban counties (RUCC 1 and 2) but positively in the less urban counties (RUCC 3 and 4). Percent of individuals earning a bachelor's degree, percent unemployed, percent of families in poverty, median household value, and creative class are variables that loaded in a consistent direction across rural- urban strata. Appendix V provides the original and modified valence corrected variable loadings. Table 12. Valence corrected variable loadings, valence determination of variables - Sociodemographic domain Sociodemographic Domain Socioeconomic Construct Metropolitan- Urbanized (RUCC1 = 1167) -0.4689 0.1625 0.2591 0.2306 -0.4034 -0.3700 0.0055 -0.1827 -0.1162 Nonmetropolitan- Urbanized (RUCC2 = 306) -0.4621 0.3274 0.4293 -0.1331 -0.4002 -0.0874 0.1371 0.0141 0.1604 Less Urbanized Thinly Populated OVERALL (RUCC3 = 1026) (RUCC4 = 644) (n=3143) -0.4174 0.3546 0.4737 -0.0555 -0.3476 -0.0640 0.1116 0.1523 0.2725 -0.4416 0.4418 0.4904 -0.1381 -0.2216 0.2578 -0.0141 0.0603 0.2766 -0.4585 0.1269 0.298 0.1979 -0.4331 -0.3824 0.1085 -0.1458 0.0118 -0.0094 0.2386 0.2997 0.2012 -0.0234 -0.4668 -0.4463 -0.3829 -0.2458 -0.4833 -0.2625 -0.0929 0.0374 0.2313 -0.211 Percent bachelors degree (-) Percent unemployed (+) Percent families less than poverty level (+) Percent vacant housing (+) Median household value (-) Household income (-) Count of occupants per room (+) Percent renter-occupied housing (+) Gini coefficient (+) Crime Construct Log violent crime (+) Creative class construct Creative class (-) 2008 Political valence construe Percent Democratic (-) 35 ------- Built Domain Similar to the sociodemographic domain, the loadings for the variables that comprise the built domain varied by RUCC (Table 13), indicating some variables were more influential on the domain score in urban counties, whereas others exerted more of an effect in rural counties. Each variable again has been annotated with a "+" or that is the predicted direction for the loading to ensure that higher values of the EQI represent worse enviromnental quality. Also, similar to the sociodemographic domain, many of the initial variable loadings are opposite to that intended. These loading valences needed to be valence corrected prior to the construction of the indices to ensure that higher values on a given index, and on the overall EQI, signify worse enviromnental quality. The business-related enviromnents loaded consistently across RUCC levels, as did the public transportation, commute time and walkability score (Table 13). Appendix V provides the original and modified valence corrected variable loadings. Table 13. Valence corrected variable loadings, valence determination of variables - Built domain Metropolitan- Nonmetropolitan- Less Thinly Urbanized Urbanized Urbanized Populated Built Domain (RUCC1 = 1167) (RUCC2 = 306) (RUCC3 = 1026) (RUCC4 = 644) OVERALL (n=3143) Socioeconomic Construct Vice-related environment (+) -0.2676 -0.0331 -0.2724 -0.2595 -0.2930 Civic-related environment (-) -0.1238 -0.2057 -0.1890 -0.3102 -0.3071 Education-related environment (-) -0.2409 -0.2626 -0.3278 -0.3285 -0.3495 Health care-related environment (-) -0.4189 -0.3856 -0.3179 -0.2742 -0.2798 Negative food environment (+) -0.3239 -0.2707 -0.2306 -0.1527 -0.2280 Positive food environment (-) -0.3405 -0.2752 -0.2660 -0.2524 -0.3179 Recreation environment (-) -0.2354 -0.3484 -0.3212 -0.3222 -0.3590 Social service-related environment (-) -0.3446 -0.3503 -0.3644 -0.2793 -0.3629 Highway safety construct Traffic fatality rate (+) -0.1978 0.2340 0.2197 0.2312 0.1751 Housing construct Rate of low-rent + Section 8 housing (+) 0.1230 -0.0459 -0.0697 0.0178 -0.0581 Road construct Proportion of secondary roads (+) -0.0950 0.1319 0.1761 0.2054 0.1777 Commuting behavior construct Commute time (+) 0.1886 0.2808 0.3230 0.3546 0.3329 Public transportation (-) -0.2253 -0.1111 -0.0777 -0.0256 -0.0463 Walkability construct Walkability score (-) -0.3516 -0.3310 -0.3542 -0.3787 -0.1585 Green space construct Proportion green space (-) 0.1065 -0.0253 0.0418 0.1370 0.0451 Changes to 2006-2010 index construction from original 2000-2005 EQI Valence Assignment The sole modification to the PCA methodology in the county 2006-2010 EQI compared to that of the 2000-2005 EQI is "valence correction." We also have created a 2000-2005 valence corrected version of the EQI. The loading pattern for the air domain which is comprised of established pollutants, served as the reference for our index orientation. The vast majority of variables for the air domain loaded "+" for both the overall United States and across the rural-urban continuum. Thus, orientation for valence correction, if needed, was toward variables with known poor enviromnental attributes toward "+" loadings. Valence correction was applied only to the sociodemographic and built-enviromnent domains. This is because only the sociodemographic and built domains had variables that were assigned as poor enviromnental attributes that loaded initially as For instance, we were reasonably certain that a high percentage of unemployed per county (variable in sociodemographic domain) is anticipated to have deleterious effects (and, therefore, could be assigned a "+" loading sign based on our determined index orientation). Appendix V provides the modified loadings, when applicable, along with the rationale for valence correction. Comparison of 2000-2005 EQI to the 2000-2005 valence corrected EQI To assess the impact of valence correction, we computed Pearson and Spearman correlation coefficients between the nonvalence- corrected and valence-corrected 2000-2005 EQI. For the overall EQI, both the Pearson and Spearman correlation coefficients were roughly 1. For RUCC1. they were 0.99 across both. For RUCC2, the Pearson correlation coefficient was 0.99, whereas 36 ------- the Spearman correlation coefficient was 0.98. For RUCC3, the Pearson and Spearman correlation coefficients were -0.97 and -0.96, respectively. And, finally, forRUCC4, they were -0.97 and -0.97, respectively. Comparison of 2000-2005 valence corrected EQI to the 2006-2010 EQI We additionally computed Pearson and Spearman correlation coefficients between the valence corrected 2000-2005 EQI and the 2006-2010 EQI. The domain-specific loadings for the overall EQI differed over the two time periods in terms of magnitude, rank, and direction. These differential loadings contributed to the relatively low correlation between the 2000-2005 and 2006- 2010 periods. For the overall EQI, the Pearson and Spearman correlation coefficients were both 0.34. For RUCC1, they were -0.71 and -0.72, respectively. ForRUCC2, the Pearson correlation coefficient was -0.35, whereas the Spearman correlation coefficient was -0.37. For RUCC3, the Pearson and Spearman correlation coefficients were 0.64 and 0.69, respectively. And, finally, forRUCC4, they were 0.57 and 0.59, respectively. The loadings may have differed over the two time periods because of inputs that were included in the domains, valence correction procedures, and potential changes in enviromnental quality. It is for these reasons that we recommend the two indices not be compared over time. Domain-Specific Index Description and Loadings on Overall EQI The means, standard deviations, and ranges for each domain- specific index are presented in Table 14. As expected, the index loadings on the overall EQI index were mean (0) and standard deviation (1). In examining the ranges of each RUCC-stratified index, the larger the negative number (the smaller the minimum), the better the enviromnental quality, whereas the larger the maximum value, the worse the enviromnental quality. In general, higher values of each domain's index was found in the more metropolitan areas, and the maximum values went down as counties became more thinly populated. Table 14. Description of the domain indices contributing to the overall and rural-urban continuum codes (RUCCs) stratified Environmental Quality Index for 3143 U.S. counties (2006-2010) -4.39E-10 -6.72 3.71 Air Environment Index -9.70E-10 -4.54 Land Environment Index -2.11E-11 -5.13 2.76 Sociodemographic Environment Index -2.20E-10 -7.29 Air Environment Index 1.28E-09 -4.30 Land Environment Index -7.23E-10 -4.28 2.78 Sociodemographic Environment Index -2.96E-09 -2.92 2.37 Air Environment Index -2.11E-09 1.62 Land Environment Index -1.45E-10 -4.14 Sociodemographic Environment Index I.32E-10 -2.67 3.31 Air Environment Index 7.79E-10 Land Environment Index 7.34E-10 -4.79 3.64 Sociodemographic Environment Index 1.40E-09 -5.69 2.17 Air Environment Index 5.36E-10 -4.32 Land Environment Index -1.17E-09 -3.51 3.81 Sociodemographic Environment Index Built-Environment Index -2.34E-09 -3.50 3.28 Built-Environment Index -4.06E-10 -2.64 4.20 Water Environment Index 1.30E-10 -1.21 1.96 Less Urbanized (n=1026) Metropolitan-Urbanized (n=1167) Water Environment Index -1.38E-09 1.93 Water Environment Index -3.48E-12 -1.46 2.05 Built-Environment Index 6.18E-10 -3.22 3.77 Water Environment Index 2.94E-10 -3.95 2.37 Built-Environment Index 1.20E-09 -4.71 5.66 Built-Environment Index -1.93E-09 -3.62 7.29 Water Environment Index -1.59E-09 -1.61 1.56 Thinly Populated (n=644) Non-Metropolitan-Urbanized (n=306) All Counties (n=3143) Mean Standard Deviation Minimum Maximum Mean Standard Deviation Minimum Maximum All Counties (n=3143) Air Environment Index -4.39E-10 1 -6.72 3.71 Water Environment Index -3.48E-12 1 -1.46 2.05 Land Environment Index -9.70E-10 1 -4.54 1.84 Built-Environment Index 1.20E-09 1 -4.71 5.66 Sociodemographic Environment Index -2.11E-11 1 -5.13 2.76 Metropolitan-Urbanized (n=1167) Air Environment Index -2.20E-10 1 -7.29 3.68 Water Environment Index -1.38E-09 1 -1.48 1.93 Land Environment Index 1.28E-09 1 -4.30 1.80 Built-Environment Index -1.93E-09 1 -3.62 7.29 Sociodemographic Environment Index -7.23E-10 1 -4.28 2.78 Non-Metropolitan-Urbanized (n=306) Air Environment Index -2.96E-09 1 -2.92 2.37 Water Environment Index -1.59E-09 1 -1.61 1.56 Land Environment Index -2.11E-09 1 -3.86 1.62 Built-Environment Index -2.34E-09 1 -3.50 3.28 Sociodemographic Environment Index -1.45E-10 1 -4.14 2.84 Less Urbanized (n=1026) Air Environment Index 8.32E-10 1 -2.67 3.31 Water Environment Index 2.94E-10 1 -3.95 2.37 Land Environment Index 7.79E-10 1 -3.88 1.61 Built-Environment Index 6.18E-10 1 -3.22 3.77 Sociodemographic Environment Index 7.34E-10 1 -4.79 3.64 Thinly Populated (n=644) Air Environment Index 1.40E-09 1 -5.69 2.17 Water Environment Index 1.30E-10 1 -1.21 1.96 Land Environment Index 5.36E-10 1 -4.32 1.51 Built-Environment Index -4.06E-10 1 -2.64 4.20 Sociodemographic Environment Index -1.17E-09 1 -3.51 3.81 37 ------- Description of Overall EQI The pattern of association for the domain-specific loadings differed by rural-urban status (Table 15). In the most urban areas, RUCC1, the sociodemographic and built-enviromnent domains were both influential, as indicated by their loading values (0.68 and 0.67, respectively), followed by the land domain (0.23). For the nomnetropolitan-urbanized areas (RUCC2), the built and sociodemographic domains loaded similarly on the overall EQI (0.58 and 0.53, respectively), followed more closely by the air domain. In all but the overall EQI, the water domain was least influential, based on its low PCA coefficients. In the most thinly populated counties, RUCC4, the water and land domains were characterized by the lowest loadings (0.13 and 0.14, respectively), whereas the built, sociodemographic, and air domains were the most influential (loadings of 0.60, 0.56, and 0.54, respectively). The built and the air domains loaded approximately equally on the overall EQI, and, unlike the loadings observed on the RUCC- stratified EQIs, the sociodemographic domain was relatively unimportant to the overall quality. Similar to the loadings for each domain, the loadings for each RUCC-stratified EQI was valence corrected to ensure that a higher EQI score corresponds to worse enviromnental quality. Appendix VI contains county mapping of the overall EQI 2006-2010 and RUCC-stratified domain-specific indices. Table 15. Loadings of the domain indices contributing to the overall and rural-urban continuum codes (RUCCs) stratified Environmental Quality Index for 3143 U.S. counties (2006-2010) Overall (n=3143) Air Domain Water Domain Land Domain Built-Environment Domain Sociodemographic Domain Metropolitan-Urbanized RUCC1 (n=1167) Air Domain Water Domain Land Domain Built-Environment Domain Sociodemographic Domain Nonmetropolitan Urbanized Areas RUCC 2 (n=306) Air Domain Water Domain Land Domain Built-Environment Domain Sociodemographic Domain Less Urbanized Areas RUCC 3 (n=1026) Air Domain Water Domain Land Domain Built-Environment Domain Sociodemographic Domain Thinly Populated RUCC 4 (n=644) Air Domain Water Domain Land Domain Built-Environment Domain Sociodemographic Domain Coefficient/Loading 0.6678 0.2209 0.3038 0.6240 -0.1536 -0.1280 -0.0906 0.2340 0.6730 0.6839 0.4128 -0.2407 0.3926 0.5274 0.5825 0.4785 -0.1569 0.1769 0.6370 0.5562 0.5402 0.1323 0.1430 0.5960 0.5612 95% Confidence Interval 0.6238,0.7118 0.0940, 0.3479 0.2054, 0.4021 0.5582, 0.6898 -0.2966,-0.0107 -0.2414, -0.0146 -0.2522,0.7010 0.0856, 0.3824 0.6377, 0.7083 0.6476, 0.7201 0.2771,0.5484 -0.4204, -0.0611 0.2514, 0.5337 0.4136, 0.6414 0.4939, 0.6712 0.4049, 0.5520 -0.2693, -0.0445 0.0672, 0.2866 0.5939, 0.6802 0.4939,0.6184 0.4809, 0.5994 0.0177,0.2469 0.0233, 0.2627 0.5469, 0.6450 0.5064,0.6160 38 ------- 4.0 Discussion This report describes the efforts to update the Environmental Quality Index (EQI) for all counties in the United States for the 2006-2010 period. The EQI was created for two main purposes: (1) as an indicator of ambient conditions/exposure in environmental health modeling and (2) as a covariate to adjust for ambient conditions in environmental models. However, with the public release of the EQI and variables that constructed the EQI, other uses may emerge. The methods applied provide a reproducible approach that capitalizes almost exclusively on publicly available data sources. The EQI holds promise for improving the environmental estimation in public health. The EQI describes the ambient county-level conditions to which residents are exposed, whether they are at home, at school, or at work, provided these multiple human activity spaces occur in the same county. Since the creation of the EQI 2000-2005, multiple studies have been conducted examining the relationship between overall environmental quality and health outcomes, including preterm birth[3], mortality[4], cancer incidence[5], asthma prevalence[6], physical inactivity and obesity [7], infant mortality [8], and pediatric multiple sclerosis[9], A complete list of references related to EQI and health outcomes is listed in Appendix I. With the updated EQI 2006-2010, the hope is that the EQI can continue to be used to help public health researchers investigate cumulative impact of various diverse constructs that typically are viewed in isolation. Each of the domain-specific pieces of information that contributes to the EQI is also informative. Because most environmental health practice occurs on a domain- specific basis, this domain-specific information may be important to policymakers and environmental health practitioners. The domain-specific loadings to the EQI indicate which of the environmental domains accounts for the largest portion of the variability in the EQI; in essence, these loadings answer the question about which domain is making the biggest contribution to the total environment. In addition, the variable loadings on each of the domains are also informative for the same reason. The development of the EQI 2006-2010 followed mostly the same protocol as the EQI 2000-2005. Most of the constructs and the data sources identified for each of the five domains in the EQI 2000-2005 were maintained. Principal components analysis was used to develop the indices. However, using lessons learned from the creation of the EQI 2000-2005, some modifications were adopted to improve the EQI 2006-2010. Summary of changes made to 2006-2010 version compared with 2000-2005 Modifications to the EQI 2006-2010 included exploring new data sources that were not available during EQI 2000-2005 development, assessment of all variables for continued inclusion in the EQI, and assessment of variables' valence within a domain and valence correction. Although most constructs were carried over from the EQI 2000-2005 to the updated EQI 2006-2010, the exceptions to this were the following: One deletion each in the water domain and land domain and constructs added to the water domain, land domain, sociodemographic domain, and the built-environment domain. For data sources, we added seven new data sources and discontinued use of one data source. Lastly, we assessed the valence of each domain to ensure that the orientation of the PCA output would have uniformity for interpretation of the domain indices and uniformity for orientation as input into the second PCA. Strengths and Limitations Because modifications were made to the updated EQI 2006-2010, direct comparisons between EQI 2000-2005 and EQI 2006-2010 should not be made. The two indices should not be examined as being continuous over time (e.g., if a study period covers 2004-2007, only one of the indices should be chosen or study population should be stratified by time period matched to the appropriate EQI). The EQI offers a comprehensive measure of environmental quality for all counties in the United States and is comprised of many of the best environmental measures currently available. The EQI can be used as an ambient exposure metric to help identify environmental issues related to community health. It provides information on overall environmental exposures faced in a community. In addition, because data sources were used for all U.S. counties, the EQI is comparable across counties to help identify areas of better and worse overall environmental quality. The development of domain-specific indices enables counties to assess the drivers of poor environmental quality in their county. Additionally, because it is comparable across counties, areas that are burdened most by poor environmental quality can be identified. Finally, the EQI can be used in a variety of environmental health research activities as a control variable to adjust for overall environmental exposure, while trying to isolate a specific effect. Such a control variable will provide better estimates of effects by reducing confounding by co-occurring environmental factors. The EQI is a national-level index that potentially can provide a better understanding into how multiple environmental conditions affect U.S. counties. At its current county-level scale, the EQI may not reveal environmental injustices seen 39 ------- at the local community level. However, it does highlight those counties experiencing an increased burden of environmental impacts. Further, the EQI can contribute to environmental justice endeavors by describing the process by which EQI data were obtained and how the EQI was constructed and by indicating the Web sites containing available data that can be used to construct indices at different levels of aggregation. The EQI can be a tool for interested investigators to consider constructing local EQIs and adding relevant local-level data for more focused comparisons. Use of the EQI as a measure of exposure assumes exposure to "environment" is consistent for all individuals, but the extent of individual environmental exposure was not assessable. The EQI was focused solely on the outside environment, which may not be the most relevant exposure in relation to human health and disease. Finally, population-level analyses offer little predictive utility for individual-level risk. Therefore, although the index may be useful at identifying less healthy county environments, it will not be useful for predicting individual-level adverse outcomes. The EQI was developed for research purposes and is not meant to be a diagnostic tool. The EQI would be useful to identify potential areas of concern for counties to target future research, but it should not be used to target regulatory purposes. Data Data sources evaluated represented each of the five environmental domains. Each data source was reasonably well documented. Despite finding a considerable number of data sources applicable to each environmental domain, significant data gaps exist. The data used to create the index balanced quality measurement with geographic breadth of coverage. Therefore, the index does a solid job estimating the ambient environment but may be less useful for estimating specific environments (e.g., in a particular noncounty location in the United States at a specific time). Not all relevant environmental exposures necessarily were included in the index. Data inclusion was dependent on data collection and coverage; if relevant data were not being collected, the information was not captured in the EQI. Relatedly, in areas where little data collection occurs, the data may be overrepresenting the environmental profile of those areas. For example, a county that contains a National Park without data collected and a town with data collection will be represented solely by the town data, although that may be inaccurate for the entire county. Conversely, environments with a wealth of environmental measurements, like urban areas, will be better estimated by the EQI. Environmental data sources often are plagued by inadequate spatial and temporal coverage. Most of the data sources obtained for the EQI required spatial interpolation to achieve county- level estimates. For example, even with extensive air monitoring networks, the measured spatial coverage of the United States was incomplete, particularly in rural areas. Some types of measures were located disproportionately in urban areas (e.g., PM air pollution), whereas other sorts are found in rural areas (e.g., industrial livestock operations). The nonrandom distribution of environmental risk meant that virtually all interpolated data were inaccurate, impairing the assessment of how pollutants differentially impacted urban and rural areas. From a human health perspective, probably the biggest limitation to existing environmental data sources is that data are collected with little thought given to potential health impacts. For instance, monitoring sites may collect relevant air pollutant data, but their location (e.g., air monitors located on top of buildings) is inappropriate for assessing the street-level values to which humans are exposed. Pesticide data, from the land domain, usually reports pesticide sales in relation to crops and livestock, not application, handling, or disbursement. Even the United States Census, which is widely used in health research, primarily is collected for tax and political districting purposes. Some of the data sources identified have not been used in human health research and, as such, are a limitation. Regularly collected, high- quality data that considers probable human health impacts would make the task of assessing differential exposures considerably easier. Environmental data also were collected rarely with adequate temporal frequency. Although data on some parameters were collected on a consistent and frequent basis, the majority were not. Water data, for instance, were collected only sporadically in response to a particular query or based on regulatory statute. Within the sociodemographic domain, the complete United States Census was collected decennially, which limits investigators' capacity to explore temporal changes. Some characteristics of places can change rapidly, but, under current data collection schedules, these changes cannot be assessed. Initially, the EQI sought to estimate yearly measures. However, ultimately, only the 5-year (2006-2010) and 6-year (2000-2005) measures were created because of the lack of yearly data for some of the variables. Many environmental parameters were compiled at a smaller unit of aggregation (e.g., for a municipality or city), and most were not maintained in a single source, such as a data repository. Although national repositories for some domains exist (e.g., water, air), often in response to federal regulations, no built- environment repository exists (for transit, walkability/physical activity, street connectivity, presence of sidewalks, or pedestrian lighting measures). Localities with limited funds may not be motivated or able to collect these data. PCA Methodology The use of PCA was not without limitations. Normality is an important assumption for PCA, and not all the data were distributed normally in their raw form. Many of the nonnormal variables were those with a substantial number of meaningful zeros (e.g., there were no public housing units contained within these counties). This "absence" of attribute is important information to convey, and, yet, it was problematic from a score-construction perspective. Although transforming the data improved their distribution, it reduced each variable's interpretability. A PCA-derived score also can be challenging to interpret. Outliers in the data also can be a limitation. However, with 3143 counties and normality checks, this is less problematic in the EQI. 40 ------- Although limited, the use of PCA was also an important strength of this project. PCA provided a means to overcome one of the significant limitations in the field of environmental health and combine multiple environmental domains into one index of ambient environmental quality; the whole endeavor would not have been possible without this data reduction strategy. The resulting scale is standardized, which will facilitate its comparison to other scales constructed in different countries or at different units of aggregation. Further, it is the approach that has been used in other scale or score construction activities[65, 66], Conclusion The updated EQI2006-2010 was constructed for all counties (n=3143) in the United States, incorporating data for five environmental domains, (1) air, (2) water, (3) land, (4) built, and (5) sociodemographic, and stratified by RUCCs. Mostly, the same reproducible approach used to create EQI 2000-2005 also was used to create EQI 2006-2010, with some noted changes that incorporate lessons learned from the first version. The EQI will be used as a measure in environmental health research. This broad-based effort acknowledges the many factors that together impact environmental quality and, more generally, recognizes that these factors work together to impact public health. Updates to the EQI for future years are planned, and the research team is actively creating a census tract version as a first step to explore other, finer spatial aggregations. 41 ------- ------- 5.0 References 1. United States Environmental Protection Agency (EPA), Creating an Overall Environmental Quality Index - Technical Report. 2014. National Health and Environmental Effects Research Laboratory: Chapel Hill, NC. 2. United States Environmental Protection Agency (EPA), EPA's 2008 Report on the Environment. 2008: Washington, DC. 3. Rappazzo, K.M., et al., The associations between environmental quality and preterm birth in the United States, 2000-2005: A cross-sectional analysis. Environ Health, 2015. 14: p. 50. 4. Jian, Y., et al., Associations between Environmental Quality and Mortality in the Contiguous United States, 2000-2005. Environ Health Perspect, 2017. 125(3): p. 355-362. 5. Jagai, J.S., et al., County-level cumulative environmental quality associated with cancer incidence. Cancer, 2017. 123(15): p. 2901-2908. 6. Gray, C.L., et al., Associations between environmental quality and adult asthma prevalence in medical claims data. Environ Res, 2018. 166: p. 529-536. 7. Gray, C.L., et al., The association between physical inactivity and obesity is modified by five domains of environmental quality in U.S. adults: A cross-sectional study. PLoS One, 2018. 13(8): p. e0203301. 8. Patel, A.P, et al., Associations between environmental quality and infant mortality in the United States, 2000-2005. Arch Public Health, 2018. 76: p. 60. 9. Lavery, A.M., et al., Examining the contributions of environmental quality to pediatric multiple sclerosis. Mult Scler Relat Disord, 2017.18: p. 164-169. 10. United States Environmental Protection Agency (EPA). Air Quality System Data Mart. The Ambient Air Monitoring Program. 2010. 11. United States Environmental Protection Agency (EPA), National Air Toxics Assessments. 2005. 12. United States Environmental Protection Agency (EPA), Watershed Assessment, Tracking, and Environmental Results (WATERS). 2010. 13. Program, N.A.D., National Atmospheric Deposition Program. 2010. 14. United States Geological Survey (USGS), Estimated Use of Water in the United States. 2010. 15. United States Drought Monitor (USDM), Drought Monitor Data Downloads. 2010. 16. United States Environmental Protection Agency (EPA), National Contaminant Occurrence Database (NCOD). 2005. 17. United States Environmental Protection Agency (EPA), Safe Drinking Water Information System. 2010. 18. Stone, W.W., Estimated annual agricultural pesticide use for counties of the conterminous United States, 1992-2009. 2013, U.S. Geological Survey. 19. United States Department of Agriculture (USD A), 2007 Census of Agriculture full report. 2009. 20. United States Environmental Protection Agency (EPA), EPA Geo spatial Data Download Service. 2017. 21. United States Environmental Protection Agency (EPA), Map of radon zones. 2017. 22. United States Department of Labor Mines Safety Health Administration (MSHA), Mines Data Set. 2017. 23. United States Geological Survey (USGS), National Geochemical Survey. 2006. 24. Bureau, U.S.C., American FactFinder. 2017. 25. Federal Bureau of Investigation (FBI), Uniform Crime Reports. 2014. 26. Leip, D., Dave Leip's Atlas of U.S. Presidential Elections. 2016. 27. United States Department of Agriculture (USD A), Economic Research Service (ERS) Creative Class County Codes. 2017. 28. Bradstreet, D.a., Dun andBradstreet Products. 2017. 29. Bureau, U.S.C., Topologically Integrated Geographic Encoding and Referencing. 2017. 30. HERE. NAVTEQ traffic mapping. 2019 [cited 2019 April 2]; Available from: https://www, here. co m/iiavtea. 31. National Highway Traffic Safety Administration (NHTSA), N.C.f.S.a.A.N., Fatality Analysis Reporting System (EARS). 2017. 32. Development, U.S.D.o.H.a.U., MultifamilyAssistance and Section 8 Contracts Database. 2017. 33. United States Environmental Protection Agency (EPA), EnviroAtlas Green space dataset. 2017. 34. Homer, C., et al., Completion of the 2011 National Land Cover Database for the Conterminous United States- representing a decade of land cover change information. 2015. 81(5): p. 345-354. 35. National Oceanic and Atmospheric Administration, O.f.C.M., Coastal Change Analysis Program (C-CAP) Regional Land Cover. 2017. 36. United States Environmental Protection Agency (EPA), National Walkability Index (NWI). 2017. 43 ------- 37. United States Environmental Protection Agency (EPA). National Emissions Inventory. 2019 [cited 2019 April 2]; Available from: https://www.epa.gov/air-emissions- iiiventories/tiational-eiriissiotis-inveiitoiy-nei. 38. United States Geologic Services (USGS). National Hydrography Dataset. 2019 [cited 2019 April 2]; Available from: https://www.iisgs.gov/core-science-svstems/iigp/ natio nal-lwdro graphv. 39. United States Environmental Protection Agency (EPA). Reach Address Database. 2010 [cited 2013 May 31]; Available from: http://www.epa.gov/waters/doc/rad/index. html. 40. United States Environmental Protection Agency (EPA). EPA Report on the Environment. 2019 [cited 2019 April 2]; Available from: https ://www. epa. gov/repo rt-env iro nment. 41. Mult-Resolution Land Cover Characteristics (MRLC) Consortium. 2019 [cited 2019 April 2]; Available from: https ://www. mrlc. gov/. 42. Cressie, N., The origins ofkriging. Mathematical Geology, 1990. 22(3): p. 239-252. 43. Tabachnick, B.G., Fidell, L.S., Using Multivariate Statistics. 5th ed. 2007, Boston: Pearson Allyn and Bacon. 44. Clean Water Act of1972. 45. United States Environmental Protection Agency (EPA). National Pollutant Discharge Elimination System (NPDES). December 12, 2018 [cited 2019 April 3]; Available from: 46. Kellog, R.L., Lander, C.H., Moffitt, D.C., Gollehon. N., Manure Nutrients Relative to the Capacity of Cropland and Pasture land to Assimilate Nutrients: Spatial and Temporal Trends for the United States. 2000, United States Department of Agriculture. 47. Baker, N.T., Stone, W.W., Estimated annual agricultural pesticide use for counties of the conterminous United States, 2008-12. 2014: U.S. Department of the Interior, U.S. Geological Survey. 48. United States Environmental Protection Agency (EPA). Assessment, Cleanup, and Redevelopment Exchange (ACRES) Brownfield Sites. 2010 [cited 2010 August 26]; Available from: http://www.epa.gov/browmieMs/. 49. United States Environmental Protection Agency (EPA). Superfund National Priorities List (NPL) Sites. 2010; Available from: http://www.epa.gov/siiperfiiiid/sites/iipl/ index, htm. 50. United States Environmental Protection Agency (EPA). Section Seven Tracking System (SSTS) Pesticide Producing Site Locations. 2019 [cited 2019 April 3]; Available from: 51. United States Environmental Protection Agency (EPA). Resource Conservation and Recovery Act (RCRA) Large Quantity Generators (LQG). 2010 [cited 2010 August 26]; Available from: http://www.epa.gov/osw/liazard/generation/ 52. United States Environmental Protection Agency (EPA). Resource Conservation and Recovery Act (RCRA) Treatment, Storage, and Disposal Facilities (TSD) and (RCRA) Corrective Action Facilities. 2010 [cited 2010 August 26]; Available from: http://www.epa.gov/osw/hazard/tsd/index. htm. 53. National Technical Information Service. Federal Information Processing Standards Publications (FIPS PUBS), [cited 2013 August 1]; Available from: http://www.mst.gov/itl/fips.cfm. 54. Richardson, E.A., et al., Green cities and health: a question of scale? J Epidemiol Community Health, 2012. 66(2): p. 160-5. 55. Access, G.B.D.H., et al., Healthcare Access and Quality Index based on mortality from causes amenable to personal health care in 195 countries and territories, 1990-2015: A novel analysis from the Global Burden of Disease Study 2015. Lancet, 2017. 390(10091): p. 231-266. 56. Friesen, C.E., Seliske, P., Papadopoulos, A., Using principal component analysis to identify priority neighbourhoods for health services delivery by ranking socioeconomic status. 2016. 8(2). 57. \fyas, S., Kumaranayake, L., Constructing socio-economic status indices: how to use principal components analysis. 2006. 21(6): p. 459-468. 58. Jolliife, I.T., Cadima, J., Principal component analysis: a review and recent developments. Philos Trans A Math Phys Eng Sci, 2016. 374(2065): p. 20150202. 59. Hall, S.A., Kaufman, J.S., Ricketts, T.C., Defining urban and rural areas in U.S. epidemiologic studies. J Urban Health, 2006. 83(2): p. 162-75. 60. United States Department of Agriculture (USD A). Measuring rurality: Rural-urban continuum codes, [cited 2019 April 3]; Available from: https://www.eis.iisda.gov/data-products/ riiral-iirbari-cofititiiiiiin-codes//. 61. Langlois, PH., et al., Occurrence of conotruncal heart birth defects in Texas: a comparison of urban/rural classifications. J Rural Health, 2010. 26(2): p. 164-74. 62. Langlois, PH., et al., Urban versus rural residence and occurrence of septal heart defects in Texas. Birth Defects Res A Clin Mol Teratol, 2009. 85(9): p. 764-72. 63. Luben, T.J., et al., Urban-rural residence and the occurrence of neural tube defects in Texas, 1999-2003. Health Place, 2009. 15(3): p. 848-54. 64. Messer, L.C., et al., Urban-rural residence and the occurrence of cleft lip and cleft palate in Texas, 1999-2003. Ann Epidemiol, 2010. 20(1): p. 32-9. 65. Emerson, J., et al., 2012 Environmental Performance Index and Pilot Trend Environmental Performance Index - Full Report. 2012, Yale Center for Environmental Law and Policy: New Haven, CT. 66. Messer, L.C., et al., The development of a standardized neighborhood deprivation index. J Urban Health, 2006. 83(6): p. 1041-62. 44 ------- Appendix I: List of References Related to 2000-2005 Environmental Quality Index 1. Lobdell DT, Jagai JS, Rappazzo K, Messer LC. (2011) Data sources for environmental assessment: determining availability, quality and utility, American Journal of Public Health Suppl l:S277-85. 2. Jagai JS, Rosenbaum BJ, Pierson SM, Messer LC, Rappazzo K, Naumova EN, Lobdell DT. (2013) Putting Regulatory Data to Work at the Service of Public Health: Utilizing Data Collected Under the Clean Water Act. Water Quality, Exposure, and Health 5:117-125; DOI: 10.1007/sl2403-013- 0095-1. 3. Messer LC, Jagai JS, Rappazzo KM, Lobdell DT. (2014) Construction of an environmental quality index for public health research. Environmental Health 13:39; DOI: 10.1186/1476-069X-13-39. 4. Rappazzo KM, Messer LC, Jagai JS, Gray CL, Grabich SC, Lobdell DT. (2015) The association between environmental quality and preterm birth in the United States, 2000-2005: a cross-sectional analysis. Environmental Health 14:50;DOI: 10.1186/sl2940-015-0038-3. 5. Grabich SC, Horney J, Konrad C, Lobdell DT. (2015). Measuring the Storm: Methods of Quantifying Hurricane Exposure with Pregnancy Outcomes. Natural Hazards Review; DOI: 10.1061/(ASCE)NH. 1527-6996.0000204. 6. Grabich SC, Rappazzo KM, Gray CL, Jagai JS, Jian Y, Messer LM, Lobdell DT. (2016) Additive interaction between heterogeneous environmental quality domains (air, water, land, sociodemographic, and built environment) on preterm birth. Frontiers in Public Health, http://dx.doi.org/10.3389/ fpubh.2016.00232. 7. Jian Y, Messer LC, Jagai JS, Rappazzo KM, Gray CL, Grabich SC, Lobdell DT. (2017) The associations between environmental quality and mortality in the contiguous United States 2000-2005. Environmental Health Perspectives 125:355-362, http://dx.doi.org/10.1289/EHP119. 8. Jagai JS, Messer LC, Rappazzo KM, Gray CL, Grabich SC, Lobdell DT. (2017) County-level cumulative environmental quality associated with cancer incidence. Cancer, http:// dx. do i. o r g/10.1002/cncr. 30709. 9. Lavery AM, Waldman AT, Charles Casper T, Roalstad S, Candee M, Rose J, Belman A, Weinstock-Guttman B, Aaen G, Tillema JM, Rodriguez M, Ness J, Harris Y, Graves J, Krupp L, Benson L, Gorman M, Moodley M, Rensel M, Goyal M, Mar S, Chitnis T, Schreiner T, Lotze T, Greenberg B, Kahn I, Rubin J, Waubant E; U.S. Network of Pediatric MS Centers. (2017) Examining the contributions of environmental quality to pediatric multiple sclerosis. Multiple Sclerosis and Related Disorders 18:164-169, https://doi. org/10.1016/j .msard.2017.09.004. 10. Jian Y, Wu CYH, Go hike JM. (2017) Effect modification by environmental quality on the association between heatwaves and mortality in Alabama, United States. International Journal of Environmental Research and Public Health 14:1143, https://doi.org/10.3390/iieiDhl4101143. 11. Gray CL, Lobdell DT, Rappazzo KM, Jian Y, Jagai JS, Messer LC, Patel AP, DeFlorio-Barker SA, Lyttle C, Solway J, Rzhetsky A. (2018) Associations between environmental quality and adult asthma prevalence in medical claims data. Environmental Research 166:529-536, https://doi. org/10.1016/j .envres.2018.06.020. 12. Gray CL, Messer LC, Rappazzo KM, Jagai JS, Grabich SC, Lobdell DT. (2018) The association between physical inactivity and obesity is modified by five domains of environmental quality in U.S. adults: A cross-sectional study. PLoS One, https://doi.org/10.1371/iournal.Pone.02Q3301. 13. Patel AP, Jagai JS, Messer LC, Gray CL, Rappazzo KM, Deflorio-Barker SA, Lobdell DT. (2018) Associations between environmental quality and infant mortality in the United States, 2000-2005. Archives of Public Health 76:60, https://doi.org/10.1186/sl3690-018-0306-0. 14. Kosnik MB, Reif DM, Lobdell DT, Astell-Burt T, Feng X, Hader JD, Hoppin JA. (2019) Associations between access to healthcare, environmental quality, and end-stage renal disease survival time: proportional-hazards models of over 1,000,000 people over 14 years. PLoS One, fattps://doi.org/10.1371/ journal.pone.0214094. 15. Jagai JS, Krajewski AK, Shaikh S, Lobdell DT, Sargis RM. (2020) Association between environmental quality and diabetes in the USA. Journal of Diabetes Investigation 11 (2):315-324, https://doi.org/10.llll/idi.13152. 16. Huanga M, Xiaob J, Nasca PC, Liu C, Lu Y, Lawrence WR, Wang L, Chen Q, Lin S. (2019) Do multiple environmental factors impact four cancers in women in the contiguous United States? Environmental Research 179:108782, https:// doi.ofg/10.1016/i.envres.2019.108782. 17. Wang M, Wasserman E, Geyer N, Carroll RM, Zhao S, Zhang L, Hohl R, Lengerich EJ, McDonald AC. (2020) Spatial patterns in prostate cancer-specific mortality in Pennsylvania using Pennsylvania Cancer registry data, 2004-2014. A-l ------- 18. Gearhart-Serna LM, Hoffman K. Devi GR. (2020) Environmental Quality and Invasive Breast Cancer. Cancer Epidemiology, Biomarkers & Prevention; DOI: 10.1158/1055-9965.EPI-19-1497. 19. Li X, Xiao J, Huang M, Liu T, Guo L, Zeng W, Chen Q, Zhang J, Ma W. (2020) Associations of county-level cumulative environmental quality with mortality of chronic obstructive pulmonary disease and mortality of tracheal, bronchus, and lung cancers. Science of the Total Environment 703:135523, https://doi.Org/10.1016/i.scitoteiiv.2019.135523. A-2 ------- Appendix II: Identified Variables by Source for Each Domain Variables by Data Source - Air Domain AIR QUALITY SYSTEM (AQS) Variable Particulate Matter <10 micrometers in aerodynamic diameter (PM10) Particulate Matter <2.5 micrometers in aerodynamic diameter (PM2.5) Nitrogen Dioxide (N02) Sulfur Dioxide (S02) Ozone (03) Carbon Monoxide (CO) Variable Name ln_S02 ln_NOx ln_CO PM25 PM10 03 Counties/Monitors 3143/1187 3143/1146 3143/303 3143/499 3143/575 3143/442 (jg/m3 ppm, log transformed ppb, log transformed PPb ppm, log transformed Variable Notes |jg/m3 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 EQI Version NATIONAL AIR TOXICS ASSESSMENT (NATA) NOTES: WHEN DATA IS MISSING/NOT RECORDED, ZERO VALUES WERE DEEMED APPROPRIATE. MOST VARIABLES KEPT FOR EQI HAVE BEEN LOG TRANSFORMED. EQI 2006-2010 = NATA 2005. ALL VARIABLES REPORTED IN TONS EMITTED PER YEAR. UNLESS OTHERWISE NOTED, ALL VARIABLES ARE LOG TRANSFORMED. VARIABLES WERE DROPPED DUE TO INSUFFICIENT DATA (HIGH NUMBERS OF MISSING OR ZERO OBSERVATIONS) OR DUE TO HIGH CORRELATION WITH OTHER VARIABLES. Variable Variable Name Counties Variable Notes EQI Version 1,1,2,2-tetrachloroethane A_TeCA_ln 3137 2000-2005; 2006-2010 1,1,2-trichloroethane A_112TCA_ln 3137 2000-2005; 2006-2010 1.2-dibromo-3-chloropropane A_DBCP_ln 3137 2000-2005; 2006-2010 1.3-dichloropropene A_DCI_propene_ln 3061 2006-2010 Acrylic acid A_Acrylic_acid_ln 3107 2000-2005; 2006-2010 Benzidine A_Benzidine_ln 3137 2000-2005; 2006-2010 Benzyl chloride A_Benzyl_CI_ln 3137 2000-2005; 2006-2010 Beryllium compounds A_Be_ln 3137 2000-2005; 2006-2010 bis-2-ethylhexyl phthalate A_DEHP_ln 3137 2000-2005; 2006-2010 Carbon tetrachloride A_CCI4 3137 2000-2005; 2006-2010 Carbonyl sulfide A_CylS_ln 3137 2006-2010 Chlorine A_CI_ln 3137 2000-2005; 2006-2010 Chlorobenzene A_C6H5CI_ln 3137 2000-2005; 2006-2010 Chloroform A_chloroform_ln 3137 2000-2005; 2006-2010 Chloroprene A_Chloroprene_ln 3137 2000-2005; 2006-2010 Chromium compounds A_Cr_ln 3137 2000-2005; 2006-2010 Cobalt compounds A_Co_ln 3132 2006-2010 Cyanide compounds A_CN_ln 3137 2000-2005; 2006-2010 Dibutylphthalate A_DBP_ln 3137 2000-2005; 2006-2010 Ethyl chloride A_EtCI_ln 3136 2000-2005; 2006-2010 Ethylbenzene A_Ebenzine 3137 2006-2010 Ethylene dibromide A_EDB 3137 2000-2005; 2006-2010 Ethylene dichloride A_EDC_ln 3137 2000-2005; 2006-2010 Formaldehyde A_Formaldehyde 3137 2006-2010 Glycol ethers A_Glycol_ethers_ln 3057 2000-2005; 2006-2010 Hydrazine A_N2H2_ln 3137 2000-2005; 2006-2010 B-l ------- Variable Hydrochloric acid Isophorone Manganese compounds Methyl bromide Methylene chloride Phosphine Polychlorinated biphenyls Propylene dichloride Quinoline Trichloroethylene Vinyl chloride Variable Name A_HCI_ln AJsophoroneJn A_Mn_ln A_Me_Br_ln A_MeCI2_ln A_PH3_ln A_PCBs_ln A_ProCI2_ln A_Quinolin_ln A_C2HCI3_ln A_VyCI_ln Counties 3137 3131 3137 3137 3137 3062 3137 3137 3137 3137 3137 Variable Notes EQI Version 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 Variables by Data Source - Water Domain WATERS PROGRAM DATABASE/REACH ADDRESS DATABASE NOTES: THESE MEASURES WERE COMPUTED; LOTS OF MISSING DATA, SO SEVERAL VARIABLES CANNOT BE USED. VARIABLES CALCULATED USING REACH STREAM LENGTH DATABASE. DATA FOR 2006, 2008, AND 2010 WERE AVERAGED. DATA WAS UPDATED BASED ON 2010 FIPS CODES. Variable Percent of stream length impaired in county All NPDES Permits grouped per 1000km of stream length in county Variable Name D303_Percent ALLNPDESperKM Counties 2513 3141 Variable Notes Calculated with REACH database information I types of NPDES Permits EQI Version 2000-2005; 2006-2010 2006-2010 Notes Grouped variable of Sewage Permits per 1000 km of Stream in County; Industrial Permits per 1000 km of Stream in County; Stormwater Permits per 1000 km of Stream in County ESTIMATE USE OF WATER IN THE UNITED STATES NOTES: THESE MEASURES WERE COMPUTED FOR 2005 AND 2010 DATA AND AVERAGED. USGS PROVIDES ESTIMATES AT COUNTY LEVEL, SO NO ADDITIONAL MANIPULATION REQUIRED. Variable Percent ofPopulation on SelfSupply, 2005, 2010 Percent of Public Supply Population that is on Surface Water, 2005, 2010 Variable Name Per_TotPopSS Per_PSWithSW Counties Variable Notes 3141 Estimate provided at county level 3067 Estimate provided at county level NATIONAL ATMOSPHERIC DEPOSITION PROGRAM NOTES: MEASURES PROVIDED AT VARIOUS MONITORING STATIONS. VALUES FOR 2006-2010 WERE KRIGED TO NATIONAL LEVEL COVERAGE. DATA FOR ALL YEARS WAS AVERAGED TOGETHER. Variable Variable Name Counties Variable Notes EQI Version Calcium (Ca) precipitation weighted mean (mg/L) CaAveJn 3141 Kriged & log transformed 2000-2005; 2006-2010 Potassium (K) precipitation weighted mean (mg/L) KAveJn 3141 Kriged & log transformed 2000-2005; 2006-2010 Nitrate (N03) precipitation weighted mean (mg/L) N03Ave 3141 Kriged - transformation not needed 2000-2005; 2006-2010 Chloride (CI) deposition ClAveJn 3141 Kriged & log transformed 2000-2005; 2006-2010 Sulfate (S04) deposition S04Ave_ln 3141 Kriged & log transformed 2000-2005; 2006-2010 Total Mercury deposition (ng/M2) Use only values with A or B quality rating HgAve 3141 Kriged - transformation not needed 2000-2005; 2006-2010 DROUGHT MONITOR DATA NOTES: RASTER DATA AGGREGATED TO THE COUNTY LEVEL. DATA FOR ALL YEARS 2006-2010 WAS AVERAGED TOGETHER. Variable Variable Name Counties Percent of county drought-extreme (D3-D4) AvgOfD3_ave 3141 Variable Notes EQI Version 2000-2005; 2006-2010 B-2 ------- NATIONAL CONTAMINANT OCCURRENCE DATABASE (NCOD) NOTES: WILL USE 6 YEAR REVIEW 2 (DATA COLLECTED BETWEEN 1998-2005). CALCULATE THE FOLLOWING VARIABLES FOR EACH CHEMICAL FOR EACH COUNTY (AGGREGATING ALL PWS IN COUNTY) FOR ALL YEARS COMBINED; MISSING FOR THOSE COUNTIES WITHOUT ANY DATA; DID NOT KEEP DETECTS. 1990 2000-2005; 2006-2010 Barium - average W_Ba_ln (mg/L) Average for all samples in county, log transformed 1989 2000-2005; 2006-2010 Chromium (total) - average W_Cr_ln (mg/L) Average for all samples in county, log transformed 2138 2000-2005; 2006-2010 Fluoride - average W_FL_ln (mg/L) Average for all samples in county, log transformed W_N03_ln (mg/L) 2000-2005; 2006-2010 Nitrate (as N) - average Average for all samples in county, log transformed 1986 2000-2005; 2006-2010 Selenium - average W_SE_ln (mg/L) Average for all samples in county, log transformed 1509 2000-2005; 2006-2010 Endrin - average W_Endrin_ln (ug/L) Average for all samples in county, log transformed 1292 2000-2005; 2006-2010 Dalapon - average W_Dalapon_ln (ug/L) Average for all samples in county, log transformed 1669 2000-2005; 2006-2010 Simazine - average Di(2-ethylhexyl) phthalate (DEHP) W_Simazine_ln (ug/L) Average for all samples in county, log transformed 1430 2000-2005; 2006-2010 Benzo[a]pyrene - average W_BenzoAP_ln (ug/L) Average for all samples in county, log transformed Polychlorinated biphenyls (PCBs) - average 2000-2005; 2006-2010 W_PCB_ln (ug/L) Average for all samples in county, log transformed Ethylene dibromide (EDB) - average 1630 2000-2005; 2006-2010 W_EDB_ln (ug/L) Average for all samples in county, log transformed 1498 2000-2005; 2006-2010 Chlordane - average W_Chlordane_ln (ug/L) Average for all samples in county, log transformed 1,4-Dichlorobenzene (p-Dichlorobenzene) - average 2165 2000-2005; 2006-2010 W_PDCB_ln (ug/L) Average for all samples in county, log transformed 2250 2000-2005; 2006-2010 Trichloroethylene - average W_Trichlorenejn (ug/L) Average for all samples in county, log transformed Monochlorobenzene Cyanide - average W_CN_ln (mg/L) 1385 Average for all samples in county, log transformed 2000-2005; 2006-2010 Mercury (inorganic) - average W_HG_ln (mg/L) 2056 Average for all samples in county, log transformed 2000-2005; 2006-2010 Xylenes (Total) - average W_xylenes_ln (ug/L) 2203 Average for all samples in county, log transformed 2000-2005; 2006-2010 Cadmium - average W_Cd_ln (mg/L) 1991 Average for all samples in county, log transformed 2000-2005; 2006-2010 Nitrite (as N) - average W_N02_ln (mg/L) 1583 Average for all samples in county, log transformed 2000-2005; 2006-2010 Antimony - average W_Sb_ln (mg/L) 1994 Average for all samples in county, log transformed 2000-2005; 2006-2010 Methoxychlor - average W_methoxychlor_ln (ug/L) 1512 Average for all samples in county, log transformed 2000-2005; 2006-2010 Pentachlorophenol - average W_PCP_ln (ug/L) 1547 Average for all samples in county, log transformed 2000-2005; 2006-2010 1,1,1-Trichloroethane - average W_111 trichlorane_ln (ug/L) 2238 Average for all samples in county, log transformed 2000-2005; 2006-2010 Tetrachloroethylene - average W_C2CI4_ln (ug/L) 224 Average for all samples in county, log transformed 2000-2005; 2006-2010 Di(2-ethylhexyl)adipate (DEHA) - average W_DEHA_ln (ug/L) 1456 Average for all samples in county, log transformed 2000-2005; 2006-2010 1,2-Dibromo-3-chloropropane (DBCP) - average W_DBCP_ln (ug/L) 1652 Average for all samples in county, log transformed 2000-2005; 2006-2010 2,4-D (2,4-Dichlorophenoxyacetic acid) - average W_24D_ln (ug/L) 1360 Average for all samples in county, log transformed 2000-2005; 2006-2010 Dichloromethane (Methylene chloride) - average W_DCM_ln (ug/L) 2245 Average for all samples in county, log transformed 2000-2005; 2006-2010 Alpha Particles (Gross Alpha, excl.Radon&U) - average W_alpha (PCI/L) 1243 Average for all samples in county Variable Arsenic - average Variable Name W_As_ln (mg/L) Counties 2017 Variable Notes Average for all samples in county, log transformed EQI Version 2000-2005; 2006-2010 B-3 ------- SAFE DRINKING WATER INFORMATION SYSTEM (SDWIS) NOTES: CUMULATIVE COUNT OF VIOLATIONS FOR ALL PWS IN COUNTY FOR THE YEAR. DATA IS AVAILABLE ANNUALLY DATA WERE COMPILED FOR 2006-2010. Variable Variable Name Counties Variable Notes EQI Version Total Coliform, Proportion Coliform_Sum 2034 2006-2010 Variables by Source - Land Domain 2007 CENSUS OF AGRICULTURE NOTES: ACRES OF CROP OR TREATMENT WERE DIVIDED BY TOTAL COUNTY ACRES TO GET PERCENTAGE OF ITEM PER COUNTY. SOME COUNTIES HAD SUPPRESSED ACREAGE DUE TO IDENTIFIABILITY ISSUES. FOR THESE, THE UNACCOUNTED-FOR ACREAGE FOR EACH STATE WAS CALCULATED (TOTAL STATE ACREAGE - LISTED COUNTY ACREAGE). THE ACREAGE WAS DIVIDED EQUALLY AMONG THE FARMS IN COUNTIES WITH SUPPRESSED INFORMATION. DATA FOR HAWAII AND ALASKA ARE NOT AVAILABLE. THESE DATAARE REFRESHED EVERY 5 YEARS. THE NEXT AVAILABLE DATA IS FOR 2012. Variable Variable Name Counties Variable Notes EQI Version Commercial fertilizer, lime, and soil conditioners pct_lime_acres 3065 2000-2005; 2006-2010 Manure Chemicals used to control insects Chemicals used to control weeds, grass, or brush Chemicals used to control nematodes Chemicals used to control diseases in crops and orchards pct_manure_acres_ln 2975 pct_insecticide_acres 3141 pct_weed_acres 3061 pct_nematode_acres_ln 1933 pct_disease_acres_ln 2530 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2588 2000-2005; 2006-2010 Corn for grain (bushels) pct_corn_acres 2082 2000-2005; 2006-2010 Soybeans for beans (bushels) pct_soybean_acres Potatoes (cwt) Pct_potato_acres 1565 2000-2005; 2006-2010 Wheat for grain, all (bushels) pct_wheat_acres 2520 2000-2005; 2006-2010 Chemicals used to control growth, thin fruit, or defoliate pct_defoliate_acres_ln 1980 2000-2005; 2006-2010 Animal units Number of farms Irrigated acres Harvested acres pd_au_ln farms_per_acre_ln pct_irrigated_acres_ln pct_harvest_acres 3078 3039 2815 2755 1 AU is equal to 0.94 cattle and calves, 5.88 hogs and pigs, 250 egg laying chickens, and 455 broiler chickens. 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2009 NATIONAL PESTICIDE USE DATASET (NPUD) NOTES: PESTICIDE CONCENTRATIONS WERE GROUPED BY CLASS AND ADDED TOGETHER TO GET CLASS-LEVEL ESTIMATES OF PESTICIDE APPLICATION. THESE DATAARE REFRESHED EVERY 5 YEARS. THE NEXT AVAILABLE DATA IS FOR 2012. Variable Variable Name Counties Variable Notes EQI Version Insecticides insecticidesjn 2761 2000-2005; 2006-2010 Herbicides herbicidesjn 2907 2000-2005; 2006-2010 Fungicides fungicidesjn 2256 2000-2005; 2006-2010 MAP OF RADON ZONE (EPA) NOTES: THE EPA RADON ZONE MAP IDENTIFIES AREAS OF THE UNITED STATES WITH THE POTENTIAL FOR ELEVATED INDOOR RADON LEVELS. EACH UNITED STATES COUNTY (3142) IS ASSIGNED TO ONE OF THREE ZONES BASED ON RADON POTENTIAL. DATA YEARS UNAVAILABLE. PRESUMABLY, RADON IS A STABLE FEATURE, AND THE MAP IS NOT VARIABLE, BUT REFRESH DATES ARE NOT AVAILABLE. NO OTHER INFORMATION AVAILABLE IN DATA DOCUMENTATION. Variable Variable Name Counties Variable Notes EQI Version Radon zones Radon_zone 3142 3-level variable 2000-2005; 2006-2010 B-4 ------- SUPERFUND NATIONAL PRIORITIES LIST (NPL) SITES NOTES: NPL SITE LOCATIONS AVAILABLE THROUGH THE EPAGEOSPATIAL DATA ACCESS PROJECT SITES WERE INCLUDED IN THE COUNTS IF THEY WERE IDENTIFIED BETWEEN 2006-2010. PUBLISHED AUGUST 2016. START AND END DATES NOT AVAILABLE. DATA REFRESHED MONTHLY. Variable Variable Name Counties Variable Notes EQI Version Notes Included as part of composite Count of Superfund National Priority List sites per county sf_county_count 719 2000-2005; 2006-2010 count variable RESOURCE CONSERVATION AND RECOVERY ACT (RCRA) TREATMENT, STORAGE, AND DISPOSAL FACILITIES (TSD) AND RCRA CORRECTIVE ACTION FACILITIES NOTES: RCA TSD AND CORRECTION ACTION FACILITIES SITE LOCATIONS AVAILABLE THROUGH THE EPAGEOSPATIAL DATA ACCESS PROJECT. SITES WERE INCLUDED IN THE COUNTS IF THEY WERE IDENTIFIED BETWEEN 2006-2010. PUBLISHED AUGUST 2016. START AND END DATES NOT AVAILABLE. DATA REFRESHED MONTHLY. Variable Variable Name Counties Variable Notes EQI Version Notes Count of RCRA TSD and corrective Included as part of composite action facilities per county rcra_tsd_count_by_fips 874 2000-2005; 2006-2010 count variable RESOURCE CONSERVATION AND RECOVERY ACT (RCRA) LARGE QUANTITY GENERATORS (LQG) NOTES: RCA LQG SITE LOCATIONS THROUGH THE EPA GEOSPATIAL DATA ACCESS PROJECT. SITES WERE INCLUDED IN THE COUNTS IF THEY WERE IDENTIFIED BETWEEN 2006-2010. PUBLISHED AUGUST 2016. START AND END DATES NOT AVAILABLE. DATA REFRESHED MONTHLY. Variable Variable Name Counties Variable Notes EQI Version Notes Count of RCRA LQG facilities per county rcralqg_count 1963 2000-2005; 2006-2010 Included as part of composite count variable TOXIC RELEASE INVENTORY (TRI) SITES NOTES: TRI SITES AVAILABLE THROUGH THE EPA GEOSPATIAL DATA ACCESS PROJECT. SITES WERE INCLUDED IN THE COUNTS IF THEY WERE IDENTIFIED BETWEEN 2006-2010. PUBLISHED AUGUST 2016. START AND END DATES NOT AVAILABLE. DATA REFRESHED MONTHLY. Variable Variable Name Counties Variable Notes EQI Version Notes Count of TRI sites per county tri_county_count 2671 2000-2005; 2006-2010 Included as part of composite count variable ASSESSMENT, CLEANUP, AND REDEVELOPMENT EXCHANGE (ACRES) BROWNFIELD SITES NOTES: BROWNFIELD SITE LOCATIONS AVAILABLE THROUGH THE EPAGEOSPATIAL DATA ACCESS PROJECT. SITES WERE INCLUDED IN THE COUNTS IF THEY WERE IDENTIFIED BETWEEN 2006-2010. PUBLISHED AUGUST 2016. START AND END DATES NOT AVAILABLE. DATA REFRESHED MONTHLY. Variable Variable Name Counties Variable Notes EQI Version Notes Count of ACRES sites per county acres_county_count 1273 2000-2005; 2006-2010 Included as part of composite count variable SECTION SEVEN TRACKING SYSTEM (SSTS) PESTICIDE PRODUCING SITE LOCATIONS NOTES: SSTS PESTICIDE-PRODUCING SITE LOCATIONS AVAILABLE THROUGH THE EPAGEOSPATIAL DATA ACCESS PROJECT. SITES WERE INCLUDED IN THE COUNTS IF THEY WERE IDENTIFIED BETWEEN 2006-2010. PUBLISHED AUGUST 2016. START AND END DATES NOT AVAILABLE. DATA REFRESHED BUT NOT ANNUALLY. Variable Variable Name Counties Variable Notes EQI Version Notes Count of SSTS sites per county ssts_county_count 2099 2000-2005; 2006-2010 Included as part of composite count variable MINE SAFETY AND HEALTH ADMINISTRATION (MSHA) NOTES: THE MINE DATASET LISTS ALL COAL AND METAL/NON-METAL MINES UNDER MSHA'S JURISDICTION SINCE 1/1/1970. IT INCLUDES SUCH INFORMATION AS THE CURRENT STATUS OF EACH MINE (ACTIVE, ABANDONED, NONPRODUCING, ETC.), THE CURRENT OWNER AND OPERATING COMPANY, COMMODITY CODES AND PHYSICAL ATTRIBUTES OF THE MINE. MINE ID IS THE UNIQUE KEY FOR THIS DATA (https://ARLWEB.MSHA.GOV/OPENGOVERNMENTDATA/OGIMSHA.ASP). DATA REFRESHED WEEKLY. COUNTIES WITH ZERO MINES WERE GIVEN A VALUE OF MINIMUM VALUE/2. THESE DATA WERE TRANSFORMED (LOG) TO ACCOUNT FOR THE LARGE NUMBER OF ZEROS AND TO RESULT IN NEARLY NORMALLY DISTRIBUTED DATA. Variable Variable Name Counties Variable Notes EQI Version Notes Primarily coal mines, mines per county population Std_coal_prim_pop_ln 464 See notes above 2006-2010 Primarily metal mines, mines per county population Std_coal_prim_pop_ln 386 See notes above 2006-2010 Primarily nonmetal mines, mines per county population Std_coal_prim_pop_ln 1135 See notes above 2006-2010 Primarily sand and gravel mines, mines per county population Std_coal_prim_pop_ln 2342 See notes above 2006-2010 Primarily stone mines, mines per county population Std_coal_prim_pop_ln 1965 See notes above 2006-2010 B-5 ------- Variables by Source - Sociodemographic Domain UNITED STATES CENSUS SUMMARY FILES NOTES: MANY, MANY MORE VARIABLES ARE AVAILABLE FROM THE UNITED STATES CENSUS THAN WILL BE DESCRIBED HERE. THE VARIABLES IDENTIFIED HERE ARE THOSE THAT WILL BE USED IN THE EQI AND NOT THE PLETHORA OF VARIABLES THAT COULD BE CONSTRUCTED. DATA ARE AVAILABLE FOR MULTIPLE UNITS OF GEOGRAPHIC AGGREGATION, INCLUDING THE COUNTY-LEVEL. FULL POPULATION DATA ARE COLLECTED DECENNIALLY; SAMPLE DATA ARE COLLECTED MORE FREQUENTLY. DATA ARE AVAILABLE FOR DOWNLOAD FROM THE UNITED STATES CENSUS BUREAU WEB SITE. Variable Percent renter-occupied units Percent vacant units Median household value Median household income Bachelor's degree or higher, percent of persons age 25 years+ Percent of persons who are unemployed Percent of families in poverty Occupants per Room Measure of income inequality (proportion) Variable Name Pct_RenterOcc Pct_Vacant_Housing med_hh_value ln_HH_lnc Pct_BS Pct_Unemp_total Pct_Fam_Pov ln_Occs_Room GINLest Counties 3143 3143 3143 3143 3143 3143 3143 3143 3143 Variable Notes EQI Version 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2000-2005; 2006-2010 2006-2010 2000-2005; 2006-2010 2006-2010 2006-2010 2006-2010 Notes This variable replaced percent < HS This variable replaced percent families in poverty This variable replaced number rooms / house FBI UNIFORM CRIME REPORTS NOTES: FBI UCR DATA WERE DOWNLOADED FOR EACH COUNTY IN EACH STATE FROM THE WEBSITE (HTTPS //WWW UCRDATATOOI GOV/) DATA ARE AVAILABLE BY YEAR AND BY CRIME TYPE (VIOLENT = MURDER AND NONNEGLIGENT MANSLAUGHTER, FORCIBLE RAPE, ROBBERY, AND AGGRAVATED ASSAULT; PROPERTY = BURGLARY, LARCENY-THEFT, AND MOTOR VEHICLE THEFT). DATA FROM 2006-2010 WERE TEMPORALLY AND SPATIALLY KRIGED FOR USE IN THE EQI. DATA REPORTING IS VOLUNTARY. DATA ARE AVAILABLE AT THE CITY AND COUNTY LEVELS, BUT MANY COUNTIES DO NOT REPORT THESE DATA. DATA FOR LAW ENFORCEMENT AGENCIES SERVING CITY JURISDICTIONS WITH POPULATIONS OF 10,000 OR MORE AND COUNTY AGENCIES OF 25,000 OR MORE. THEREFORE, DATA MAY NOT BE AVAILABLE FOR EACH JURISDICTION EACH YEAR. DATA ARE AVAILABLE FROM 1960 TO CURRENT YEAR. RATES WERE OBTAINED FROM THE FBI. THE VIOLENT CRIME RATE DATA WERE TRANSFORMED (LOG) TO ACCOUNT FOR THE LARGE NUMBER OF ZEROS AND TO RESULT IN NEARLY NORMALLY DISTRIBUTED DATA. Variable Variable Name Violent crime rate ln_ViolAv Murder-manslaughter crime rate murder_manslaughter_rate Rape crime rate rape_rate Robbery crime rate rob_rate Aggravated assault crime rate agg_assault_rate Counties Variable Notes EQI Version Notes Variable kriged to estimate values for counties 2000-2005; 3143 with no reported violent crime data 2006-2010 Variable kriged to estimate values for counties Constituent of violent 1062 with no reported violent crime data No crime rate Variable kriged to estimate values for counties Constituent of violent 1055 with no reported violent crime data No crime rate Variable kriged to estimate values for counties Constituent of violent 1062 with no reported violent crime data No crime rate Variable kriged to estimate values for counties Constituent of violent 1062 with no reported violent crime data No crime rate UNITED STATES DEPARTMENT OF AGRICULTURE ECONOMIC RESEARCH SERVICE CREATIVE CLASS INDEX NOTES: THE ECONOMIC RESEARCH SERVICE (ERS) CLASS CODES INDICATE A COUNTY'S SHARE OF POPULATION EMPLOYED IN OCCUPATIONS THAT REQUIRE "THINKING CREATIVELY." THIS SKILL ELEMENT IS DEFINED AS "DEVELOPING, DESIGNING, OR CREATING NEW APPLICATIONS, IDEAS, RELATIONSHIPS, SYSTEMS, OR PRODUCTS, INCLUDING ARTISTIC CONTRIBUTIONS." DATA ARE AVAILABLE FOR DOWNLOAD FROM THE USDAERS WEBSITE. Variable Percent county employed in creative class Variable Name Num CreatClass Counties 3143 Variable Notes EQI Version 2006-2010 UNITED STATES ELECTION ATLAS NOTES: THE POLITICAL CLIMATE OF A COUNTY WAS REPRESENTED BY THE DAVID LEIP ELECTION MAP. COUNTY-SPECIFIC PERCENTS VOTING REPUBLICAN OR DEMOCRATIC WERE REPORTED. THE REPORT VOTING DEMOCRATIC IN THE 2008 PRESIDENTIAL ELECTION WERE INCLUDED IN THE EQI. Variable Percent county voting Democratic in 2008 Variable Name DEM02008 Counties 3143 Variable Notes EQI Version 2006-2010 B-6 ------- Variables by Source - Built Domain HOUSING AND URBAN DEVELOPMENT (HUD) DATA NOTES: THESE DATA PROVIDE A COUNT OF THE LOW-RENT AND SECTION 8 HOUSING IN EACH HOUSING AUTHORITY AREA. THESE HOUSING AUTHORITY AREAS CORRESPOND TO CITIES, WHICH ARE THEN ASSIGNED FIPS CODES. COUNTIES WITHOUT HOUSING AUTHORITY CITIES ARE GIVEN A COUNT OF ZERO FOR LOW- RENT AND/OR SECTION-EIGHT HOUSING. THESE DATA WERE TRANSFORMED (LOG) TO ACCOUNT FOR THE LARGE NUMBER OF ZEROS AND TO RESULT IN NEARLY NORMALLY DISTRIBUTED DATA. DATA ARE REFRESHED FREQUENTLY), BUT UPDATE FREQUENCY NOT PROVIDED. HISTORIC DATA DOES NOT APPEAR TO BE AVAILABLE FROM WEB SITE. DATA WERE COLLECTED IN 2010, BUT, SINCE LOW-RENT AND SECTION 8 HOUSING DOES NOT CHANGE SUBSTANTIALLY OVER TIME, THESE DATA ARE CONSIDERED REPRESENTATIVE OF THE 2006-2010 TIME PERIOD. RATES FOR EACH VARIABLE CONSTRUCTED BY DIVIDING COUNT BY COUNTY POPULATION. Variable Rate of low-rent + section 8 units in county Count of low-rent units per county Count of section 8 units per county Variable Name Counties Variable Notes Variable transformed (log) to allow it to total_units_ln 3143 approximate normal distribution Variable transformed (log) to allow it to low_rent_units 2080 approximate normal distribution Variable transformed (log) to allow it to section_eight_units 2080 approximate normal distribution EQI Version 2000-2005; 2006-2010 Notes Zeros considered meaningful zeros (lack of public housing) Constituent of total unit rate Constituent of total unit rate FATALITY ANALYSIS REPORTING SYSTEM (FARS) DATA NOTES: THE FATALITY ANALYSIS REPORTING SYSTEM (FARS) IS A NATIONWIDE CENSUS PROVIDING THE NATIONAL HIGHWAY TRAFFIC SAFETY ADMINISTRATION YEARLY DATA REGARDING FATAL INJURIES SUFFERED IN MOTOR VEHICLE TRAFFIC CRASHES. FARS DATAARE AVAILABLE FROM 1975 (HTTPV/WWW.NHTSA.GOV/ FARS I). RATES FOR THE COUNT OF FATAL CRASHES PER COUNTY FOR 2006-2010 WERE CONSTRUCTED BY DIVIDING COUNT BY COUNTY POPULATION. THESE DATA WERE TRANSFORMED (LOG) TO ACCOUNT FOR THE LARGE NUMBER OF ZEROS AND TO RESULT IN NEARLY NORMALLY DISTRIBUTED DATA. THESE DATA CAN BE UPDATED ANNUALLY. Variable Variable Name Counties Variable Notes EQI Version Notes Rate of fatal car crashes Variable transformed (log) to allow it to per county ln_fatalities 3143 approximate normal distribution 2000-2005; 2006-2010 2010 UNITED STATES CENSUS SUMMARY FILES NOTES: MANY, MANY MORE VARIABLES ARE AVAILABLE FROM THE UNITED STATES CENSUS THAN WILL BE DESCRIBED HERE. THE VARIABLES IDENTIFIED HERE ARE THOSE THAT WILL BE USED IN THE EQI AND NOT THE PLETHORA OF VARIABLES THAT COULD BE CONSTRUCTED. DATAARE AVAILABLE FOR MULTIPLE UNITS OF GEOGRAPHIC AGGREGATION, INCLUDING THE COUNTY-LEVEL. FULL POPULATION DATAARE COLLECTED DECENNIALLY; SAMPLE DATAARE COLLECTED MORE FREQUENTLY. THESE DATA WERE TRANSFORMED (LOG) TO ACCOUNT FOR THE LARGE NUMBER OF ZEROS AND TO RESULT IN NEARLY NORMALLY DISTRIBUTED DATA. DATAARE AVAILABLE FOR DOWNLOAD FROM THE UNITED STATES CENSUS BUREAU WEB SITE. Variable Variable Name Counties Variable Notes EQI Version Notes Percent of county residents who Variable transformed (log) to allow it to report using public transportation ln_PubTrans 3143 approximate normal distribution 2000-2005; 2006-2010 Time it takes from home to go to work CommuteTime 3143 Recorded in minutes 2006-2010 TIGER FILES NOTES: TOPOLOGICALLY INTEGRATED GEOGRAPHIC ENCODING AND REFERENCING PRODUCTS PROVIDE MAPS AND ROAD LAYERS WORLDWIDE. INCLUDING THE UNITED STATES. THESE DATAARE UPDATED REGULARLY BUT DO NOT CHANGE SUBSTANTIALLY OVER TIME. THE DATA USED IN THE EQI ARE FROM 2009. DATAARE AVAILABLE AT CENSUS GEOGRAPHY. FOR THE STREET TYPES, THE HIGHWAY AND SECONDARY AND LOCAL ROADS (TERTIARY ROADS) PER COUNTY PER STATE WERE DOWNLOADED. PROPORTION OF EACH ROAD TYPE WAS CONSTRUCTED BY DIVIDING THE DISTANCE OF EACH ROAD TYPE BY THE TOTAL AMOUNT OF EACH ROAD. Variable Variable Name Counties Variable Notes Proportion of all roads that are secondary roads SecondaryRoadProportion 3143 EQI Version Notes This single variable replaced proportion 2006-2010 primary road and highways B-7 ------- DUN AND BRADSTREET NOTES: DUN AND BRADSTREET COLLECT COMMERCIAL INFORMATION ON BUSINESS. ITS DATABASE CONTAINS MORE THAN 195 MILLION RECORDS AND IS PROPRIETARY. THE DATA ARE PUT THROUGH AN EXTENSIVE QUALITY ASSURANCE PROCESS, WHICH INCLUDES OVER 2000 SEPARATE AUTOMATED AND SEVERAL MANUAL CHECKS. DATA ARE UPDATED DAILY. RATES OF EACH TYPE OF BUSINESS IN 2008 WERE CALCULATED BY DIVIDING THE COUNTS OF EACH VARIABLE BY THE COUNTY POPULATION. THESE DATA WERE TRANSFORMED (LOG) TO ACCOUNT FOR THE LARGE NUMBER OF ZEROS AND TO RESULT IN NEARLY NORMALLY DISTRIBUTED DATA. Variable Variable Name Counties Variable Notes EQI Version Notes Rate of positive food environment businesses per county pos_food_rate_ln 3140 2000-2005; 2006-2010 Rate of negative food environment businesses per county neg_food_rate_ln 3117 2000-2005; 2006-2010 Rate of alcohol, pawn, gaming businesses per county al_pwn_gm_env_rate_ln 3039 2000-2005; 2006-2010 Rate of health care-related businesses per county hc_env_rate_ln 3119 2000-2005; 2006-2010 Rate of recreation-related businesses per county rec_env_rate_ln 3133 2000-2005; 2006-2010 Rate of education-related businesses per county ed_env_rate_ln 3141 2000-2005; 2006-2010 Rate of social-service-related businesses per county ss_env_rate_ln 3125 2000-2005; 2006-2010 Rate of civic-related businesses per county civic_env_rate_ln 3138 2006-2010 ENVIROATLAS LAND COVER CONTERMINOUS UNITED STATES (EPA) NOTES: THIS ENVIROATLAS DATASET REPRESENTS THE PERCENTAGE OF LAND AREA THAT IS CLASSIFIED AS NATURAL, BARREN, FOREST, TUNDRA, SHRUBLAND, HERBACEOUS, WETLAND, WOODY WETLAND, EMERGENT WETLAND, ALL HUMAN LAND USE, DEVELOPED, OPEN SPACE DEVELOPED, LOW-INTENSITY DEVELOPED, MEDIUM-INTENSITY DEVELOPED, HIGH-INTENSITY DEVELOPED, AGRICULTURAL, PASTURE/HAY, AND CULTIVATED CROP USING THE 2011 NATIONAL LAND COVER DATASET (NLCD) FOR EACH COUNTY IN THE CONTERMINOUS UNITED STATES. THIS DATASET WAS PRODUCED BY THE UNITED STATES EPA TO SUPPORT RESEARCH AND ONLINE MAPPING ACTIVITIES RELATED TO ENVIROATLAS. ENVIROATLAS (HTTPSV/WWW.EPA.GOV/ENVIROATLASI ENABLES THE USER TO INTERACT WITH A WEB-BASED, EASY-TO-USE, MAPPING APPLICATION TO VIEW AND ANALYZE MULTIPLE ECOSYSTEM SERVICES FOR THE CONTIGUOUS UNITED STATES. THE DATASET IS AVAILABLE AS DOWNLOADABLE DATA (HTTPSV/EDG.EPA.GOV/DATA/PUBLIC/ORD/ENVIROATLASl OR AS AN ENVIROATLAS MAP SERVICE. ADDITIONAL DESCRIPTIVE INFORMATION ABOUT EACH ATTRIBUTE IN THIS DATASET CAN BE FOUND IN ITS ASSOCIATED ENVIROATLAS FACT SHEET (HTTPS://WWW.EPA.GOV/ ENVIROATLAS/ENVIROATLAS-FACT-SHEETS). Variable Variable Name Counties Variable Notes EQI Version Notes Combined natural land cover and open space developed NINDEX_open 3109 Green space composite variable 2006-2010 Percentage of county land area that is classified as natural land cover Percentage of county land area that is classified as barren land cover Percentage of county land area that is classified as forest land cover Percentage of county land area that is classified as tundra land cover Percentage of county land area that is classified as shrubland land cover Percentage of county land area that is classified as herbaceous land cover Percentage of county land area that is classified as wetland land cover Percentage of county land area that is classified as woody wetland land cover NINDEX pbar pfor ptun pshb phrb pwtl pwtlw Composite variable of barren, forest, tundra, shrubland, herbaceous, and wetland 3109 land cover 2006-2010 3109 Vegetation accounts for <15% total cover 2006-2010 Composite variable of deciduous, evergreen, and mixed forests. Areas dominated by trees generally greater than 5-meters tall, and greater than 20% total 3109 vegetation cover 2006-2010 3109 Alaska only areas 2006-2010 Areas dominated by shrubs; less than 5-meters tall; shrub canopy greater than 3109 20% of total vegetation 2006-2010 Areas dominated by graminoid and herbaceous vegetation, usually greater than 3109 80% of total vegetation 2006-2010 Composite variable of woody and emergent 3109 wetlands. 2006-2010 Soil or substrate is periodically saturated with or covered with water, and forest or shrubland vegetation account for >20% 3109 vegetative cover 2006-2010 Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable B-8 ------- Variable Variable Name Counties Variable Notes EQI Version Notes Soil or substrate is periodically saturated Percentage of county land with or covered with water, and perennial area that is classified as herbaceous vegetation accounts for >80% Included as part of green space emergent wetland land cover pwtle 3109 vegetative cover No composite variable Percentage of county land area that is classified as all Composite variable of developed and human land use land cover UINDEX 3109 agricultural land cover No Does not meet definition of green space Percentage of county land area that is classified as developed land cover pdev 3109 All developed land cover No Does not meet definition of green space Percentage of county land area that is classified as Mixture of some constructed materials open space developed land but mostly vegetation; < 20% impervious Included as part of green space cover pdevo 3109 surface No composite variable Percentage of county land area that is classified as low-intensity developed land Mixture of constructed materials and cover pdevl 3109 vegetation; 20% to 49% impervious surface No Does not meet definition of green space Percentage of county land area that is classified as medium-intensity developed Mixture of constructed materials and land cover pdevm 3109 vegetation; 50% to 79% impervious surface No Does not meet definition of green space Percentage of county land area that is classified as high-intensity developed Highly developed areas; 80% to 100% land cover pdevh 3109 impervious surface No Does not meet definition of green space Percentage of county land area that is classified as Composite variable of pasture/hay and agricultural land cover pagr 3109 cultivated crop land cover No Does not meet definition of green space Grasses, legumes, or grass-legume Percentage of county land mixtures for livestock grazing; production of area that is classified as seed or hay crops; pasture/hay vegetation pasture/hay land cover pagrp 3109 accounts for >20% total vegetation No Does not meet definition of green space Percentage of county land area that is classified as cultivated crop land cover pagrc 3109 Production of annual crops; crop vegetation accounts for >20% total vegetation; includes land being actively tilled No Does not meet definition of green space ENVIROATLAS LAND COVER ALASKA (EPA) NOTES: THIS ENVIROATLAS DATASET REPRESENTS THE PERCENTAGE OF LAND AREA THAT IS CLASSIFIED AS NATURAL, BARREN, FOREST, TUNDRA, SHRUBLAND, HERBACEOUS, WETLAND, WOODY WETLAND, EMERGENT WETLAND, ALL HUMAN LAND USE, DEVELOPED, OPEN SPACE DEVELOPED, LOW-INTENSITY DEVELOPED, MEDIUM-INTENSITY DEVELOPED, HIGH-INTENSITY DEVELOPED, AGRICULTURAL, PASTURE/HAY, CULTIVATED CROP, AND PERENNIAL SNOW/ICE USING THE 2011 NATIONAL LAND COVER DATASET (NLCD) FOR EACH COUNTY IN ALASKA. THIS DATASET WAS PRODUCED BY THE UNITED STATES EPA TO SUPPORT RESEARCH AND ONLINE MAPPING ACTIVITIES RELATED TO ENVIROATLAS. ENVIROATLAS (HTTPS7/WWW FPA GOV/FNVIROATI AS) ENABLES THE USER TO INTERACT WITH A WEB- BASED, EASY-TO-USE, MAPPING APPLICATION TO VIEW AND ANALYZE MULTIPLE ECOSYSTEM SERVICES FOR THE CONTIGUOUS UNITED STATES. THE DATASET IS AVAILABLE AS DOWNLOADABLE DATA (HTTPS7/FDG FPA GOV/DATA/PUBI IC/ORD/FNVIROATI AS) OR AS AN ENVIROATLAS MAP SERVICE. ADDITIONAL DESCRIPTIVE INFORMATION ABOUT EACH ATTRIBUTE IN THIS DATASET CAN BE FOUND IN ITS ASSOCIATED ENVIROATLAS FACT SHEET (HHPS//WWW FPA GOV/FNVIROATI AS/ FNVIROATI AS-FACT-SHFFTS) Variable Combined natural land cover and open space developed Percentage of county land area that is classified as natural land cover Percentage of county land area that is classified as barren land cover Percentage of county land area that is classified as forest land cover Percentage of county land area that is classified as tundra land cover Variable Name Counties NINDEX_open NINDEX pbar pfor ptun 29 29 29 29 29 Variable Notes EQIVersion Green space composite variable 2006-2010 Composite variable of barren, forest, tundra, shrubland, herbaceous, and wetland land cover 2006-2010 Vegetation accounts for <15% total cover 2006-2010 Composite variable of deciduous, evergreen, and mixed forests. Areas dominated by trees generally greater than 5-meters tall, and greater than 20% total vegetation cover 2006-2010 Alaska only areas; includes dwarf scrub, sedge/herbaceous, lichens, and moss land cover 2006-2010 Notes Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable B-9 ------- Variable Variable Name Counties Variable Notes EQIVersion Notes Areas dominated by shrubs; less than Percentage of county land area that 5-meters tall; shrub canopy greater than 20% Included as part of green space is classified as shrubland land cover pshb 29 of total vegetation 2006-2010 composite variable Percentage of county land area that Areas dominated by graminoid and is classified as herbaceous land herbaceous vegetation, usually greater than Included as part of green space cover phrb 29 80% of total vegetation 2006-2010 composite variable Percentage of county land area that Composite variable ofwoody and emergent Included as part of green space is classified as wetland land cover pwtl 29 wetlands 2006-2010 composite variable Percentage of county land area Soil or substrate is periodically saturated with that is classified as woody wetland or covered with water and forest or shrubland Included as part of green space land cover pwtlw 29 vegetation account for >20% vegetative cover 2006-2010 composite variable Soil or substrate is periodically saturated Percentage of county land area that with or covered with water, and perennial is classified as emergent wetland herbaceous vegetation accounts for >80% Included as part of green space land cover pwtle 29 vegetative cover 2006-2010 composite variable Percentage of county land area that is classified as all human land use Composite variable of developed and Does not meet definition of green land cover UINDEX 29 agricultural land cover No space Percentage of county land area that Does not meet definition of green is classified as developed land cover pdev 29 All developed land cover No space Percentage of county land area that is classified as open space Mixture of some constructed materials but Included as part of green space developed land cover pdevo 29 mostly vegetation; <20% impervious surface No composite variable Percentage of county land area that is classified as low-intensity Mixture of constructed materials and Does not meet definition of green developed land cover pdevl 29 vegetation; 20% to 49% impervious surface No space Percentage of county land area that is classified as medium-intensity Mixture of constructed materials and Does not meet definition of green developed land cover pdevm 29 vegetation; 50% to 79% impervious surface No space Percentage of county land area that is classified as high-intensity Highly developed areas; 80% to 100% Does not meet definition of green developed land cover pdevh 29 impervious surface No space Percentage of county land area that is classified as agricultural land Composite variable of pasture/hay and Does not meet definition of green cover pagr 29 cultivated crop land cover No space Grasses, legumes, or grass-legume mixtures Percentage of county land area that for livestock grazing; production of seed or is classified as pasture/hay land hay crops; pasture/hay vegetation accounts Does not meet definition of green cover pagrp 29 for >20% total vegetation No space Percentage of county land area that Production of annual crops; crop vegetation is classified as cultivated crop land accounts for >20% total vegetation; includes Does not meet definition of green cover pagrc 29 land being actively tilled No space Percentage of county land area that is classified as forest and woody Composite variable of forest and woody Included as part of green space wetland cover Pfor90 29 wetland No composite variable Percentage of county land area that is classified as forest and emergent Included as part of green space wetland cover Pwetl95 29 Composite of forest and emergent wetland No composite variable Percentage of county land area that Characterized by perennial cover of ice and/ Does not meet definition of green is classified as perennial snow/ice pice 29 or snow, generally >25% total cover No space B-IO ------- ENVIROATLAS LAND COVER HAWAII (EPA) NOTES: THIS ENVIROATLAS DATASET REPRESENTS THE PERCENTAGE OF LAND AREA THAT IS CLASSIFIED AS NATURAL, BARREN, FOREST, TUNDRA, SHRUBLAND, HERBACEOUS, WETLAND, WOODY WETLAND, EMERGENT WETLAND, ALL HUMAN LAND USE, DEVELOPED, OPEN SPACE DEVELOPED, LOW-INTENSITY DEVELOPED, MEDIUM-INTENSITY DEVELOPED, HIGH -INTENSITY DEVELOPED, AGRICULTURAL, PASTURE/HAY, AND CULTIVATED CROP LAND COVER USING THE ENVIROATLAS COMPOSITE OF THE 2005-2011 COASTAL CHANGE ANALYSIS PROGRAM (C-CAP) LAND COVER DATASET FOR EACH 12-DIGIT HYDROLOGIC UNIT CODE (HUC) IN HAWAII. THIS DATASET WAS PRODUCED BY THE UNITED STATES EPA TO SUPPORT RESEARCH AND ONLINE MAPPING ACTIVITIES RELATED TO ENVIROATLAS. ENVIROATLAS (HTTPSV/WWW.EPA.GOV/ENVIROATLAS) ENABLES THE USER TO INTERACT WITH A WEB-BASED, EASY-TO-USE, MAPPING APPLICATION TO VIEW AND ANALYZE MULTIPLE ECOSYSTEM SERVICES FOR THE CONTIGUOUS UNITED STATES. THE DATASET IS AVAILABLE AS DOWNLOADABLE DATA (HTTPSV/EDG.EPA.GOV/ DATA/PUBLIC/ORD/ENVIROATLAS) OR AS AN ENVIROATLAS MAP SERVICE. ADDITIONAL DESCRIPTIVE INFORMATION ABOUT EACH ATTRIBUTE IN THIS DATASET CAN BE FOUND IN ITS ASSOCIATED ENVIROATLAS FACT SHEET (HTTPSV/WWW.EPA.GOV/ENVIROATLAS/ENVIROATLAS-FACT-SHEETSV Variable Combined natural land cover and open space developed Percentage of county land area that is classified as natural land Percentage of county land area that is classified as barren land Percentage of county land area that is classified as forest land Percentage of county land area that is classified as tundra land cover Percentage of county land area that is classified as shrubland land cover Percentage of county land area that is classified as herbaceous land cover Percentage of county land area that is classified as wetland land Percentage of county land area that is classified as woody wetland land cover Percentage of county land area that is classified as emergent wetland land cover Percentage of county land area that is classified as all human land use land cover Percentage of county land area that is classified as developed land cover Percentage of county land area that is classified as open space developed land cover Percentage of county land area that is classified as low-intensity developed land cover Percentage of county land area that is classified as medium- intensity developed land cover Percentage of county land area that is classified as high-intensity developed land cover Percentage of county land area that is classified as agricultural land cover Variable Name Counties Variable Notes EQIVersion NINDEX_open 5 Green space composite variable 2006-2010 Composite variable of barren, forest, tundra, shrubland, herbaceous, and NINDEX 5 wetland land cover 2006-2010 Vegetation accounts for <15% total pbar 5 cover 2006-2010 Composite variable of deciduous, evergreen, and mixed forests. Areas dominated by trees generally greater than 5-meters tall, and greater than 20% pfor 5 total vegetation cover 2006-2010 ptun 5 Alaska only areas 2006-2010 Areas dominated by shrubs; less than 5-meters tall; shrub canopy greater than pshb 5 20% of total vegetation 2006-2010 Areas dominated by graminoid and herbaceous vegetation, usually greater phrb 5 than 80% of total vegetation 2006-2010 Composite variable of woody and pwtl 5 emergent wetlands 2006-2010 Soil or substrate is periodically saturated with or covered with water and forest or shrubland vegetation account for >20% pwtlw 5 vegetative cover 2006-2010 Soil or substrate is periodically saturated with or covered with water and perennial herbaceous vegetation accounts for pwtle 5 >80% vegetative cover 2006-2010 Composite variable of developed and UINDEX 5 agricultural land cover No pdev 5 All developed land cover No Mixture of some constructed materials but mostly vegetation; < 20% pdevo 5 impervious surface No Mixture of constructed materials and vegetation; 20% to 49% impervious pdevl 5 surface No Mixture of constructed materials and vegetation; 50% to 79% impervious pdevm 5 surface No Highly developed areas; 80% to 100% pdevh 5 impervious surface No Composite variable of pasture/hay and pagr 5 cultivated crop land cover No Notes Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Included as part of green space composite variable Does not meet definition of green space Does not meet definition of green space Included as part of green space composite variable Does not meet definition of green space Does not meet definition of green space Does not meet definition of green space Does not meet definition of green space B-ll ------- Variable Variable Name Counties Variable Notes EQIVersion Notes Grasses, legumes, or grass-legume mixtures for livestock grazing; Percentage of county land area production of seed or hay crops; that is classified as pasture/hay pasture/hay vegetation accounts for Does not meet definition of green land cover pagrp 5 >20% total vegetation No space Production of annual crops; crop Percentage of county land area vegetation accounts for >20% total that is classified as cultivated crop vegetation; includes land being actively Does not meet definition of green land cover pagrc 5 tilled No space NATIONAL WALKABILITY INDEX (EPA) NOTES: THE NATIONAL WALKABILITY INDEX IS A NATIONWIDE GEOGRAPHIC DATA RESOURCE THAT RANKS BLOCK GROUPS ACCORDING TO THEIR RELATIVE WALKABILITY. THE NATIONAL DATASET INCLUDES WALKABILITY SCORES FOR ALL BLOCK GROUPS, AS WELL AS THE UNDERLYING ATTRIBUTES THAT ARE USED TO RANK THE BLOCK GROUPS. DATAARE AVAILABLE FOR DOWNLOAD FROM THE EPASMARTGROWTH WEB SITE (HTTPSV/WWW.EPA.GOV/SMARTGROWTH/SMART- LOCATION-MAPPING#WALKABILITYV Variable Variable Name Counties Variable Notes EQI Version Notes Scores were available at block group; county score created by adding block group scores, then taking mean National walkability of the block group scores based on county population index score Sum_NWIBG 3143 proportions 2006-2010 B-12 ------- Appendix III: Changes in Variables from EQI 2000-2005 to EQI 2006-2010 Table A: Variables Added Domain Data Source Variable Variable Name Notes Water Safe Drinking Water Information Total coliform, proportion Coliform_Sum Added to drinking water quality construct System (SDWIS) Land Mine Safety and Health Administration Primarily coal mines, mines per county Std_coal_prim_pop_ln Part of new mining activity construct (MSHA) Mines Data Set population Primarily metal mines, mines per county Std_coal_prim_pop_ln Part of new mining activity construct population Primarily nonmetal mines, mines per Std_coal_prim_pop_ln Part of new mining activity construct county population Primarily sand and gravel mines, mines Std_coal_prim_pop_ln Part of new mining activity construct per county population Primarily stone mines, mines per county Std_coal_prim_pop_ln Part of new mining activity construct population Sociodemographic United States Census Measure of income inequality GINI_est Added to socioeconomic construct (proportion) United States Department of Percent county employed in creative Num_CreatClass County creative typology construct Agriculture Economic Research class Service Creative Class United States Election Atlas Percent county voting Democratic in DEM02008 County political valence construct 2008 Built TIGER Files Proportion of all roads that are SecondaryRoadProportion Replaced proportion primary road and secondary roads highways EnviroAtlas Land Cover Combined natural land cover and open NINDEX_open Green Space construct space developed National Walkability Index (EPA) National walkability index score Sum_NWIBG Walkability construct Table B: Variables Changed Domain Data Source Variable Variable Name Variable Replaced Variable Replaced Name Sociodemographic United States Census Bachelor's degree or higher, Pct_BS Percent of persons with more Pct_hs_more percent of persons age 25 years+ than a high school education Table C: Variables Deleted Percent of families in poverty Pct_Fam_Pov Occupants per room ln_Occs_Room Percent of persons less than Pct_pers_lt_pov poverty level Median number of rooms in Med_rooms residence Domain Data Source Variable Variable Name Reason Not Used Land National Geochemical Mean level of arsenic from sampled county sources Mean_as_ln Data quality Survey Mean level of selenium from sampled county sources Mean_se_ln Data quality Mean level of mercury from sampled county sources Mean_hg_ln Data quality Mean level of lead from sampled county sources Mean_pb_ln Data quality Mean level of zinc from sampled county sources Mean_zn_ln Data quality Mean level of copper from sampled county sources Mean_cu_ln Data quality Mean level of aluminum from sampled county sources Mean_al_pct Data quality Mean level of sodium from sampled county sources Mean_na_pct Data quality Mean level of magnesium from sampled county sources Mean_mg_pct_ln Data quality C-l ------- Table C: continued Domain Data Source Built Dun & Bradstreet Built TIGER files Sociodemographic United States Census Variable Mean level of titanium from sampled county sources Mean level of calcium from sampled county sources Mean level of manganese from sampled county sources Mean level of iron from sampled county sources Mean level of phosphorus from sampled county sources Rate of transportation-related businesses per county Rate of entertainment businesses per county Proportion of all roads that are highways Proportion of all roads that are primary roads Percent of persons less than poverty level Variable Name Mean_ti_pct_ln Mean_ca_pct_ln Mean_mn Mean_fe_pct_ln mean_al_pct rate_trans_env_log rate_ent_env_log hwyprop primaryprop pct_pers_lt_pov Percent of persons who do not speak English pct_no_eng Percent of persons with more than high school education pct_hs_more Percent of persons who work outside their county of work_out_co residence Median number of rooms in residence med_rooms Percent of residences with more than 10 units pct_mt_1 Ounitsjog Reason Not Used Data quality Data quality Data quality Data quality Data quality Captured by public transportation, commuting times and roads Dropped because there was no clear association with health Both variables replaced with secondary roads Replaced with percent of families below poverty level Replaced with percent of persons with a bachelor's degree Replaced with occupants per room Water Watershed Assessment, Sewage Permits per 1000 km of Stream in County SEWAGENPDESperKM Used group variable Tracking and Environmental Results Program Database/ REACH Address Database Industrial Permits per 1000 km of stream in county INDNPDESperKM Used group variable Stormwater Permits per 1000 km of stream in county STORMNPDESperKM Used group variable Number of days closed per event in county 2002 numDays_Close_Activity_2002 Not enough counties Number of days per contamination advisory event in numDays_Cont_Activity_2002 Not enouqh counties county 2002 Number of days per rain advisory event in county 2002 numDays_Rain_Activity_2002 Not enough counties Water National Atmospheric Magnesium (Mg) precipitation weighted mean (mg/L) Mgjn Correlated Deposition Program Sodium (Na) precipitation weighted mean (mg/L) Najn Correlated Ammonium (NH4) precipitation weighted mean (mg/L) NH4_mean Correlated /ater National Contaminant Occurrence Database Beryllium - average W_Be_ln (mg/L) Zeros Thallium - average W_TI_ln (mg/L) Correlated Lindane - average W_Lindane_ln (mg/L) Correlated Toxaphene - average W_Toxaphene_ln (ug/L) Correlated Oxamyl (Vydate) - average W_Oxamyl_ln (ug/L) Correlated Hexachlorocyclopentadiene - average W_HCCPD_ln (ug/L) Correlated Carbofuran - average W_Carbofuran_ln (ug/L) Correlated Alachlor - average W_Alachlor_ln (ug/L) Correlated Heptachlor - average W_Heptachlor_ln (ug/L) Correlated Heptachlor epoxide - average W_Heptachlor_epox_ln (ug/L) Correlated 2,4,5-TP (Silvex) - average W_silvex_ln (ug/L) Correlated Hexachlorobenzene - average WJ-ICBJn (ug/L) Correlated 1,2,4-Trichlorobenzene - average W_124TCIB_ln (ug/L) Correlated 1,2-Dichlorobenzene (o-Dichlorobenzene) - average W_ODCB_ln (ug/L) Correlated Vinyl chloride - average W_VCM_ln (ug/L) Correlated C-2 ------- Table C: continued Domain Data Source Variable Variable Name Reason Not Used Carbon Tetrachloride - average W_CCI4_ln (ug/L) Correlated 1,1,2-Trichloroethane - average W_112TCA_ln (ug/L) Correlated 1,1 -Dichloroethylene - average W_11 DCEJn (ug/L) Correlated trans-1,2-Dichloroethylene - average W_t12DCE_ln (ug/L) Correlated 1,2-Dichloroethane (Ethylene Dichloride) - average W_EDC_ln (ug/L) Correlated 1,2-Dichloropropane - average W_PDC_ln (ug/L) Correlated Benzene - average W_CI1benz_ln (ug/L) Correlated \ir National-Scale Air Toxics Assessment 2,4-toluene diisocyanate A_TDI_ln Correlated 2-chloroacetophenone A_2Clacephen_ln Correlated 2-nitropropane A_2NP_ln Correlated 4-nitrophenol A_PNP_ln Correlated Acetonitrile A_CH3CN_ln Correlated Acetophenone A_Acetophenone_ln Correlated Acrolein A_Aroclein_ln Correlated Acrylonitrile A_C3H3N_ln Correlated Antimony compounds A_Sb_ln Correlated Biphenyl A_biphenyl_ln Correlated Bromoform A_Bromoform_ln Correlated Cadmium compounds A_Cd_ln Correlated Carbon disulfide A_CS2_ln Correlated Carbon sulfide A_CS_ln Correlated Cresol/cresylic acid A_Cresol_ln Correlated Cumene A_Cumene_ln Correlated Diesel engine emissions A_Diesel_ln Correlated Dimethyl formamide A_DMF_ln Correlated Dimethyl phthalates A_Me2_phatalte_ln Correlated Dimethyl sulfate A_Me2S04_ln Correlated Epichlorohydrin A_ECH_ln Correlated Ethyl acrylate A_Etacrylate_ln Correlated Ethylene glycol A_EGLY_ln Correlated Ethylene oxide A_EOx_ln Correlated Ethylidene dichloride A_EdCI2_ln Correlated Hexachlorobenzene A_HCB_ln Correlated Hexachlorobutadiene A_HCBD_ln Correlated Hexachlorocyclopentadiene A_HCCPD_ln Correlated Hexane A_Hexane_ln Correlated Lead compounds A_Pb_ln Correlated Mercury compounds A_Hg_ln Correlated Methanol A_MeOH_ln Correlated Methyl isobutyl ketone A_MIBK_ln Correlated Methyl methacrylate A_MMA_ln Correlated Methyl chloride A_MeCI_ln Correlated Methylhydrazine A_Mehydrazine_ln Correlated MTBE A_MTBE_ln Correlated Nitrobenzene A_nitrobenzene_ln Correlated N,N-dimethylaniline A_DMA_ln Correlated o-toluidine A_otoluidine_ln Correlated PAH/POM A_PAHPOM_ln Correlated Pentachlorophenol A_PCP_ln Correlated C-3 ------- Table C: continued Domain Data Source Variable Variable Name Reason Not Used Phosphorus A_P_ln Correlated Propylene oxide A_ProO_ln Correlated Selenium compounds A_Se_ln Correlated Styrene A_Styrene_ln Correlated Tetrachloroethylene A_CI4C2_ln Correlated Toluene A_Toluene_ln Correlated Triethylamine A_Et3N_ln Correlated Vinyl acetate A_VyAc_ln Correlated Vinylidene chloride A_11DCE_ln Correlated C-4 ------- Appendix IV: Table of Highly Correlated for Each Domain Variables Air Domain Correlation Variable Correlated Variable Coefficient Variable Used to Represent Group 1 -1 -1 -trichloroethane Methylene chloride 0.73 Methylene chloride 1-4-dichlorobenzene 0.70 Vinylidene chloride 2-2-4-trimethylpentane 2-chloroacetophenone 2-nitropropane Ethylbenzene 2-2-4-trimethylpentane Carbon disulfide Cumene Diesel engine emissions Ethylene glycol Hexane Methanol Methyl isobutyl ketone MTBE Naphthalene Toluene Xylenes Ethylbenzene Vinylidene chloride 4-4-methylenediphenyl diisocyanate Acetophenone Acrolein Benzene Biphenyl 1-3-butadiene Tetrachloroethylene Cresol cresylic acid Cumene Diesel engine emissions Ethylene glycol Triethylamine Hexane Mercury compounds Dimethyl phthalate Methanol Methyl isobutyl ketone Methyl methacrylate MTBE Naphthalene Pahpom 4-nitrophenol Propionaldehyde Selenium compounds Styrene 2-4-toluene diisocyanate Toluene Vinyl acetate Xylenes Benzyl chloride Bromoform Methylhydrazine Chloroprene Allyl chloride n-n-dimethylaniline 2-4-dinitrotoluene Nitrobenzene o-toluidine 0.73 0.72 0.80 0.72 0.71 0.75 0.74 0.75 0.71 0.71 0.71 0.71 0.74 0.95 0.72 0.82 0.75 0.74 0.82 0.76 0.83 0.72 0.71 0.85 0.88 0.86 0.76 0.92 0.82 0.72 0.85 0.83 0.75 0.79 0.88 0.77 0.82 0.73 0.76 0.82 0.72 0.88 0.78 0.95 0.71 0.95 0.96 0.70 0.76 0.77 0.74 0.76 0.72 Ethylbenzene Ethylbenzene Benzyl chloride Chloroprene D-l ------- Air Domain Variable 4-4-methylenediphenyl diisocyanate Acetophenone Acrolein Correlation Correlated Variable Coefficient Ethylbenzene 0.83 2-2-4-trimethylpentane 0.82 Acetophenone 0.74 Acrolein 0.72 Benzene 0.73 Biphenyl 0.70 1-3-butadiene 0.76 Cumene 0.84 Diesel engine emissions 0.75 Ethylene glycol 0.86 Triethylamine 0.79 Hexane 0.82 Mercury compounds 0.76 Dimethyl phthalate 0.76 Methanol 0.83 Methyl isobutyl ketone 0.82 Methyl methacrylate 0.77 MTBE 0.74 Naphthalene 0.79 Pahpom 0.72 Phenol 0.71 4-nitrophenol 0.78 Selenium compounds 0.71 Styrene 0.79 2-4-toluene diisocyanate 0.75 Toluene 0.77 Vinyl acetate 0.80 Xylenes 0.84 Ethylbenzene 0.76 2-2-4-trimethylpentane 0.75 4-4-methylenediphenyl diisocyanate 0.74 Biphenyl 0.78 1-3-butadiene 0.72 Cresol cresylic acid 0.76 Cumene 0.78 Ethylene glycol 0.78 Triethylamine 0.71 Hexane 0.74 Mercury compounds 0.75 Methanol 0.78 Methyl isobutyl ketone 0.76 MTBE 0.74 Naphthalene 0.76 Pahpom 0.76 Phenol 0.73 4-nitrophenol 0.81 Selenium compounds 0.72 Toluene 0.70 Vinyl acetate 0.70 Xylenes 0.77 Ethylbenzene 0.77 2-2-4-trimethylpentane 0.74 4-4-methylenediphenyl diisocyanate 0.72 1-3-butadiene 0.74 Cresol cresylic acid 0.81 Cumene 0.74 Ethylene glycol 0.76 Hexane 0.73 Methanol 0.76 Methyl isobutyl ketone 0.75 MTBE 0.73 Naphthalene 0.75 Pahpom 0.71 Propionaldehyde 0.75 Xylenes 0.77 Variable Used to Represent Group Ethylbenzene Ethylbenzene Ethylbenzene D-2 ------- Air Domain Variable Allyl chloride Arsenic compounds Benzene Biphenyl Bromoform Correlation Correlated Variable Coefficient Chloroprene 0.90 2-nitropropane 0.76 Acetonitrile 0.81 n-n-dimethylaniline 0.96 Epichlorohydrin 0.85 Ethyl acrylate 0.78 Hexachlorobutadiene 0.73 Hexachlorocyclopentadiene 0.70 Nitrobenzene 0.96 o-toluidine 0.85 Propylene oxide 0.77 1-2-4-trichlorobenzene 0.78 Chromium compounds 0.80 Cadmium compounds 0.80 Lead compounds 0.74 Ethylbenzene 0.85 2-2-4-trimethylpentane 0.82 4-4-methylenediphenyl diisocyanate 0.73 1-3-butadiene 0.90 Tetrachloroethylene 0.85 Cumene 0.77 Diesel engine emissions 0.76 Ethylene glycol 0.80 Hexane 0.81 Mercury compounds 0.71 Methanol 0.79 Methyl isobutyl ketone 0.74 MTBE 0.70 Naphthalene 0.80 4-nitrophenol 0.74 Styrene 0.70 Toluene 0.96 Xylenes 0.85 Ethylbenzene 0.75 2-2-4-trimethylpentane 0.76 4-4-methylenediphenyl diisocyanate 0.70 Acetophenone 0.78 1-3-butadiene 0.70 Cresol cresylic acid 0.74 Cumene 0.77 Ethylene glycol 0.77 Hexane 0.73 Mercury compounds 0.76 Methanol 0.77 Methyl isobutyl ketone 0.74 MTBE 0.71 Naphthalene 0.77 Pahpom 0.80 Phenol 0.74 4-nitrophenol 0.74 Selenium compounds 0.72 Toluene 0.71 Xylenes 0.76 Benzyl chloride 0.70 Methylhydrazine 0.94 Variable Used to Represent Group Chloroprene Chromium compounds Ethylbenzene Ethylbenzene Benzyl chloride D-3 ------- Air Domain Correlation Variable Correlated Variable Coefficient Variable Used to Represent Group 1 -3-butadiene Ethylbenzene 0.84 Ethylbenzene 2-2-4-trimethylpentane 0.83 4-4-methylenediphenyl diisocyanate 0.76 Acetophenone 0.72 Acrolein 0.74 Benzene 0.90 Biphenyl 0.70 Tetrachloroethylene 0.74 Cresol cresylic acid 0.71 Cumene 0.80 Diesel engine emissions 0.72 Ethylene glycol 0.83 Triethylamine 0.74 Hexane 0.81 Mercury compounds 0.76 Methanol 0.81 Methyl isobutyl ketone 0.79 Methyl methacrylate 0.71 MTBE 0.73 Naphthalene 0.81 Pahpom 0.72 4-nitrophenol 0.77 Selenium compounds 0.70 Styrene 0.73 2-4-toluene diisocyanate 0.70 Toluene 0.94 Vinyl acetate 0.73 Xylenes 0.84 Acrylonitrile Trichloroethylene 0.74 Trichloroethylene Cadmium compounds Chromium compounds 0.71 Chromium compounds Arsenic compounds 0.80 Acetonitrile Chloroprene 0.80 Chloroprene Allyl chloride 0.81 n-n-dimethylaniline 0.80 2-4-dinitrotoluene 0.75 Epichlorohydrin 0.76 Nitrobenzene 0.79 o-toluidine 0.75 Propylene oxide 0.77 Tetrachloroethylene Ethylbenzene 0.72 Ethylbenzene 2-2-4-trimethylpentane 0.72 Benzene 0.85 1-3-butadiene 0.74 Naphthalene 0.73 Toluene 0.82 Xylenes 0.72 Cresol cresylic acid Ethylbenzene 0.77 Ethylbenzene 2-2-4-trimethylpentane 0.71 Acetophenone 0.76 Acrolein 0.81 Biphenyl 0.74 1-3-butadiene 0.71 Cumene 0.73 Ethylene glycol 0.75 Triethylamine 0.71 Mercury compounds 0.73 Methanol 0.74 Methyl isobutyl ketone 0.75 Naphthalene 0.78 Pahpom 0.76 Phenol 0.75 Propionaldehyde 0.71 Xylenes 0.78 D-4 ------- Air Domain Variable Carbon disulfide Cumene 1-4-dichlorobenzene Diesel engine emissions Correlated Variable Ethylbenzene Vinylidene chloride Cumene Ethylene glycol Methanol Methyl isobutyl ketone Xylenes Ethylbenzene Vinylidene chloride 2-2-4-trimethylpentane 4-4-methylenediphenyl diisocyanate Acetophenone Acrolein Benzene Biphenyl 1-3-butadiene Cresol cresylic acid Carbon disulfide Diesel engine emissions Ethylene glycol Triethylamine Hexane Mercury compounds Dimethyl phthalate Methanol Methyl isobutyl ketone Methyl methacrylate MTBE Naphthalene Pahpom Phenol 4-nitrophenol Selenium compounds Styrene 2-4-toluene diisocyanate Toluene Vinyl acetate Xylenes Methylene chloride 1-1-1 -trichloroethane Ethylbenzene Vinylidene chloride 2-2-4-trimethylpentane 4-4-methylenediphenyl diisocyanate Benzene 1-3-butadiene Cumene Ethylene glycol Triethylamine Hexane Mercury compounds Methanol Methyl isobutyl ketone MTBE Naphthalene 4-nitrophenol Selenium compounds Styrene 2-4-toluene diisocyanate Toluene Vinyl acetate Xylenes Correlation Coefficient 0.72 0.80 0.70 0.74 0.74 0.73 0.72 0.87 0.72 0.85 0.84 0.78 0.74 0.77 0.77 0.80 0.73 0.70 0.77 0.89 0.82 0.88 0.81 0.74 0.88 0.86 0.76 0.83 0.84 0.79 0.78 0.81 0.76 0.81 0.77 0.81 0.80 0.88 0.80 0.70 0.86 0.71 0.88 0.75 0.76 0.72 0.77 0.78 0.70 0.85 0.75 0.78 0.74 0.73 0.78 0.74 0.71 0.74 0.71 0.78 0.72 0.85 Variable Used to Represent Group Ethylbenzene Ethylbenzene Methylene chloride Ethylbenzene D-5 ------- Air Domain Correlation Variable Correlated Variable Coefficient Variable Used to Represent Group n-n-di methylanil ine Dimethyl formamide 2-4-dinitrotoluene Epichlorohydrin Ethylidene dichloride Chloroprene 0.92 2-nitropropane 0.77 Allyl chloride 0.96 Acetonitrile 0.80 2-4-dinitrotoluene 0.92 Epichlorohydrin 0.86 Ethyl acrylate 0.77 Hexachlorobutadiene 0.72 Hexachlorocyclopentadiene 0.72 Nitrobenzene 0.95 o-toluidine 0.86 Propylene oxide 0.78 1-2-4-trichlorobenzene 0.78 Ethyl chloride 0.71 Chloroprene 0.88 2-nitropropane 0.74 Allyl chloride 0.89 A_CH3CN 0.75 n-n-dimethylaniline 0.92 Epichlorohydrin 0.84 Ethyl acrylate 0.76 Hexachlorocyclopentadiene 0.70 Nitrobenzene 0.88 o-toluidine 0.86 Propylene oxide 0.70 1-2-4-trichlorobenzene 0.76 Chloroprene 0.84 Allyl chloride 0.85 Acetonitrile 0.76 n-n-dimethylaniline 0.86 2-4-dinitrotoluene 0.84 Ethyl acrylate 0.77 Nitrobenzene 0.81 o-toluidine 0.80 Propylene oxide 0.75 1-2-4-trichlorobenzene 0.74 Vinyl chloride 0.82 Chloroprene Ethyl chloride Chloroprene Chloroprene Vinyl chloride D-6 ------- Air Domain Variable Ethylene glycol Ethylene oxide Triethylamine Ethyl acrylate Hexachlorobenzene Correlation Correlated Variable Coefficient Ethylbenzene 0.88 Vinylidene chloride 0.75 2-2-4-trimethylpentane 0.86 4-4-methylenediphenyl diisocyanate 0.86 Acetophenone 0.78 Acrolein 0.76 Benzene 0.80 Biphenyl 0.77 1-3-butadiene 0.83 Cresol cresylic acid 0.75 Carbon disulfide 0.74 Cumene 0.89 Diesel engine emissions 0.78 Triethylamine 0.83 Hexane 0.87 Mercury compounds 0.84 Dimethyl phthalate 0.76 Methanol 0.93 Methyl isobutyl ketone 0.91 Methyl methacrylate 0.79 MTBE 0.81 Naphthalene 0.86 Pahpom 0.78 Phenol 0.75 4-nitrophenol 0.83 Propionaldehyde 0.73 Selenium compounds 0.78 Styrene 0.81 2-4-toluene diisocyanate 0.78 Toluene 0.84 Vinyl acetate 0.82 Xylenes 0.90 Ethylene dichloride 0.72 Ethylbenzene 0.79 2-2-4-trimethylpentane 0.76 4-4-methylenediphenyl diisocyanate 0.79 Acetophenone 0.71 1-3-butadiene 0.74 Cresol cresylic acid 0.71 Cumene 0.82 Diesel engine emissions 0.70 Ethylene glycol 0.83 Hexane 0.79 Mercury compounds 0.75 Methanol 0.80 Methyl isobutyl ketone 0.81 Methyl methacrylate 0.72 MTBE 0.70 Naphthalene 0.80 Pahpom 0.70 4-nitrophenol 0.71 Styrene 0.73 2-4-toluene diisocyanate 0.74 Toluene 0.74 Vinyl acetate 0.77 Xylenes 0.81 Chloroprene 0.80 Allyl chloride 0.78 n-n-dimethylaniline 0.77 2-4-dinitrotoluene 0.76 Epichlorohydrin 0.77 Nitrobenzene 0.75 o-toluidine 0.76 Polychlorinated biphenyls 0.83 Variable Used to Represent Group Ethylbenzene Ethylene dichloride Ethylbenzene Chloroprene Polychlorinated biphenyls D-7 ------- Air Domain Variable Hexachlorobutadiene Hexachlorocyclopentadiene Hexane Hydrogen fluoride Mercury compounds Correlation Correlated Variable Coefficient Chloroprene 0.70 Allyl chloride 0.73 n-n-dimethylaniline 0.72 Hexachlorocyclopentadiene 0.93 Nitrobenzene 0.73 Chloroprene 0.71 Allyl chloride 0.70 n-n-dimethylaniline 0.72 2-4-dinitrotoluene 0.70 Hexachlorobutadiene 0.93 Ethylbenzene 0.92 Vinylidene chloride 0.74 2-2-4-trimethylpentane 0.92 4-4-methylenediphenyl diisocyanate 0.82 Acetophenone 0.74 Acrolein 0.73 Benzene 0.81 Biphenyl 0.73 1-3-butadiene 0.81 Cumene 0.88 Diesel engine emissions 0.85 Ethylene glycol 0.87 Triethylamine 0.79 Mercury compounds 0.80 Dimethyl phthalate 0.72 Methanol 0.87 Methyl isobutyl ketone 0.83 Methyl methacrylate 0.76 MTBE 0.81 Naphthalene 0.86 Pahpom 0.73 4-nitrophenol 0.79 Selenium compounds 0.72 Styrene 0.80 2-4-toluene diisocyanate 0.77 Toluene 0.85 Vinyl acetate 0.79 Xylenes 0.92 Hydrochloric acid 0.91 Ethylbenzene 0.82 2-2-4-trimethylpentane 0.82 4-4-methylenediphenyl diisocyanate 0.76 Acetophenone 0.75 Benzene 0.71 Biphenyl 0.76 1-3-butadiene 0.76 Cresol cresylic acid 0.73 Cumene 0.81 Diesel engine emissions 0.75 Ethylene glycol 0.84 Triethylamine 0.75 Hexane 0.80 Methanol 0.82 Methyl isobutyl ketone 0.81 Methyl methacrylate 0.72 MTBE 0.74 Naphthalene 0.84 Pahpom 0.75 Phenol 0.72 4-nitrophenol 0.80 Propionaldehyde 0.73 Selenium compounds 0.91 Styrene 0.74 Toluene 0.76 Vinyl acetate 0.76 Xylenes 0.82 Variable Used to Represent Group Chloroprene Chloroprene Ethylbenzene Hydrochloric acid Ethylbenzene D-8 ------- Air Domain Variable Dimethyl phthalate Dimethyl sulfate Methyl chloride Methylhydrazine Methanol Correlation Correlated Variable Coefficient Ethylbenzene 0.73 2-2-4-trimethylpentane 0.72 4-4-methylenediphenyl diisocyanate 0.76 Cumene 0.74 Ethylene glycol 0.76 Hexane 0.72 Methanol 0.74 Methyl isobutyl ketone 0.73 Methyl methacrylate 0.76 Naphthalene 0.71 Styrene 0.75 Xylenes 0.74 Benzyl chloride 0.90 Carbon tetrachloride 0.94 Benzyl chloride 0.71 2-chloroacetophenone 0.96 Bromoform 0.94 Ethylbenzene 0.88 Vinylidene chloride 0.75 2-2-4-trimethylpentane 0.85 4-4-methylenediphenyl diisocyanate 0.83 Acetophenone 0.78 Acrolein 0.76 Benzene 0.79 Biphenyl 0.77 1-3-butadiene 0.81 Cresol cresylic acid 0.74 Carbon disulfide 0.74 Cumene 0.88 Diesel engine emissions 0.78 Ethylene glycol 0.93 Triethylamine 0.80 Hexane 0.87 Mercury compounds 0.82 Dimethyl phthalate 0.74 Methyl isobutyl ketone 0.89 Methyl methacrylate 0.78 MTBE 0.82 Naphthalene 0.84 Pahpom 0.78 Phenol 0.76 4-nitrophenol 0.82 Propionaldehyde 0.72 Selenium compounds 0.77 Styrene 0.81 2-4-toluene diisocyanate 0.76 Toluene 0.82 Vinyl acetate 0.79 Xylenes 0.89 Variable Used to Represent Group Ethylbenzene Benzyl chloride Carbon tetrachloride Benzyl chloride Ethylbenzene D-9 ------- Air Domain Variable Methyl isobutyl ketone Methyl methacrylate Correlation Correlated Variable Coefficient Ethylbenzene 0.86 Vinylidene chloride 0.71 2-2-4-trimethylpentane 0.83 4-4-methylenediphenyl diisocyanate 0.82 Acetophenone 0.76 Acrolein 0.75 Benzene 0.74 Biphenyl 0.74 1-3-butadiene 0.79 Cresol cresylic acid 0.75 Carbon disulfide 0.73 Cumene 0.86 Diesel engine emissions 0.74 Ethylene glycol 0.91 Triethylamine 0.81 Hexane 0.83 Mercury compounds 0.81 Dimethyl phthalate 0.73 Methanol 0.89 Methyl methacrylate 0.77 MTBE 0.81 Naphthalene 0.82 Pahpom 0.77 Phenol 0.78 4-nitrophenol 0.79 Selenium compounds 0.76 Styrene 0.81 2-4-toluene diisocyanate 0.77 Toluene 0.79 Vinyl acetate 0.76 Xylenes 0.89 Ethylbenzene 0.77 2-2-4-trimethylpentane 0.75 4-4-methylenediphenyl diisocyanate 0.77 1-3-butadiene 0.71 Cumene 0.76 Ethylene glycol 0.79 Triethylamine 0.72 Hexane 0.76 Mercury compounds 0.72 Dimethyl phthalate 0.76 Methanol 0.78 Methyl isobutyl ketone 0.77 Naphthalene 0.74 4-nitrophenol 0.72 Styrene 0.83 Toluene 0.71 Vinyl acetate 0.72 Xylenes 0.78 Variable Used to Represent Group Ethylbenzene Ethylbenzene D-10 ------- Air Domain Correlation Variable Correlated Variable Coefficient Variable Used to Represent Group MTBE Naphthalene Nickel compounds Ethylbenzene 0.79 Vinylidene chloride 0.71 2-2-4-trimethylpentane 0.79 4-4-methylenediphenyl diisocyanate 0.74 Acetophenone 0.74 Acrolein 0.73 Benzene 0.70 Biphenyl 0.71 1-3-butadiene 0.73 Cumene 0.83 Diesel engine emissions 0.73 Ethylene glycol 0.81 Triethylamine 0.70 Hexane 0.81 Mercury compounds 0.74 Methanol 0.82 Methyl isobutyl ketone 0.81 Naphthalene 0.78 Pahpom 0.71 Phenol 0.71 4-nitrophenol 0.74 Selenium compounds 0.70 Styrene 0.72 2-4-toluene diisocyanate 0.73 Toluene 0.73 Xylenes 0.79 Ethylbenzene 0.87 Vinylidene chloride 0.71 2-2-4-trimethylpentane 0.88 4-4-methylenediphenyl diisocyanate 0.79 Acetophenone 0.76 Acrolein 0.75 Benzene 0.80 Biphenyl 0.77 1-3-butadiene 0.81 Tetrachloroethylene 0.73 Cresol cresylic acid 0.78 Cumene 0.84 Diesel engine emissions 0.78 Ethylene glycol 0.86 Triethylamine 0.80 Hexane 0.86 Mercury compounds 0.84 Dimethyl phthalate 0.71 Methanol 0.84 Methyl isobutyl ketone 0.82 Methyl methacrylate 0.74 MTBE 0.78 Pahpom 0.84 Phenol 0.73 4-nitrophenol 0.79 Propionaldehyde 0.74 Selenium compounds 0.77 Styrene 0.76 2-4-toluene diisocyanate 0.70 Toluene 0.83 Vinyl acetate 0.78 Xylenes 0.88 Chromium compounds 0.79 Ethylbenzene Ethylbenzene Chromium compounds D-ll ------- Air Domain Variable Nitrobenzene o-toluidine Pahpom Lead compounds Phenol Correlation Correlated Variable Coefficient Chloroprene 0.88 2-nitropropane 0.76 Allyl chloride 0.96 Acetonitrile 0.79 n-n-dimethylaniline 0.95 2-4-dinitrotoluene 0.88 Epichlorohydrin 0.81 Ethyl acrylate 0.75 Hexachlorobutadiene 0.70 o-toluidine 0.82 Propylene oxide 0.77 1-2-4-trichlorobenzene 0.76 Chloroprene 0.84 2-nitropropane 0.72 Allyl chloride 0.85 Acetonitrile 0.75 n-n-dimethylaniline 0.86 2-4-dinitrotoluene 0.86 Epichlorohydrin 0.80 Ethyl acrylate 0.76 Nitrobenzene 0.82 Propylene oxide 0.77 1-2-4-trichlorobenzene 0.76 Ethylbenzene 0.76 2-2-4-trimethylpentane 0.77 4-4-methylenediphenyl diisocyanate 0.72 Acetophenone 0.76 Acrolein 0.71 Biphenyl 0.80 1-3-butadiene 0.72 Cresol cresylic acid 0.76 Cumene 0.79 Ethylene glycol 0.78 Triethylamine 0.70 Hexane 0.73 Mercury compounds 0.75 Methanol 0.78 Methyl isobutyl ketone 0.77 MTBE 0.71 Naphthalene 0.84 Phenol 0.79 4-nitrophenol 0.76 Selenium compounds 0.72 Styrene 0.73 Xylenes 0.78 Chromium compounds 0.74 Arsenic compounds 0.74 Ethylbenzene 0.71 4-4-methylenediphenyl diisocyanate 0.71 Acetophenone 0.73 Biphenyl 0.74 Cresol cresylic acid 0.75 Cumene 0.78 Ethylene glycol 0.75 Mercury compounds 0.72 Methanol 0.76 Methyl isobutyl ketone 0.78 MTBE 0.71 Naphthalene 0.73 Pahpom 0.79 Styrene 0.74 Xylenes 0.72 Variable Used to Represent Group Chloroprene Chloroprene Ethylbenzene Chromium compounds Ethylbenzene D-12 ------- Air Domain Correlation Variable Correlated Variable Coefficient Variable Used to Represent Group 4-nitrophenol Ethylbenzene 0.81 Ethylbenzene 2-2-4-trimethylpentane 0.82 4-4-methylenediphenyl diisocyanate 0.78 Acetophenone 0.81 Benzene 0.74 Biphenyl 0.74 1-3-butadiene 0.77 Cumene 0.81 Diesel engine emissions 0.74 Ethylene glycol 0.83 Triethylamine 0.71 Hexane 0.79 Mercury compounds 0.80 Methanol 0.82 Methyl isobutyl ketone 0.79 Methyl methacrylate 0.72 MTBE 0.74 Naphthalene 0.79 Pahpom 0.76 Propionaldehyde 0.71 Selenium compounds 0.75 Styrene 0.75 2-4-toluene diisocyanate 0.70 Toluene 0.77 Vinyl acetate 0.73 Xylenes 0.81 Propylene oxide Chloroprene 0.75 Chloroprene Allyl chloride 0.77 Acetonitrile 0.77 n-n-dimethylaniline 0.78 2-4-dinitrotoluene 0.70 Epichlorohydrin 0.75 Nitrobenzene 0.77 o-toluidine 0.73 Propionaldehyde Ethylbenzene 0.74 Ethylbenzene 2-2-4-trimethylpentane 0.73 Acrolein 0.75 Cresol cresylic acid 0.71 Ethylene glycol 0.73 Mercury compounds 0.73 Methanol 0.72 Naphthalene 0.74 4-nitrophenol 0.71 Selenium compounds 0.70 Xylenes 0.73 Selenium compounds Ethylbenzene 0.76 Ethylbenzene 2-2-4-trimethylpentane 0.76 4-4-methylenediphenyl diisocyanate 0.71 Acetophenone 0.72 Biphenyl 0.72 1-3-butadiene 0.70 Cumene 0.76 Diesel engine emissions 0.71 Ethylene glycol 0.78 Hexane 0.72 Mercury compounds 0.91 Methanol 0.77 Methyl isobutyl ketone 0.76 MTBE 0.70 Naphthalene 0.77 Pahpom 0.72 4-nitrophenol 0.75 Propionaldehyde 0.70 Xylenes 0.77 D-13 ------- Air Domain Variable Styrene 1-2-4-trichlorobenzene 2-4-toluene diisocyanate Correlation Correlated Variable Coefficient Ethylbenzene 0.82 2-2-4-trimethylpentane 0.82 4-4-methylenediphenyl diisocyanate 0.79 Benzene 0.70 1 -3-butadiene 0.73 Cumene 0.81 Diesel engine emissions 0.74 Ethylene glycol 0.81 Triethylamine 0.73 Hexane 0.80 Mercury compounds 0.74 Dimethyl phthalate 0.75 Methanol 0.81 Methyl isobutyl ketone 0.81 Methyl methacrylate 0.83 MTBE 0.72 Naphthalene 0.76 Pahpom 0.73 Phenol 0.74 4-nitrophenol 0.75 Toluene 0.74 Vinyl acetate 0.73 Xylenes 0.83 Chloroprene 0.70 Allyl chloride 0.78 n-n-dimethylaniline 0.78 2-4-dinitrotoluene 0.76 Epichlorohydrin 0.74 Nitrobenzene 0.76 o-toluidine 0.74 Ethylbenzene 0.77 2-2-4-trimethylpentane 0.72 4-4-methylenediphenyl diisocyanate 0.75 1-3-butadiene 0.70 Cumene 0.77 Diesel engine emissions 0.71 Ethylene glycol 0.78 Triethylamine 0.74 Hexane 0.77 Methanol 0.76 Methyl isobutyl ketone 0.77 MTBE 0.73 Naphthalene 0.70 4-nitrophenol 0.70 Toluene 0.71 Vinyl acetate 0.70 Xylenes 0.77 Variable Used to Represent Group Ethylbenzene Chloroprene Ethylbenzene D-14 ------- Air Domain Variable Toluene Vinyl acetate Correlation Correlated Variable Coefficient Ethylbenzene 0.88 Vinylidene chloride 0.71 2-2-4-trimethylpentane 0.88 4-4-methylenediphenyl diisocyanate 0.77 Acetophenone 0.70 Benzene 0.96 Biphenyl 0.71 1-3-butadiene 0.94 Tetrachloroethylene 0.82 Cumene 0.81 Diesel engine emissions 0.78 Ethylene glycol 0.84 Triethylamine 0.74 Hexane 0.85 Mercury compounds 0.76 Methanol 0.82 Methyl isobutyl ketone 0.79 Methyl methacrylate 0.71 MTBE 0.73 Naphthalene 0.83 4-nitrophenol 0.77 Styrene 0.74 2-4-toluene diisocyanate 0.71 Vinyl acetate 0.73 Xylenes 0.88 Ethylbenzene 0.79 2-2-4-trimethylpentane 0.78 4-4-methylenediphenyl diisocyanate 0.80 Acetophenone 0.70 1-3-butadiene 0.73 Cumene 0.80 Diesel engine emissions 0.72 Ethylene glycol 0.82 Triethylamine 0.77 Hexane 0.79 Mercury compounds 0.76 Methanol 0.79 Methyl isobutyl ketone 0.76 Methyl methacrylate 0.72 Naphthalene 0.78 4-nitrophenol 0.73 Styrene 0.73 2-4-toluene diisocyanate 0.70 Toluene 0.73 Xylenes 0.88 Variable Used to Represent Group Ethylbenzene Ethylbenzene D-15 ------- Air Domain Variable Xylenes Correlation Correlated Variable Coefficient Ethylbenzene 0.99 Vinylidene chloride 0.74 2-2-4-trimethylpentane 0.95 4-4-methylenediphenyl diisocyanate 0.84 Acetophenone 0.77 Acrolein 0.77 Benzene 0.85 Biphenyl 0.76 1-3-butadiene 0.84 Tetrachloroethylene 0.72 Cresol cresylic acid 0.78 Carbon disulfide 0.72 Cumene 0.88 Diesel engine emissions 0.85 Ethylene glycol 0.90 Triethylamine 0.81 Hexane 0.92 Mercury compounds 0.82 Dimethyl phthalate 0.74 Methanol 0.89 Methyl isobutyl ketone 0.89 Methyl methacrylate 0.78 MTBE 0.79 Naphthalene 0.88 Pahpom 0.78 Phenol 0.72 4-nitrophenol 0.81 Propionaldehyde 0.73 Selenium compounds 0.77 Styrene 0.83 2-4-toluene diisocyanate 0.77 Toluene 0.88 Vinyl acetate 0.80 Variable Used to Represent Group Ethylbenzene Water Domain Variable Percent of county abnormally dry Percent of county drought - moderate Percent of county drought - severe Percent of county drought - exceptional Lindane - average Thallium - average Toxaphene - average Oxamyl (Vydate) - average Alachlor - average 2,4,5-TP (Silvex) - average Hexachlorocyclopentadiene - average Carbofuran - average Correlation Correlated Variable(s) Coefficient Percent of county without drought, 0.94 Percent of county drought - moderate, 0.94 Percent of county drought - severe, 0.86 Percent of county drought - extreme 0.71 Percent of county without drought, 0.94 Percent of county abnormally dry, 0.94 Percent of county drought - severe, 0.86 Percent of county drought - extreme 0.71 Percent of county without drought, 0.86 Percent of county abnormally dry, Percent of county 0.86 drought - moderate, 0.94 Percent of county drought - extreme 0.71 Percent of county drought - moderate, 0.94 Percent of county drought - severe, 0.86 Percent of county drought - extreme 0.80 Barium - average 0.75 Cadmium - average 0.76 Endrin - average 0.80 Dalapon - average 0.70 Simazine - average 0.72 Picloram - average 0.73 Ethylene dibromide (EDB) - average 0.80 Chlordane - average 0.79 Variable Used To Represent Group Percent of county drought - extreme Percent of county drought - extreme Percent of county drought - extreme Percent of county drought - extreme Barium - average Cadmium - average Endrin - average Dalapon - average Simazine - average Picloram - average Ethylene dibromide (EDB) - average Chlordane - average D-16 ------- Water Domain Variable Correlated Variable(s) Correlation Coefficient Variable Used To Represent Group Heptachlor - average Di(2-ethylhexyl) phthalate (DEHP) - average Hexachlorobenzene - average Heptachlor - average 0.77 0.70 0.81 Di(2-ethylhexyl) phthalate (DEHP) - average Heptachlor Epoxide - average Di(2-ethylhexyl) phthalate (DEHP) - average Hexachlorobenzene - average Heptachlor - average 0.73 0.74 0.81 Di(2-ethylhexyl) phthalate (DEHP) - average Hexachlorobenzene - average Di(2-ethylhexyl) phthalate (DEHP) - average Heptachlor - average Heptachlor Epoxide - average 0.77 0.70 0.74 Di(2-ethylhexyl) phthalate (DEHP) - average 1,2,4-Trichlorobenzene - average Ethylbenzene - average Vinyl chloride - average Benzene - average 0.77 0.71 0.82 Ethylbenzene - average 1,2-Dichlorobenzene (o-Dichlorobenzene) 1,2,4-Trichlorobenzene - detect Ethylbenzene - average - average Benzene - average 0.80 0.77 0.88 Ethylbenzene - average Vinyl chloride - average 1,2-Dichlorobenzene (o-Dichlorobenzene) - average 1,2,4-Trichlorobenzene - detect Ethylbenzene - average Benzene - average 0.73 0.80 0.77 0.82 Ethylbenzene - average Benzene - average 1,2-Dichlorobenzene (o-Dichlorobenzene) - average 1,2,4-Trichlorobenzene - detect Ethylbenzene - average Vinyl chloride - average 0.88 0.82 0.72 0.82 Ethylbenzene - average 1,1 -Dichloroethylene - average cis1,2-Dichloroethylene - average Dichloroethylene - average cis-1,2-Dichloroethylene - average 0.70 0.70 0.81 cis-1,2-Dichloroethylene - average W_t12DCE_ln cis-1,2-Dichloroethylene - average 1,1-Dichloroethylene - average cis-1,2-Dichloroethylene - average 0.82 0.70 0.75 cis-1,2-Dichloroethylene - average cis-1,2-Dichloroethylene - average cis-1,2-Dichloroethylene - average 1,1 -Dichloroethylene - average Dichloroethylene - average 0.82 0.81 0.75 cis-1,2-Dichloroethylene - average Carbon Tetrachloride - average 1,1,1 -Trichloroethane - average 0.71 1,1,1 -Trichloroethane - average 1,2-Dichloropropane - average 1,4-Dichlorobenzene (p-Dichlorobenzene) - average 0.72 1,4-Dichlorobenzene (p-Dichlorobenzene) - average 1,1,2-Trichloroethane - average Tetrachloroethylene - average 0.80 Tetrachloroethylene - average Land Domain Variable Correlated Variable(s) Correlation Coefficient Variable Used To Represent Group Mean manganese Mean iron percent 0.90 Mean iron percent Percent weed acres Percent harvested acres, 0.96 percent lime acres 0.95 Percent harvested acres Percent lime acres Percent harvested acres, 0.97 percent weed acres 0.95 Percent harvested acres Sociodemographic Domain Variable Correlated Variable(s) Correlation Coefficient Variable Used To Represent Group Property crime rate Violent crime rate 0.91 Violent crime rate Built Domain Variable Correlated Variable(s) Correlation Coefficient Variable Used To Represent Group Secondary road proportion Street proportion 0.94 Street proportion D-17 ------- ------- Appendix V: Sociodemographic and Built-Domain Valence Correction -0.1269 0.0161 0.1269 Percent unemployed Harmful Yes -0.1979 0.0392 0.1979 Percent vacant housing Harmful Yes 0.3824 0.1462 -0.3824 Household income Beneficial Yes 0.1458 0.0213 -0.1458 Percent renter- occupied housing Harmful Yes Yes 0.4833 0.2336 -0.4833 Percent creative class Beneficial Yes -0.0118 0.0001 0.0118 GINI Harmful Yes Violent crime Harmful 0.0234 Yes Yes 0.0005 -0.0234 Percent Democratic Beneficial 0.211 Yes 0.0445 -0.211 Count of occupants per room Harmful -0.1085 Yes 0.0118 0.1085 Percent families less than poverty level Harmful -0.298 Yes 0.298 Median household value Beneficial 0.4331 Yes 0.1876 -0.4331 Percent bachelor's degree Sociodemographic Overall Loading A priori Variable (Expected Characteristic Sign) Beneficial Loading (Actual) 0.4585 Match (Expected versus Observed) Necessary To Multiply Vector of Loadings by-1? (Loading)A2 Yes 0.2102 Modified Loadings -0.4585 coefficient -0.1625 0.0264 0.1625 Percent unemployed Harmful Yes -0.2306 0.0532 0.2306 Percent vacant housing Harmful Yes 0.3700 0.1369 -0.3700 Household income Beneficial Yes 0.1827 0.0334 -0.1827 Percent renter-occupied housing Harmful Yes Yes 0.4668 0.2179 -0.4668 Percent creative class Beneficial Yes 0.1162 0.0135 -0.1162 GINI Harmful Yes Yes Count of occupants per room Harmful -0.0055 Yes 0.0000 0.0055 Violent crime Harmful 0.0094 Yes Yes 0.0001 -0.0094 Median household value Beneficial 0.4034 Yes 0.1627 -0.4034 Percent Democratic Beneficial 0.2625 Yes -0.2625 Percent families less than poverty level Harmful -0.2591 Yes 0.0671 0.2591 Percent bachelor's degree Sociodemographic RUCC 1 Loading A priori Variable (Expected Characteristic Beneficial Sign) Loading (Actual) 0.4689 Match (Expected versus Observed) Necessary To Multiply Vector of Loadings by-1? (Loading)A2 Yes 0.2199 Modified Loadings -0.4689 coefficient E-l ------- -0.3274 0.1072 0.3274 Percent unemployed Harmful Yes 0.1331 0.0177 -0.1331 Percent vacant housing Harmful Yes Yes 0.0874 0.0076 -0.0874 Household income Beneficial Yes -0.0141 0.0002 0.0141 Percent renter-occupied housing Harmful Yes 0.4463 0.1992 -0.4463 Percent creative class Beneficial Yes -0.1604 0.0257 0.1604 GINI Harmful Yes Count of occupants per room Harmful -0.1371 Yes 0.0188 0.1371 Violent crime Harmful -0.2386 Yes 0.0569 0.2386 Median household value Beneficial 0.4002 Yes 0.1602 -0.4002 Percent Democratic Beneficial 0.0929 Yes -0.0929 Percent families less than poverty level Harmful -0.4293 Yes 0.1843 0.4293 Percent bachelor's degree Sociodemographic RUCC 2 Loading A priori Variable (Expected Loading Characteristic Sign) (Actual) Beneficial 0.4621 Match (Expected versus Observed) Necessary To Multiply Vector of Loadings by -1? Yes Modified Loadings -0.4621 coefficient Built (Overall) Match Loading (Expected Necessary To A priori Variable (Expected Loading versus Multiply Vector of Modified Characteristic Sign) (Actual) Observed) Loadings by -1? (Loading)A2 Loadings Vice-related environment Harmful + 0.2930 Yes Yes 0.0858 -0.2930 Civic-related environment Beneficial - 0.3071 No Yes 0.0943 -0.3071 Education-related environment Beneficial - 0.3495 No Yes 0.1222 -0.3495 Health care-related environment Beneficial - 0.2798 No Yes 0.0783 -0.2798 Negative food environment Harmful 0.2280 Yes Yes 0.0520 -0.2280 Positive food environment Beneficial - 0.3179 No Yes 0.1011 -0.3179 Recreation environment Beneficial - 0.3590 No Yes 0.1289 -0.3590 Social service-related environment Beneficial - 0.3629 No Yes 0.1317 -0.3629 Traffic fatality rate Harmful + -0.1751 No Yes 0.0307 0.1751 Rate of low-rent + Section 8 housing Harmful + 0.0581 Yes Yes 0.0034 -0.0581 Proportion of secondary roads Harmful -0.1777 No Yes 0.0316 0.1777 Commute time Harmful + -0.3329 No Yes 0.1108 0.3329 Public transportation Beneficial - 0.0463 No Yes 0.0021 -0.0463 Walkability score Beneficial - 0.1585 No Yes 0.0251 -0.1585 Proportion green space Beneficial - -0.0451 Yes Yes 0.0020 0.0451 Built RUCC 1 Loading A priori Variable (Expected Loading Characteristic Sign) (Actual) Vice-related Harmful " +" 0.2676 environment Match Necessary To Multiply Vector of Modified Loadings by -1? (Loading)A2 Loadings Yes Yes 0.0716 -0.2676 (Expected versus Observed) Civic-related environment Education-related environment Health care-related environment Beneficial Beneficial Beneficial 0.1238 0.2409 0.4189 No No No Yes Yes Yes 0.0153 0.0580 0.1755 -0.1238 -0.2409 -0.4189 E-2 ------- 0.3405 0.1159 -0.3405 Positive food environment Beneficial Yes 0.3446 0.1187 -0.3446 Social service-related environment Beneficial Yes -0.1230 0.0151 0.1230 Rate of low-rent + Section 8 housing Harmful Yes 0.0356 Commute time Harmful Yes 0.3516 0.1236 -0.3516 Walkability score Beneficial Yes 0.2057 0.2057 -0.2057 Civic-related environment Beneficial Yes 0.3856 0.3856 -0.3856 Health care-related environment Beneficial Yes 0.2752 0.2752 -0.2752 Positive food environment Beneficial Yes 0.3503 0.3503 -0.3503 Social service-related environment Beneficial Yes 0.0459 0.0459 -0.0459 Rate of low-rent + Section 8 housing Harmful Yes Yes Commute time Harmful Yes 0.3310 0.3310 -0.3310 Walkability score Beneficial Yes Public transportation Beneficial 0.2253 Yes 0.0508 -0.2253 Traffic fatality rate Harmful -0.2340 Yes -0.2340 0.2340 Recreation environment Beneficial 0.3484 Yes 0.3484 -0.3484 Proportion of secondary roads Harmful -0.1319 Yes -0.1319 0.1319 Negative food environment Harmful 0.2707 Yes Yes 0.2707 -0.2707 Proportion green space Beneficial 0.0253 Yes 0.0253 -0.0253 Public transportation Beneficial 0.1111 Yes 0.1111 -0.1111 Education-related environment Beneficial 0.2626 Yes 0.2626 -0.2626 Traffic fatality rate Harmful 0.1978 Yes Yes 0.0391 -0.1978 Proportion of secondary roads Harmful 0.0950 Yes Yes 0.0090 -0.0950 Proportion green space Beneficial -0.1065 Yes 0.0113 0.1065 Recreation environment Beneficial 0.2354 Yes 0.0554 -0.2354 Vice-related environment Built RUCC 2 A priori Variable Characteristic Harmful Loading (Expected Loading Sign) Match (Expected versus (Actual) Observed) 0.0331 Yes Necessary To Multiply Vector of Loadings by -1? Yes (Loading)A2 Modified Loadings 0.0331 -0.0331 Negative food environment Built RUCC 1 Loading A priori Variable (Expected Characteristic Harmful Sign) Loading (Actual) 0.3239 Match (Expected versus Observed) Yes Necessary To Multiply Vector of Loadings by -1? Yes Modified Loadings -0.3239 E-3 ------- 0.1890 0.0357 -0.1890 Civic-related environment Beneficial Yes 0.3179 0.1011 -0.3179 Health care-related environment Beneficial Yes 0.2660 0.0707 -0.2660 Positive food environment Beneficial Yes 0.3644 0.1328 -0.3644 Social service-related environment Beneficial Yes 0.0697 0.0049 -0.0697 Rate of low-rent + Section 8 housing Harmful Yes -0.3230 0.1043 0.3230 Commute time Harmful 0.3542 0.1255 -0.3542 Walkability score Beneficial 0.3102 0.0962 -0.3102 Civic-related environment Beneficial Yes 0.2742 0.0752 -0.2742 Health care-related environment Beneficial Yes 0.2524 0.0637 -0.2524 Positive food environment Beneficial Yes 0.2793 0.0780 -0.2793 Social service-related environment Beneficial Yes -0.0178 0.0003 0.0178 Rate of low-rent + Section housing Harmful Yes -0.3546 0.1257 0.3546 Commute time Harmful Yes 0.3787 0.1434 -0.3787 Walkability score Beneficial Yes Recreation environment Beneficial 0.3212 Yes 0.1032 -0.3212 Traffic fatality rate Harmful -0.2197 Yes 0.0483 0.2197 Traffic fatality rate Harmful -0.2312 Yes 0.0535 0.2312 Recreation environment Beneficial 0.3222 Yes 0.1038 -0.3222 Education-related environment Beneficial 0.3278 Yes 0.1074 -0.3278 Proportion of secondary roads Harmful -0.1761 0.0310 0.1761 Public transportation Beneficial 0.0777 0.0060 -0.0777 Proportion green space Beneficial -0.0418 Yes 0.0017 0.0418 Negative food environment Harmful 0.2306 Yes Yes 0.0532 -0.2306 Proportion of secondary roads Harmful -0.2054 Yes 0.0422 0.2054 Public transportation Beneficial 0.0256 Yes 0.0007 -0.0256 Education-related environment Beneficial 0.3285 Yes 0.1079 -0.3285 Proportion green space Beneficial -0.1370 Yes Yes 0.0188 0.1370 Negative food environment Harmful 0.1527 Yes Yes 0.0233 -0.1527 Vice-related environment Built RUCC 4 A priori Variable Characteristic Harmful Loading (Expected Loading Sign) Match (Expected versus (Actual) Observed) 0.2595 Yes Necessary To Multiply Vector of Loadings by -1? Yes Modified Loadings -0.2595 Vice-related environment Built RUCC 3 A priori Variable Characteristic Harmful Loading (Expected Loading Sign) Match (Expected versus (Actual) Observed) 0.2724 Yes Necessary To Multiply Vector of Loadings by-1? (Loading)A2 Modified Loadings Yes 0.0742 -0.2724 Vice-related environment Harmful " +" 0.2595 Yes Yes 0.0673 -0.2595 Civic-related environment Beneficial 0.3102 No Yes 0.0962 -0.3102 Education-related environment Beneficial 0.3285 No Yes 0.1079 -0.3285 Health care-related environment Beneficial 0.2742 No Yes 0.0752 -0.2742 Negative food environment Harmful " + " 0.1527 Yes Yes 0.0233 -0.1527 Positive food environment Beneficial " -" 0.2524 No Yes 0.0637 -0.2524 Recreation environment Beneficial 0.3222 No Yes 0.1038 -0.3222 Social service-related environment Beneficial " -" 0.2793 No Yes 0.0780 -0.2793 Traffic fatality rate Harmful " + " -0.2312 No Yes 0.0535 0.2312 Rate of low-rent + Section 8 Harmful " + " -0.0178 No Yes 0.0003 0.0178 housing Proportion of secondary roads Harmful " + " -0.2054 No Yes 0.0422 0.2054 Commute time Harmful " + " -0.3546 No Yes 0.1257 0.3546 Public transportation Beneficial " -" 0.0256 No Yes 0.0007 -0.0256 Walkability score Beneficial 0.3787 No Yes 0.1434 -0.3787 Proportion green space Beneficial -0.1370 Yes Yes 0.0188 0.1370 E-4 ------- Appendix Vh County Maps of Environmental Quality Index 2006- 2010 Overall Environmental Quality Index by County 2006-2010 Percentile ¦ O-SOi ¦ 5th -20th f"B~ 20,h - 40th ¦ 40th - 60th I I 60«i - 80"' 80th - 95th ¦ 95lh - 100th Air Domain Index by County 2006-2010 Percentile * For orientation to the maps, low index scores (EQI and domain-specific) indicate higher environmental quality, and higher index scores (EQI and domain-specific) mean lower environmental quality I- ] ------- Water Domain Index by County 2006-2010 Percentile Land Domain Index by County 2006-2010 Percentile ¦ 0-5* ¦ 5th - 20th ¦ 20th ¦ 40th E3 40,h - 60th I] 60th - 80th na 80th - 95th ¦ 95th- 100th * For orientation to the maps, low index scores (EQI and domain-specific) indicate higher environmental quality, and higher index scores (EQI and domain-specific) mean lower environmental quality F-2 ------- Sociodemographic Domain Index by County 2006-2010 Percentile Built Domain Index by County 2006-2010 Percentile * For orientation to the maps, low index scores (EQI and domain-specific) indicate higher environmental quality, and higher index scores (EQI and domain-specific) mean lower environmental quality F-3 ------- Overall Environmental Quality Index Stratified by Rural-Urban Continuum Codes by County 2006-2010 ! 0 - 5th Percentile 1 I I ~ ~ 5* - 20* Percentile HI Ul Hi H 20th - 40th Percentile H MEI |^| H 40th - 60th Percentile ¦I H |60th - 80th Percentile 80th - 95th Percentile H 95th- 100th Percentile RUCC1 = Metropolitan urbanized RUCC2 = Non-metro urbanized RUCC3 = Less urbanized RUCC4 = Thinly populated Built Domain Index Stratified by Rural-Urban Continuum Codes by County 2006-2010 I I I I I I 1 1 0 - 5th Percentile H I 1 II 1 I1 I 5th - 20th Percentile _J 20th - 40th Percentile 3] 40th - 60th Percentile H |^| H 60th - 80th Percentile H H 80th - 95th Percentile H H |^| Hi 95th- 100th Percentile RUCC1 = Metropolitan urbanized RUCC2 = Non-metro urbanized RUCC3 = Less urbanized RUCC4 = Thinly populated * For orientation to the maps, low index scores (EQI and domain-specific) indicate higher environmental quality, and higher index scores (EQI and domain-specific) mean lower environmental quality F-4 ------- Water Domain Index Stratified by Rural-Urban Continuum Codes by County 2006-2010 I 1 I 1 I 1 I I 0 - 5th Percentile 5th. 20th Percentile I j Hi I I 20th - 40th Percentile IH H HI 40th - 60th Percentile Hi H 60th - 80th Percentile |^| H Hi Hi 80th - 95th Percentile H 95th- 100th Percentile RUCC1 = Metropolitan urbanized RUCC2 = Non-metro urbanized RUCC3 = Less urbanized RUCC4 = Thinly populated Land Domain Index Stratified by Rural-Uiban Continuum Codes by County 2006-2010 I 1 1 I 1 1 I I 0 - 5th Percentile _i 5th - 20th Percentile | 20th - 40th Percentile | Hi 40th - 60th Percentile H H H Hi 60th " 30th Percentile H H H 80th - 95th Percentile | H 95th- 100th Percentile RUCC1 = Metropolitan urbanized RUCC2 = Non-metro urbanized RUCC3 = Less urbanized RUCC4 = Thinly populated * For orientation to the maps, low index scores (EQI and domain-specific) indicate higher environmental quality, and higher index scores (EQI and domain-specific) mean lower environmental quality F-5 ------- Sociodemographic Domain Index Stratified by Rural-Urban Continuum Codes by County 2006-2010 I 1 I I \ I II I 0 - 5th Percentile I 1 I FT 1 5th - 20th Percentile I 1 Hi B I 20th - 40th Percentile 1^1 II II Hi 40th - 60th Percentile H HI Hi 60th - 80th Percentile HI Hi 8Qth - 95th Percentile |^| H 95th - 100th Percentile RUCC1 = Metropolitan urbanized RUCC2 = Non-metro urbanized RUCC3 = Less urbanized RUCC4 = Thinly populated Built Domain Index Stratified by Rural-Urban Continuum Codes by County 2006-2010 I 1 I I I I | | 0 - 5th Percentile n ~ ~ EH 5th - 20th Percentile 20th - 40th Percentile I 40th - 60th Percentile |H H H 60th - 80th Percentile H 80th - 95th Percentile 95th - 100th Percentile RUCC1 = Metropolitan urbanized RUCC2 = Non-metro urbanized RUCC3 = Less urbanized RUCC4 = Thinly populated * For orientation to the maps, low index scores (EQI and domain-specific) indicate higher environmental quality, and higher index scores (EQI and domain-specific) mean lower environmental quality F-6 ------- Appendix VII: Quality Assurance The approved Center for Public Health and Environmental Assessment, Public Health and Environmental Systems Division, Quality Assurance Project Plan for this project is "Creating an Overall Environmental Quality Index," with Document Control Number IRP-NHEERL/HSD/EBB/DL/2008-01-QP-1-7. An internal EPA review of this report was conducted in April 2019. An external peer review was conducted in March 2020. The data sources used to create the EQI and the criteria used to select the data sources are mentioned in this report in the Development of the EQI 2006-2010 section. Information about uses of the EQI, as well as strengths and limitations of the EQI, is located within the Discussion section of the report. G-l ------- SEPA United States Environmental Protection Agency PRESORTED STANDARD POSTAGE & FEES PAID EPA PERMIT NO. G-35 Office of Research and Development (8101R) Washington, DC 20460 Official Business Penalty for Private Use $300 Recycled/Recyclable Printed on paper that contains a minimum of 50% postconsurner fiber content processed chlorine free ------- |