A EPA EPA/635/R-23/014 IRIS Assessment Protocol www.epa.gov/iris Protocol for the Ethylbenzene IRIS Assessment (Preliminary Assessment Materials) (CASRN 100-41-4] February 2023 Integrated Risk Information System Center for Public Health and Environmental Assessment Office of Research and Development U.S. Environmental Protection Agency Washington, DC ------- Protocol for the Ethylbenzene IRIS Assessment DISCLAIMER This document is a public comment draft for review purposes only. This information is distributed solely for the purpose of public comment. It has not been formally disseminated by EPA. It does not represent and should not be construed to represent any Agency determination or policy. Mention of trade names or commercial products does not constitute endorsement or recommendation for use. This document is a draft for review purposes only and does not constitute Agency policy. ii DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment CONTENTS AUTHORS | CONTRIBUTORS | REVIEWERS ix 1. INTRODUCTION 1-1 2. SCOPING AND INITIAL PROBLEM FORMULATION 2-1 2.1. BACKGROUND 2-1 2.1.1. Physical and Chemical Properties 2-1 2.1.2. Sources, Production, and Uses 2-2 2.1.3. Environmental Fate and Transportation 2-3 2.1.4. Potential for Human Exposure and Populations with Potentially Greater Exposure 2-3 2.2.SCOPING AND PROBLEM FORMULATION SUMMARY 2-3 2.3. KEY SCIENCE ISSUES 2-4 3. OVERALL OBJECTIVES AND SPECIFIC AIMS 3-1 3.1. OBJECTIVES 3-1 3.2. SPECIFIC AIMS 3-1 4. LITERATURE SEARCH, SCREENING, AND INVENTORY 4-1 4.1. POPULATIONS, EXPOSURES, COMPARATORS, AND OUTCOMES (PECO) CRITERIA FOR THE SYSTEMATIC EVIDENCE MAP 4-1 4.2. SUPPLEMENTAL CONTENT SCREENING CRITERIA 4-2 4.3. LITERATURE SEARCH STRATEGIES 4-7 4.3.1. Database Search Term Development 4-7 4.3.2. Database Searches 4-7 4.3.3. Searching Other Sources 4-8 4.3.4. Non-Peer-Reviewed Data 4-9 4.4. LITERATURE SCREENING 4-10 4.4.1. Title and Abstract Screening 4-10 4.4.2. Full-Text Screening 4-10 4.4.3. Multiple Citations with the Same Data 4-11 4.4.4. Literature Flow Diagrams 4-11 4.5. LITERATURE INVENTORY 4-13 4.5.1. Studies That Meet Problem Formulation PECO Criteria 4-13 4.5.2. Organizational Approach for Supplemental Material 4-13 This document is a draft for review purposes only and does not constitute Agency policy. iii DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 4.6.SUMMARY-LEVEL LITERATURE INVENTORIES 4-13 5. REFINE PROBLEM FORMULATION AND SPECIFY ASSESSMENT APPROACH 5-1 5.1. ASSESSMENT PECO CRITERIA 5-1 5.1.1. Other Exclusions Based on Full-Text Content 5-2 5.2. UNITS OF ANALYSES FOR DEVELOPING EVIDENCE SYNTHESIS AND INTEGRATION JUDGMENTS FOR HEALTH EFFECT CATEGORIES 5-3 6. STUDY EVALUATION (RISK OF BIAS AND SENSITIVITY) 6-1 6.1.STUDY EVALUATION OVERVIEW FOR HEALTH EFFECT STUDIES 6-1 6.2. EPIDEMIOLOGY STUDY EVALUATION 6-5 6.2.1. Epidemiological Study Evaluation Considerations Specific to Exposure Domain for Ethylbenzene 6-16 6.2.2. Exposure Assessment Approaches used in Epidemiology Studies of Ethylbenzene and Potential Misclassification 6-16 6.2.3. ADME and Notes Relevant to Biomarkers 6-18 6.2.4. Time Frames Represented by Exposure Assessments 6-19 6.2.5. Correlation Between BTEX Compounds and Potential Confounding 6-19 6.2.6. Exposure Domain Evaluation Levels 6-19 6.3.CONTROLLED HUMAN EXPOSURE STUDY EVALUATION 6-22 6.4. EXPERIMENTAL ANIMAL STUDY EVALUATION 6-22 6.5. IN VITRO AND OTHER MECHANISTIC STUDY EVALUATION 6-31 6.6. PHYSIOLOGICALLY BASED PHARMACOKINETIC (PBPK) MODEL DESCRIPTIVE SUMMARY AND EVALUATION 6-41 6.6.1. Pharmacokinetic (PK)/Physiologically Based Pharmacokinetic (PBPK) Model Descriptive Summary 6-41 6.6.2. Pharmacokinetic (PK)/Physiologically Based Pharmacokinetic (PBPK) Model Evaluation 6-43 6.6.3. Selection of the Appropriate Dose Metric 6-44 7. DATA EXTRACTION OF STUDY METHODS AND RESULTS 7-1 8. EVIDENCE SYNTHESIS AND INTEGRATION 8-1 8.1. EVIDENCE SYNTHESIS 8-5 8.2. EVIDENCE INTEGRATION 8-15 9. DOSE-RESPONSE ASSESSMENT: SELECTING STUDIES AND QUANTITATIVE ANALYSIS 9-1 9.1.OVERVIEW 9-1 9.2.SELECTING STUDIES FOR DOSE-RESPONSE ASSESSMENT 9-2 9.2.1. Hazard and MOA Considerations for Dose Response 9-2 This document is a draft for review purposes only and does not constitute Agency policy. iv DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 9.3.CONDUCTING DOSE-RESPONSE ASSESSMENTS 9-8 9.3.1. Dose-Response Analysis in the Range of Observation 9-8 9.3.2. Extrapolation: Slope Factors and Unit Risk 9-11 9.3.3. Extrapolation: Reference Values 9-11 10. PROTOCOL HISTORY 10-1 REFERENCES R-l APPENDIX A. ELECTRONIC DATABASE SEARCH STRATEGIES A-l APPENDIX B. PROCESS FOR SEARCHING AND COLLECTING EVIDENCE FROM SELECTED OTHER RESOURCES B-l This document is a draft for review purposes only and does not constitute Agency policy. v DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment TABLES Table 2-1. Predicted or experimental physicochemical properties of ethylbenzene 2-1 Table 2-2. EPA program and regional office interest in an updated ethylbenzene assessment 2-4 Table 4-1. Problem formulation populations, exposures, comparators, and outcomes (PECO) criteria for the ethylbenzene assessment 4-1 Table 4-2. Categories of potentially relevant supplemental material 4-3 Table 5-1. Assessment PECO criteria for the ethylbenzene assessment 5-1 Table 5-2. Human and animal endpoint grouping categories 5-3 Table 6-1. Information relevant to evaluation domains for epidemiology studies 6-6 Table 6-2. Questions to guide the development of criteria for each domain in epidemiology studies 6-7 Table 6-3. Estimates representing total individual-level exposure based on personal or residential monitoring 6-19 Table 6-4. Exposure to ethylbenzene in ambient air 6-21 Table 6-5. Domains, questions, and general considerations to guide the evaluation of animal toxicology studies 6-23 Table 6-6. Domains, questions, and general considerations to guide the evaluation of in vitro studies 6-33 Table 6-7. Example descriptive summary for a physiologically based pharmacokinetic (PBPK) model study 6-42 Table 6-8. Criteria for evaluating physiologically based pharmacokinetic (PBPK) models 6-44 Table 8-1. Generalized evidence profile table to show the relationship between evidence synthesis and evidence integration to reach judgment of the evidence for hazard 8-3 Table 8-2. Generalized evidence profile table to show the key findings and supporting rationale from mechanistic analyses 8-4 Table 8-3. Considerations that inform judgments of the certainty of the evidence for hazard for each unit of analysis 8-7 Table 8-4. Framework for evidence synthesis judgments from studies in humans 8-11 Table 8-5. Framework for evidence synthesis judgments from studies in animals 8-13 Table 8-6. Considerations that inform evidence integration judgments 8-15 Table 8-7. Framework for summary evidence integration judgments in the evidence integration narrative 8-18 Table 9-1. Attributes used to evaluate studies for derivation of toxicity values (in addition to the health effect category-specific evidence integration judgment) 9-4 Table 9-2. Example table used in assessment to show endpoint consideration judgments for POD derivation 9-6 Table 9-3. Specific example of presenting endpoints considered for dose-response modeling and derivation of points of departure 9-6 Table A-l. Database search strategy A-l Table B-l. Summary table for ethylbenzene other sources search results (12/2021) B-4 This document is a draft for review purposes only and does not constitute Agency policy. vi DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment FIGURES Figure 1-1. IRIS systematic review problem formulation and method documents Figure 4-1. Literature flow diagram for ethylbenzene Figure 4-2. Inventory heatmap of PECO-relevant ethylbenzene human studies by study design and health system. An interactive version, which includes a list of citations with additional study details and summary of the results, is available here Figure 4-3. Inventory heatmap of PECO-relevant ethylbenzene animal studies by study design and health system. An interactive version, which includes a list of citations with additional study details and summary of the results, is available here Figure 4-4. Literature tag tree of the supplemental studies identified from the ethylbenzene literature searches. An interactive version, which includes a list of citations with additional study details and summary of the results, is available here Figure 4-5. High throughput screening bioactivity data from the CompTox Chemicals Dashboard. An interactive version, which includes a list of citations with additional study details and summary of the results, is available here Figure 6-1. Overview of IRIS study evaluation process, (a) An overview of the evaluation process (b) The evaluation domains and definitions for ratings (i.e., domain and overall judgments, performed on an outcome-specific basis) ABBREVIATIONS AC50 activity concentration at 50% CASRN Chemical Abstracts Service registry ADME absorption, distribution, metabolism, number and excretion CERCLA Comprehensive Environmental AIC Akaike's information criterion Response, Compensation, and Liability ALT alanine aminotransferase Act AOP adverse outcome pathway CHO Chinese hamster ovary (cell line cells) AST aspartate aminotransferase CI confidence interval atm atmosphere CL confidence limit ATSDR Agency for Toxic Substances and CMAQ Community Multi-scale Air Quality Disease Registry model BMC benchmark concentration CNS central nervous system BMCL benchmark concentration lower COI conflict of interest confidence limit CPAD Chemical and Pollutant Assessment BMD benchmark dose Division BMDL benchmark dose lower confidence limit CPHEA Center for Public Health and BMDS Benchmark Dose Software Environmental Assessment BMR benchmark response CYP450 cytochrome P450 BTEX benzene, toluene, ethylbenzene, o- DAF dosimetric adjustment factor xylene, m-/p-xylene DMSO dimethylsulfoxide BUN blood urea nitrogen DNA deoxyribonucleic acid BW body weight EPA Environmental Protection Agency BW3/4 body weight scaling to the 3/4 power ER extra risk CA chromosomal aberration FDA Food and Drug Administration CAA Clean Air Act FEVi forced expiratory volume of 1 second CAS Chemical Abstracts Service GD gestation day This document is a draft for review purposes only and does not constitute Agency policy. vii DRAFT-DO NOT CITE OR QUOTE ..1-2 4-12 4-15 4-16 4-17 4-18 ..6-2 ------- Protocol for the Ethylbenzene IRIS Assessment GDH glutamate dehydrogenase QSAR GGT y-glutamyl transferase GLP Good Laboratory Practice RD GSH glutathione RfC GST glutathione-^"-transferase RfD HAP hazardous air pollutant RGDR HAWC Health Assessment Workspace RNA Collaborative ROBINS I Hb/g-A animal blood:gas partition coefficient Hb/g-H human blood:gas partition coefficient SAR HBCD hexabromocyclododecane SCE HEC human equivalent concentration SD HED human equivalent dose SDH HERO Health and Environmental Research SE Online SGOT i.p. intraperitoneal i.v. intravenous SGPT IAP IRIS Assessment Plan IARC International Agency for Research on TIAB Cancer TSCATS IRIS Integrated Risk Information System IUR inhalation unit risk TWA LCso median lethal concentration UF LD50 median lethal dose UFa LOAEL lowest-observed-adverse-effect level UFd LOEL lowest-observed-effect level UFh LUR land use regression UFl MeSH Medical Subject Headings UFs MLE maximum likelihood estimation MN micronuclei WOS MNPCE micronucleated polychromatic erythrocyte MOA mode of action MTD maximum tolerated dose NCI National Cancer Institute NMD normalized mean difference NOAEL no-observed-adverse-effect level NOEL no-observed-effect level NTP National Toxicology Program NZW New Zealand White (rabbit breed) OAR Office of Air and Radiation OECD Organisation for Economic Co-operation and Development OLEM Office of Land and Emergency Management ORD Office of Research and Development OSF oral slope factor PB PK physiologically based pharmacokinetic PECO populations, exposures, comparators, and outcomes PK pharmacokinetic PND postnatal day POD point of departure POD[adj] duration-adjusted POD quantitative structure-activity relationship relative deviation inhalation reference concentration oral reference dose regional gas dose ratio ribonucleic acid Risk of Bias in Nonrandomized Studies of Interventions structure-activity relationship sister chromatid exchange standard deviation sorbitol dehydrogenase standard error serum glutamic oxaloacetic transaminase, also known as AST serum glutamic pyruvic transaminase, also known as ALT title and abstract Toxic Substances Control Act Test Submissions time-weighted average uncertainty factor animal-to-human uncertainty factor database deficiencies uncertainty factor human variation uncertainty factor LOAEL-to-NOAEL uncertainty factor subchronic-to-chronic uncertainty factor Web of Science This document is a draft for review purposes only and does not constitute Agency policy. viii DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment AUTHORS | CONTRIBUTORS | REVIEWERS Assessment Managers Laura Dishaw. Ph.D. EPA/ORD/CPHEA Paul G. Reinhart. Ph.D. Assessment Team Timothy Anderson, Ph.D. EPA/ORD/CPHEA Christine Cai, Ph.D. Ingrid Druwe. Ph.D. Yu-Sheng Lin. Ph.D. Anuradha Mudipalli, Ph.D. Rebecca Nachman, Ph.D. Rachel Shaffer. Ph.D. John Stanek, Ph.D. George Woodall, Ph.D. Brittany Schulz. B.S. Student Services Contractor, Oak Ridge Associated Universities (ORAU] Executive Direction Wayne Cascio, M.D. (CPHEA Director) EPA/ORD/CPHEA V. Kay Holt, M.S. (CPHEA Deputy Director) Samantha Jones, Ph.D. (CPHEA Associate Director) Kristina Thayer, Ph.D. (CPAD Director) Steve Dutton, Ph.D. (HEEAD Director) Andrew Kraft, Ph.D. (CPAD Associate Director) Ravi Subramaniam, Ph.D. (Acting CPAD Senior Advisor) Paul White, Ph.D. (CPAD Senior Science Advisor) Andrew Hotchkiss, Ph.D. (Branch Chief) Janice Lee, Ph.D. (Branch Chief) Elizabeth Radke-Farabaugh, Ph.D. (Branch Chief) Viktor Morozov, Ph.D. (Branch Chief) Garland Waleko, M.S. (Acting Branch Chief) Contributors Michelle Angrish, Ph.D. EPA/ORD/CPHEA Andrew Shapiro This document is a draft for review purposes only and does not constitute Agency policy. ix DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Production Team Maureen Johnson (CPHEA Webmaster) EPA/ORD/CPHEA Ryan Jones (HERO Director) Dahnish Shams (Project Management Team) Vicki Soto (Project Management Team) Jessica Soto-Hernandez (Project Management Team) Samuel Thacker (HERO Team) This document is a draft for review purposes only and does not constitute Agency policy. x DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Protocol for the Ethylbenzene IRIS Assessment 1. INTRODUCTION The Integrated Risk Information System (IRIS) Program is undertaking a reassessment of the health effects of ethylbenzene. IRIS assessments provide high quality, publicly available hazard identification and dose-response analyses on chemicals to which the public might be exposed. These assessments are not regulations but provide an important source of toxicity information used by the Environmental Protection Agency (EPA), state and local health agencies, tribes, other federal agencies, and international health organizations. A draft IRIS assessment plan (IAP) for ethylbenzene was presented at a public science meeting on September 27-28, 2017 fhttps://sab.epa.gov/ords/sab/f?p=100:19:3 574465722633) to seek input on the problem formulation components of the assessment plan. The 2017 IAP specified the EPA need for an ethylbenzene assessment, described the objectives and specific aims of the assessment, provided draft PECO (populations, exposures, comparators, and outcomes) criteria, and described areas of scientific complexity. However, in April 2019 the ethylbenzene assessment was suspended due to changes in how EPA identified priorities for the IRIS Program fApril 2019 IRIS Program Outlook! In June 2021, the assessment work was restarted after interest was expressed by EPA's Office of Land and Emergency Management (OLEM), Office of Chemical Safety and Pollution Prevention (OCSPP), and Region 2. This assessment may also be used to support actions in other EPA Program and Regional Offices and can inform efforts to address ethylbenzene by tribes, states, and international health agencies (see Section 2.2). This protocol document includes the IAP content, revised based on public input, and updated EPA scoping needs and presents the methods for conducting the systematic review and dose-response analysis for the assessment. While the IAP describes what the assessment will cover, this protocol describes how the assessment will be conducted (see Figure 1-1). The methods described in this protocol are based on the Office of Research and Development (ORD) Staff Handbook for Developing Integrated Risk Information System (IRIS) Assessments (referred to as the "IRIS Handbook") (U.S. EPA. 2022). This document is a draft for review purposes only and does not constitute Agency policy. 1-1 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Systematic Review: A structured and documented process for transparent literature review using explicit, pre-specified scientific methods to identify, select, assess, and summarize the findings of similar but separate studies. Assessment Initiated Scoping/Initial Problem Formulation Specify Assessment Approach Assessment Developed Figure 1-1. IRIS systematic review problem formulation and method documents. This document is a draft for review purposes only and does not constitute Agency policy. 1-2 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Protocol for the Ethylbenzene IRIS Assessment 2. SCOPING AND INITIAL PROBLEM FORMULATION 2.1. BACKGROUND Section 2.1 provides a brief overview of aspects of the physicochemical properties, human exposure, and environmental fate characteristics of ethylbenzene that might provide useful context for this protocol. This overview is not intended to provide a comprehensive description of the available information on these topics and is not recommended for use in decision-making. The reader is encouraged to refer to the source materials cited below, more recent publications on these topics, and authoritative reviews or assessments focused on these topics. A previous assessment of ethylbenzene is available on the IRIS website fhttps://cfpub.epa.gov/ncea/iris2/chemicalLanding.cfm?substance nmbr=51] (U.S. EPA. 1991b). An oral RfD of 1 x 10"1 mg/kg-day was posted in 1987 based on hepatic and renal toxicity. An inhalation RfC of 1 mg/m3 was posted in 1991 based on developmental toxicity. In 1988 the cancer weight of evidence for ethylbenzene was categorized as "Group D," that is, not classified concerning its potential to cause cancer in humans, due to a lack of animal and human data. Since then, several relevant studies on ethylbenzene toxicity have been completed and new data have become available. 2.1.1. Physical and Chemical Properties Ethylbenzene is a colorless flammable liquid with a sweet, gasoline-like odor fATSDR. 20101. Various physical and chemical properties are presented in Table 2-1 below. Table 2-1. Predicted or experimental physicochemical properties of ethylbenzene Characteristic or property (unit) Value3 Reference Chemical structure H^C / \ ^3 U.S. EPA (2021) CASRN 100-41-4 U.S. EPA (2021) Synonyms 1-ethylbenzene, alpha- methyltoluene, ethylbenzol, phenylethane, EB U.S. EPA (2021) Color/form colorless liquid U.S. EPA (2021) This document is a draft for review purposes only and does not constitute Agency policy. 2-1 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Protocol for the Ethylbenzene IRIS Assessment Characteristic or property (unit) Value3 Reference Molecular formula CsHsCHzCHs U.S. EPA (2021) Molecular weight (g/mol) 106.168 U.S. EPA (2021) Density (g/cm3) 0.879b U.S. EPA (2021) Boiling point (°C) 136 U.S. EPA (2021) Melting point (°C) -95.0 U.S. EPA (2021) Heat of formation (kJ/mol) -12.55 ANL (2021) Log Kow 3.15 U.S. EPA (2021) Koc(L/kg) 170 U.S. EPA (2021) Henry's law constant (atm-m3/mol) 7.88 x 10"3 U.S. EPA (2021) Solubility in water (mol/L) 1.64 x 10"3 U.S. EPA (2021) Vapor pressure (mmHg) 9.60 U.S. EPA (2021) 1 ppm = 4.34 mg/m3 at 25 °C (ATSDR, 2010). aWhen available, average experimental values are reported from U.S. EPA (2021) Chemicals Dashboard (Ethylbenzene DTXSID3020596): https://comptox.epa.gov/dashboard/chemical/details/DTXSID3020596. Predicted values are provided when experimental values are not available but may be less reliable than experimental values. 2.1.2. Sources, Production, and Uses Ethylbenzene can be found naturally in crude petroleum and in numerous man-made products for industrial and consumer use. Exposure to ethylbenzene can occur via releases to the air, water, and soil during the manufacturing process fATSDR. 2010] and from burning fossil fuels (automobile exhaust and small gasoline engines). Ethylbenzene is produced by the alkylation of benzene with ethylene in liquid-phase or by vapor-phase reaction of benzene with dilute ethylene (Cannella. 2007: Welch etal.. 2005: Ranslev. 1984: Clayton and Clayton. 1981). Newer methods employ synthetic zeolites for alkylation in the liquid phase or narrow pore synthetic zeolites in the vapor phase (Welch etal.. 2005). Other methods include dehydrogenation of naphthenes, preparation from acetophenone, separation from mixed xylenes via fractionation, reaction of ethylmagnesium bromide and chlorobenzene, extraction from coal oil, and recovery from benzene-toluene-xylene (BTX) processing fClavton and Clayton. 19811 fWelchetal.. 2005: Ranslev. 19841. Ethylbenzene can be found in a variety of products including gasoline, paints, inks, varnishes, pesticides, carpet glues, tobacco products, and automobile products. The majority of produced ethylbenzene is used in the production of styrene (ATSDR. 2010). This document is a draft for review purposes only and does not constitute Agency policy. 2-2 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Protocol for the Ethylbenzene IRIS Assessment 2.1.3. Environmental Fate and Transportation While ethylbenzene is widespread in the environment and detected in air, water, and soil but it is not considered to be highly persistent In the air it is removed via photochemically generated hydroxyl radicals with a half-life of approximately 1-2 days. Ethylbenzene undergoes biodegradation under aerobic conditions and indirect photolysis in soil and water. Volatilization from water and soil surfaces is expected to be an important environmental fate process for ethylbenzene based on the vapor pressure and Henry's law constant. On the basis of the soil adsorption coefficient (Koc), ethylbenzene is expected to possess moderate mobility fATSDR. 20101. 2.1.4. Potential for Human Exposure and Populations with Potentially Greater Exposure Exposure of the general population to ethylbenzene is from inhalation of contaminated air, ingestion of contaminated drinking water and foods, and dermal contact from contaminated soil and water. The predominate exposure to the general population is via inhalation of contaminated air from automobile exhaust Additionally, the general population can be exposed to ethylbenzene from use of consumer products containing ethylbenzene [e.g., gasoline, paints, varnishes, inks, solvents, pesticides, coatings, and tobacco smoke fATSDR. 20101], Populations with potentially greater exposure to ethylbenzene include people living near facilities that manufacture, contain, or use ethylbenzene (e.g., petroleum refineries, hazardous waste disposal sites, chemical plants) and people working or residing in high traffic areas. People who obtain their drinking water from residential wells downstream from uncontrolled landfills, leaking underground storage tanks, and hazardous waste sites, which are contaminated with ethylbenzene, could potentially have a greater oral and dermal exposure. Populations that may experience exposures greater than those of the general population may include individuals employed in the petroleum refinery industry, paint, solvents, and inks industry, styrene producing industries, as well as those involved in the manufacture of ethylbenzene and products that contain ethylbenzene fATSDR. 20101. 2.2. SCOPING AND PROBLEM FORMULATION SUMMARY The IAP for ethylbenzene was released in September 2017 fU.S. EPA. 2017bl. On September 27-28, 2017, the IAP was discussed at a Science Advisory Board Chemical Assessment Advisory Committee (SAB CAAC) meeting fhttps://sab.epa.gov/ords/sab/f?p=l 00:19:35 7446572 263 31 in which EPA sought input from the scientific community and interested parties.1 This protocol considers input received on the 2017 IAP. However, in 2019 the ethylbenzene assessment was 1 Dissemination of scoping and problem formulation activities for public comment in IAPs began in 2017 as part of the IRIS Program's implementation of systematic review. However, there were prior problem formulation efforts on ethylbenzene that informed the IAP. Earlier scoping and problem formulation materials were released in July 2014 fU.S. EPA. 2014bl and presented at a public science meeting on September 3, 2014 (https://www.epa.gov/iris/iris-bimonthly-public-meeting-sep-2014]. This document is a draft for review purposes only and does not constitute Agency policy. 2-3 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 1 suspended due to changes in how EPA identified priorities for the IRIS Program fApril 2019 IRIS 2 Program Outlook! In 2021 the assessment work was restarted after it was nominated by EPA's 3 Office of Land and Emergency Management (OLEM) and Region 2 as a priority need (see Table 2-2). 4 Interest was also expressed by the Office of Chemical Safety and Pollution Prevention (OCSPP) 5 because ethylbenzene is on the TSCA Work Plan list Table 2-2. EPA program and regional office interest in an updated ethyl jenzene assessment Program or regional office Oral Inhalation Statutes/ regulations Anticipated uses/interest OLEM V V CERCLA Ethylbenzene has been identified as a contaminant of concern at numerous contaminated waste sites. CERCLA authorizes EPA to conduct short- or long-term cleanups at Superfund sites and later recover cleanup costs from potentially responsible parties. Ethylbenzene toxicological information may be used to make risk determinations for response actions (e.g., short-term removals, long-term remedial response actions, RCRA Corrective Action). Region 2 V V CERCLA Region 2 contains 106 Superfund sites with ethylbenzene contamination. These include landfills, oil refineries, trucking facilities, former manufacturing facilities, and federal facilities. OCSPP V V TSCA Ethylbenzene was identified on the 2014 update of the TSCA Work Plan for Chemical Assessments. CERCLA = Comprehensive Environmental Response, Compensation, and Liability Act; OCSPP = Office of Chemical Safety and Pollution Prevention; OLEM = Office of Land and Emergency Management; RCRA = Resource Conservation and Recovery Act; TSCA = Toxic Substances Control Act. 2.3. KEY SCIENCE ISSUES 6 The 2017 IAP for ethylbenzene identified several key science issues that would require 7 additional review and focus that were not covered in the previous assessment (U.S. EPA. 1991b). 8 These key science issues continue to be of interest to EPA, as reflected in this protocol, in 9 developing the ethylbenzene IRIS assessment: 10 • Interspecies difference in the pharmacokinetics of ethylbenzene. While there is evidence 11 suggesting that ethylbenzene metabolism is critical to understanding its toxic effects, 12 interspecies differences in the pharmacokinetics of ethylbenzene including metabolic 13 biotransformation have been noted. Thus, one may need to apply toxicokinetic and This document is a draft for review purposes only and does not constitute Agency policy. 2-4 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment dosimetry modeling (possibly including PBPK modeling) to account for interspecies differences, as appropriate. The selection of appropriate dose metrics to inform the toxicity assessment and human relevance for cancer and noncancer hazards observed in experimental systems (e.g., rat renal toxicity and tumors, mouse lung toxicity and tumors). Mechanisms of neurotoxicity including ototoxicity. o Reversibility, persistence, or potential for progression of the neurobehavioral or ototoxic effects after humans are removed from ethylbenzene exposure. o The relevance of ototoxicity to humans at lower exposure levels. This document is a draft for review purposes only and does not constitute Agency policy. 2-5 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Protocol for the Ethylbenzene IRIS Assessment 3. OVERALL OBJECTIVES AND SPECIFIC AIMS 3.1. OBJECTIVES The overall objective of this assessment is to identify adverse health effects of ethylbenzene exposure and characterize exposure-response relationships for these effects to support development of toxicity values. This assessment will use systematic review methods to evaluate the epidemiological and toxicological literature, including consideration of relevant mechanistic evidence for ethylbenzene. The assessment methods described in this protocol utilize EPA guidelines2. 3.2. SPECIFIC AIMS • Develop a systematic evidence map (SEM) to identify epidemiological (i.e., human), toxicological (i.e., experimental animal), and supplemental literature pertinent to characterizing the health effects of exposure to ethylbenzene. The PECO criteria used to develop the SEM (referred to as "problem formulation PECO") is intended to identify the amount and type of evidence available to address a particular topic and is a useful scoping tool for health effects assessments fThayer etal.. 2022: NASEM. 2021: Wolffe etal.. 20191. • Supplemental material content includes: mechanistic studies, including in vivo, in vitro, ex vivo, or in silico models; nonmammalian model systems; pharmacokinetic and absorption, distribution, metabolism, and excretion (ADME) studies; pharmacokinetic (PK) or physiologically based pharmacokinetic (PBPK) models; exposure characteristics (no health outcome); data pertinent to identify susceptible populations, mixture studies; non-PECO routes of exposure; case studies; records with no original data; conference abstracts, and errata. • Use the results of the SEM to (1) develop PECO criteria for the assessment (referred to as "assessment PECO"); (2) define the unit(s) of analysis at the level of endpointor health outcome for hazard characterization; and (3) identify priority analyses of supplemental material to address the specific aims, uncertainties in hazard characterization, susceptibility, and dose-response analysis. • Conduct study evaluations (risk of bias and sensitivity) for individual epidemiological and toxicological studies that meet assessment PECO criteria. • Conduct a scientific and technical review for PBPK models considered for use in the assessment If a PBPK or PK model is selected for use, the most reliable dose metric will be 2EPA guideline documents: http://www.epa.gov/iris/basic-information-about-integrated-risk-information- svstem#guidance/. This document is a draft for review purposes only and does not constitute Agency policy. 3-1 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Protocol for the Ethylbenzene IRIS Assessment applied based on analyses of the available dose metrics and the outcomes to which they are being applied. • Conduct data extraction (summarizing study methods and results) from epidemiological and animal toxicological studies that meet the assessment PECO criteria. • For each evidence stream, and for each unit of analysis, use a structured framework to develop and describe the certainty of evidence across studies and the supporting rationale ("evidence synthesis"). Depending on the specific health endpoint or outcome, mechanistic information and precursor events may be included in a unit of analysis. • For each health effect category, use a structured framework to develop and describe weight of evidence judgments across evidence streams and the supporting rationale for those judgments ("evidence integration"). The evidence integration analysis presents inferences and conclusions on human relevance of findings in animals, cross-evidence stream coherence, potentially susceptible populations and lifestages, biological plausibility, and other critical inferences supported by mechanistic, ADME, or PK/PBPK analyses. • For each health effect category, summarize evidence synthesis (certainty of evidence) and evidence integration (weight of evidence) conclusions in an evidence profile table. • As supported by the currently available evidence, derive chronic and subchronic inhalation reference concentrations (RfCs) and reference doses (RfDs) and organ- or system-specific RfCs and RfDs. Apply pharmacokinetic and dosimetry modeling (possibly including PBPK modeling) to account for interspecies differences, as appropriate. Derive an inhalation unit risk (IUR) and oral cancer slope factor (OSF) as appropriate. Characterize confidence in any toxicity values that are derived. • Characterize uncertainties and identify key data gaps and research needs, such as limitations of the evidence database, and consideration of dose relevance and pharmacokinetic differences when extrapolating findings from higher dose animal studies to lower levels of human exposure. This document is a draft for review purposes only and does not constitute Agency policy. 3-2 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 Protocol for the Ethylbenzene IRIS Assessment 4. LITERATURE SEARCH, SCREENING, AND INVENTORY The literature search and screening processes described in this section were used to develop an SEM using the problem formulation PECO (see Section 4.1) and supplemental screening criteria (see Section 4.2) to guide the inclusion of studies. The resulting inventory of studies identified in the SEM was used to develop assessment PECO criteria and identify priority analyses of supplemental material (described in Section 5). The initial literature search as well as all subsequent literature search updates use the same literature search and screening process, and therefore the literature inventory is continually updated with new studies as the assessment progresses. 4.1. POPULATIONS, EXPOSURES, COMPARATORS, AND OUTCOMES (PECO) CRITERIA FOR THE SYSTEMATIC EVIDENCE MAP PECO criteria are used to focus the assessment question(s), search terms, and inclusion criteria. To meet the PECO criteria a study must meet all PECO elements. The problem formulation PECO criteria used to develop the SEM were intentionally broad to identify all the available evidence in humans and animal models. Table 4-1. Problem formulation populations, exposures, comparators, and outcomes (PECO) criteria for the ethylbenzene assessment PECO element Evidence Populations Human: All populations and life stages (e.g., children, general population, occupational, or high exposure from an environmental source). The following study designs will be considered most informative: controlled exposure, cohort, case-control, cross-sectional, and ecological. Note: Case reports and case series will be tracked during study screening but are not the primary focus of this assessment. They may be retrieved for full-text review and subsequent evidence synthesis if no or few more informative study designs are available. Case reports also can be used as supportive information to establish biological plausibility for some target organs and health outcomes. Animal: Nonhuman, mammalian, animal species (whole organism) of anv life stage (including preconception, in utero, lactation, peripubertal, and adult stages). Exposures Human: Exposure to ethvlbenzene (CASRN 100-41-4), including occupational exposures, alone or as a mixture by any route. Measures of metabolites used to estimate exposures to ethylbenzene. Animal: Exposure to ethvlbenzene (CASRN 100-41-4) alone by the oral or inhalation route. Studies employing chronic exposures will be considered the most informative. Studies involving exposures to mixtures will be included only if they include a group with exposure to ethylbenzene alone. This document is a draft for review purposes only and does not constitute Agency policy. 4-1 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment PECO element Evidence Comparators Human: Anv comparison or reference group exposed; lower levels of ethvlbenzene, no exposure to ethylbenzene, or to ethylbenzene for shorter periods of time. Animal: Quantitative exposure vs. lower or no exposure with concurrent vehicle control group. Outcomes All health outcomes (both cancer and noncancer). In general, endpoints related to clinical diagnostic criteria, disease outcomes, histopathological examination, or other apical/phenotypic outcomes will be prioritized for evidence synthesis over outcomes such as biochemical measures. Notes: Studies meeting PECO criteria may also contain supplemental mechanistic content that describes biological or chemical events associated with phenotypic effects. When this occurs, these studies are also tagged as having supplemental mechanistic information. This typically happens during full-text review. Full-text retrieval is performed for studies of transgenic model systems that meet E and C criteria because they may present phenotypic information in wildtype animals that meet P and 0 criteria but is not reported in the abstract. CASRN = Chemical Abstract Service registry number. 4.2. SUPPLEMENTAL CONTENT SCREENING CRITERIA 1 During the literature screening process, studies containing information that may be 2 potentially relevant to the specific aims of the assessment are tagged as supplemental material by 3 category. Some studies could emerge as being critically important to the assessment and may need 4 to be evaluated and summarized at the individual study level (e.g., certain cancer MOA or ADME 5 studies), or might be helpful to provide context (e.g., provide hazard evidence from routes or 6 durations of exposure not meeting the assessment PECO), or might not be cited at all in the 7 assessment (e.g., individual studies that contribute to a well-established scientific conclusion). 8 Because it is often difficult to assess the impact of individual studies tagged as supplemental 9 material on assessment conclusions at the screening stage, the tagging structure, described in 10 Table 4-2, allows for easy retrieval later in the assessment process. This document is a draft for review purposes only and does not constitute Agency policy. 4-2 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 4-2. Categories of potentially relevant supplemental material Category (tag) Description Typical assessment use Pharmacokinetics data potentially informative to assessment analyses Classical pharmacokinetic (PK) or physiologically based pharmacokinetic (PBPK) model studies Classical Pharmacokinetic or Dosimetry Model Studies: Classical PK or dosimetry modeling usually divides the body into just one or two compartments, which are not specified by physiology, where movement of a chemical into, between, and out of the compartments is quantified empirically by fitting model parameters to ADME (absorption, distribution, metabolism, and excretion) data. This category is for papers that provide detailed descriptions of PK models but are not PBPK models. • The data are typically the concentration time course in blood or plasma after inhalation exposure, but other exposure routes (i.e., oral and or intravenous administration) can be described. Physiologically Based Pharmacokinetic or Mechanistic Dosimetry Model Studies: PBPK models represent the body as various compartments (e.g., liver, lung, slowly perfused tissue, richly perfused tissue) to quantify the movement of chemicals or particles into and out of the body (compartments) by defined routes of exposure, metabolism, and elimination, and thereby estimate concentrations in blood or target tissues. • A defining characteristic is that key parameters are determined from a substance's physicochemical parameters (e.g., particle size and distribution, octanol- water partition coefficient) and physiological parameters (e.g., ventilation rate, tissue volumes). PBPK and PK model studies are included in the assessment and evaluated for possible use in conducting quantitative extrapolations. PBPK/PK models are categorized as supplemental material with the expectation that each one will be evaluated for applicability to address assessment extrapolation needs and technical conduct. Specialized expertise is required for their evaluation. Standard operating procedures for PBPK/PK model evaluation and the identification, organization, and evaluation of ADME studies are outlined in An Umbrella Quality Assurance Project Plan (QAPP) for PBPK models (U.S. EPA, 2018b). Pharmacokinetic (ADME) Pharmacokinetic (ADME) studies are primarily controlled experiments, where defined exposures usually occur by intravenous, oral, inhalation, or dermal routes, and the concentration of particles, a chemical, or its metabolites in blood or serum, other body tissues, or excreta are then measured. • These data are used to estimate the amount absorbed (A), distributed (D), metabolized (M), and/or excreted (E). • ADME data can also be collected from human subjects who have had environmental or workplace exposures that are not quantified or fully defined. • ADME data, especially metabolism and tissue partition coefficient information, can be generated using in vitro model systems. Although in vitro data may not be as definitive as in vivo data, these studies should also be tracked as ADME studies are inventoried and prioritized for possible inclusion in an ADME synthesis section on the chemical's PK properties and for conducting quantitative adjustments or extrapolations (e.g., animal-to-human). Specialized expertise in PK is necessary for inventory and prioritization. Standard operating procedures for PBPK/PK model evaluation and the identification, organization, and This document is a draft for review purposes only and does not constitute Agency policy. 4-3 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Category (tag) Description Typical assessment use ADME. For large evidence bases it may be appropriate to separately track the in vitro ADME studies. *Studies describing environmental fate and transport or metabolism in bacteria or model systems that are not applicable to humans or animals should not be tagged. evaluation of ADME studies are outlined in An Umbrella Quality Assurance Project Plan (QAPP) for PBPKmodels (U.S. EPA, 2018b). Supplemental evidence potentially informative to assessment analyses Mechanistic (cancer) Mechanistic (noncancer) Studies that do not meet PECO criteria but report measurements that inform the biological or chemical events associated with phenotypic effects related to a health outcome. Experimental design may include in vitro, in vivo (by various routes of exposure; includes all transgenic models), ex vivo, and in silico studies in mammalian and nonmammalian model systems. Studies using New Approach Methodologies (NAMs; e.g., in vitro high throughput testing strategies, read-across applications) are also categorized here. Studies where the chemical is used as a laboratory reagent (e.g., as a chemical probe used to measure antibody response) generally should not be tagged. Mechanistic evidence can also help identify factors contributing to susceptibility; these studies should also be tagged "susceptible populations." [Notes: During screening, especially at the title and abstract (TIAB) level, it may not be readily apparent for studies that meet P, E, and C criteria if the endpoint(s) in a study are best classified as phenotypic or mechanistic with respect to the 0 criteria. In these cases, the study should be screened as "unclear" during TIAB screening, and a determination made based on full-text review (in consultation with a content expert as needed). Full-text retrieval is performed for studies of transgenic model systems that meet E and C criteria to determine if they include phenotypic information in wildtype animals that meet P and 0 criteria that is not reported in the abstract.] Prioritized studies of mechanistic endpoints are described in the mechanistic synthesis sections; subsets of the most informative studies may become part of the units of analysis. Mechanistic evidence can provide support for the relevance of animal effects to humans and biological plausibility for evidence integration judgments (including MOA analyses, e.g., using the MOA framework in the US EPA Cancer Guidelines (2005a)). This document is a draft for review purposes only and does not constitute Agency policy. 4-4 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Category (tag) Description Typical assessment use Non-PECO animal model Studies reporting outcomes in animal models that meet the outcome criteria but do not meet the population criteria in the PECO. Depending on the endpoints measured in these studies, they can also provide mechanistic information (in these cases studies should also be tagged "mechanistic endpoints"). This categorization generally does not apply to studies that use species with limited human health relevance (e.g., ecotoxicity-focused studies are typically excluded). Studies of non-PECO animals, exposures, or durations can be summarized to inform evaluations of consistency (e.g., across species or routes or durations), coherence, or adversity; subsets of the most informative studies may be included in the unit of analysis. These studies may also be used to inform evidence integration judgments of biological plausibility and/or MOA analyses and thus may be summarized as part of the mechanistic evidence synthesis. Non-PECO route of exposure Epidemiological or animal studies that use a non-PECO route of exposure, e.g., injection studies or dermal studies if the dermal route is not part of the exposure criteria. This categorization generally does not apply to epidemiological studies where the exposure route is unclear; such studies are considered to meet PECO criteria if the relevant route(s) of exposure are plausible, with exposure being more thoroughly evaluated at later steps. Susceptible population Studies that help to identify potentially susceptible subgroups, including studies on the influence of intrinsic factors such as sex, lifestage, or genotype to toxicity, as well as some other factors (e.g., health status). These are often co-tagged with other supplemental material categories, such as mechanistic or ADME. Studies meeting PECO criteria that also address susceptibility should be co-tagged as supplemental. *Susceptibility based on most extrinsic factors, such as increased risk for exposure due to residential proximity to exposure sources, is not considered an indicator of susceptible populations for the purposes of IRIS assessments. Provides information on factors that might predispose sensitive populations or lifestages to a higher risk of adverse health effects following exposure to the chemical. This information is summarized during evidence integration for each health effect and is considered during dose-response, where it can directly impact modeling decisions. Background information potentially useful to problem formulation and protocol development (These studies fall outside the scope of IRIS assessment analyses) Human exposure and biomonitoring (no health outcome) Information regarding exposure monitoring methods and reporting that are unrelated to health outcomes, but which provide information on the following: methods for measuring human exposure, biomonitoring (e.g., detection of chemical in blood, urine, hair), defining exposure sources, or modeled estimates of exposure (e.g., in occupational settings). Studies that compare exposure levels to a reference value, risk threshold or assessment points of departure are also included in this This information may be useful for developing exposure criteria for study evaluation or refining problem formulation decisions. This document is a draft for review purposes only and does not constitute Agency policy. 4-5 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Category (tag) Description Typical assessment use category. Studies related to environmental fate and transport are typically tagged as background materials unless otherwise described in the assessment-specific protocol. * Assessment teams may want to subtag studies that describe or predict exposure levels versus those that present exposure assessment methods. Notably, providing an assessment of typical human exposures (e.g., sources, levels) falls outside the scope of an IRIS assessment. Mixture study Mixture studies use methods that do not allow investigation of the health effects of exposure to the chemical of interest by itself (e.g., animal studies that lack exposure to chemical of interest alone or epidemiology studies that do not evaluate associations of the chemical of interest with relevant health outcome(s)). * Methods used to assess investigation of the exposure by itself may not be clear from the abstract, in particular for epidemiology studies. When unclear, the study is advanced to full-text review to determine eligibility. Mixture studies are tracked to help inform cumulative risk analyses, which may provide useful context for risk assessment but fall outside the scope of an IRIS assessment. Case reports or case series Human studies that present an investigation of a single exposed individual or group of <3 subjects who describe health outcomes after exposure but lack a comparison group (i.e., do not meet the "C" in the PECO) and typically do not include reliable exposure estimates. Tracking case studies can facilitate awareness of potential human health issues missed by other types of studies during problem formulation. Reference materials Records with no original data Records that do not contain original data, such as other agency assessments, informative scientific literature reviews, editorials, or commentaries. Studies that are tracked for potential use in identifying missing studies, background information, or current scientific opinions (e.g., hypothesized MOAs). Posters or conference abstracts Records that do not contain sufficient documentation to support study evaluation and data extraction. This document is a draft for review purposes only and does not constitute Agency policy. 4-6 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Protocol for the Ethylbenzene IRIS Assessment 4.3. LITERATURE SEARCH STRATEGIES 4.3.1. Database Search Term Development Literature search strategies are developed using key terms and words related to the PECO criteria. Development of the search strategy for each topic area is conducted by identifying relevant search terms through the following approaches: (1) reviewing PubMed's Medical Subject Headings (MeSH) for relevant and appropriate terms, (2) extracting key terminology from relevant reviews and a set of previously identified primary data studies known to be relevant to the topic ("test set"), and (3) reviewing search strategies presented in other reviews. Relevant subject headings and text- words are crafted into a search strategy designed to maximize the sensitivity and specificity of the search results. The search strategy is run, and the results assessed to ensure that all previously identified relevant primary studies are retrieved in the search. The database search terms focused only on the chemical name (and synonyms or trade names) with no additional limits. Because each database has its own search architecture, the resulting search strategy is tailored to account for each database's unique search functionality. 4.3.2. Database Searches Searches are not restricted by publication date and no language restrictions are applied. The detailed search strategies are presented in Appendix A. Literature searches are conducted using EPA's Health and Environmental Research Online (HERO) database.3 The following databases are searched as described in the IRIS Handbook (U.S. EPA. 2022): • PubMed (National Library of Medicine) • Web of Science (Thomson Reuters) • Toxline (National Library of Medicine) - Searched through December 2019, after which Toxline content was moved to PubMed (National Library of Medicine) products. • Toxic Substances Control Act Test Submissions (TSCATS) database The literature searches are updated throughout the assessment's development and review process to identify newly published literature. During this period, studies are screened according to both the problem formulation and assessment PECO criteria. Thus, the literature inventory is updated during the process of developing the draft assessment. The last full literature search update is conducted several months prior to the planned release of the draft document for public comment. Studies identified after peer review begins are only considered for inclusion if they are 3Health and Environmental Research Online: https: //hero.epa.gov/hero/. This document is a draft for review purposes only and does not constitute Agency policy. 4-7 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Protocol for the Ethylbenzene IRIS Assessment directly relevant to the assessment PECO criteria and are expected to fundamentally alter the draft assessment conclusions. 4.3.3. Searching Other Sources The literature search strategy described above was designed to be broad, but like any search strategy, studies can be missed [e.g., cases where the specific chemical is not mentioned in title, abstract, or keyword content; ability to capture "gray" literature (studies not reported in the peer-reviewed literature) that is not indexed in the databases listed above]. Thus, in addition to the database searches, the sources below are used to identify studies that could have been missed based on the database search. Searching of these resources occurs during preparation of the initial literature inventory when assembling the SEM. After preparation of the initial literature inventory, references can be identified during public comment periods, by technical consultants, and during peer review. Records that appear to meet the problem formulation PECO criteria are uploaded into a screening software, annotated with respect to source of the record, and screened using the methods described in Section 4.4. Appendix B describes the specific methods and results for searching the sources below. Searching of these sources is summarized to include the source type or name, the search string (when applicable), number of results present within the resource, and the URL (uniform resource locator, when available and applicable). The list of other sources consulted includes: • Manual review (at the title level) of reference list in studies screened as meeting problem formulation PECO after full-text review. • Manual review (at the title level) of the reference list from other publicly available final or draft assessments from other non-EPA Agencies (e.g., ATSDR [Agency for Toxic Substances and Disease Registry] Toxicological Profile) or published journal review specifically focused on human health. Reviews can be identified from the database search or from the resources listed in Appendix B. • European Chemicals Agency (ECHA) registration dossiers to identify data submitted by registrants http://echa.europa.eu/information-on-chemicals/information-from-existing- substances-regulation. • EPA ChemView database (U.S. EPA. 2019a) to identify unpublished studies, information submitted to EPA under Toxic Substances Control Act (TSCA) Section 4 (chemical testing results), Section 8(d) (health and safety studies), Section 8(e) (substantial risk of injury to health or the environment notices), and FYI (For Your Information, voluntary documents). Other databases accessible via ChemView include EPA's High Production Volume (HPV) Challenge database and the Toxic Release Inventory database. • The National Toxicology Program (NTP) database of study results and research projects (https://ntp.niehs.nih.gov/results/index.html). This document is a draft for review purposes only and does not constitute Agency policy. 4-8 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Protocol for the Ethylbenzene IRIS Assessment • The Organization for Economic Cooperation and Development (OECD) Screening Information DataSet (SIDS) High Production Volume Chemicals https://www.echemportal.org/echemportal/substance-search. • The EPA CompTox (Computational Toxicology Program) Chemical Dashboard fU.S. EPA. 2019b) to retrieve a summary of any ToxCast or Tox21 high throughput screening information. This data can be used to generate mechanistic insight, predict outcome using appropriate models, and potentially inform dose-response modeling. Their importance for outcome prediction and dose-response modeling depends on the context, size and quality of retrieved results and the lack of availability of other data typically used for these purposes. • Review of the list of references in the ECOTOX database for the chemical(s) of interest • References identified during public comment periods, by technical consultants, and during peer review. 4.3.4. Non-Peer-Reviewed Data IRIS assessments rely mainly on publicly accessible, peer-reviewed studies. However, it is possible that unpublished data directly relevant to the PECO may be identified during assessment development In these instances, the EPA will try to get permission to make the data publicly available (e.g., in HERO); data that cannot be made publicly available are not used in IRIS assessments. In addition, on rare occasions where unpublished data would be used to support key assessment decisions (e.g., deriving a toxicity value), EPA may obtain external peer review if the owners of the data are willing to have the study details and results made publicly accessible, or if an unpublished report is publicly accessible (or submitted to EPA in a nonconfidential manner) (U.S. EPA. 2015). This independent, contractor driven, peer review would include an evaluation of the study similar to that for peer review of a journal publication. The contractor would identify and typically select three scientists knowledgeable in scientific disciplines relevant to the topic as potential peer reviewers. Persons invited to serve as peer reviewers would be screened for conflict of interest In most instances, the peer review would be conducted by letter review. The study and its related information, if used in the IRIS assessment, would become publicly available. In the assessment, EPA would acknowledge that the document underwent external peer review managed by the EPA, and the names of the peer reviewers would be identified. In certain cases, IRIS will assess the utility of a data analysis of accessible raw data (with descriptive methods) that has undergone rigorous quality assurance/quality control review (e.g., ToxCast/Tox21 data, results of NTP studies not yet published) but that have not yet undergone external peer review. Unpublished data from personal author communication can supplement a peer-reviewed study as long as the information is made publicly available. If such ancillary information is acquired, it is documented in the Health Assessment Workspace Collaborative (HAWC) or HERO project page (depending on the nature of the information received). This document is a draft for review purposes only and does not constitute Agency policy. 4-9 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Protocol for the Ethylbenzene IRIS Assessment 4.4. LITERATURE SCREENING Records identified from the literature searches are housed in HERO. After deduplication in HERO, records are imported into SWIFT Review software (Howard etal.. 20161 to identify those references most likely to be applicable to a human health assessment Briefly, SWIFT Review has preset literature search strategies ("filters") developed and applied by information specialists to identify studies more likely to be useful for identifying human health content from those that likely are not (e.g., analytical methods). The filters function like a typical search strategy in which studies are tagged as belonging to a certain filter if the terms in the filter literature search strategy appear in title, abstract, keyword or medical subject headings (MeSH) fields content The applied SWIFT Review filters focused on lines of evidence: human, animal models for human health, and in vitro studies. The details of the search strategies that underlie the filters are available online fhttps: //www.sciome.com /swift-review/searchstrategies/]. Studies not retrieved using these filters are not considered further. Studies that included one or more of the search terms in the title, abstract, keyword, or MeSH fields are exported as a RIS (Research Information System) file for title and abstract (TIAB) and full-text screening in DistillerSR (Evidence Partners; https: //distillercer.com/products/distillersr-systematic-review-software/). as described below. The impact of application of the SWIFT evidence stream filters on the number of studies for TIAB screening is presented in Figure 4-1. 4.4.1. Title and Abstract Screening The studies prioritized by SWIFT Review are imported into DistillerSR software for TIAB screening by two independent reviewers. Reviewers complete a structured form asking whether a study meets PECO criteria or contains potentially relevant supplemental material. Studies considered relevant or "unclear" based on meeting all PECO criteria at the TIAB level are considered for inclusion and advanced to full-text screening. Any screening conflicts are resolved by discussion between the primary screeners with consultation by a third reviewer, if needed. For citations with no abstract, articles are initially screened based on the following: title relevance (title should indicate clear relevance), and page length (articles two pages in length or less are assumed to be conference reports, editorials, or letters). Eligibility status of non-English studies is assessed using the same approach with online translation tools or engagement with a native speaker. 4.4.2. Full-Text Screening Full-text references are sought through EPA's HERO database for studies screened as meeting problem formulation PECO criteria, potentially relevant supplemental material, or "unclear" based on TIAB screening. Full-text screening occurs in Distiller SR. Full-text copies of these citations are retrieved, stored in the HERO database, and independently assessed by two screeners using a structured form in DistillerSR to confirm eligibility. Screening conflicts are This document is a draft for review purposes only and does not constitute Agency policy. 4-10 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Protocol for the Ethylbenzene IRIS Assessment resolved by discussion among the primary screeners with consultation by a third reviewer or technical advisor (as needed to resolve any remaining disagreements). Rationales for excluding citations are documented, e.g., study did not meet problem formulation PECO, full-text not available. Approaches for language translation include online translation tools or engagement of a native speaker. Fee-based translation services for non-English studies are typically reserved for studies that are anticipated as being useful for toxicity value derivation. Conflicts between screeners in applying the supplemental material tags are resolved similarly, erring on the side of over tagging. Note that more granular sub-tagging of supplemental material occurs during preparation of the literature inventory as described in Section 4.5.2. 4.4.3. Multiple Citations with the Same Data When there are multiple citations using the same or overlapping data, all citations are included, with one selected for use as the primary citation; the others are considered as secondary publications with annotation in HAWC and HERO indicating their relationship to the primary citation during data extraction. For epidemiology studies, the primary citation is generally the one with the longest follow-up, the largest number of cases, or the most recent publication date. For animal studies, the primary citation is typically the one with the longest duration of exposure, the largest sample size, or with the outcome(s) most informative to the problem formulation PECO. For both epidemiology and animal studies, the assessments include relevant data from all citations of the study, although if the same data are reported in more than one citation, the data are only extracted once (see Section 7). For corrections, retractions, and other companion documents to the included citations, a similar approach to annotation is taken and the most recently published data are incorporated into the assessments. 4.4.4. Literature Flow Diagrams The results of the screening process are posted on the project page for the assessment in the HERO database fhttps://heronet.epa.gov/heronet/index.cfm/proiect/page/project id/59). Results for SEM screening against the problem formulation PECO are also summarized in a literature flow diagram (see Figure 4-1) and interactive HAWC literature tag trees (see Figure 4-4). This document is a draft for review purposes only and does not constitute Agency policy. 4-11 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Ethylbenzene Literature Searches (up to January 2022) PubMed (n = 3242) WOS (n = 1742) Toxline3 (n = 2676) TSCATS3 (n = 245) Other Strategies'1 (n = 80) 7273 Records from the Initial April 2019 Literature Search SWIFT Review software applied to identify of potentially relevant records based on evidence stream and health outcome tags (n = 1266} 712 Records from Literature Search Updates April 2020 (n = 211) November 2020 (n = 209) January 2022 (n = 292) Title and Abstract in DistillerSR (n = 1716 after duplicate removal) 1 r Full-Text (n = 1032) r Excluded (n = 684) Not relevant to PECO and not considered supplemental Excluded (n = 212) Not relevant to PECO (n = 167) Unable to obtain full text (n = 45) Studies Meeting PECO (n = 112) • Human health effect records (n = 52) • Animal health effect records (n = 60) Tagged as Supplemental Material (n = 708c) PBPK (n = 32) ADME (n = 82) Mechanistic (cancer) (n = 29) Mechanistic (noncancer) (n = 84) Non-PECO animal model (n = 10) Non-PECO route of exposure (n - 38) Susceptible population (n = 10) Human exposure and biomonitoring (n = 166) Mixture studies (n = 113) Case report or case series (n = 6) Records with no original data (n = 201) Posters or conference abstracts (n = 23) Other supplemental (n = 29) Figure 4-1. Literature flow diagram for ethylbenzene. aToxline and TSCATS only included in Apr 2019 search. bOther strategies include the following sources of gray literature: ToxVal, CEBS, ECHA, ChemView, and OECD SIDS); Jan 2022 = 3; Nov 2020 = 77. indicates the total number of unique citations that were identified; because some citations are given multiple tags, the sum of the individual supplemental material tags is greater than the total number of unique citations. This document is a draft for review purposes only and does not constitute Agency policy, 4-12' DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Protocol for the Ethylbenzene IRIS Assessment 4.5. LITERATURE INVENTORY During full-text-level screening, citations that meet problem formulation PECO criteria are categorized by evidence type (human or animal) or category of supplemental information (e.g., mechanistic, PBPK, ADME). Next, study design details for citations that meet problem formulation PECO criteria are summarized as described in Section 4.5.1. A more granular tagging of supplemental material may be conducted as described in Section 4.5.2. The results of this categorization and tagging are referred to as the literature inventory and is the key analysis output of the SEM. 4.5.1. Studies That Meet Problem Formulation PECO Criteria Human and animal studies that met problem formulation PECO criteria after full-text review are briefly summarized using DistillerSR Hierarchical Data Extraction (HDE) forms to create literature inventories which were used to display the extent and nature of the available evidence. Data extraction details for the literature inventory are presented in Section 7. These study summaries are exported from DistillerSR in Excel format and imported into Tableau software (https: //www.tableau.eom/l to create interactive literature inventory visualizations. The literature inventories are used to inform the assessment PECO criteria and evaluation plan. More detail on the process of summarizing studies is presented in Section 7 (Data Extraction of Study Methods and Results). 4.5.2. Organizational Approach for Supplemental Material The results of the supplemental material tagging conducted in DistillerSR are imported into the literature review module in HAWC, where more granular sub-tagging within a type of supplemental material content category may be conducted if determined to be useful to support assessment conclusions. A single study can have multiple tags. The degree of sub-tagging depends on the extent of content for a given type of supplemental material and needs of the assessment with respect to developing human health hazard conclusions and derivation of toxicity values. Typically, more granular tagging is most useful for supplemental content classified as mechanistic, ADME, PK/PBPK models, routes of administration not meeting the PECO, and nonmammalian model studies. Tagging judgments in HAWC are made by one assessment member and confirmed during preparation of draft assessment by another member of the assessment team. The overall approach for supplemental material content was previously described in Section 4.2. 4.6. SUMMARY-LEVEL LITERATURE INVENTORIES During TIAB or full-text-level screening, citations tagged based on problem formulation PECO eligibility were further categorized based on features such as evidence type (i.e., human, animal), health outcome(s), and/or endpoint measure(s) included in the citation. Literature inventories for PECO-relevant citations were created to develop summary-level, sortable lists that This document is a draft for review purposes only and does not constitute Agency policy. 4-13 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 1 include some basic study design information (e.g., study population, exposure information such as 2 doses administered or biomarkers analyzed, age/life stage4 of exposure, endpoints examined). 3 These literature inventories facilitate subsequent review of individual studies or sets of studies by 4 topic-specific experts. The summary results are presented in Figures 4-2 and 4-3 for human and 5 animal studies, respectively. An interactive version of these figures, including additional study 6 design details and a high-level summary of the results is available here. 4Age/life stage of chemical exposure are considered according to EPA's Guidance on Selecting Age Groups for Monitoring and Assessing Childhood Exposures to Environmental Contaminants and EPA's A Framework for Assessing Health Risk of Environmental Exposures to Children. This document is a draft for review purposes only and does not constitute Agency policy. 4-14 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment vvEPA Human Studies Examining Exposure to Ethylbenzene by Study Design and Health System Overview of Human Evidence Base Hover over column headers and click the small [+] to expand Population columns. Health System Case-Control Cohort Controlled Trial Cross-Sectional Grand Total Cancer 2 1 2 5 Cardiovascular 2 4 6 Dermal 1 3 4 Developmental 4 3 7 Hematologic 1 4 5 Hepatic 1 1 2 4 Immune 3 3 6 12 Metabolic 1 1 2 Nervous 1 3 1 4 8 Ocular 2 3 5 Other Reproductive 1 1 1 1 Respiratory 1 2 1 6 10 Sensory ¦ 2 6 Systemic/Whole Body 1 5 Grand Total 12 14 6 21 52 Notes: Column totals. Row totals, and Grand Totals indicate total numbers of distinct references. References Q 0 © © O © © Arif and Shah, 2007 (729385) Baines et aL, 2004 (1061732) Bardodej and Cirek, 1988 (1600227) Billion net et al., 2011 (733119) Cakmak et al., 2020 (6945302) Chen et al., 2018 (5068482) Choi et al., 2009 (632318) Cometto-Muniz and Cain, 1995 (783606) © Cometto-Muniz and Cain, 1997 (2859452) © Delfino et al., 2003 (50460) © Del lefratte et al2019 (6333789) © Dohertyetal., 2017 (3863578) © Everson et al., 2019 (6333776) © Exposure Measurement Methods air biomonitoring 13 direct administration (dermal) 1 direct administration (inhalation) 5 Grand Total 52 Study Details j statistically significant association | potential beneficial association | no effect(s) reported Health System Cancer Cardiovascular Population Sex Children both General population (adults) Infants both General population (adults) both female Occupational both female Pregnant women female General population (adults) both Exposure Measurement biomonitoring air biomonitoring biomonitoring Endpoints germ cell tumors (GCTs), yolk s.. retinoblastoma lifetime cancer risk Lung cancer cancer mortality germ cell tumors (GCTs), yolk s.. cardiovascular disease heart disease mortality systolic blood pressure (SBP),.. heart symptoms associated wi.. increased heart rate any cardiovascular event • con.. itching, dry, flushed, or erupte.. Itchina/rash. drv and cracked s.„ Reference Hall etal., 2019 Heck etal., 2015 Tunsaringkarn et aL, 2015 Khorrami etal., 2021 Wenzhen et al., 2022 Hall etal., 2019 Xuetal., 2009 Wenzhen et al., 2022 Everson et al., 2019 5a ke liar is etal., 2020 Moradi etal., 2019 Mannisto et al., 2015 Saijoetal., 2004 Tunsaringkarn et al., 2015 Figure 4-2. Inventory heatmap of PECO-relevant ethylbenzene human studies by study design and health system. An interactive version, which includes a list of citations with additional study details and summary of the results, is available here. This document is a draft for review purposes only and does not constitute Agency policy, 4-15 ' DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment v>EPA Toxicological Studies Examining Exposure to Ethylbenzene by Study Design and Health System Overview of Animal Evidence Base Hover over column headers and click the small [+] to expand Species columns. Health System Acute Short-Term Subchronic Chronic Developmental Reproductive Grand Total Cancer 4 4 Cardiovascular 5 3 3 11 Dermal 1 1 Developmental 10 3 11 Endocrine 1 4 1 3 1 10 Exocrine 1 2 3 Gastrointestinal 1 2 1 3 7 Hematologic 2 3 2 7 Hepatic 3 11 7 5 4 4 28 immune 4 3 3 2 1 11 Lymphatic 2 2 2 1 7 Metabolic 2 1 3 Musculoskeletal/Con.. 2 1 3 6 Nervous 5 8 4 2 2 1 22 Ocular 2 1 1 4 Other 1 1 Renal 1 11 7 7 2 4 25 Reproductive 1 8 4 4 9 4 25 Respiratory 7 7 2 4 1 1 20 Sensory 2 5 1 8 Systemic/Whole Body 7 19 6 8 4 4 41 Urinary 2 1 1 1 5 Grand Total 16 23 9 10 10 4 60 References Andersson et al.. 1981 (63026) © Battelle, 1987 (5068255) © Bio/dynamics, 1987 (2859254) © Bio/dynamics, 1987 (2859255) © Cappaert at al., 1999 (184385) © Cappaertetal., 2000(184378) © Cappaert et al., 2001 (184382) © Cappaertetal., 2002(34480) © Chan et al., 1998 (184392) © Craggetal., 1989 (63039) © Cruz et al., 2016 (3491041) © deCeaurrizetal, 1981 (62981) © ECHA Summary (Unnamed Report,.. © ECHA Summary (Unnamed Report,.. © ECHASummary(Unnamed Report,.. © ECHA Summary (Unnamed Report,.. © ECHASummary(Unnamed Report,.. © Elovaara et al., 1985 (63043) © Faber et al., 2006 (818624) © Faber et al., 2007 (818430) © Fabian et al., 2016 (3491136) © Frantiketal., 1994 (67510) © Gagnaire and Langlais, 2005 (5981.. © Gagnaireetal., 2007(749670) © Gerarde, 1963 (196801) © Notes: Column totals. Row totals, and Grand Totals indicate total numbers of distinct references. Study Details All Dose Levels I effect(s) observed Dose Units Reference | no effect(s) reported Cancer Chronic inhalation 103 wk (6 h/d x 5 d/wk) Mouse B6C3F1 female 0, 75. 250, 750 ppm Chanetal., 1998 NTP, 1999 ¦ male 0, 75, 250, 750 ppm Chan et al., 1998 NTP, 1999 104 wk (6 h/d x 5 d/wk) Rat Fischer 344/N female 0, 75, 250, 750 ppm Chanetal., 1998 NTP, 1999 male 0, 75, 250, 750 ppm Chanetal., 1998 NTP, 1999 oral 104 wk (4 d/wk) Rat Sprague*D. both 0, 500. 800 mg/kg-d Maltoni et al., 1997 ¦ Figure 4-3. Inventory heatmap of PECO-relevant ethylbenzene animal studies by study design and health system. An interactive version, which includes a list of citations with additional study details and summary of the results, is available here. 1 HAWC literature trees are created for citations that are tagged as "potentially relevant 2 supplemental material" during screening, including mechanistic studies (e.g., in vitro or in silico 3 models], ADME studies, and studies on endpoints or routes of exposure that do not meet the 4 specific PECO criteria but that may still be relevantto the research question(s). Here, the objective 5 is to create an inventory of citations that can be tracked and further summarized as needed—for 6 example, by model system, key characteristic [e.g., of carcinogens; Smith et al. f20161], mechanistic 7 endpoint, or key event—to support analyses of critical mechanistic questions that arise at various 8 stages of the systematic review (see Section 9.2 for a description of the process for determining the 9 specific questions and pertinent mechanistic studies to be analyzed}. ADME data and related 10 information can be critical to the next steps of prioritizing or evaluating individual PECO-specific This document is a draft for review purposes only and does not constitute Agency policy, 4-16' DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment studies and are reviewed by subject-matter experts early in the assessment process. A literature tree of the supplemental material identified from the literature searches (as of 1/2022) is presented in Figure 4-4. Supplemental Studies G PBPK / ADME/PK Q Mechanistic (cancer) © Mechanistic (non-cancer) © Non-PECO animal model @ Non-PECO route of exposure © Susceptible population Supplemental Human exposure and biompaitpring Mixture study v\ © Case reports or case series \\ © Record with no original datasets \ Q Posters or conference abstracts © Other supplemental Figure 4-4. Literature tag tree of the supplemental studies identified from the ethylbenzene literature searches. An interactive version, which includes a list of citations with additional study details and summary of the results, is available here. This document is a draft for review purposes only and does not constitute Agency policy, 4-17' DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 1 A single active high throughput screening assay was reported for ethylbenzene on the 2 CompTox Chemicals Dashboard fU.S. EPA. 2019b! The TOXCAST summary plot is shown in Figure 3 4-5 and an interactive version can be found online 4 (https://comptox.epa.gov/dashboard/chemical/invitrodb/DTXSID3 0205961. Bioactivity - TOXCAST Summary © linear ~ * AC50 (uM): 51.144 Assay Endpoint Name: TQX21_RXR_BLA_Agonist_ratio Gene Symbol: RXRA.RXRA Organism: human Tissue: kidney Assay Format Type: cell-based Biological Process Target: regulation of transcription factor activity Analysis Direction: positive Intended Target Family: dna binding AC50 (yM) Figure 4-5. High throughput screening bioactivity data from the CompTox Chemicals Dashboard. An interactive version, which includes a list of citations with additional study details and summary of the results, is available here. This document is a draft for review purposes only and does not constitute Agency policy, 4-18 ' DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 5. REFINE PROBLEM FORMULATION AND SPECIFY ASSESSMENT APPROACH 1 2 3 4 5 6 7 8 9 10 Table 5-1. Assessment PECO criteria for the ethylbenzene assessment PECO element Evidence Populations Human: Any population and lifestage (occupational or general population, including children and other sensitive populations). Animal: Nonhuman mammalian animal species (whole organism) of any lifestage (including preconception, in utero, lactation, peripubertal, and adult stages). Studies of transgenic animals are tracked as mechanistic studies under "potentially relevant supplemental material." Exposures Human: Exposure to ethylbenzene (CASRN 100-41-4), including occupational exposures, alone or as a mixture by any route. Measures of metabolites used to estimate exposures to ethylbenzene. Animal: Exposure to ethylbenzene (CASRN 100-41-4) alone by the oral or inhalation route. Studies employing chronic exposures will be considered the most informative. Studies involving exposures to mixtures will be included only if they include a group with exposure to ethylbenzene alone. Comparators Human: Any comparison or reference group exposed; lower levels of ethylbenzene, no exposure to ethylbenzene, or to ethylbenzene for shorter periods of time. Animal: Quantitative exposure vs. lower or no exposure with concurrent vehicle control group. Outcomes Health Outcomes: Cancer, cardiovascular, developmental, general toxicity (systemic / whole body), hematologic, hepatic, immune/lymphatic, metabolic, nervous system/auditory, renal/urinarv, reproductive, respiratory svstem, thvroid (endocrine). In general, endpoints related to clinical diagnostic criteria, disease outcomes, histopathological examination, or other apical/phenotypic outcomes will be prioritized for evidence synthesis over outcomes such as biochemical measures. Underlined text show modifications in the assessment PECO criteria compared with the problem formulation PECO criteria. CASRN = Chemical Abstract Service registry number. This document is a draft for review purposes only and does not constitute Agency policy. 5-1 DRAFT-DO NOT CITE OR QUOTE 5.1. ASSESSMENT PECO CRITERIA The primary purpose of this step is to provide further specification to the assessment methods based on characterization of the extent and nature of the evidence identified from the literature inventory. This includes refinements to PECO criteria and defining the unit(s) of analysis for health endpoints/outcomes during evidence synthesis, and presenting analysis approaches for mechanistic, ADME or other types of supplemental material content A unit of analysis is an outcome or group of related outcomes within a health effect category that are considered together during evidence synthesis (see Section 8). In some assessments, the units of analysis may include predefined categories of mechanistic evidence (e.g., biomarkers or precursors relating to other outcomes within the unit of analysis, evidence that provides support for grouping together biologically linked endpoints into a unit of analysis). ------- Protocol for the Ethylbenzene IRIS Assessment 5.1.1. Other Exclusions Based on Full-Text Content 1 In addition to failure to meet PECO criteria (described above), epidemiological and 2 toxicological studies may be excluded at the full-text level due to critical reporting limitations. 3 Reporting limitations can be identified during full-text screening but are more commonly identified 4 during subsequent phases of the assessment (e.g., literature inventory, study evaluation). 5 Regardless of when the limitation is identified, exclusions based on full-text content are 6 documented at the level of full-text exclusions in literature flow diagrams with a rationale of 7 "critical reporting limitation." 8 A similar approach is taken for in vitro studies that are prioritized for focused analysis 9 during assessment development (i.e., the critical reporting deficiency may preclude them from 10 consideration). Critical reporting information for different study types are summarized below. For 11 each piece of information, if the information can be inferred (when not directly stated) for an 12 exposure/endpoint combination, the study should be included. 13 14 Epidemiology studies 15 • Sample size 16 • Exposure characterization and/or measurement method 17 • Outcome ascertainment method 18 • Study design 19 Animal studies 20 • Species 21 • Test article name 22 • Levels and duration of exposure 23 • Route of exposure 24 • Quantitative or qualitative (e.g., photomicrographs; author-reported lack of an effect on the 25 outcome) results for at least one endpoint of interest 26 In vitro studies prioritized for focused analysis 27 • Cell/tissue type(s) or test system 28 • Test article name This document is a draft for review purposes only and does not constitute Agency policy. 5-2 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Protocol for the Ethylbenzene IRIS Assessment • Concentration and duration of treatment • Quantitative or qualitative results for at least one endpoint of interest 5.2. UNITS OF ANALYSES FOR DEVELOPING EVIDENCE SYNTHESIS AND INTEGRATION JUDGMENTS FOR HEALTH EFFECT CATEGORIES The planned units of analysis based on outcomes identified in the assessment PECO are summarized in Table 5-2. General considerations for defining the units of analysis are presented in the IRIS Handbook (U.S. EPA. 20221. Each unit of analysis is initially synthesized and judged separately within an evidence stream (see Section 8.1). Depending on the specific health endpoint or outcome, PK data, mechanistic information, and other supporting evidence (e.g., from studies of non-PECO routes of exposure) may be included in a unit of analysis. Evidence integration judgments focus on the stronger within evidence stream synthesis conclusions when multiple units of analysis are synthesized. The evidence synthesis judgments are used alongside other key considerations (i.e., human relevance of findings in animal evidence, coherence across evidence streams, information on susceptible populations or lifestages, and other critical inferences that draw on mechanistic evidence) to draw an overall evidence integration judgment for each health effect category or more granular health outcome grouping (see Section 8.2). Table 5-2. Human and animal endpoint grouping categories. Relevant human health effect category3 Units of analysis for evidence synthesis that inform evidence integration for the ethylbenzene assessment (each bullet represents a unit of analysis) Human evidence Animal evidence Cancer • Lifetime cancer risk • Tumors and precancerous lesions • Tumors and precancerous lesions Cardiovascular • Heart disease • Blood pressure, vascular dilation, and pulse • Heart weight • Histopathology Developmental • Birth defects • Birth weight • Preeclampsia • Age at use of academic support services • Offspring mortality/survival • Body weight, body weight change • Developmental milestones (e.g., eye opening, incisor eruption, pinna detachment) • Skeletal and visceral malformations/variations Hematologic • Red blood cells, hematocrit or hemoglobin, cell volume • Red blood cells, hematocrit or hemoglobin, cell volume • Blood platelets, reticulocytes This document is a draft for review purposes only and does not constitute Agency policy. 5-3 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Relevant human health effect category3 Units of analysis for evidence synthesis that inform evidence integration for the ethylbenzene assessment (each bullet represents a unit of analysis) Human evidence Animal evidence • Blood platelets, reticulocytes • Blood biochemical measures (e.g., sodium, calcium) Hepatic • Serum or liver enzymes (e.g., ALT, AST) • Liver weight and histopathology • Serum or liver enzymes (e.g., ALT, AST) • Liver tissue biochemical markers/biochemistry Immune/lymphatic • Asthma incidence/severity • Respiratory infection • Immune cell counts • Inflammation (c-reactive protein) • Immune organ weight/histopathology • Immune cell counts Metabolic • Serum glucose, insulin; A1C • Serum glucose Nervous system/auditory • Neurodevelopmental disorders (e.g., autism, learning disabilities) • Neurobehavioral function (e.g., reaction time, emotional changes) • Headache, fatigue, sensory irritation • Hearing loss, tinnitus • Brain weight/histopathology • Functional observational battery, including motor activity and reflex responses Learning and memory • Seizures/tremors • Neurotransmitters • Histopathology (hair cell loss) • Auditory function (e.g., MER, auditory threshold) Renal/urinary • No studies • Organ weight/ histopathology • Blood and urine biomarkers (e.g., BUN, CREA, CK) • Urinalysis measures (e.g., specific gravity, protein) Reproductive Note: Evidence synthesis and integration conclusions in the assessment are developed separately for male and female reproductive effects • Menstrual disorders • Reproductive organ weight/ histopathology • Reproductive hormones • Puberty onset • Fertility and pregnancy outcomes (e.g., sperm measures, estrous cyclicity, litter size, gestation length, mating/fertility index) • Dam body weight/body weight gain This document is a draft for review purposes only and does not constitute Agency policy. 5-4 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Relevant human health effect category3 Units of analysis for evidence synthesis that inform evidence integration for the ethylbenzene assessment (each bullet represents a unit of analysis) Human evidence Animal evidence Respiratory • Measures of respiratory function (e.g., FEV, FVC, MEF) • Acute respiratory symptoms (e.g., wheezing, irritation, shortness of breath) • Respiratory organ weight/ histopathology • Respiratory irritation • Respiratory rate Thyroid (endocrine) • No studies • Hormone levels • Histopathology • Thyroid weight General toxicity (systemic/whole body) • Sick building syndrome • Worker health status • Adverse health symptoms (e.g., fatigue, nausea) • Mortality and clinical observations (e.g., lethargy, weakness, labored breathing)15 • Growth and body weightb • Food consumption ALT = alanine aminotransferase; AST = aspartate aminotransferase; A1C = glycated hemoglobin; BUN = blood urea nitrogen; CREA = creatinine; CK = creatine kinase; FEV = forced expiratory volume; FVC = forced vital capacity; MEF = maximal expiratory flow; MER = middle ear reflex. aBased on the currently available evidence base, other health outcomes will not be formally evaluated in this assessment. However, short summaries of the evidence might be included for context. These decisions may be reevaluated if literature search updates identify additional data that may warrant further evaluation. bEffects in dams/pups or animals exposed only during development will be discussed in the developmental and reproductive sections. This document is a draft for review purposes only and does not constitute Agency policy. 5-5 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Protocol for the Ethylbenzene IRIS Assessment 6. STUDY EVALUATION (RISK OF BIAS AND SENSITIVITY) The general approach for evaluating primary health effect studies that meet PECO is described in Section 5.1. Instructional and informational materials for study evaluations are available at https: //hawcprd.epa.gOv/assessment/l 00000039 /. The approach is conceptually the same for epidemiology, controlled human exposure, animal toxicology, and in vitro studies but the application specifics differ; thus, they are described separately in Sections 6.2, 6.3 and 6.4, respectively. Any physiologically based PBPK models used in the assessment are evaluated using methods described in the Quality Assurance Project Plan for PBPK models fU.S. EPA. 2018bl. which is summarized below (see Section 6.6). 6.1. STUDY EVALUATION OVERVIEW FOR HEALTH EFFECT STUDIES The IRIS Program uses a domain-based approach to evaluate studies. Key concerns for the review of epidemiology and animal toxicology studies are potential bias (factors that affect the magnitude or direction of an effect in either direction) and insensitivity (factors that limit the ability of a study to detect a true effect; low sensitivity is a bias toward the null when an effect exists). The study evaluations are aimed at discerning the expected magnitude of any identified limitations (focusing on limitations that could substantively change a result), considering the expected direction of the bias. The study evaluation approach is designed to address a range of study designs, health effects, and chemicals. The general approach for reaching an overall judgment regarding confidence in the reliability of the results is illustrated in Figure 6-1. This document is a draft for review purposes only and does not constitute Agency policy. 6-1 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment (a) Individual evaluation domains Epidemiology Animal In vitro • Exposure measurement • Outcome ascertainment • Participant selection • Confounding • Analysis • Selective reporting • Sensitivity • Allocation ¦ Observational bias/blinding ¦ Confounding ¦ Attrition ¦ Chemical administration and characterization ¦ Endpoint measurement ¦ Results presentation ¦ Selective reporting • Sensitivity • Observational bias/blinding ¦ Variable control ¦ Selective reporting ¦ Chemical administration and characterization ¦ Endpoint measurement ¦ Results presentation ¦ Sensitivity (b) Domain level judgments and overall study rating Domain judgments Judgment Interpretation o Good Appropriate study conduct relating to the domain and minor deficiencies not expected to influence results. o Adequate A study that may have some limitations relating to the domain, but they are not likely to be severe or to have a notable impact on results. Deficient Identified biases or deficiencies interpreted as likely to have had a notable impact on the results or prevent reliable interpretation of study findings. o Critically Deficient A serious flaw identified that makes the observed effect(s) uninterpretable. Studies with a critical deficiency are considered "uninformative" overall. Overall study rating for an outcome Rating Interpretation High Medium Low Uninformative No notable deficiencies or concerns identified: potential for bias unlikely or minimal; sensitive methodology. Possible deficiencies or concerns noted but they are unlikely to have a significant impact on results. Deficiencies or concerns were noted, and the potential for substantive bias or inadequate sensitivity could have a significant impact on the study results or their interpretation. Serious flaw(s) makes study results uninterpretable but may be used to highlight possible research gaps. Figure 6-1. Overview of IRIS study evaluation process, (a) An overview of the evaluation process, (b) The evaluation domains and definitions for ratings (i.e., domain and overall judgments, performed on an outcome-specific basis}. 1 To calibrate the assessment-specific considerations, the study evaluation process includes a 2 pilot phase to assess and refine the evaluation process. Following this pilot, at least two reviewers 3 independently evaluate studies to identify characteristics that bear on the informativeness 4 (i.e., validi ty and sensitivity) of the results. The independent reviewers use structured web-forms This document is a draft for review purposes only and does not constitute Agency policy, 6-2 ' DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Protocol for the Ethylbenzene IRIS Assessment for study evaluation housed within the EPA's version of HAWC fhttps: //hawcprd.epa.gov/assessment/100000039/1 to record separate judgments for each domain and the overall study for each outcome and unit of analysis, to reach consensus between reviewers, and when necessary, resolve differences by discussion between the reviewers or consultation with additional independent reviewers. As reviewers examine a group of studies, additional chemical-specific knowledge or methodological concerns could emerge, and a second pass of all pertinent studies might become necessary. In general, considerations for reviewing a study with regard to its conduct for specific health outcomes are based on considerations presented in the IRIS Handbook fU.S. EPA. 20221 and use of existing guideline documents when available, including EPA guidelines for carcinogenicity, neurotoxicity, reproductive toxicity, and developmental toxicity fU.S. EPA. 2005a. 1998.1996. 1991a"). Authors might be queried to obtain critical information, particularly that involving missing key study design or results information that or additional analyses that could address potential study limitations. During study evaluation, the decision on whether to seek missing information focuses on information that could result in a reevaluation of the overall study confidence for an outcome. Outreach to study authors is documented in HAWC and considered unsuccessful if researchers do not respond to an email or phone request within one month of the attempt to contact Only information or data that can be made publicly available (e.g., within HAWC or HERO) will be considered. When evaluating studies that examine more than one outcome, the evaluation process is explicitly conducted at the individual outcome level within the study. Thus, the same study may have different outcome domain judgments for different outcomes. These measures could still be grouped for evidence synthesis. During review, for each evaluation domain, reviewers reach a consensus judgment of good, adequate, deficient, not reported, or critically deficient. If a consensus is not reached, a third reviewer performs conflict resolution. It is important to emphasize that evaluations are performed in the context of the study's utility for identifying individual hazards. Limitations specific to the usability of the study for dose-response analysis are useful to note and applicable to selecting studies for that purpose (see Section 9), but they do not contribute to the study confidence classifications. These four categories are applied to each evaluation domain for each outcome considered within a study, as follows: • Good represents a judgment that the study was conducted appropriately in relation to the evaluation domain, and any minor deficiencies noted are not expected to influence the study results or interpretation of the study findings. • Adequate indicates a judgment that methodological limitations related to the evaluation domain are (or are likely to be) present, but those limitations are unlikely to be severe or to notably impact the study results or interpretation of the study findings This document is a draft for review purposes only and does not constitute Agency policy. 6-3 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 Protocol for the Ethylbenzene IRIS Assessment • Deficient denotes identified biases or deficiencies interpreted as likely to have had a notable impact on the results, or that limit interpretation of the study findings. • Not reported indicates the information necessary to evaluate the domain question was not available in the study. Depending on the expected impact, the domain may be interpreted as adequate or deficient for the purposes of the study confidence rating. • Critically deficient reflects a judgment that the study conduct relating to the evaluation domain introduced a serious flaw that is interpreted to be the primary driver of any observed effect(s) or makes the study uninterpretable. Studies with critically deficient judgments in any evaluation domain are almost always classified as overall uninformative for the relevant outcome (s). Once the evaluation domains are rated, the identified strengths and limitations are considered collectively to reach a study confidence classification of high, medium, or low confidence, or uninformative for each specific health outcome (s). This classification is based on the reviewer judgments across the evaluation domains and considers the likely impact that the noted deficiencies in bias and sensitivity have on the outcome-specific results. There are no predefined weights for the domains, and the reviewers are responsible for applying expert judgment to make this determination. The study confidence classifications, which reflect a consensus judgment between reviewers, are defined as follows: • High confidence: No notable deficiencies or concerns were identified; the potential for bias is unlikely or minimal, and the study used sensitive methodology. High confidence studies generally reflect judgments of good across all or most evaluation domains. • Medium confidence: Possible deficiencies or concerns were identified, but the limitations are unlikely to have a significant impact on the study results or their interpretation. Generally, medium confidence studies include adequate or good judgments across most domains, with the impact of any identified limitation not being judged as severe. • Low confidence: Deficiencies or concerns are identified, and the potential for bias or inadequate sensitivity is expected to have a significant impact on the study results or their interpretation. Typically, low confidence studies have a deficient evaluation for one or more domains, although some medium confidence studies might have a deficient rating in domain(s) considered to have less influence on the magnitude or direction of effect estimates. Low confidence results are given less weight compared with high or medium confidence results during evidence synthesis and integration (see Sections 7 and 8) and are generally not used as the primary sources of information for hazard identification or derivation of toxicity values unless they are the only studies available (in which case, this significant uncertainty would be emphasized during dose-response analysis). Studies rated low confidence only because of sensitivity concerns are asterisked or otherwise noted because they often require additional consideration during evidence synthesis. Effects observed in studies that are biased toward the null may increase confidence in the results, assuming the study is otherwise well conducted (see Section 8). • Uninformative: Serious flaw(s) are judged to make the study results uninterpretable for use in the assessment. Studies with critically deficient judgments in any evaluation domain are This document is a draft for review purposes only and does not constitute Agency policy. 6-4 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Protocol for the Ethylbenzene IRIS Assessment almost always rated uninformative. Studies with multiple deficient judgments across domains may also be considered uninformative. Given that the findings of interest are considered uninterpretable based on the identified flaws (see above definition of critically deficient) and do not provide information of use to assessment interpretations, these studies have no impact on evidence synthesis or integration judgments and are not usable for dose-response analyses but may be used to highlight research gaps. As previously noted, study evaluation determinations reached by each reviewer and the consensus judgment between reviewers are recorded in HAWC. Final study evaluations housed in HAWC are made available when the draft is publicly released. The study confidence classifications and their rationales are carried forward and considered as part of evidence synthesis (see Section 11) to help interpret the results across studies. Critically deficient and Uninformative ratings are uncommon; these ratings are reserved for critical flaws where the study findings are truly uninterpretable due to identified biases. The most frequent situation where they are used for epidemiology studies is when potential confounding has not been considered using any method (e.g., adjustment, stratification, restriction), including unadjusted correlation coefficients or means in cases/controls in a heterogeneous population where confounding is likely. 6.2. EPIDEMIOLOGY STUDY EVALUATION Evaluation of epidemiology studies of health effects to assess risk of bias and study sensitivity are conducted for the following domains: exposure measurement, outcome ascertainment, participant selection, potential confounding, analysis, study sensitivity, and selective reporting. Bias can result in false positives and negatives (i.e., Types I and II errors), whereas study sensitivity is typically concerned with identifying the latter. The principles and framework used for evaluating epidemiology studies are adapted from the principles in the Cochrane Risk of Bias in Nonrandomized Studies of Interventions [ROBINS-I; Sterne etal. (2016)] but modified to address environmental and occupational exposures. The types of information that may be the focus of those criteria are listed in Table 6-1. Core and prompting questions, presented in Table 6-2, are used to collect information to guide evaluation of each domain. Core questions represent key concepts while the prompting questions help the reviewer focus on relevant details under each key domain. Exposure- and outcome-specific criteria to use during study evaluation are developed using the core and prompting questions and refined during a pilot phase with engagement from topic-specific experts. The protocol may also be adjusted in the early phases of the study evaluation process if corrections are identified based on initial literature reviews. Exposure and confounding domain considerations specific to ethylbenzene are presented in Sections 6.2.1 to 6.2.6. This document is a draft for review purposes only and does not constitute Agency policy. 6-5 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 6-1. Information relevant to evaluation domains for epidemiology studies Domain Types of information that may need to be collected or are important for evaluating the domain Exposure measurement Source(s) of exposure (e.g., consumer products, occupational, an industrial accident) and source(s) of exposure data, blinding to outcome, level of detail for job history data, when measurements were taken, type of biomarker(s), assay information, reliability data from repeated-measures studies, validation studies. Outcome ascertainment Source of outcome (effect) measure, blinding to exposure status or level, how measured/classified, incident vs. prevalent disease, evidence from validation studies, prevalence (or distribution summary statistics for continuous measures). Participant selection Study design, where and when was the study conducted, and who was included? Recruitment process, exclusion and inclusion criteria, type of controls, total eligible, comparison between participants and nonparticipants (or followed and not followed), and final analysis group. Does the study include potential susceptible populations or life stages (see discussion in Section 9)? Confounding Background research on key confounders for specific populations or settings; participant characteristic data, by group; strategy/approach for consideration of potential confounding; strength of associations between exposure and potential confounders and between potential confounders and outcome; and degree of exposure to the confounder in the population. Analysis Extent (and if applicable, treatment) of missing data for exposure, outcome, and confounders; approach to modeling; classification of exposure and outcome variables (continuous vs. categorical); testing of assumptions; sample size for specific analyses; and relevant sensitivity analyses. Sensitivity What are the ages of participants (e.g., not too young in studies of pubertal development)? What is the length of follow-up (for outcomes with long latency periods)? Choice of referent group, the exposure range, and the level of exposure contrast between groups (i.e., the extent to which the "unexposed group" is truly unexposed, and the prevalence of exposure in the group designated as "exposed"). Selective reporting Are results presented with adequate detail for all the endpoints and exposure measures reported in the methods section, and are they relevant to the PECO? Are results presented for the full sample as well as for specified subgroups? Were stratified analyses (effect modification) motivated by a specific hypothesis? This document is a draft for review purposes only and does not constitute Agency policy. 6-6 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 6-2. Questions to guide the development of criteria for each domain in epidemiology studies Domain and core question Prompting questions Follow-up questions Considerations that apply to most exposures and outcomes Exposure measurement Does the exposure measure reliably distinguish between levels of exposure in a time window considered most relevant for a causal effect with respect to the development of the outcome? For all: • Does the exposure measure capture the variability in exposure among the participants, considering intensity, frequency, and duration of exposure? • Does the exposure measure reflect a relevant time window? If not, can the relationship between measures in this time and the relevant time window be estimated reliably? • Is the exposure measurement likely to be affected by a knowledge of the outcome? • Is the exposure measurement likely to be affected by the presence of the outcome (i.e., reverse causality)? For case-control studies of occupational exposures: • Is exposure based on a comprehensive job history describing tasks, setting, time period, and use of specific materials? For biomarkers of exposure, general population: • Is a standard assay used? What are the intra- and inter-assay coefficients of variation? Is the assay likely to be affected by contamination? Are values less than the limit of detection dealt with adequately? • What exposure time period is reflected by the biomarker? If the half-life is short, what is the correlation between serial measurements of exposure? Is the degree of exposure misclassification likely to vary by exposure level? If the correlation between exposure measurements is moderate, is there an adequate statistical approach to ameliorate variability in measurements? If there is a concern about the potential for bias, what is the predicted direction or distortion of the bias on the effect estimate (if there is enough information)? These considerations require customization to the exposure and outcome (relevant timing of exposure). Good • Valid exposure assessment methods used, which represent the etiologically relevant time period of interest. • Exposure misclassification is expected to be minimal. Adequate • Valid exposure assessment methods used, which represent the etiologically relevant time period of interest. • Exposure misclassification may exist but is not expected to greatly change the effect estimate. Deficient • Valid exposure assessment methods used, which represent the etiologically relevant time period of interest. Specific knowledge about the exposure and outcome raises concerns about reverse causality, but there is uncertainty whether it is influencing the effect estimate. • Exposed groups are expected to contain a notable proportion of unexposed or minimally exposed individuals, the method did not capture important temporal or spatial variation, or there is other evidence of exposure misclassification that would be expected to notably change the effect estimate. Critically deficient • Exposure measurement does not characterize the etiologically relevant time period of exposure or is not valid. • There is evidence that reverse causality is very likely to account for the observed association. • Exposure measurement was not independent of outcome status. This document is a draft for review purposes only and does not constitute Agency policy. 6-7 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions Follow-up questions Considerations that apply to most exposures and outcomes Outcome ascertainment Does the outcome measure reliably distinguish the presence or absence (or degree of severity) of the outcome? For all: • Is outcome ascertainment likely to be affected by knowledge of, or presence of, exposure (e.g., consider access to health care, if based on self-reported history of diagnosis)? For case-control studies: • Is the comparison group without the outcome (e.g., controls in a case-control study) based on objective criteria with little or no likelihood of inclusion of people with the disease? For mortality measures: • How well does cause-of-death data reflect occurrence of the disease in an individual? How well do mortality data reflect incidence of the disease? For diagnosis of disease measures: • Is the diagnosis based on standard clinical criteria? If it is based on self-report of the diagnosis, what is the validity of this measure? For laboratory-based measures (e.g., hormone levels): • Is a standard assay used? Does the assay have an acceptable level of inter-assay variability? Is the sensitivity of the assay appropriate for the outcome measure in this study population? Is there a concern that any outcome misclassification is nondifferential, differential, or both? What is the predicted direction or distortion of the bias on the effect estimate (if there is enough information)? These considerations require customization to the outcome. Good • High certainty in the outcome definition (i.e., specificity and sensitivity), minimal concerns with respect to misclassification. • Assessment instrument is validated in a population comparable to the one from which the study group was selected. Adequate • Moderate confidence that outcome definition was specific and sensitive, some uncertainty with respect to misclassification but not expected to greatly change the effect estimate. • Assessment instrument is validated but not necessarily in a population comparable to the study group. Deficient • Outcome definition was not specific or sensitive. • Uncertainty regarding validity of assessment instrument. Critically deficient • Invalid/insensitive marker of outcome. • Outcome ascertainment is very likely to be affected by knowledge of, or presence of, exposure. Note: Lack of blinding should not be automatically construed to be critically deficient. Participant selection Is there evidence that selection into or out of the study (or For longitudinal cohort: • Did participants volunteer for the cohort based on knowledge of exposure and/or preclinical disease symptoms? Was entry into the cohort or continuation in the cohort related to exposure and outcome? Are differences in participant enrollment and follow-up evaluated to assess bias? These considerations may require customization to the outcome. This could include determining what study designs effectively allow analyses of associations appropriate to the outcome measures (e.g., design to capture incident vs. prevalent cases, design to capture early pregnancy loss). Good This document is a draft for review purposes only and does not constitute Agency policy. 6-8 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions Follow-up questions Considerations that apply to most exposures and outcomes analysis sample) is jointly related to exposure and to outcome? For occupational cohort: • Did entry into the cohort begin with the start of the exposure? • Was follow-up or outcome assessment incomplete, and if so, was follow-up related to both exposure and outcome status? • Could exposure produce symptoms that would result in a change in work assignment/work status ("healthy worker survivor effect")? For case-control study: • Were controls representative of population and time periods from which cases were drawn? • Are hospital controls selected from a group whose reason for admission is independent of exposure? • Could recruitment strategies, eligibility criteria, or participation rates result in differential participation relating to both disease and exposure? If there is a concern about the potential for bias, what is the predicted direction or distortion of the bias on the effect estimate (if there is enough information)? Are appropriate analyses performed to address changing exposures over time in relation to symptoms? Is there a comparison of participants and nonparticipants to address whether differential selection is likely? • Minimal concern for selection bias based on description of recruitment process (e.g., selection of comparison population, population based random sample selection, recruitment from sampling frame including current and previous employees). • Exclusion and inclusion criteria are specified and do not induce bias. • Participation rate is reported at all steps of study (e.g., initial enrollment, follow-up, selection into analysis sample). If rate is not high, there is appropriate rationale for why it is unlikely to be related to exposure (e.g., comparison between participants and nonparticipants or other available information indicates differential selection is not likely). Adequate • Enough of a description of the recruitment process to be comfortable that there is no serious risk of bias. • Inclusion and exclusion criteria are specified and do not induce bias. • Participation rate is incompletely reported but available information indicates participation is unlikely to be related to exposure. Deficient • Little information on recruitment process, selection strategy, sampling framework and/or participation or aspects of these processes raise the potential for bias (e.g., healthy worker effect, survivor bias). For population-based survey: • Was recruitment based on advertisement to people with knowledge of exposure, outcome, and hypothesis? Critically deficient • Aspects of the processes for recruitment, selection strategy, sampling framework, or participation result in concern that selection bias resulted in a large impact on effect estimates (e.g., convenience sample with no information about recruitment and selection, cases This document is a draft for review purposes only and does not constitute Agency policy. 6-9 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions Follow-up questions Considerations that apply to most exposures and outcomes and controls are recruited from different sources with different likelihood of exposure, recruitment materials stated outcome of interest, and potential participants are aware of or are concerned about specific exposures). Confounding Is confounding of the effect of the exposure likely? Is confounding adequately addressed by considerations in: • Participant selection (matching or restriction)? • Accurate information on potential confounders and statistical adjustment procedures? • Lack of association between confounder and outcome, or confounder and exposure in the study? • Information from other sources? Is the assessment of confounders based on a thoughtful review of published literature, potential relationships (e.g., as can be gained through directed acyclic graphing), and minimizing potential overcontrol (e.g., inclusion of a variable on the pathway between exposure and outcome)? If there is a concern about the potential for bias, what is the predicted direction or distortion of the bias on the effect estimate (if there is enough information)? These considerations require customization to the exposure and outcome, but this may be limited to identifying key covariates. Good • Conveys strategy for identifying key confounders. This may include a priori biological considerations, published literature, causal diagrams, or statistical analyses; with recognition that not all "risk factors" are confounders. • Inclusion of potential confounders in statistical models not based solely on statistical significance criteria (e.g., p < 0.05 from stepwise regression). • Does not include variables in the models that are likely to be influential colliders or intermediates on the causal pathway. • Key confounders are evaluated appropriately and considered to be unlikely sources of substantial confounding. This often will include o Presenting the distribution of potential confounders by levels of the exposure of interest and/or the outcomes of interest (with amount of missing data noted), o Consideration that potential confounders are rare among the study population or are expected to be poorly correlated with exposure of interest, o Consideration of the most relevant functional forms of potential confounders, and o Examination of the potential impact of measurement error or missing data on confounder adjustment. This document is a draft for review purposes only and does not constitute Agency policy. 6-10 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions Follow-up questions Considerations that apply to most exposures and outcomes Adequate • Similar to good but may not have included all key confounders, or less detail may be available on the evaluation of confounders (e.g., subbullets in good). It is possible that residual confounding could explain part of the observed effect, but concern is minimal. Deficient • Does not include variables in the models that are likely to be influential colliders or intermediates on the causal pathway. And any of the following: • The potential for bias to explain some of the results is high based on an inability to rule out residual confounding, such as a lack of demonstration that key confounders of the exposure outcome relationships are considered; • Descriptive information on key confounders (e.g., their relationship relative to the outcomes and exposure levels) are not presented; or • Strategy of evaluating confounding is unclear or is not recommended (e.g., only based on statistical significance criteria or stepwise regression [forward or backward elimination]). Critically deficient • Includes variables in the models that are colliders and/or intermediates in the causal pathway, indicating that substantial bias is likely from this adjustment or • Confounding is likely present and not accounted for, indicating that all of the results are most likely due to bias. o Presenting a progression of model results with adjustments for different potential confounders, if warranted. This document is a draft for review purposes only and does not constitute Agency policy. 6-11 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions Follow-up questions Considerations that apply to most exposures and outcomes Analvsis Does the analysis strategy and presentation convey the necessary familiarity with the data and assumptions? • Are missing outcome, exposure, and covariate data recognized, and if necessary, accounted for in the analysis? • Does the analysis appropriately consider variable distributions and modeling assumptions? • Does the analysis appropriately consider subgroups of interest (e.g., based on variability in exposure level or duration or susceptibility)? • Is an appropriate analysis used for the study design? • Is effect modification considered, based on considerations developed a priori? • Does the study include additional analyses addressing potential biases or limitations (i.e., sensitivity analyses)? If there is a concern about the potential for bias, what is the predicted direction or distortion of the bias on the effect estimate (if there is enough information)? These considerations may require customization to the outcome. This could include the optimal characterization of the outcome variable and ideal statistical test (e.g., Cox regression). Good • Use of an optimal characterization of the outcome variable. • Quantitative results are presented (effect estimates and confidence limits or variability in estimates) (i.e., not presented only as a p-value or "significant"/"not significant"). • Descriptive information about outcome and exposure is provided (where applicable). • Amount of missing data is noted and addressed appropriately (discussion of selection issues—missing at random vs. differential). • Where applicable, for exposure, includes (limit of detection (LOD) and percentage below the LOD), and decision to use log transformation. • Includes analyses that address robustness of findings, e.g., examination of exposure-response (explicit consideration of nonlinear possibilities, quadratic, spline, or threshold/ceiling effects included, when feasible); relevant sensitivity analyses; effect modification examined based only on a priori rationale with sufficient numbers. • No deficiencies in analysis evident. Discussion of some details may be absent (e.g., examination of outliers). Adequate Same as good, except: • Descriptive information about exposure is provided (where applicable) but may be incomplete; might not have discussed missing data, cutpoints, or shape of distribution. • Includes analyses that address robustness of findings (examples in good), but some important analyses are not performed. This document is a draft for review purposes only and does not constitute Agency policy. 6-12 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions Follow-up questions Considerations that apply to most exposures and outcomes Deficient • Does not conduct analysis using optimal characterization of the outcome variable. • Descriptive information about exposure levels is not provided (where applicable). • Effect estimate and p-value are presented, without standard error or confidence interval. • Results are presented as statistically "significant"/"not significant." Critically deficient • Results of analyses of effect modification are examined without clear a priori rationale and without providing main/principal effects (e.g., presentation only of statistically significant interactions that were not hypothesis driven). • Analysis methods are not appropriate for design or data of the study. Selective reporting Is there reason to be concerned about selective reporting? • Are results provided for all the primary analyses described in the methods section? • Is there appropriate justification for restricting the amount and type of results that are shown? • Are only statistically significant results presented? If there is a concern about the potential for bias, what is the predicted direction or distortion of the bias on the effect estimate (if there is enough information)? These considerations generally do not require customization and may have fewer than four levels. Good • The results reported by study authors are consistent with the primary and secondary analyses described in a registered protocol or methods paper. Adequate • The authors described their primary (and secondary) analyses in the methods section and results are reported for all primary analyses. Deficient • Concerns are raised based on previous publications, a methods paper, or a registered protocol indicating that analyses are planned or conducted that are not reported, or that hypotheses This document is a draft for review purposes only and does not constitute Agency policy. 6-13 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions Follow-up questions Considerations that apply to most exposures and outcomes originally considered to be secondary are represented as primary in the reviewed paper. • Only subgroup analyses are reported suggesting that results for the entire group are omitted. • Only statistically significant results are reported. Sensitivitv Is there a concern that sensitivity of the study is not adequate to detect an effect? • Is the exposure range adequate to detect associations and exposure-response relationships? • Was the appropriate population included? • Was the length of follow-up adequate? Is the time/age of outcome ascertainment optimal given the interval of exposure and the health outcome? • Are there other aspects related to risk of bias or otherwise that raise concerns about sensitivity? These considerations may require customization to the exposure and outcome. Depending on the needs of the assessment, there may be fewer than four rating levels. Some study features that affect study sensitivity may have already been included in the other evaluation domains; these should be noted in this domain again, along with any features that have not been addressed elsewhere so that the rating provides an overall summary of factors that may impact sensitivity. When determining the overall study confidence rating, the evaluator should be conscious that a limitation could contribute to multiple domains and not double-penalize the study. Some considerations include: Good • The range of exposure levels provides sufficient variability in exposure distribution and/or sufficient range or contrasts (e.g., across groups or exposure categories) to detect associations or exposure- response relationships that may be present. • The population was exposed to levels expected to have an impact on response. • The study population was at risk of developing the outcomes of interest (e.g., ages, life stage, sex). • The timing of outcome ascertainment was appropriate given expected latency for outcome development (i.e., adequate follow-up interval). • There was evidence of sufficient statistical power (which may include formal power calculations) to observe an effect if it exists. This document is a draft for review purposes only and does not constitute Agency policy. 6-14 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions Follow-up questions Considerations that apply to most exposures and outcomes • No other concerns raised regarding study sensitivity (e.g., no evidence that results would be attenuated enough to preclude detection of an adverse health effect). Adequate • Same considerations as good, except: o Issues are identified that could reduce sensitivity, but they are unlikely to impact the overall findings of the study. Deficient • Concerns were raised about the issues described for good that are expected to notably decrease the sensitivity of the study to detect associations for the outcome (i.e., reasonably high likelihood of a false null result). • Note: Deficient sensitivity indicates that null findings should be interpreted with caution and may not represent a lack of association. Critically deficient • Severe concerns were raised about the sensitivity of the study such that any observed association is uninterpretable (e.g., exposure gradients/contrasts that precluded an ability to distinguish exposure levels between study participants). This document is a draft for review purposes only and does not constitute Agency policy. 6-15 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Protocol for the Ethylbenzene IRIS Assessment 6.2.1. Epidemiological Study Evaluation Considerations Specific to Exposure Domain for Ethylbenzene Ethylbenzene is present in solvents, inks, paint, pesticides and other household products, and concentrations indoors are typically higher than levels measured outdoors. Traffic emissions, escape of vapors at gas stations or car repair garages, car and truck idling in parking lots and border crossings, and emissions from the petrochemical industry are primary contributors to ethylbenzene concentrations in ambient air. While ethylbenzene from ambient air contributes to indoor levels, variability of ethylbenzene levels in residences primarily is due to indoor sources, such as the presence of a smoker in the home (Wallace etal.. 19871 or product use fAdgate etal.. 20041. as well as housing characteristics, such as attached garages and ventilation (Sexton etal.. 20071. Because there are unique sources both indoors and outdoors, individual-level exposure assessments for health effects studies ideally would capture contributions from time at home, school, or work, and in transit. 6.2.2. Exposure Assessment Approaches used in Epidemiology Studies of Ethylbenzene and Potential Misclassification A few of the epidemiology studies in the ethylbenzene inventory characterized individual exposures using personal monitoring over a few days. Most of these studies measured average concentrations in the home over a period of days to a few weeks during one or more seasons. Because indoor levels typically are higher than outdoor concentrations and people typically spend the majority of their time indoors, measurements of exposure levels in the home are likely to adequately characterize personal exposure. A comparison of exposure estimates in children or nonsmoking adults in Minnesota using personal sampling and a time-weighted model, based on indoor measurements in their homes and schools (or work) and outdoors at school (or community), found that the model with only the home measurements was comparable to the model containing all microenvironments in explaining the variation in the personal exposure measurements (Adgate etal.. 2004: Sexton etal.. 2004a). The degree to which indoor residential measurements explain personal exposure likely depends on the local and meteorological characteristics in different locations. Within communities, variation in indoor aromatic VOC concentrations is primarily due to variability between residences and between seasons, with much lower variability due to variation between cities or measurement error (Tia etal.. 2012). Within a residence, concentrations measured in different rooms (e.g., living room, bedroom) are highly correlated (Wallace etal.. 1991). Therefore, to characterize average indoor exposure to ethylbenzene over longer timeframes (e.g., the previous year), sampling from at least one room would be adequate, but multiple sampling periods in different seasons would provide an estimate with less exposure misclassification compared with estimates based on measurements during one season. Exposure estimates based on outdoor concentrations or air quality models capture a small portion of an individual's average exposure. Studies have demonstrated that estimates based on This document is a draft for review purposes only and does not constitute Agency policy. 6-16 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 Protocol for the Ethylbenzene IRIS Assessment ambient exposures are an underestimate of an individual's personal exposure f Sexton etal.. 2004bl. and, similarly, increasing evidence suggests the importance of indoor sources fKonkle etal.. 2020). However, health effect studies may be able to identify associations with ambient ethylbenzene exposure using methods to characterize the spatial or temporal variation in communities, primarily due to traffic and industrial point sources. Annual exposure estimates based on land use regression that capture finer scale concentration gradients across a community are expected to result in less exposure misclassification compared with methods based on measurements from central site monitors accounting for the relative distance to subject's homes fMukeriee etal.. 2009: Aguilera et al.. 20081. However, the use of exposure estimates from land use regression (LUR) models in epidemiology studies of air pollution can introduce measurement error with attenuated effect estimates and inflated variance, if the spatial variation within a community has not been adequately characterized (Basagana etal.. 2013). Publications reporting studies of ambient ethylbenzene exposure should describe the approach to model development and present information about the sources of ethylbenzene emissions and their impact on spatial variation. Still, due to concerns about misclassification from ambient exposure assessment approaches, these studies will be unable to reach the "good" ranking in the exposure domain. Some of the studies of ambient exposure in the ethylbenzene inventory used annual average exposure estimates for each census tract generated by regional air quality models based on the National Emissions Inventory (e.g., National Air Toxics Assessment data) fhttps://www.epa.gov/national-air-toxics-assessment/2014-nata-assessment-results! NATA estimates are based on the National Emissions Inventory (NEI) for a specific year, which uses empirical and engineering factors, not measurements, but the models account for spatial variation incorporating secondary formation and decay, pollutant dispersion, meteorology, population activity data, and several sources of exposure. NATA is a screening level tool to look at annual population exposures. NATA has been found to underpredict concentrations of many VOCs due to missing and underestimated emission sources and other reasons (U.S. EPA. 2010a). For ethylbenzene, a comparison of annual average concentrations at 242 specific sites estimated using the model outputs from 2005 and monitoring data found a median model-to-monitor ratio of 0.471 with 85% of the modeled estimates underestimating those based on monitoring data. Twenty percent of the modeled values were within 30% of the values based on monitors and 41% of the modeled values were within a factor of 2 of those based on monitoring data (https://www3.epa.gov/ttn/chief/conference/eil9/sessionl/oommen.pdf). These analyses indicate that exposure misclassification may be a concern for individual exposure estimates based on the 2005 (and previous) NATA models. Other sources of exposure misclassification in epidemiology studies that use NATA estimates include the use of exposure assignments at the census tract level (not individual level) and the use of annual average concentration estimates for only one year (i.e., 1996 or 2005). This document is a draft for review purposes only and does not constitute Agency policy. 6-17 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Protocol for the Ethylbenzene IRIS Assessment Exposure estimates in a few of the epidemiology studies of ethylbenzene exposure were derived using the Community Multi-scale Air Quality (CMAQ) model, which also uses emissions data, as well as meteorological and atmospheric chemistry inputs. CMAQ models concentrations over large regions using a 36-km horizontal resolution domain but has also been used to model concentrations at a finer resolution (i.e., 1 km). Exposure estimates based on a grid size of 36 km would have limited spatial resolution, and therefore exposure misclassification would be of greater concern. 6.2.3. ADME and Notes Relevant to Biomarkers Similar to many VOCs, ethylbenzene is rapidly distributed in the body and can undergo metabolism prior to elimination unchanged as the parent compound in exhaled breath or its metabolic derivatives in urine. Thus, ethylbenzene is generally not persistent in the body: the half- life in blood is less than a half-hour (ATSDR. 20101. A complex multiexponential elimination curve for ethylbenzene was measured in the blood of four individuals after a six-hour exposure to a mixture of VOCs, including ethylbenzene. While declines after exposure ended were rapid during the first hour, subsequent decline slowed and a three-compartment model appeared to be the best fit to the data (Ashley and Prah. 1997). Although bioaccumulation may occur, the concentration in blood primarily signifies recent exposure levels and is not considered a relevant exposure measure for chronic disease (e.g., prevalent cardiovascular disease). Analyses of matched blood values and personal air measurements of BTEX compounds (benzene, toluene, ethylbenzene, o-xylene, m-/p- xylene) have found relatively low correlations, possibly due to mistiming of the air sampling or other unknown factors fSu etal.. 2011: Sexton etal.. 20051. Because of rapid clearance, blood concentrations would reflect exposures occurring just prior to a blood draw. In contrast to blood biomarkers, urinary biomarkers of VOCs have delayed clearance and therefore may be representative of exposures in the period of hours to days (Heinrich-Ramm et al.. 2000). Therefore, urinary biomarkers are preferable to blood biomarkers to assess daily exposures to VOCs potentially relevant to chronic health outcomes, though it is important to adjust for kidney function when using urinary measures fHeinrich-Ramm etal.. 20001. However, it should be noted that the primary measurable metabolites for ethylbenzene (mandelic acid and phenylglyoxylic acid) are not specific to ethylbenzene and are also derived from styrene, which is commonly detected in conjunction with ethylbenzene (Capella et al.. 2019). As such, the use of urinary biomarkers should be restricted to cases where substantial co-exposure to styrene can be ruled out. Overall, in comparison to outdoor or indoor air measurement alone, the use of biomarkers can account for exposures from multiple routes and sources and may have smaller variance ratios than air measurements fLin etal.. 20051. They may also better capture the growing importance of exposure from to VOCs from volatile chemical products fMcdonald etal.. 20181. which may not be accounted for in traditional ambient exposure models. This document is a draft for review purposes only and does not constitute Agency policy. 6-18 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Protocol for the Ethylbenzene IRIS Assessment 6.2.4. Time Frames Represented by Exposure Assessments The time frame represented by the exposure estimates should correspond to the period in which the health outcomes were expected to have developed. Indoor exposure assessments representing a period of week(s) in more than one season could reasonably characterize average exposure over the previous year and would be relevant to immune-related or other symptoms (e.g., asthma, wheezing illness, allergy symptoms, sensory irritation) occurring over the previous several weeks to a year. Daily sampling is best, but periodic sampling on a less than daily basis may be sufficient depending on the variability in air concentrations. Developmental outcomes should be evaluated in relation to the relevant critical exposure periods during pregnancy if they are known. Exposure measurements with shorter time frames are less informative for studying the prevalence or incidence of chronic disease, such as physician-diagnosed asthma, cardiovascular disease, cancer. 6.2.5. Correlation Between BTEX Compounds and Potential Confounding BTEX compounds, all traffic pollutants, are correlated in ambient air (r = 0.43 - 0.59) (Sexton et al.. 2004a). Ethylbenzene and o-xylene concentrations in blood were correlated in the NHANES III and continuous NHANES cohorts (r = 0.81 and 0.89, respectively) (see Appendix C in Su etal. (2011)). Confounding of observed associations with health outcomes by other BTEX compounds is best considered when interpreting results across studies if they analyzed exposures from different locations or settings (e.g., traffic-related, indoor product use). 6.2.6. Exposure Domain Evaluation Levels The following exposure domain rating levels will be applied. The exposure assessment methods will be evaluated for how well they characterize either (1) total personal/residential or (2) outdoor (ambient) ethylbenzene exposure to the individuals in the study. Table 6-3. Estimates representing total individual-level exposure based on personal or residential monitoring Rating Criteria Good Integrated personal measurements using passive monitors, over multiple 24-hr periods (since there could be relevant daily variations), or time-weighted summary concentrations incorporating concentrations in residence and school/workplace. Sampling details provided including type of samplers, placement of samplers, sampling periods, status of activities in structures, chemical analysis methods (or citation provided). Time frame of measurements appropriate to development of health outcome. OR Area measurements in home using passive or active monitors, average of measurements in one or more rooms; average over longer periods is better (weeks) and multiple seasons if estimating annual average. Sampling details provided including type of samplers, placement of samplers, sampling periods, status of activities in structures, chemical This document is a draft for review purposes only and does not constitute Agency policy. 6-19 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Rating Criteria analysis methods (or citation provided). Time frame of measurements appropriate to development of health outcome. OR In cases where co-exposure to styrene can be ruled out, urinary biomarkers collected via standardized procedures (e.g., gas chromatography-mass spectrometry, GC/MS) and appropriate QC. Adequate Area measurements in home using passive or active monitors, average of measurements in one or more rooms; average of shorter duration (less than 1 wk) with information about monitoring protocol, and multiple seasons if estimating annual average. Sampling details provided including type of samplers, placement of samplers, sampling periods, status of activities in structures, chemical analysis methods (or citation provided). Time frame of measurements appropriate to development of health outcome. Deficient Area measurements in home obtained on one occasion if estimating annual average. (A single measure does not capture daily variations in the relative proportion of time in different microenvironments nor variations in concentrations of VOCs (Kim et al., 2002). Sampling details provided including type of samplers, placement of samplers, sampling periods, status of activities in structures, chemical analysis methods (or citation provided). Time frame of measurements appropriate to development of health outcome. OR Use of questionnaires or observations of VOC products in the home by trained study personnel OR Blood biomarkers collected via standardized procedures (e.g., GC/MS) and appropriate QC OR Urinary biomarkers (not specific to ethyl benzene and where there is concern for co- exposure to styrene) collected via standardized procedures (e.g., GC/MS) and appropriate QC OR Air sampling with gas chromatography-flame ionization detection (preferred method would utilize mass spectrometry detection) (e.g., gas chromatography-mass spectrometry). Critically deficient Time frame for exposure estimation was not appropriate to development of health outcome. This document is a draft for review purposes only and does not constitute Agency policy. 6-20 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 6-4. Exposure to ethylbenzene in ambient air Rating Criteria Good No studies using ambient exposure assessment approaches can reach classification of "good" due to concerns regarding misclassification of personal/individual-level exposure. Adequate Average estimates based on land use regression models developed for location where study was conducted including description of model development and sufficient information about how the model adequately characterizes spatial variation in the community due to what was known about sources. Time frame of measurements appropriate to development of health outcome. Potentially other methods besides LUR might fall into this category if detailed validation information was provided to ensure model adequately characterizes spatial variation. Deficient Average estimates based on land use regression models developed for location where study was conducted, but some uncertainties remain regarding how the model was developed or how the model adequately characterizes spatial variation in the community due to what was known about sources. OR Annual average estimates or other time-period-specific averages appropriate to development of health outcome based on NATA data linked to residential census tract. OR Annual average estimates or other time-period-specific averages appropriate to development of health outcome based on chemical transport models (CMAQ) using spatially resolved grid size (i.e., 1 km). OR Annual average estimates based on proximity to central monitor for homes, with multiple sampling locations in a community, with some description of how well the monitoring network characterizes variation due to sources. Time frame of measurements averages appropriate to development of health outcome. Critically deficient Annual average estimates or other time-period-specific averages appropriate to development of health outcome based on CMAQ using large grid (resolution) size (i.e., 36 km). OR Time frame for exposure estimation was not appropriate to development of health outcome OR Air sampling with gas chromatography-flame ionization detection (preferred method would utilize mass spectrometry detection) (e.g., gas chromatography-mass spectrometry). This document is a draft for review purposes only and does not constitute Agency policy. 6-21 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Protocol for the Ethylbenzene IRIS Assessment 6.3. CONTROLLED HUMAN EXPOSURE STUDY EVALUATION This study design involves human volunteers to test specific hypotheses about short-term exposures and biological responses that inform potential mechanisms and understanding of exposure-response patterns. The exposures are generated in the laboratory to achieve predetermined concentrations for periods of minutes to hours. For study evaluation, a process incorporating aspects of the approaches used for epidemiology studies and experimental animal studies, as well as the ROBINS-I tool discussed in Section 6.2 (Sterne etal.. 20161. are used to evaluate controlled exposure studies in humans. Controlled human exposure studies are evaluated for important attributes of experimental studies, including randomization of exposure assignments, blinding of subjects and investigators, exposure generation, inclusion of a clean air control exposure (if applicable), study sensitivity, and other aspects of the exposure protocol. Sample size is considered, as is the process of recruitment and selection of study subjects and differences in characteristics between groups reflecting potential differences in sensitivity. 6.4. EXPERIMENTAL ANIMAL STUDY EVALUATION Using the principles described in Section 6.1, the animal studies of health effects to assess risk of bias and sensitivity are evaluated for the following domains: allocation, observational bias/blinding, confounding, selective reporting, attrition, chemical administration and characterization, endpoint measurement and validity, results presentation and comparisons, and sensitivity (see Table 6-5). The rationale for judgments is documented at the outcome level. The evaluation documentation in HAWC includes the identified limitations and their expected impact on the overall confidence level. To the extent possible, the rationale will reflect an interpretation of the potential influence on the outcome-specific results, including the direction or magnitude of influence (or both). This document is a draft for review purposes only and does not constitute Agency policy. 6-22 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 6-5. Domains, questions, and general considerations to guide the evaluation of animal toxicology studies Domain and core question Prompting questions General considerations Allocation Were animals assigned to experimental groups using a method that minimizes selection bias? For each study: Did each animal or litter have an equal chance of being assigned to any experimental group (i.e., random allocation)?3 Is the allocation method described? Aside from randomization, were any steps taken to balance variables across experimental groups during allocation? These considerations typically do not need to be refined by assessment teams. A judgment and rationale for this domain should be given for each cohort or experiment in the study. Good: Experimental groups were randomized, and any specific randomization procedure was described or inferable (e.g., computer-generated scheme. Note that normalization is not the same as randomization [see response for adequate]). Adequate: Authors report that groups were randomized but do not describe the specific procedure used (e.g., "animals were randomized"). Alternatively, authors used a nonrandom method to control for important modifying factors across experimental groups (e.g., body-weight normalization). Not reported (interpreted as deficient): No indication of randomization of groups or other methods (e.g., normalization) to control for important modifying factors across experimental groups. Critically deficient: Bias in the animal allocations was reported or inferable. Observational bias/blinding Did the study implement measures to reduce observational bias? For each endpoint/outcome or grouping of endpoints/outcomes in a study: Does the study report blinding or other procedures for reducing observational bias? If not, did the study use a design or approach for which such procedures can be inferred? What is the expected impact of failure to implement (or report implementation) of these procedures on results? These considerations typically do not need to be refined by the assessment teams. (Note that it can be useful for teams to identify highly subjective measures of endpoints/outcomes where observational bias may strongly influence results prior to performing evaluations.) A judgment and rationale for this domain should be given for each endpoint/outcome or group of endpoints/outcomes investigated in the study. Good: Measures to reduce observational bias were described (e.g., blinding to conceal treatment groups during endpoint evaluation; consensus-based evaluations of histopathology-lesions).b Adequate: Methods for reducing observational bias (e.g., blinding) can be inferred or were reported but described incompletely. Not reported: Measures to reduce observational bias were not described. (Interpreted as adequate) The potential concern for bias was mitigated based on use of automated/computer driven systems, standard laboratory kits, relatively simple, objective measures (e.g., body or tissue weight), or screening-level evaluations of histopathology. This document is a draft for review purposes only and does not constitute Agency policy. 6-23 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations (Interpreted as deficient) The potential impact on the results is major (e.g., outcome measures are highly subjective). Critically deficient: Strong evidence for observational bias that impacted the results. Confounding Are variables with the potential to confound or modify results controlled for and consistent across experimental groups? Note: Consideration of overt toxicity (possibly masking more specific effects) is addressed under endpoint measurement reliability. For each study: Are there difference across the treatment groups, considering both differences related to the exposure (e.g., coexposures, vehicle, diet, palatability) and other aspects of the study design or animal groups (e.g., animal source, husbandry, or health status), that could bias the results? If differences are identified, to what extent are they expected, based on a specific scientific understanding, to impact the results? These considerations may need to be refined by assessment teams, as the specific variables of concern can vary by experiment or chemical. A judgment and rationale for this domain should be given for each cohort or experiment in the study, noting when the potential for confounding is restricted to specific endpoints/outcomes. Good: Outside of the exposure of interest, variables that are likely to confound or modify results appear to be controlled for and consistent across experimental groups. Adequate: Some concern that variables that were likely to confound or modify results were uncontrolled or inconsistent across groups but are expected to have a minimal impact on the results. Deficient: Notable concern that potentially confounding variables were uncontrolled or inconsistent across groups and are expected based on to substantially impact the results. Critically deficient: Confounding variables were presumed to be uncontrolled or inconsistent across groups and are expected to be a primary driver of the results. Attrition Did the study report results for all tested animals? For each study: Are all animals accounted for in the results? If there is attrition, do authors provide an explanation (e.g., death or unscheduled sacrifice during the study)? If unexplained attrition of animals for outcome assessment is identified, what is the expected impact on the interpretation of the results? These considerations typically do not need to be refined by assessment teams. A judgment and rationale for this domain should be given for each cohort or experiment in the study. Good: Results were reported for all animals. If animal attrition is identified, the authors provide an explanation, and these are not expected to impact the interpretation of the results. Adequate: Results are reported for most animals. Attrition is not explained but this is not expected to significantly impact the interpretation of the results. Deficient: Moderate to high level of animal attrition that is not explained and may significantly impact the interpretation of the results. Critically deficient: Extensive animal attrition that prevents comparisons of results across treatment groups. This document is a draft for review purposes only and does not constitute Agency policy. 6-24 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations Chemical administration and characterization Did the study adequately characterize exposure to the chemical of interest and the exposure administration methods? Note: Consideration of the appropriateness of the route of exposure (not the administration method) is not a risk of bias consideration. Relevance and utility of the routes of exposure are considered in the PECO criteria for study inclusion and during evidence synthesis. Relatedly, consideration of exposure level selection (e.g., were levels sufficiently high to elicit effects) is addressed during evidence synthesis and is not a risk of bias consideration. For each study: Are there concerns [specific to this chemical] regarding the source and purity and/or composition (e.g., identity and percent distribution of different isomers) of the chemical? Was independent analytical verification of the test article (e.g., composition, homogeneity, and purity) performed? Were nominal exposure levels verified analytically? Are there concerns about the methods used to administer the chemical (e.g., inhalation chamber type, gavage volume)? It is essential that these considerations are considered, and potentially refined, by assessment teams, as the specific variables of concern can vary by chemical (e.g., stability may be an issue for one chemical but not another). A judgment and rationale for this domain should be given for each cohort or experiment in the study. Good: Chemical administration and characterization is complete (i.e., source and purity are provided or can be obtained from the supplier and test article is analytically verified). There are no notable concerns about the composition, stability, or purity of the administered chemical, or the specific methods of administration. Exposure levels are verified using reliable analytical methods. Adequate: Some uncertainties in the chemical administration and characterization are identified but these are expected to have minimal impact on interpretation of the results (e.g., purity of the test article is suboptimal but interpreted as unlikely to have a significant impact; analytical verification of exposure levels is not reported or verified with nonpreferred methods). Deficient: Uncertainties in the exposure characterization are identified and expected to substantially impact the results (e.g., source of the test article is not reported, and composition is not independently verified; impurities are substantial or concerning; administration methods are considered likely to introduce confounders, such as use of static inhalation chambers or a gavage volume considered too large for the species or lifestage at exposure). Critically deficient: Uncertainties in the exposure characterization are identified and there is reasonable certainty that the study results are largely attributable to factors other than exposure to the chemical of interest (e.g., identified impurities are expected to be a primary driver of the results). Endpoint measurement Are the selected procedures, protocols and animal models adequately described and appropriate for the For each endpoint/outcome or grouping of endpoints/outcomes in a study: Are the evaluation methods and animal model adequately described and appropriate? Considerations for this domain are highly variable depending on the endpoint(s)/outcome(s) of interest and typically must be refined by assessment teams. A judgment and rationale for this domain should be given for each endpoint/outcome or group of endpoints/outcomes investigated in the study. Some considerations include the following: This document is a draft for review purposes only and does not constitute Agency policy. 6-25 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations endpoint(s)/outcome(s) of interest? Notes: Considerations related to the sensitivity of the animal model and timing of endpoint measurement are evaluated under Sensitivity Considerations related to adjustments/corrections to endpoint measurements (e.g., organ weight corrected for body weight) are addressed under results presentation. Are there concerns regarding the methodology selected for endpoint evaluation? Are there concerns about the specificity of the experimental design? Are there serious concerns regarding the sample size or how endpoints were sampled? Are appropriate control groups for the study/assay type included? Good: • Adequate description of methods and animal models. • Use of generally accepted and reliable endpoint methods. • Sample sizes are generally considered adequate for the assay or protocol of interest and there are no notable concerns about sampling in the context of the endpoint protocol (e.g., sampling procedures for histological analysis). • Includes appropriate control groups and any use of nonconcurrent or historical control data (e.g., for evaluation of rare tumors) is justified (e.g., authors or evaluators considered the similarity between current experimental animals and laboratory conditions to historical controls). Ratings of Adequate, Deficient, and Critically Deficient are generally defined as follows: Adequate: Issues are identified that may affect endpoint measurement but are considered unlikely to substantially impact the overall findings or the ability to reliably interpret those findings. Deficient: Concerns are raised that are expected to notably affect endpoint measurement and reduce the reliability of the study findings Critically deficient: Severe concerns are raised about endpoint measurement and any findings are likely to be largely explained by these limitations The following specific examples of relevant concerns are typically associated with a Deficient rating, but Adequate or Critically Deficient might be applied depending on the expected impact of limitations on the reliability and interpretation of the results: • Study report lacks important details that are necessary to evaluate the appropriateness of the study design (e.g., description of the assays or protocols; information on the strain, sex, or lifestage of the animals) • Selection of protocols that are nonpreferred or lack specificity for investigating the endpoint of interest. This includes omission of additional experimental criteria (e.g., inclusion of a positive control or dosing up to levels causing minimal toxicity) when required by specific testing guidelines/protocols.* • Overt toxicity (e.g., mortality, extreme weight loss) is observed or expected based on findings from similarly designed studies and may mask interpretation of outcome(s) of interest. This document is a draft for review purposes only and does not constitute Agency policy. 6-26 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations • Sample sizes are smaller than is generally considered adequate for the assay or protocol of interest. Inadequate sampling can also be raised within the context of the endpoint protocol (e.g., in a pathology study, bias that is introduced by only sampling a single tissue depth or an inadequate number of slides per animal)** • Control groups are not included, considered inappropriate, or comparisons to nonconcurrent or historical controls are not adequately justified *These limitations typically also raise a concern for insensitivity ** Sample size alone is not a reason to conclude an individual study is critically deficient. Results presentation Are the results presented and compared in a way that is appropriate and transparent? For each endpoint/outcome or grouping of endpoints/outcomes in a study: Does the level of detail allow for an informed interpretation of the results? Are the data compared, or presented, in a way that is inappropriate or misleading? Considerations for this domain are highly variable depending on the outcomes of interest and typically must be refined by assessment teams. A judgment and rationale for this domain should be given for each endpoint/outcome or group of endpoints/outcomes investigated in the study. Some considerations include the following: Good: • No concerns with how the data are presented. • Results are quantified or otherwise presented in a manner that allows for an independent consideration of the data (assessments do not rely on author interpretations). • No concerns with completeness of the results reporting.* Ratings of Adequate, Deficient, and Critically Deficient are generally defined as follows: Adequate: Concerns are identified that may affect results presentation but are considered unlikely to substantially impact the overall findings or the ability to reliably interpret those findings. Deficient: Concerns with results presentation are identified and expected to substantially impact results interpretation and reduce the reliability of the study findings. Critically deficient: Severe concerns about results presentation were identified and study findings are likely to be largely explained by these limitations. This document is a draft for review purposes only and does not constitute Agency policy. 6-27 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations The following specific examples of relevant concerns are typically associated with a Deficient rating but Adequate or Critically Deficient might be applied depending on expected impact of limitations on the reliability and interpretation of the results: • Nonpreferred presentation of data (e.g., developmental toxicity data averaged across pups in a treatment group, when litter responses are more appropriate; presentation of only absolute organ weight data when relative weights are more appropriate). • Pooling data when responses are known or expected to differ substantially (e.g., across sexes or ages). • Incomplete presentation of the data* (e.g., presentation of mean without variance data; concurrent control data are not presented; dichotomizing or truncating continuous data). *Failure to describe any findings for assessed outcomes (i.e., report lacks any qualitative or quantitative description of the results in tables, figures, or text) is addressed under Selective Reporting. Selective reporting Did the study report results for all prespecified outcomes? Note: This domain does not consider the appropriateness of the analysis/results presentation. This aspect of study quality is evaluated in another domain. For each study: Are results presented for all endpoints/outcomes described in the methods (see note)? If unexplained results omissions are identified, what is the expected impact on the interpretation of the results? These considerations typically do not need to be refined by assessment teams. A judgment and rationale for this domain should be given for each cohort or experiment in the study. Good: Quantitative or qualitative results were reported for all prespecified outcomes (explicitly stated or inferred), exposure groups and evaluation time points. Data not reported in the primary article is available from supplemental material. If results omissions are identified, the authors provide an explanation, and these are not expected to impact the interpretation of the results. Adequate: Quantitative or qualitative results are reported for most prespecified outcomes (explicitly stated or inferred) and evaluation time points. Omissions and are not explained but are not expected to significantly impact the interpretation of the results. Deficient: Quantitative or qualitative results are missing for many prespecified outcomes (explicitly stated or inferred), omissions are not explained and may significantly impact the interpretation of the results. Critically deficient: Extensive results omission is identified and prevents comparisons of results across treatment groups. This document is a draft for review purposes only and does not constitute Agency policy. 6-28 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations Sensitivity Are there concerns that sensitivity in the study is not adequate to detect an effect? Note: Consideration of exposure level selection (e.g., were levels sufficiently high to elicit effects) is addressed during evidence synthesis and is not a study sensitivity consideration. Was the exposure period, timing (e.g., lifestage), frequency, and duration sensitive for the outcome(s) of interest? Given knowledge of the health hazard of concern, did the selection of species, strain, and/or sex of the animal model reduce study sensitivity? Are there concerns regarding the timing (e.g., lifestage) of the outcome evaluation? Are there aspects related to risk of bias domains that raise concerns about insensitivity (e.g., selection of protocols that are known to be insensitive or nonspecific for the outcome(s) of interest) These considerations may require customization to the specific exposure and outcomes. Some study design features that affect study sensitivity may have already been included in the other evaluation domains; these should be noted in this domain, along with any features that have not been addressed elsewhere. Some considerations include: Good • The experimental design (considering exposure period, timing, frequency, and duration) is appropriate and sensitive for evaluating the outcome(s) of interest. • The selected animal model (considering species, strain, sex, and/or lifestage) is known or assumed to be appropriate and sensitive for evaluating the outcome(s) of interest. • No significant concerns with the ability of the experimental design to detect the specific outcome(s) of interest, (e.g., outcomes evaluated at the appropriate lifestage; study designed to address known endpoint variability that is unrelated to treatment, such as estrous cyclicity or time of day). • Timing of endpoint measurement in relation to the chemical exposure is appropriate and sensitive (e.g., behavioral testing is not performed during a transient period of test chemical-induced depressant or irritant effects; endpoint testing does not occur only after a prolonged period, such as weeks or months, of nonexposure). • Potential sources of bias toward the null are not a substantial concern. Adequate Same considerations as Good, except: • The duration and frequency of the exposure was appropriate, and the exposure covered most of the critical window (if known) for the outcome(s) of interest. • Potential issues are identified that could reduce sensitivity, but they are unlikely to impact the overall findings of the study. Deficient • Concerns were raised about the considerations described for Good or Adequate that are expected to notably decrease the sensitivity of the study to detect a response in the exposed group(s). This document is a draft for review purposes only and does not constitute Agency policy. 6-29 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations Critically deficient • Severe concerns were raised about the sensitivity of the study and experimental design such that any observed associations are likely to be explained by bias. The rationale should indicate the specific concern(s). Overall confidence Considering the identified strengths and limitations, what is the overall confidence rating for the endpoint(s)/outcome(s) of interest? For each endpoint/outcome or grouping of endpoints/outcomes in a study: Were concerns (i.e., limitations or uncertainties) related to the risk of bias or sensitivity identified? If yes, what is their expected impact on the overall interpretation of the reliability and validity of the study results, including (when possible) interpretations of impacts on the magnitude or direction of the reported effects? The overall confidence rating considers the likely impact of the noted concerns (i.e., limitations or uncertainties) in reporting, bias and sensitivity on the results. Reviewers should mark studies that are rated lower than high confidence only due to low sensitivity (i.e., bias toward the null) for additional consideration during evidence synthesis. If the study is otherwise well conducted and an effect is observed, it may increase the certainty of evidence judgment. A confidence rating and rationale should be given for each endpoint/outcome or group of endpoints/outcomes investigated in the study. Confidence ratings are described above (see Section 6.1.1). This document is a draft for review purposes only and does not constitute Agency policy. 6-30 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 Protocol for the Ethylbenzene IRIS Assessment 6.5. IN VITRO AND OTHER MECHANISTIC STUDY EVALUATION As described in Section 4.4, the initial literature screening identifies sets of other potentially informative studies, including mechanistic studies, as "potentially relevant supplemental information." Mechanistic information includes any experimental measurement related to a health outcome that informs the biological or chemical events associated with phenotypic effects. These measurements can improve understanding of the mechanisms involved in the biological effects following exposure to a chemical but are not generally considered by themselves adverse outcomes. Mechanistic data are reported in a diverse array of observational and experimental studies across species, model systems, and exposure paradigms, including in vitro, in vivo (by various routes of exposure), ex vivo, and in silico studies. Individual study-level evaluations of mechanistic endpoints are not typically pursued. To undergo a full reporting quality, risk of bias, and sensitivity evaluation of every identified study that may report mechanistic information before the relevant toxicity pathways have been identified or the needs of the assessment are better understood would not be an effective use of time. However, for some chemical assessments, it may be necessary to identify assay-specific considerations for study endpoint evaluations, on a case-by-case basis, to provide a more detailed summary and evaluation for the most relevant individual studies. This may be done, for example, when the scientific understanding of a critical mechanistic event or MOA is less established or lacks scientific consensus, when the reported findings on a mechanistic endpoint are conflicting, when the available mechanistic evidence addresses a complex and influential aspect of the assessment, or when in vitro or in silico data make up the bulk of the evidence base and there is little or no evidence from epidemiological studies or animal bioassays. If a subset of individual mechanistic studies is identified for evaluation, the study evaluation considerations will differ depending on the type of endpoints, study designs, and model systems or populations evaluated. Note that because the evaluation process is outcome specific, overall confidence classifications for human or animal studies that have already been determined will not automatically apply to mechanistic endpoints if reported in the same study; instead, a separate evaluation of the mechanistic endpoints should be performed because the utility of a study may vary for the different outcomes reported. Developing specific considerations requires a familiarity with the studies to be evaluated and cannot be conducted in the absence of knowledge of the relevant study designs, measurements, and analytic issues. Knowledge of issues related to the hazards and the outcomes identified in the revised evaluation plan is also important for developing specific evaluation considerations. One challenge is that novel methodologies for studying mechanistic evidence are continually being developed and implemented and often no "standard practices" exist The evaluation of mechanistic studies applies similar principles as those described above for the evaluation of experimental animal studies. Table 6-6 provides the standard domains and core questions for evaluating studies conducted in in vitro test systems, along with some basic This document is a draft for review purposes only and does not constitute Agency policy. 6-31 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 1 considerations for guiding the evaluation. The evaluation process focuses on assessing aspects of 2 the study design and conduct through three broad types of evaluations: reporting quality, risk of 3 bias, and study sensitivity. Some domain considerations are tailored to the chemical, as well as the 4 assay(s) and/or endpoint(s) being evaluated. Assessment teams work with subject-matter experts 5 to develop specific considerations. These specific considerations are determined before performing 6 the study evaluation, although they may be refined as the study evaluation proceeds (e.g., during 7 pilot testing). Assessment-specific and/or assay-specific considerations are documented and made 8 publicly available in the assessment This document is a draft for review purposes only and does not constitute Agency policy. 6-32 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 6-6. Domains, questions, and general considerations to guide the evaluation of in vitro studies Domain and core question Prompting questions General considerations Observational bias/blinding Did the study implement measures, where possible, to reduce observational bias? Considerations will vary depending on the specific assay/model system being used and may not be applicable to some analyses. For each assay or endpoint in a study: Did the study report steps taken to minimize observational bias during analysis (e.g., blinding/coding of slides or plates for analysis; collection of data from randomly selected fields; positive controls that are not immediately identifiable)? If not, did the study use a design or approach for which such procedures can be inferred, or which would not be possible to implement? Were the assays evaluated using automated approaches (e.g., microplate readers) that reduce concern for observational bias? What is the expected impact of failure to implement (or report implementation) of these methods/procedures on results? These considerations typically do not need to be refined by the assessment teams. Prior to performing evaluations, teams should consider the specific assay to identify highly subjective measures of endpoints where observational bias may strongly influence results. A judgment and rationale for this domain should be given for each assay or endpoint or group of endpoints investigated in the study. Good: Measures to reduce observational bias were described (e.g., specific mention of blinding and/or coding of slides for analysis), or observational bias is not a concern because of use of automated/computer driven systems and/or standard laboratory kits. Not reported, interpreted as adequate: Measures to reduce observational bias were not described, but the potential concern for bias was mitigated because protocol cited includes a description of requirements for blinding/coding, or the impact on results is expected to be minor because the specific measurement is more objective. Not reported, interpreted as deficient: No protocol cited; the potential impact on the results is major because the endpoint measures are highly subjective (e.g., counting plaques or live vs. dead cells). Critically deficient: Strong evidence for observational bias that could have impacted the results. Variable control Are all introduced variables with the potential to affect the results of interest controlled for and consistent across experimental groups? For each study: Are there any known or presumed differences across treatment groups (e.g., coexposures, culture conditions, cell passages, variations in reagent production lots, mycoplasma infections) that could bias the results? If differences are identified, to what extent are they expected to impact the results? Did the study address features inherent to the physicochemical properties of the test These considerations will need to be refined by assessment teams as the specific variables of concern can vary by the experimental test system and chemical. A judgment and rationale for this domain should be given for each experiment in the study, noting when the potential to affect results is restricted to specific assays or endpoints. Good: Outside of the exposure of interest, variables or features of the test system and/or chemical properties that are likely to impact results appear to be controlled for and consistent across experimental groups. This document is a draft for review purposes only and does not constitute Agency policy. 6-33 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations substance(s) that have the potential to bias the results away from the null? For example, could the test article interfere with a given assay (e.g., auto-fluoresces or inhibits enzymatic processes necessary for assay signals), potentially leading to an erroneous positive signal? (Note that concerns related to dose are addressed in chemical administration and characterization.) Are there known variations in cellular signaling unique to the model system that could influence the possibility of detecting the effect(s) of interest? Are there concerns regarding the negative (untreated and/or vehicle) controls used? Were negative controls run concurrently? Adequate: Some concern that variables or features of the test system and/or chemical properties that are likely to modify or interfere with results were uncontrolled or inconsistent across groups but are expected to have a minimal impact on the results. Deficient: Notable concern that important study variables and/or features of the test system lacked specificity or were uncontrolled or inconsistent across groups and are expected to substantially impact the results. Critically deficient: Features of the test system are known to be nonspecific for this endpoint, and/or influential study variables were presumed to be uncontrolled or inconsistent across groups and are expected to be a primary driver of the results. Selective reporting Did the study present results, quantitatively or qualitatively, for all prespecified assays or endpoints and replicates described in the methods? Note: The appropriateness of the analysis or results presentation is considered under results presentation. For each study: Are results presented for all endpoints/outcomes described in the methods? Did the study clearly indicate the number of replicate experiments performed? Were the replicates technical (from the same sample) or independent (from separate, distinct exposures)? If unexplained results omissions are identified, what is the expected impact on the interpretation of the results? These considerations typically do not need to be refined by assessment teams. A judgment and rationale for this domain should be given for each assay or endpoint in the study. Good: Quantitative or qualitative results were reported for all prespecified assays or endpoints (explicitly stated or inferred), exposure groups and evaluation timepoints. Data not reported in the primary article is available from supplemental material. If results omissions are identified, the authors provide an explanation, and these are not expected to impact the interpretation of the results. Adequate: Quantitative or qualitative results are reported for most prespecified assays or endpoints (explicitly stated or inferred), exposure groups and evaluation timepoints. Omissions are not explained but are not expected to significantly impact the interpretation of the results. Deficient: Quantitative or qualitative results are missing for many prespecified assays or endpoints (explicitly stated or inferred), exposure This document is a draft for review purposes only and does not constitute Agency policy. 6-34 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations groups and evaluation timepoints; omissions are not explained and may significantly impact the interpretation of the results. Critically deficient: Extensive results omissions are identified, preventing comparisons of results across treatment groups. Chemical administration and characterization Did the study adequately characterize exposure to the chemical of interest and the exposure administration methods? For each study: Are there concerns regarding the purity and/or composition (e.g., identity and percent distribution of different isomers) of the test material/chemical? If so, can the purity and/or composition be obtained from the supplier (e.g., as reported on the website)? Was independent analytical verification of the test article purity and composition performed? If not, is this a significant concern for this substance? Are there concerns about the stability of the test chemical in the vehicle and/or culture media (e.g., pH, solubility, volatility, adhesion to plastics) that were not corrected for, leading to potential bias away from the null (e.g., observed precipitate formation at high concentrations) or toward the null (e.g., enclosed chambers not used for testing volatile chemicals)? Are there concerns about the preparation or storage conditions of the test substance? Are there concerns about the methods used to administer the chemical? It is essential that these criteria are considered, and potentially refined, by assessment teams, as the specific variables of concern can vary by chemical (e.g., stability may be an issue for one chemical but not another). A judgment and rationale for this domain should be given for each experiment in the study. Good: Chemical administration and characterization is complete (i.e., source, purity, and analytical verification of the test article are provided). There are no concerns about the composition, stability, or purity of the administered chemical, or the specific methods of administration. Adequate: Some uncertainties in the chemical administration and characterization are identified but these are expected to have minimal impact on interpretation of the results (e.g., source and vendor-reported purity are presented but not independently verified; purity of the test article is suboptimal but not concerning). Deficient: Uncertainties in the exposure characterization are identified and expected to substantially impact the results (e.g., the source and purity of the test article are not reported and no independent verification of the test article was conducted; levels of impurities are substantial or concerning; deficient administration methods were used). Critically deficient: Uncertainties in the exposure characterization are identified and there is reasonable certainty that the results are largely attributable to factors other than exposure to the chemical of interest (e.g., identified impurities are expected to be a primary driver of the results). This document is a draft for review purposes only and does not constitute Agency policy. 6-35 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations Endpoint measurement Are the selected protocols, procedures, and test systems adequately described and appropriate for evaluating the endpoint(s) of interest? Notes: Considerations related to adjustments or corrections to endpoint measurements are addressed under results presentation. Considerations related to the sensitivity of the animal model and timing of endpoint measurement are evaluated under sensitivity. For each endpoint or grouping of endpoints in a study: Are the evaluation methods and test systems adequately described and appropriate? Are there concerns regarding the methodology selected (e.g., accepted guidelines, established criteria) for endpoint evaluation? Are there concerns about the specificity of the experimental design? Did the study address features inherent to the test system or experiment that have the potential to lead to bias away from the null? Are there serious concerns about the number of replicates or sample size in the study? Are appropriate control groups for the study/assay type included? Was there a need for the assay to include specific controls to reduce potential sources of underlying bias? Did the test compound induce cytotoxicity (known, or expected based on other studies of similar design) to a degree that is expected to affect interpretation of results? Considerations for this domain are highly variable depending on the assay or endpoint(s) of interest and must be refined by assessment teams. A judgment and rationale for this domain should be given for each assay or endpoint or group of endpoints investigated in the study. Some considerations include the following: Good: • Adequate description of methods and test system. • Use of generally accepted and reliable endpoint methods that are consistent with accepted guidelines or established criteria for the assay(s)/endpoint(s) of interest. • Sample sizes are generally considered adequate for the assay or protocol of interest and there are no notable concerns about sampling in the context of the endpoint protocol. • Includes appropriate control groups (e.g., use of loading controls) and any use of nonconcurrent or historical control data (e.g., for comparison to background levels in negative controls) is justified (e.g., authors or evaluators considered the similarity between current cell cultures and laboratory conditions to historical controls). Ratings of Adequate, Deficient, and Critically Deficient are generally defined as follows: Adequate: Issues are identified that may affect endpoint measurement but are considered unlikely to substantially impact the overall findings or the ability to reliably interpret those findings. Deficient: Concerns are raised that are expected to notably affect endpoint measurement and reduce the reliability of the study findings Critically deficient: Severe concerns are raised about endpoint measurement and any findings are likely to be largely explained by these limitations The following specific examples of relevant concerns are typically associated with a Deficient rating, but Adequate or Critically Deficient might be applied depending on the expected impact of limitations on the reliability and interpretation of the results: This document is a draft for review purposes only and does not constitute Agency policy. 6-36 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations • Study report lacks important details that are necessary to evaluate the appropriateness of the study design (e.g., description of the assays or protocols; information on the cell line, passage number). • Selection of protocols that are nonpreferred or lack specificity for investigating the endpoint of interest. This includes omission of additional experimental criteria (e.g., inclusion of a positive control or dosing up to levels causing minimal toxicity) when required by specific testing guidelines/protocols.* • Cytotoxicity is observed or expected based on findings from similarly designed studies and may mask interpretation of outcome(s) of interest. • Sample sizes are smaller than is generally considered adequate for the assay or protocol of interest. Inadequate sampling can also be raised within the context of the endpoint protocol (e.g., in a pathology study, bias that is introduced by only sampling a single tissue depth or an inadequate number of slides per animal)** • Controls are not included or considered inappropriate. *These limitations typically also raise a concern for insensitivity **Sample size alone is not a reason to conclude an individual study is critically deficient. Results presentation Are the results presented and compared in a way that is appropriate and transparent and makes the data usable? For each assay/endpoint or grouping of endpoints in a study: Does the level of detail allow for an informed interpretation of the results? If applicable, was the assay signal normalized to account for nonbiological differences across replicates and exposure groups? Are the data compared or presented in a way that is inappropriate or misleading (e.g., presenting western blot images without including numerical values for densitometry analysis, or vice versa)? Flag potentially Considerations for this domain are highly variable depending on the endpoints of interest and must be refined by assessment teams. A judgment and rationale for this domain should be given for each assay or endpoint or group of endpoints investigated in the study. Some considerations include the following: Good: • No concerns with how the data are presented. • Results are quantified or otherwise presented in a manner that allows for an independent consideration of the data (assessments do not rely on author interpretations). • No concerns with completeness of the results reporting.* Ratings of Adequate, Deficient, and Critically Deficient are generally defined as follows: This document is a draft for review purposes only and does not constitute Agency policy. 6-37 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations inappropriate statistical comparisons for further review. Adequate: Concerns are identified that may affect results presentation but are considered unlikely to substantially impact the overall findings or the ability to reliably interpret those findings. Deficient: Concerns with results presentation are identified and expected to substantially impact results interpretation and reduce the reliability of the study findings. Critically deficient: Severe concerns about results presentation were identified and study findings are likely to be largely explained by these limitations. The following specific examples of relevant concerns are typically associated with a Deficient rating but Adequate or Critically Deficient might be applied depending on expected impact of limitations on the reliability and interpretation of the results: • Nonpreferred presentation of data (e.g., averaging technical replicates rather than independent replicates). • Failure to present quantitative results • Pooling data when responses are known or expected to differ substantially (e.g., across cell types or passage number). • Incomplete presentation of the data* (e.g., presentation of mean without variance data; concurrent control data are not presented; failure to report or address overt cytotoxicity). *Failure to describe any findings for assessed outcomes (i.e., report lacks any qualitative or quantitative description of the results in tables, figures, or text) will result in a critically deficient rating for the outcome(s) of interest for Results Presentation; overall completeness of reporting at the study level is addressed under Selective Reporting. Sensitivity Are there concerns that sensitivity in the study is not adequate to detect an effect? Was the exposure period, timing (i.e., cell passage number, insufficient culture maturity for the adequate expression of mature cell markers; insufficient treatment and/or measurement duration for the production of protein above the level of detection), Are there concerns regarding the need for positive controls (e.g., concerns that the effects of interest may be inhibited or otherwise poorly manifest in the test system, for example due to differences from in vivo biology)? If used, was the selected positive test substance (and dose) reasonable and appropriate and was the intended positive response induced? This document is a draft for review purposes only and does not constitute Agency policy. 6-38 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations frequency, and duration of exposure sensitive for the assay/model system of interest, particularly in the absence of a positive control? Assay-specific considerations regarding sensitivity, specificity, and validity of the selection of the test methods will be described here (e.g., metabolic competency, antibody specificity) (some of these external considerations may have been applied during prioritization of studies for evaluation). Are there aspects related to risk of bias domains that raise concerns about insensitivity (e.g., selection of protocols or methods that are known to be insensitive or nonspecific for the outcome(s) of interest)? Are there concerns regarding the need for positive controls (e.g., concerns that the effects of interest may be inhibited or otherwise poorly manifest in the test system, for example due to differences from in vivo biology)? If used, was the selected positive test substance (and dose) reasonable and appropriate and was the intended positive response induced? Considerations for this domain are highly variable depending on the specific assay/model system used or endpoint(s) of interest and must be refined by assessment teams. Some study design features that affect study sensitivity may have already been included in the other evaluation domains; these should be noted in this domain, along with any features that have not been addressed elsewhere. Some considerations include: Good • The experimental design (considering exposure period, timing, frequency, and duration) is appropriate and sensitive for evaluating the outcome(s) of interest. • The selected test system is appropriate and sensitive for evaluating the outcome(s) of interest (e.g., cell line/cell type is appropriate and routinely used for the selected assay). • No significant concerns with the ability of the experimental design to detect the specific outcome(s) of interest, (e.g., study designed to address known endpoint variability that is unrelated to treatment, such as doubling time or confluency). • Timing of endpoint measurement in relation to the chemical exposure is appropriate and sensitive (e.g., cultures adequately express mature cell markers). • Potential sources of bias toward the null are not a substantial concern. Adequate • Potential issues are identified related to the considerations described for Good that could reduce sensitivity, but they are unlikely to impact the overall findings of the study. Deficient • Concerns were raised about the considerations described for Good that are expected to notably decrease the sensitivity of the study to detect a response in the exposed group(s). Critically deficient This document is a draft for review purposes only and does not constitute Agency policy. 6-39 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Domain and core question Prompting questions General considerations • Severe concerns were raised about the sensitivity of the study and experimental design such that any observed associations are likely to be explained by bias. The rationale should indicate the specific concern(s). Overall confidence Considering the identified strengths and limitations, what is the overall confidence rating for the assay(s) or endpoint(s) of interest? Note: Reviewers should mark studies for additional consideration during evidence synthesis if, due to low sensitivity only (i.e., bias toward the null), these studies are rated as lower than high confidence. If the study is otherwise well conducted and an effect is observed, the confidence may be increased. For each assay or endpoint or grouping of endpoints in a study: • Were concerns (i.e., limitations or uncertainties) related to the risk of bias or sensitivity identified? • If yes, what is their expected impact on the overall interpretation of the reliability and validity of the study results, including (when possible) interpretations of impacts on the magnitude or direction of the reported effects? The overall confidence rating considers the likely impact of the noted concerns (i.e., limitations or uncertainties) in reporting, bias and sensitivity on the results. A confidence rating and rationale should be given for each assay or endpoint or group of endpoints investigated in the study. Confidence rating definitions are described above (see Section 4.1). This document is a draft for review purposes only and does not constitute Agency policy. 6-40 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Protocol for the Ethylbenzene IRIS Assessment 6.6. PHYSIOLOGICALLY BASED PHARMACOKINETIC (PBPK) MODEL DESCRIPTIVE SUMMARY AND EVALUATION PBPK (or classical pharmacokinetic [PK]) models should be used in an assessment when a validated and applicable one exists and no equal or better alternative for dosimetric extrapolation is available. Any models used should represent current scientific knowledge and accurately translate the science into computational code in a reproducible, transparent manner. For a specific target organ/tissue, it may be possible to employ or adapt an existing PBPK model or develop a new PBPK model or an alternate quantitative approach. Data for PBPK models may come from studies across various species and may be in vitro or in vivo in design. Specific details for this evaluation are provided below and in the Umbrella Quality Assurance Project Plan (QAPP) for dosimetry and mechanism-based models (U.S. EPA. 2020b) and Umbrella QAPP for PBPK models (U.S. EPA. 2018b). As interspecies difference in ethylbenzene pharmacokinetics have been noted, a major strength of a PBPK model is its capacity to account for physiological, biochemical, and metabolic determinants when extrapolating findings from higher dose animal studies to lower levels of human exposure. Note that a nonlinear ethylbenzene metabolism has been observed, suggesting high-dose saturation of metabolic processes (Sweeney et al.. 2015: Nong etal.. 20071. Hence the internal dose responsible for observed toxicity is a nonlinear function of the exposure levels. Therefore, the PBPK model(s) selected for assessing ethylbenzene toxicity should account for this dose saturation as well as reflect the current state of knowledge of toxicological mechanisms or MOA for specific toxicological endpoints when estimating relevant dose metrics (U.S. EPA. 2018b). Over a dozen scientific publications or reports describing the development or application of PBPK models since 2000 have been identified and will be evaluated for quality and potential use in the assessment. This evaluation will be conducted according to EPA's Umbrella QAPP for Dosimetry and Mechanism-Based Models (U.S. EPA. 2020b) and Umbrella QAPP for PBPK models (U.S. EPA. 2018b). It may be that none of the existing PBPK models adequately fulfills all of the assessment applications. In this case, a hybrid model could be created which merges elements from the existing models to achieve this objective if needed and feasible under the time constraints for the assessment. 6.6.1. Pharmacokinetic (PK)/Physiologically Based Pharmacokinetic (PBPK) Model Descriptive Summary PBPK modeling is the preferred approach for calculating a human equivalent concentration (HEC) according to the hierarchy of approaches outlined in EPA guidelines fU.S. EPA. 2020a. 20021. Following literature searches, a stepwise approach is taken that includes conducting an initial scoping of the supplemental material studies categorized as PK/PBPK models. Then, an in- depth full model evaluation is implemented to identify PBPK models that are potentially suitable for deriving toxicity values for the ethylbenzene assessment This document is a draft for review purposes only and does not constitute Agency policy. 6-41 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 1 The initial scoping process is distinct from the full model evaluation. The scoping process 2 provides a rapid assessment and communication of the availability, structure, and potential uses of 3 PBPK/PK models, but is not a full evaluation. Full model evaluation—the complete and thorough 4 assessment of the quality and utility of a particular model—is conducted if the initial scoping 5 identifies one or more models that are available and considered appropriate for one or more 6 applications in the assessment The model evaluation is then conducted for the selected 7 application(s). As shown below in Table 6-7, for example, key information from identified PBPK 8 models during the scoping process is summarized in tabular format for further in-depth model 9 evaluation following the evaluation approaches summarized in Section 6.6.2. Table 6-7. Example descriptive summary for a physiologically based pharmacokinetic (PBPK) model study Study detail Description/notes Author Smith et al. (2003) Contact email xxxxx@email.com Contact phone xxx-xxx-xxxx Sponsor N/A Model summary Species Rat Strain F433 Sex Male and female Life stage Adult Exposure routes Inhalation Oral I.V. Skin Tissue dosimetry Blood Liver Kidney Urine Lung Model evaluation Language ACSL 11.8 Code available YES Effort to recreate model COMPLETE Code received YES Effort to migrate to open software SIGNIFICANT Structure evaluated YES Math evaluated YES Code evaluated YES. Issue (minor): Incorrect units listed in comments for liver metabolism (line 233). Issue (major): Mass balance error in stomach compartment. Available PK data Urine (cumulative amount excreted) and blood (concentration) time course data for oral (gavage) and inhalation (6 hr/d for 4 d) exposure. In vitro skin permeation. This document is a draft for review purposes only and does not constitute Agency policy. 6-42 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Protocol for the Ethylbenzene IRIS Assessment 6.6.2. Pharmacokinetic (PK)/Physiologically Based Pharmacokinetic (PBPK) Model Evaluation Once available PBPK models are summarized, the assessment team undertakes model evaluation in accordance with criteria outlined by U.S. EPA (2018b). Judgments on the suitability of a model are separated into two categories: scientific and technical (see Table 6-8). The scientific criteria focus on whether the biology, chemistry, and other information available for chemical MOA(s) are justified (i.e., preferably with citations to support use) and represented by the model structure and equations. The scientific criteria are judged based on information presented in the publication or report that describes the model and do not require evaluation of the computer code. Preliminary technical criteria include availability of the computer code and completeness of parameter listing and documentation. Studies that meet the preliminary scientific and technical criteria are then subjected to an in-depth technical evaluation, which includes a thorough review and testing of the computational code. The in-depth technical and scientific analyses focus on the accurate implementation of the conceptual model in the computational code, use of scientifically supported and biologically consistent parameters in the model, and reproducibility of model results reported in journal publications and other documents. This approach stresses (1) clarity in the documentation of model purpose, structure, and biological characterization; (2) validation of mathematical descriptions, parameter values, and computer implementation; and (3) evaluation of each plausible dose metric. The in-depth analysis is used to evaluate the potential value and cost of developing a new model or substantially revising an existing one. PBPK models developed by EPA during the course of the assessment are peer reviewed, either as a component of the draft assessment or by publication in a journal article. This document is a draft for review purposes only and does not constitute Agency policy. 6-43 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 6-8. Criteria for evaluating physiologically based pharmacokinetic (PBPK) models Category Specific criteria Scientific Biological basis for the model is accurate. • Consistent with mechanisms that substantially impact dosimetry. • Predicts dose metric(s) expected to be relevant. • Applicable for relevant route(s) of exposure. Consideration of model fidelity to the biological system strengthens the scientific basis of the assessment relative to standard exposure-based extrapolation (default) approaches. • Ability of model to describe critical behavior, such as nonlinear kinetics in a relevant dose range, better than the default (i.e., BW3/4 scaling). • Model parameterization for critical life stages or windows of susceptibility. Evaluation of these criteria should also consider the model's fidelity vs. default approaches and possible use of an intraspecies uncertainty factor (UFh) in conjunction with the model to account for variations in sensitivity between life stages. • Predictive power of model-based dose metric vs. default approach, based on exposure. o Specifically, model-based metrics may correlate better than the applied doses with animal/human dose-response data, o The degree of certainty in model predictions vs. default is also a factor. For example, while target tissue metrics are generally considered better than blood concentration metrics, lack of data to validate tissue predictions when blood data are available may lead to choosing the latter. Principle of parsimony • Model complexity or biological scale, including number and parameterization of (sub)compartments (e.g., tissue or subcellular levels) should be commensurate with data available to identify parameters. Model describes existing PK data reasonably well, both in "shape" (matches curvature, inflection points, peak concentration time, etc.) and quantitatively (e.g., within factor of 2-3). Model equations are consistent with biochemical understanding and biological plausibility. Initial Well-documented model code is readily available to EPA and public. technical Set of published parameters is clearly identified, including origin/derivation. Parameters do not vary unpredictably with dose (e.g., any dose dependence in absorption constants is predictable across the dose ranges relevant for animal and human modeling). Sensitivity and uncertainty analysis has been conducted for relevant exposure levels (local sensitivity analysis is sufficient, but global analysis provides more information). • If a sensitivity analysis was not conducted, EPA may decide to independently conduct this additional work before using the model in the assessment. • A sound explanation should be provided when sensitivity of the dose metric to model parameters differs from what is reasonably expected based on experience. 6.6.3. Selection of the Appropriate Dose Metric 1 The level of confidence in using a pharmacokinetic (PK) or PBPK model depends on its 2 ability to provide a reliable estimation of dose metrics based on biological plausibility and MOA This document is a draft for review purposes only and does not constitute Agency policy. 6-44 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 1 considerations. Thus, one needs to take into consideration mechanism(s) relevant to the 2 endpoint(s) of interest, data availability and uncertainties in estimating that dose metric. 3 Compared to liver and kidney toxicity, it remains less understood what the appropriate dose metric 4 for other toxicities should be, including lung and ototoxicity endpoints. Therefore, various dose 5 metrics (e.g., the area under the curve (AUC) for arterial blood concentration of ethylbenzene or its 6 metabolites) will be explored to inform dose-response extrapolation of animal data to humans. This document is a draft for review purposes only and does not constitute Agency policy. 6-45 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Protocol for the Ethylbenzene IRIS Assessment 7. DATA EXTRACTION OF STUDY METHODS AND RESULTS The process of summarizing study methods and results is referred to as data extraction. Studies that met problem formulation PECO criteria after full-text review are briefly summarized in DistillerSR HDE forms. These study summaries are exported from DistillerSR in Excel format and imported into Tableau software f https: //www.tableau.com /] to create interactive literature inventory visualizations used to display the extent and nature of the available evidence, (see below for studies decisions related to studies meeting the assessment PECO). For experimental animal studies, which are typically studies in rodents, the following information is captured: chemical form, study type (acute [<24 hours], short term [<7 days], short term [7-27 days], subchronic [28-90 days], chronic [>90 days5] and developmental, which includes multigeneration studies), duration of treatment, route, species, strain, sex, dose or concentration levels tested, dose units, health system and specific endpoints assessed. Animal studies that meet the assessment PECO undergo a subsequent phase of full data extraction in HAWC that includes detailed presentation of results (described below). For studies that meet problem formulation PECO criteria (but not the assessment PECO) the SEM (initial) literature inventory summary includes the no-observed-effect level/low-observed-effect level (NOEL/LOEL) based on author- reported statistical significance. Expert judgment may be used to identify NOEL/LOELs in cases where only qualitative results are reported (e.g., "no effects on liver weight were observed at any dose level") or when the findings indicate an apparent clear and strong effect of exposure (e.g., large magnitude of change) but the authors did not present a statistical comparison. When findings are not analyzed by the authors and are not readily interpretable, then NOEL/LOELs are not identified, and the extraction field entry indicates "not reported." For human studies, the following information is summarized in DistillerSR HDE forms: chemical form, population type (e.g., general population-adult, occupational, pregnant women, infants and children), study type (e.g., cross-sectional, cohort, case-control), sex, major route of exposure (if known), description of how exposure was assessed, health system studied, specific endpoints assessed and a quantitative summary of findings at the endpoint level (or narrative only if the finding was qualitatively presented). In contrast to the animal studies, epidemiological studies 5EPA considers chronic exposure to be more than approximately 10% of the life span in humans. For typical laboratory rodent species, this can lead to consideration of exposure durations of approximately 90 days to 2 years. However, studies in duration of 1-2 years are typical of what is considered representative of chronic exposure rather than durations just over 90 days. This document is a draft for review purposes only and does not constitute Agency policy. 7-1 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Protocol for the Ethylbenzene IRIS Assessment that met assessment PECO did not undergo additional more detailed data extraction in HAWC because that module in HAWC was under development at the time of preparation of this protocol. For animal studies that met the assessment PECO criteria, HAWC is used for full extraction of study methods and results. For animal studies, compared with the literature inventory forms used to described studies that meet problem formulation PECO criteria, full data extraction in HAWC includes summarizing more details of study design (e.g., diet, chemical purity) and gathering effect size information. Instructions on how to conduct data extraction in HAWC are available at https: //hawcproiect.org/resources/. Over 100 distinct extraction fields are collected for each animal study and endpoint (for list of data extraction fields, see Downloads > Animal Bioassay Data > Complete Export at the HAWC Ethylbenzene Project https: //hawc.epa.gOv/assessment/100000059 /). An additional resource used to implement use of a consistent vocabulary to summarize endpoints assessed in animal studies is available in the HAWC project "IRIS PPRTV SEM Template Figures and Resources" (see "Attachments," then select the "Environmental Health Vocabulary (EHV) - a recommended terminology for outcomes/endpoints" file). In some cases, EPA may conduct their own statistical analysis of human and animal toxicology data (assuming the data are amenable to doing so and the study is otherwise well conducted) during evidence synthesis. Data extraction for in vivo and in vitro studies prioritized to assess key mechanistic analyses is conducted in Microsoft Word and presented in tabular format. All findings are considered for extraction, regardless of statistical significance. The level of extraction for specific outcomes within a study could differ (i.e., narrative only if the finding was qualitative). For quality control, studies were summarized by one member of the evaluation team and independently verified by at least one other member. Discrepancies were resolved by discussion or consultation within the evaluation team. Data extraction results are presented via figures, tables, or interactive web-based graphics in the assessment. The information is also made available for download in Excel format when the draft is publicly released. The literature inventories are presented in the HAWC Visualization module, with options to link to the native Tableau application where the underlying information is available for download. Download of full data extraction for animal studies is done directly in HAWC. For non-English studies online translation tools (e.g., Google translator) or engagement with a native speaker can be used to summarize studies at the level of the literature inventory. Fee-based translation services for non-English studies are typically reserved for studies considered potentially informative for dose response, a consideration that occurs after preparation of the initial literature inventory during draft assessment development. Digital rulers, such as WebPlotDigitizer fhttps: //automeris.io/WebPlotDigitizer/]. are used to extract numerical information from figures, and their use is be documented during extraction. For studies that evaluate endpoints at multiple time points (e.g., 7 days, 3 weeks, 3 months) data are generally summarized for the longest duration This document is a draft for review purposes only and does not constitute Agency policy. 7-2 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Protocol for the Ethylbenzene IRIS Assessment in the study report, but other durations may be summarized if they provide important contextual information for hazard characterization (e.g., an effect was present at an interim time point but did not appear to persist or the magnitude of the effect diminished). A free text field is available in HAWC to describe cases when the approach for summarizing results requires explanation. Author queries may be conducted for studies considered for dose-response to facilitate quantitative analysis (e.g., information on variability or availability of individual animal data). Outreach to study authors or designated contact persons is documented and considered unsuccessful if researchers do not respond to email or phone requests within 1 month of initial attempt(s) to contact. Only information or data that can be made publicly available (e.g., within HAWC or HERO) will be considered. Exposures are standardized to common units when possible. For hazard characterization, exposure levels are typically presented as reported in the study and standardized to common units (e.g., ppm or mg/m3 for inhalation studies) as an initial phase in evidence synthesis and integration. For inhalation exposures to ethylbenzene, concentration in air in ppm can be converted to concentration in air in mg/m3 by multiplying ppm times 4.344 (106.2 g/mol 4- 24.45 L) at standard temperature (25°C) and pressure (1 atm). This document is a draft for review purposes only and does not constitute Agency policy. 7-3 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Protocol for the Ethylbenzene IRIS Assessment 8. EVIDENCE SYNTHESIS AND INTEGRATION Evidence synthesis6 is a within-stream analysis, conducted separately for human, animal, and mechanistic evidence. Findings from human and animal evidence for each unit of analysis are separately judged to reach an expression of certainty in the evidence for a hazard (robust, moderate, slight, indeterminate, or compelling evidence of no effect). Within-stream evidence synthesis conclusions directly inform the integration across the evidence streams to draw overall conclusions for each of the assessed health effect categories (evidence demonstrates; evidence indicates¦, evidence suggests¦, evidence inadequate, or strong evidence supports no effect). A structured framework approach is used to guide both evidence synthesis and integration. While there are circumstances where specific mechanistic evidence (typically biological precursors) is included in the unit of analysis for human or animal evidence synthesis, in most cases mechanistic findings are presented separately from the human and animal evidence and used to inform conclusions on (1) the coherence, directness of outcome measures, and biological significance of findings within the animal or human evidence streams during evidence synthesis and, (2) evidence integration judgments on the human relevance of findings in animals, coherence across evidence streams ("cross-stream coherence"), information on susceptible populations or lifestages, understanding of biological plausibility and MOA, and possibly other critical inferences (e.g., read-across analyses). The structured framework also accommodates consideration of supplemental information (e.g., ADME, non-PECO route of exposure) that can inform evidence synthesis and integration judgments. • Evidence synthesis: A summary of findings and judgment(s) regarding the certainty in the evidence for hazard for each unit of analysis from the human and animal studies are made in parallel, but separately. A unit of analysis is an outcome or group of related outcomes within a health effect category that are considered together during evidence synthesis. These judgments can incorporate mechanistic and other supplemental evidence when the unit of analysis is defined as such (see Section 3). The units of analysis can also include or be framed to focus on precursor events (e.g., biomarkers). In addition, this can include an evaluation of coherence across units of analysis within an evidence stream. At this stage, the animal evidence judgment(s) does not yet consider the human relevance of that evidence. • Evidence integration: The animal and human evidence judgments are combined to draw an overall evidence integration judgment(s) that incorporates inferences drawn based on information on the human relevance of the animal evidence, coherence across evidence 6The phrases "evidence synthesis" and "evidence integration" used here are analogous to the phrases "strength of evidence" and "weight of evidence," respectively, used in some other assessment processes fEFSA. 2017: U.S. EPA. 2017a: NRC. 2014: U.S. EPA. 2005al This document is a draft for review purposes only and does not constitute Agency policy. 8-1 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Protocol for the Ethylbenzene IRIS Assessment streams, potential susceptibility, understanding of biological plausibility and MOA and other critical inferences informed by mechanistic, ADME, or other supplemental data. Evidence synthesis and integration judgments are expressed both narratively in the assessment and summarized in tabular format in evidence profile tables (see Table 8-1). Key findings and analyses of mechanistic and other supplemental content are also summarized in narrative and tabular format to inform evidence synthesis and integration judgments (see Table 8-2). In brief, after synthesis a certainty in the evidence judgment is drawn for each unit of analysis summarized as robust, moderate, slight, indeterminate, or compelling evidence of no effect (see Section 8.1). Next, these judgments are used to inform evidence integration judgments summarized as evidence demonstrates, evidence indicates, evidence suggests, evidence inadequate, or strong evidence supports no effect) (see Section 8.2). These summary judgments are included as part of the evidence synthesis and integration narratives. When multiple units of analysis are synthesized, the main evidence integration judgments typically focus on the unit of analysis with the strongest evidence synthesis judgments, although exceptions may occur.7 Health outcomes or endpoints where the unit of analysis is considered to present slight, indeterminant or compelling evidence of no effect can inform the evidence integration hazard judgment but would typically not be used as the basis for deriving a toxicity value. Structured evidence profile tables are used to summarize these analyses and foster consistency within and across assessments. Instructions for using HAWC to create these tables are available at the HAWC project "IRIS PPRTV SEM Template Figures and Resources" (see "Attachments," then select the "Creating Evidence Profile Tables in HAWC"). 7In some cases, it may be appropriate to draw multiple evidence integration judgments within a given health effect category. This is generally dependent on data availability (i.e., more narrowly defined categories may be possible with more evidence) and the ability to integrate the different evidence streams at the level of these more granular categories. More granular categories will generally be organized by pre-defined manifestations of potential toxicity. For example, within the health effect category of immune effects, separate and different evidence integration judgments might be appropriate for immunosuppression, immunostimulation, and sensitization and allergic response (i.e., the three types of immunotoxicity described in the IPCS (2012)). Likewise, within the category of developmental effects, it may be appropriate to draw separate judgments for potential effects on fetal death, structural abnormality, altered growth, and functional deficits (i.e., the four manifestations of developmental toxicity described in EPA guidelines (U.S. EPA. 1991a)). These separate judgments are particularly important when the evidence supports that the different manifestations might be based on different toxicological mechanisms. As described for the evidence synthesis judgments, the strongest evidence integration judgment will typically be used to reflect certainty in the broader health effect category. This document is a draft for review purposes only and does not constitute Agency policy. 8-2 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 8-1. Generalized evidence profile table to show the relationship between evidence synthesis and evidence integration to reach judgment of the evidence for hazard Evidence synthesis (certainty of evidence) judgments (note that many factors and judgments require elaboration or evidence-based justification; see IRIS Handbook for details) Evidence integration (weight of evidence) judgment(s) Studies Summary of key findings Factors that increase certainty (applied to each unit of analysis) Factors that decrease certainty (applied to each unit of analysis) Evidence synthesis judgment(s) Describe overall evidence integration judgment(s): ©©© Evidence demonstrates ©©O Evidence indicates (likely) ©OO Evidence suggests OOO Evidence inadequate Strong evidence supports no effect Highlight the primary supporting evidence for each integration judgment* Present inferences and conclusions on: • Human relevance of findings in animals* • Cross-stream coherence* • Potential susceptibility* • Biological plausibility* • Other critical inferences (e.g., from ADME or other supplemental information)* Evidence from human studies Unit of analysis #1 Studies considered and study confidence Description of the primary results • All/Mostly medium or high confidence studies • Consistency • Dose-response gradient • Large or concerning magnitude of effect • Coherence* • All/Mostly low confidence studies • Unexplained inconsistency • Imprecision • Concerns about biological significance* • Indirect outcome measures* • Lack of expected coherence* Judgment reached for each unit of analysis* ©0© Robust ©©O Moderate ©OO Slight OOO Indeterminate Compelling evidence of no effect Evidence from animal studies Unit of analysis #1 Studies considered and study confidence Description of the primary results • All/Mostly medium or high confidence studies • Consistency • Dose-response gradient • Large or concerning magnitude of effect • Coherence* • All/Mostly low confidence studies • Unexplained inconsistency • Imprecision • Concerns about biological significance* • Indirect outcome measures* • Lack of expected coherence* Judgment reached for each unit of analysis ©©© Robust ©©O Moderate ©OO Slight OOO Indeterminate Compelling evidence of no effect *Can be informed by key findings from the mechanistic analyses (see Table 8-2). This document is a draft for review purposes only and does not constitute Agency policy. 8-3 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 8-2. Generalized evidence profile table to show the key findings and supporting rationale from mechanistic analyses. Mechanistic analyses Biological events or pathways (or other relevant evidence grouping) Summary of key findings and interpretation Judgment(s) and rationale Different analyses mav be presented separately, e.g., bv exposure route or key uncertainty addressed Each analysis mav include multiple rows separated bv biological events or other feature of the approach used for the analvsis • Generally, will cite mechanistic synthesis (e.g., for references; for detailed analysis) • Does not have to be chemical- specific (e.g., read-across) Mav include separate summaries, for example by study type (e.g., new approach methods vs. in vivo biomarkers), dose, or design Interpretation: Summary of expert interpretation for the body of evidence and supporting rationale Key findings: Summary of findings across the body of evidence (may focus on or emphasize highly informative designs or findings), including key sources of uncertainty or identified limitations of the study designs tested (e.g., regarding the biological event or pathway being examined) Overall summary of expert interpretation across the assessed set of biological events, potential mechanisms of toxicity, or other analysis approach (e.g., AOP). • Includes the primary evidence supporting the interpretation(s) • Describes and substantiates the extent to which the evidence influences inferences across evidence streams • Characterizes the limitations of the evaluation and highlights existing data gaps • May have overlap with factors summarized for other streams 1 This document is a draft for review purposes only and does not constitute Agency policy. 8-4 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 Protocol for the Ethylbenzene IRIS Assessment 8.1. EVIDENCE SYNTHESIS IRIS assessments synthesize the evidence separately for each unit of analysis by focusing on factors that increase or decrease certainty in the reported findings (see Table 8-1). These factors are adapted from considerations for causality introduced by Austin Bradford Hill fHill. 1965] with some expansion and adaptation of how they are applied to facilitate transparent application to chemical assessments that consider multiple streams of evidence. Specifically, the factors considered are confidence in study findings (risk of bias and sensitivity), consistency across studies or experiments, dose-/exposure-response gradient, strength (effect magnitude) of the association, directness of outcome or endpoint measures, and coherence [Table 8-3; see additional discussion in U.S. EPA (2005a), U.S. EPA (1994), and U.S. EPA (2020a)]. These factors are similar to the domains considered in the GRADE Quality of Evidence framework fSchiinemann etal.. 20131. Each of the considered factors and the certainty of evidence judgments require elaboration or evidence-based justification in the synthesis narrative. Analysis of evidence synthesis considerations is qualitative (i.e., numerical scores are not developed, summed, or subtracted). Biological understanding (e.g., knowledge of how an effect manifests or progresses) or mechanistic inference (e.g., dependency on a conserved key event across outcomes) can be used to define which related outcomes are considered as a unit of analysis. The units of analysis may also include predefined categories of mechanistic evidence (typically precursor events). When mechanistic evidence is included in the units of analysis, it is evaluated against all evidence synthesis factors. Mechanistic and other supplemental evidence not included in the units of analysis can be analyzed to inform select evidence synthesis factors (i.e., coherence, directness of outcome measures, or biological significance) within the animal and human evidence synthesis. Additional mechanistic evaluations (e.g., biological plausibility) as considered as part of across stream evidence integration (see Section 8.2). Five levels of certainty in the evidence for a hazard are used to summarize evidence synthesis judgments: robust (©©©, very little uncertainty exists), moderate (©©O, some uncertainty exists), slight (©OO, large uncertainty exists), indeterminate (OOO), or compelling evidence of no effect (—, little to no uncertainty exists for lack of hazard) (see Tables 8.4 and 8.5 for descriptions). Conceptually, before the evidence synthesis framework is applied, certainty in the evidence is neutral (i.e., functionally equivalent to indeterminate). Next, the level of certainty regarding the evidence for (or against) hazard is increased or decreased depending on interpretations using the factors described in Table 8-3. Level of certainty analyses are conducted for each unit of analysis within an evidence stream. Observations that increase certainty are having an evidence base exhibiting a signal of an effect on the health outcome based on evaluation of consistency across studies or experiments, the presence of a dose or exposure-response gradient, observing a large or concerning magnitude of effect, and coherent findings for closely related endpoints (can include mechanistic endpoints). These patterns are more compelling when observed among high or medium confidence studies. Observations that decrease certainty are This document is a draft for review purposes only and does not constitute Agency policy. 8-5 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 1 having an evidence base of mostly low confidence studies, unexplained inconsistency, imprecision, 2 concerns about biological significance, indirect measures of outcomes, and lack of expected 3 coherence. Study sensitivity considerations can be expressed as a factor that can either increase or 4 decrease certainty in the evidence, depending on whether an association is observed. An evidence 5 base of mostly null findings where insensitivity is a serious concern decreases certainty that the 6 evidence is sufficient to support a lack of health effect or association. Conversely, there may be an 7 increase in the evidence certainty in cases where an association is observed although the expected 8 impact of study sensitivity is toward the null. This document is a draft for review purposes only and does not constitute Agency policy. 8-6 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 8-3. Considerations that inform judgments of the certainty of the evidence for hazard for each unit of analysis Consideration Increased evidence certainty (of the human or animal evidence for hazard3) Decreased evidence certainty (of the human or animal evidence for hazard3) Risk of bias and sensitivity (across studies) • An evidence base of mostly (or all) high or medium confidence studies is interpreted as being only minimally affected by bias and insensitivity. • This factor should not be used if no other factors would increase or decrease the confidence for a given unit of analysis. • In addition, consideration of risk of bias and sensitivity should inform how other factors are evaluated, i.e., can inconsistency be potentially explained by variation in confidence judgments? • An evidence base of mostly (or all) low confidence studies decreases certainty. An exception to this is an evidence base of studies in which the issues resulting in low confidence are related to insensitivity. This may increase evidence certainty in cases where an association is identified because the expected impact of study insensitivity is toward the null. • An evidence base of mostly null findings where insensitivity is a serious concern decreases certainty that the evidence is sufficient to support a lack of health effect or association. • Decisions to increase certainty for other considerations in this table should generally not be made if there are serious concerns for risk of bias. Consistency • Similarity of findings for a given outcome (e.g., of a similar direction) across independent studies or experiments, especially when medium or high confidence, increases certainty. The increase in certainty is larger when consistency is observed across populations (e.g., geographical location) or exposure scenarios in human studies, and across laboratories, species, or exposure scenarios (e.g., route; timing) in animal studies. When seemingly inconsistent findings are identified, patterns should be further analyzed to discern if the inconsistencies can potentially be explained based on study confidence, dose or exposure levels, population, or experimental model differences, etc. This factor is typically given the most attention during evidence synthesis. • Unexplained inconsistency N.e., conflicting evidence; see U.S. EPA (2005a)l decreases certaintv. Generally, certaintv should not be decreased if discrepant findings can be reasonably explained by considerations such as study confidence conclusions (including sensitivity); variation in population or species, sex, or lifestage (including understanding of differences in pharmacokinetics); or exposure patterns (e.g., intermittent versus continuous), levels (low versus high), or duration. Similar to current recommendations in the Cochrane Handbook fHiggins et al. (2022), see Section 7.8.61, clear conflicts of interest (COI) related to funding source can be considered as a factor to explain apparent inconsistency. For small evidence bases, it may be hard to assess consistency. An evidence base of a single or a few studies where consistency cannot be accurately assessed does not, on its own, increase or decrease evidence certainty. Similarly, a reasonable explanation for inconsistency does not necessarily result in an increase in evidence certainty. Effect magnitude and imprecision • Evidence of a large or concerning magnitude of effect can increase certainty (generally only when observed in medium or high confidence studies). • Certainty may be decreased if the findings are considered not likely to be biologically significant. Effects that are small in magnitude might not be considered to be biologically significant (adverse15) based on information such as historical responses and variability. However, effects that appear to be of small This document is a draft for review purposes only and does not constitute Agency policy. 8-7 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Consideration Increased evidence certainty (of the human or animal evidence for hazard3) Decreased evidence certainty (of the human or animal evidence for hazard3) • Judgments on effect magnitude and imprecision consider the rarity and severity of the effect. magnitude may be meaningful at the population level (e.g., IQ shifts); in such cases, certainty would not be decreased. • Certainty may also be decreased for imprecision, particularly if there are only a few studies available to evaluate consistency in effect magnitude across studies. Dose-response • Evidence of dose-response or exposure- response in high or medium confidence studies increases certainty. Dose-response may be demonstrated across studies or within studies, and it can be dose- or duration-dependent. It may also not be a monotonic dose-response (monotonicity should not necessarily be expected as different outcomes may be expected at low vs. high doses or long vs. short durations due to factors such as activation of different mechanistic pathways, systemic toxicity at high doses, or tolerance/acclimation). Sometimes, grouping studies by level of exposure is helpful to identify the dose- response pattern. • Decreases in a response (e.g., symptoms of current asthma) after a documented cessation of exposure also may increase certainty in a relationship between exposure and outcome (this is primarily applicable to epidemiology studies because of their observational nature). • A lack of dose-response when expected based on biological understanding can decrease certainty in the evidence. If the data are not adequate to evaluate a dose-response pattern, however, then certainty is neither increased nor decreased. • In some cases, duration-dependent patterns in the dose-response can decrease evidence certainty. Such patterns are generally only observable in experimental studies. Specifically, the magnitude of effects at a given exposure level might decrease with longer exposures (e.g., due to tolerance or acclimation). Or, effects might rapidly resolve under certain experimental conditions (e.g., reversibility after removal of exposure). As many reversible and short-lived effects can be of high concern, decisions about whether such patterns decrease evidence certainty depend on considering the pharmacokinetics of the chemical and the conditions of exposure fsee U.S. EPA (1998)1, endpoint severity, judgments regarding the potential for delaved or secondary effects, the underlying mechanism(s) involved, as well as the exposure context focus of the assessment (e.g., addressing intermittent or short-term exposures). This document is a draft for review purposes only and does not constitute Agency policy. 8-8 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Consideration Increased evidence certainty (of the human or animal evidence for hazard3) Decreased evidence certainty (of the human or animal evidence for hazard3) Directness of outcome/endpoin t measures • Not applicable • If the evidence base primarily includes outcomes or endpoints that are indirect measures (e.g., biomarkers) of the unit of analysis, certainty (for that unit of analysis) is typically decreased. Judgments to decrease certainty based on indirectness should focus on findings that have an unclear linkage to an apical or clinical (adverse15) outcome. Scenarios where the magnitude of the response is not considered to reflect a biologically meaningful level of change (i.e., biological significance; see 'effect magnitude and imprecision' row above) are not considered under indirectness. • Related to indirectness, certainty in the evidence may be decreased when the findings are determined to be nonspecific to the hazard under evaluation. This consideration is generally only applicable to animal evidence and the most common example is effects only with exposures (level, duration) shown to cause excessive toxicity in that species and lifestage (including consideration of maternal toxicity in developmental evaluations). This does not apply when an effect is viewed as secondary to other changes (e.g., effects on pulmonary function because of disrupted immune responses). Coherence • Biologically related findings within or across studies, within an organ system or across populations (e.g., sex), increase certainty (generally only when observed in medium or high confidence studies). Certainty is further increased when a temporal or dose- dependent progression of related effects is observed within or across studies, or when related findings of increasing severity are observed with increasing exposure. • Coherence across findings within a unit of analysis (e.g., consistent changes in disease markers and biological precursors in exposed humans) can increase certainty in the evidence for an effect. • Coherence within or across biologically related units of analysis can also increase certainty for a given (or multiple) unit(s) of analysis. This considers certainty in the biological relationships between the endpoints • An observed lack of expected coherent changes (e.g., in well- established biological relationships) within or across biologically related units of analysis typically decrease evidence certainty. This includes mechanistic changes when included in the unit of analysis. However, as described for decisions to increase certainty in the biological relationships between the endpoints being compared, and the sensitivity and specificity of the measures used, need to be carefully examined. The decision to decrease depends on the availability of evidence across multiple related endpoints for which changes would be anticipated, and it considers factors (e.g., dose and duration of exposure, strength of expected relationship) across the studies of related changes. This document is a draft for review purposes only and does not constitute Agency policy. 8-9 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Consideration Increased evidence certainty (of the human or animal evidence for hazard3) Decreased evidence certainty (of the human or animal evidence for hazard3) being compared, and the sensitivity and specificity of the measures used. • Mechanistic support for, or biological understanding of, the relatedness between different endpoints within (or across different) units of analysis, can inform an understanding of coherence. Other factors • Unusual scenarios that cannot be addressed by the considerations above, e.g., read-across inferences supporting the adversity of observed changes. • Unusual scenarios that cannot be addressed by the considerations above, e.g., strong evidence of publication bias.0 aWhile the focus is on identifying potential adverse human health effects (hazards) of exposure, these factors can also be used to increase or decrease certainty in the evidence supporting lack of an effect (e.g., leading to a judgment of compelling evidence of no effect). The latter application is not explicitly outlined here. bWithin this framework, evidence synthesis judgments reflect an interpretation of the evidence for) a hazard; thus, consideration of the adversity of the findings is an explicit aspect of the analyses. To better define how adversity is evaluated, the consideration of adversity is broken into the two, sometimes related, considerations of the indirectness of the outcome measures and the interpreted biological significance of the effect magnitude. Publication bias involves the influence of the direction, magnitude, or statistical significance of the results on the likelihood of a paper being published; it can result from decisions made, consciously or unconsciously, by study authors, journal reviewers, and journal editors (Dickersin. 1990). This may make the available evidence base unrepresentative. However, publication bias can be difficult to evaluate (NTP, 2019) and should not be used as a factor that decreases certainty unless there is strong evidence. This document is a draft for review purposes only and does not constitute Agency policy. 8-10 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Protocol for the Ethylbenzene IRIS Assessment A structured framework approach is used to draw evidence synthesis judgments for human and animal evidence. Tables 8-4 and 8-5 (for human and animal evidence, respectively) provide the example-based criteria that guide how to draw the certainty of evidence judgments for each unit of analysis within a health effect category and the terms used to summarize those judgments. These terms are applied to human and animal evidence separately. The terms robust and moderate are characterizations for judgments that the evidence (across studies) supports that the effect(s) results from the exposure being assessed. These two terms are differentiated by the quality and amount of information available to rule out alternative explanations for the results. For example, repeated observations of effects by independent studies or experiments examining various aspects of exposure or response (e.g., different exposure settings, dose levels or patterns, populations or species, biologically related endpoints) result in a stronger certainty of evidence judgment The term slight indicates situations in which there is some evidence supporting an association within the evidence stream, but substantial uncertainties in the data exist to prevent judgments that the effect(s) can be reliably attributed to the exposure being assessed. Indeterminate reflects judgments for a wide variety of evidence scenarios, including when no studies are available or when the evidence from studies of similar confidence has a high degree of unexplained inconsistency. Compelling evidence of no effect represents a rare situation in which extensive evidence across a range of populations and exposures has demonstrated that no effects are likely to be attributable to the exposure being assessed. This category is applied at the health effect level (e.g., hepatic effects) rather than more granular units of analysis level to avoid giving the impression of confidence in lack of a health effect when aspects of potential toxicity have not been adequately examined. Reaching this judgment is infrequent because it requires both a high degree of confidence in the conduct of individual studies, including consideration of study sensitivity, as well as comprehensive assessments of outcomes and lifestages of exposure that adequately address concern for the hazard under evaluation. Table 8-4. Framework for evidence synthesis judgments from studies in humans Evidence synthesis judgment Description Robust (©©©) ...evidence in human studies (strong signal of effect with very little uncertainty) A set of high or medium confidence independent studies (e.g., in different populations) reporting an association between the exposure and the health outcome(s), with reasonable confidence that alternative explanations, including chance, bias, and confounding, can be ruled out across studies. The set of studies is primarily consistent, with reasonable explanations when results differ; the findings are considered adverse (i.e., biologically significant and without notable concern for indirectness); and an exposure-response gradient is demonstrated. Additional supporting evidence, such as associations with biologically related endpoints in human studies (coherence) or large estimates of risk or severity of the response, can increase confidence but are not required. Supplemental evidence included in the unit of analysis (e.g., mechanistic studies in exposed humans or human cells) may raise This document is a draft for review purposes only and does not constitute Agency policy. 8-11 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Evidence synthesis judgment Description the certainty of evidence to robust for a set of studies that otherwise would be described as moderate. Such evidence not included in the unit of analysis can also inform evaluations of the coherence of the human evidence, the directness of the outcome measures, and the biological significance of the findings. Causality is inferred for a human evidence base of robust. Moderate (0©O) ...evidence in human studies (signal of effect with some uncertainty) A set of evidence that does not reach the degree of certainty required for Robust, but which includes at least one high or medium confidence study reporting an association and additional information increasing the certainty of evidence. For multiple studies, there is primarily consistent evidence of an association with reasonable support for adversity, but there may be some uncertainty due to potential chance, bias, or confounding or because of the indirectness of some measures. For a single study, there is a large magnitude or severity of the effect, or a dose-response gradient, or other supporting evidence, and there are no serious residual methodological uncertainties. Supporting evidence could include associations with related endpoints, including mechanistic evidence from exposed humans when included within the unit of analysis. When available and included in the unit of analysis, mechanistic data in humans that address the above considerations may raise the certainty of evidence to Moderate for a set of studies that otherwise would be described as Slight. In exceptional cases, biological support from mechanistic evidence in exposed humans may support raising the certainty of evidence to Moderate for evidence that would otherwise be described as Indeterminate. Slight (©OO) ...evidence in human studies (signal of effect with large amount of uncertainty) One or more studies reporting an association between exposure and the health outcome, but considerable uncertainty exists and supporting coherent evidence is sparse. In general, the evidence is limited to a set of consistent low confidence studies, or higher confidence studies with significant unexplained heterogeneity or other serious residual uncertainties. It also applies when one medium or high confidence study is available without additional information strengthening the likelihood of a causal association (e.g., coherent findings within the same study or from other studies). This category serves primarily to encourage additional study where evidence does exist that might provide some support for an association, but for which the evidence does not reach the degree of confidence required for moderate. Indeterminate (OOO) ...evidence in human studies (signal cannot be determined for or against an effect) No studies available in humans or situations when the evidence is inconsistent and primarily of low confidence. In addition, this may include situations where higher confidence studies exist, but there are major concerns with the evidence base such as unexplained inconsistency, a lack of expected coherence from a stronger set of studies, very small effect magnitude (i.e., major concerns about biological significance), or uncertainties or methodological limitations that result in an inability to discern effects from exposure. It also applies for a single low confidence study in the absence of factors that increase certainty. A set of largely null studies could be concluded to be Indeterminate if the evidence does not reach the level required for Compelling evidence of no effect. This document is a draft for review purposes only and does not constitute Agency policy. 8-12 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Evidence synthesis judgment Description Compelling evidence of no effect (...) ...in human studies (strong signal for lack of an effect with little uncertainty) A set of high confidence studies examining a reasonable spectrum of endpoints showing null results (for example, an odds ratio of 1.0), ruling out alternative explanations including chance, bias, and confounding) with reasonable confidence. Each of the studies should have used an optimal outcome and exposure assessment and adequate sample size (specifically for higher exposure groups and for susceptible populations). The set as a whole should include diverse sampling (across sexes [if applicable] and different populations) and include the full range of levels of exposures that human beings are known to encounter, an evaluation of an exposure-response gradient, and an examination of at-risk populations and lifestages. Mechanistic data in humans that address the above considerations or that provide information supporting the lack of an association between exposure and effect with reasonable confidence may provide additional support for this judgment. Table 8-5. Framework for evidence synthesis judgments from studies in animals Evidence synthesis judgment Description Robust (©©©) ...evidence in animal studies (strong signal of effect with very little uncertainty) The set of high or medium confidence, independent experiments (i.e., across laboratories, exposure routes, experimental designs [for example, a subchronic study and a multigenerational study], or species) reporting effects of exposure on the health outcome(s). The set of studies is primarily consistent, with reasonable explanations when results differ (i.e., due to differences in study design, exposure level, animal model, or study confidence), and the findings are considered adverse (i.e., biologically significant and without notable concern for indirectness). At least two of the following additional factors in the set of experiments increase the certainty of evidence: coherent effects across multiple related endpoints (within or across biologically related units of analysis and may include mechanistic endpoints); an unusual magnitude of effect, rarity, age at onset, or severity; a strong dose-response relationship; or consistent observations across animal lifestages, sexes, or strains. Mechanistic evidence from animals included in the unit of analysis or used to assess coherence of findings in the animal evidence may raise the certainty of evidence to robust for a set of studies that otherwise would be described as moderate. This document is a draft for review purposes only and does not constitute Agency policy. 8-13 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Evidence synthesis judgment Description Moderate (0©O) ...evidence in animal studies (signal of effect with some uncertainty) A set of evidence that does not reach the degree of certainty required for Robust, but which includes at least one high or medium confidence study and additional information increasing the certainty of evidence. For multiple studies or a single study, the evidence is primarily consistent or coherent with reasonable support for adversity, but there are notable remaining uncertainties (e.g., difficulty interpreting the findings due to concerns for indirectness of some measures); however, these uncertainties are not sufficient to reduce or discount the level of concern regarding the positive findings and any conflicting findings are from a set of experiments of lower confidence. The set of experiments supporting the effect provide additional information increasing the certainty of evidence, such as consistent effects across laboratories or species; coherent effects across multiple related endpoints (may include mechanistic endpoints within the unit of analysis); an unusual magnitude of effect, rarity, age at onset, or severity; a strong dose-response relationship; and/or consistent observations across exposure scenarios (e.g., route, timing, duration), sexes, or animal strains. When available and included in the unit of analysis, mechanistic data in animals that address the above considerations may raise the certainty of evidence to Moderate for a set of studies that otherwise would be described as Slight. In exceptional cases, strong biological support from mechanistic studies may raise the certainty of evidence to Moderate for evidence that would otherwise be described as Indeterminate. Slight (©OO) ...evidence in animal studies (signal of effect with large amount of uncertainty) One or more studies reporting an effect on an exposure on the health outcome, but considerable uncertainty exists and supporting coherent evidence is sparse. In general, the evidence is limited to a set of consistent low confidence studies, or higher confidence studies with significant unexplained heterogeneity or other serious uncertainties (e.g., concerns about adversity) across studies. It also applies when one medium or high confidence experiment is available without additional information increasing the certainty of evidence (e.g., coherent findings within the same study or from other studies). Biological evidence from mechanistic studies may also be independently interpreted as Slight. This category serves primarily to encourage additional study where evidence does exist that might provide some support for an association, but for which the evidence does not reach the degree of confidence required for Moderate. Indeterminate (OOO) ...evidence in animal studies (signal cannot be determined for or against an effect) No studies available in animals or situations when the evidence is inconsistent and primarily of low confidence. In addition, this may include situations where higher confidence studies exist, but there are major concerns with the evidence base such as unexplained inconsistency, a lack of expected coherence from a stronger set of studies, very small effect magnitude (i.e., major concerns about biological significance), or uncertainties or methodological limitations that result in an inability to discern effects from exposure. It also applies for a single low confidence study in the absence of factors that increase certainty. A set of largely null studies could be concluded to be Indeterminate if the evidence does not reach the level required for Compelling evidence of no effect. This document is a draft for review purposes only and does not constitute Agency policy. 8-14 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Evidence synthesis judgment Description Compelling evidence of no effect (...) ...iN animal studies (strong signal for lack of an effect with little uncertainty) A set of high confidence experiments examining a reasonable spectrum of endpoints that demonstrate a lack of biologically significant effects across multiple species, both sexes, and a broad range of exposure levels. The data are compelling in that the experiments have examined the range of scenarios across which health effects in animals could be observed, and an alternative explanation (e.g., inadequately controlled features of the studies' experimental designs; inadequate sample sizes) for the observed lack of effects is not available. Each of the studies should have used an optimal endpoint and exposure assessment and adequate sample size. The evidence base should represent both sexes and address potentially susceptible populations and lifestages. Mechanistic data in animals that address the above considerations or that provide information supporting the lack of an association between exposure and effect with reasonable confidence may provide additional support for this judgment. 8.2. EVIDENCE INTEGRATION 1 The phase of evidence integration combines animal and human evidence synthesis 2 judgments while also considering information on the human relevance of findings in animal 3 evidence, coherence across evidence streams ("cross-stream coherence"), information on 4 susceptible populations or lifestages, understanding of biological plausibility and MOA, and 5 possibly other critical inferences (e.g., read-across analyses) that generally draw on mechanistic 6 and other supplemental evidence (see Table 8-6). This analysis culminates in an evidence 7 integration judgment and narrative for each potential health effect (i.e., each noncancer health 8 effect and specific type of cancer, or broader grouping of related outcomes as defined in the 9 evaluation plan). To the extent it can be characterized prior to conducting dose-response analyses, 10 exposure context is provided. Table 8-6. Considerations that inform evidence integration judgments Judgment Description Human relevance of findings • Used to describe and justify the interpretation of the relevance of the animal data to humans. This can include consideration of mechanistic or other supplemental information. When human evidence is lacking or has results that differ from animals, analyses of the mechanisms underlying the animal response in relation to those presumed to operate in humans, and the chemical's pharmacokinetics, can inform the extent to which the animal response is likely to be relevant to humans and potentially strengthen overall confidence in the evidence integration conclusion. Conversely, evidence for a mechanistic pathway that is expected to only occur in animals and not in humans can provide support for a conclusion that the animal evidence for an effect is not relevant to humans. • In the absence of chemical-specific evidence informing human relevance, the evidence integration narrative will briefly describe the interpreted comparability of experimental animal organs/systems to humans based on underlying biological similarity (e.g., thyroid signaling processes are well conserved across rodents and humans). Generally, a high- This document is a draft for review purposes only and does not constitute Agency policy. 8-15 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Judgment Description level systems summary should be possible for most encountered effects. In some cases, however, it may be appropriate to use a statement such as, 'without evidence to the contrary, [health effect described in the table] responses in animals are presumed to be relevant to humans.' As noted in EPA guidelines (U.S. EPA, 2005a), there needs to be evidence or a biological explanation to support an interpreted lack of human relevance for findings in animals, and site concordance is neither expected nor required. Cross-st ream coherence • Addresses the concordance of findings known to be biologically related across human, animal, and mechanistic studies, considering factors such as exposure timing and levels. Notably, for many health effects (e.g., some nervous system and reproductive effects; cancer), it is not necessary (or expected) that effects manifest in humans are identical to those observed in animals, although this typically provides stronger evidence. For example, tumors in one animal species can be predictive of carcinogenic potential in humans or other species, but not necessarily at the same site. EPA guidelines and other resources (e.g., OECD guidelines) are consulted when drawing these inferences. • Mechanistic support for, or biological understanding of, the relatedness between different outcomes (and the manner in which they are manifest) in different species can inform an understanding of coherence across evidence streams. Evidence supporting a biologically plausible mechanistic pathway across species adds coherence (see below). Potential susceptibility Susceptible populations and lifestages • Used to summarize analyses relating to individual and social factors that may increase susceptibility to exposure-related health effects in certain populations or lifestages, or to highlight the lack of such information. These analyses are based on knowledge about the health outcome or organ system affected and focus primarily on the influence of intrinsic biological factors such as race/ethnicity, genetic variability, sex, lifestage, and pre-existing health conditions (which can also have an extrinsic basis). Information on extrinsic factors potentially influencing susceptibility (e.g., proximity to exposure; certain lifestyle factors including subsistence living) are not considered in evidence integration judgments on potential susceptibility; these exposure-focused factors are considered by risk managers after the human health assessment is complete. Evaluation of potential susceptibility can also include consideration of mechanistic and ADME evidence. Biological plausibility or MOA understanding • Support for the biological plausibility of an association between exposure and the health effect increases evidence certainty, particularly when observed across species. This may be provided by data from experimental studies of mechanistic pathways, particularly when support is provided for key events or is conserved across multiple components of the pathway. Mechanisms or biological changes with broad scientific acceptance for their relevance to chemical toxicity or the health effect (e.g., key characteristics, hallmarks of cancer) may be used to organize the chemical-specific evidence and identify key events leading from exposure to the health effect. For each key event and key event relationship, the evidence is considered regarding the consistency of experimental data and the generalizability, or likelihood of similarities (e.g., in presence or function) across species, as well as the strength of the support for the biological mechanism. • Mechanistic evidence from well conducted studies that demonstrates that the health effect is unlikely to occur (i.e., species-specific effects, irrelevant exposure conditions) can support a judgment that the effects from animal or human studies are not biologically relevant, which weakens the summary evidence integration judgment. Such a decision depends on an evaluation of the certainty of the information supporting vs. opposing biological plausibility, as well as the certainty of the health effect specific findings (e.g., stronger health effect data require more certainty in mechanistic evidence opposing This document is a draft for review purposes only and does not constitute Agency policy. 8-16 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Protocol for the Ethylbenzene IRIS Assessment Judgment Description plausibility). Importantly, because understanding biological plausibility is dependent on expert knowledge and canonical scientific knowledge, the lack of such understanding does not provide a rationale to decrease the certaintv of the evidence for an effect (NTP, 2015; NRC, 2014). • These analyses are typically conducted separately to establish MOA understanding and referenced in the evidence integration judgment. If sufficiently supported, MOA understanding can serve to increase (e.g., strong support for mutagenicity) or increase (e.g., critical dependence on a key event not likely to be operant in humans) certainty in the evidence integration judgments. Other critical inferences (optional) • Consideration of other evidence or nonchemical-specific information that informs evidence integration judgments (e.g., read-across analyses, ADME understanding used to inform other considerations; judgments on other health effects expected to be linked to the health effect under evaluation; read-across analyses or inferences) may be separately described as "other critical inferences." Using a structured framework approach, one of five phrases is used to summarize the evidence integration judgment based on the within evidence stream integration of the human and animal evidence, and supplemental (mechanistic) evidence: evidence demonstrates, evidence indicates, evidence suggests, evidence is inadequate, or strong evidence supports no effect (see Table 8-7). The five integration judgment levels reflect the differences in the amount and quality of the data that inform the evaluation of whether exposure may cause the health effect(s). As it is assumed that any identified health hazards will only be manifest given exposures of a certain type and amount (e.g., a specific route; a minimal duration, periodicity, and level), the evidence integration narrative and summary judgment levels include the generic phrase, "given sufficient exposure conditions." This highlights that, for those assessment-specific health effects identified as potential hazards, the exposure conditions associated with those health effects will be defined (as will the uncertainties in the ability to define those conditions) during dose-response analysis. More than one descriptor can be used when the evidence base is able to support that a chemical's effects differ by exposure level or route (U.S. EPA. 2005a). The analyses and judgments are summarized in the evidence profile table (see Table 8-1). This document is a draft for review purposes only and does not constitute Agency policy. 8-17 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 8-7. Framework for summary evidence integration judgments in the evidence integration narrative Summary evidence integration judgment3 in narrative Evidence integration judgment level Explanation and example scenarios'3 The currently available evidence demonstrates that [chemical] causes [health effect] in humans0 given sufficient exposure conditions. This conclusion is based on studies of [humans or animals] that assessed [exposure or dose] levels of [range of concentrations or specific cutoff level concentration01]. Evidence demonstrates A strong evidence base demonstrating that [chemical] exposure causes [health effect] in humans. • This conclusion level is used if there is robust human evidence supporting an effect. • This conclusion level could also be used with moderate human evidence and robust animal evidence if there is strong mechanistic evidence that MOAs and key precursors identified in animals are anticipated to occur and progress in humans. The currently available evidence indicates that [chemical] likely causes [health effect] in humans given sufficient exposure conditions. This conclusion is based on studies of [humans or animals] that assessed [exposure or dose] levels of [range of concentrations or specific cutoff level concentration]. Evidence indicates (likely®) An evidence base that indicates that [chemical] exposure likely causes [health effect] in humans, although there may be outstanding questions or limitations that remain, and the evidence is insufficient for the higher conclusion level. • This conclusion level is used if there is robust animal evidence supporting an effect and slight-to-indeterminate human evidence, or with moderate human evidence when strong mechanistic evidence is lacking. • This conclusion level could also be used with moderate human evidence supporting an effect and moderate-to-indeterminate animal evidence, or with moderate animal evidence supporting an effect and moderate-to-indeterminate human evidence. In these scenarios, any uncertainties in the moderate evidence are not sufficient to substantially reduce confidence in the reliability of the evidence, or mechanistic evidence in the slight or indeterminate evidence base (e.g., precursors) exists to increase confidence in the reliability of the moderate evidence. The currently available evidence suggests that [chemical] may cause [health effect] in humans This conclusion is based on studies of [humans or animals] that assessed [exposure or dose] levels of [range of concentrations or specific cutoff level concentration]. Evidence suggests An evidence base that suggests that [chemical] exposure may cause [health effect] in humans, but there are very few studies that contributed to the evaluation, the evidence is very weak or conflicting, and/or the methodological conduct of the studies is poor. • This conclusion level is used if there is slight human evidence and indeterminate-to-slight animal evidence. • This conclusion level is also used with slight animal evidence and indeterminate-to-slight human evidence. • This conclusion level could also be used with moderate human evidence and slight or indeterminate animal evidence, or with moderate animal evidence and slight This document is a draft for review purposes only and does not constitute Agency policy. 8-18 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Summary evidence integration judgment3 in narrative Evidence integration judgment level Explanation and example scenarios'3 or indeterminate human evidence. In these scenarios, there are outstanding issues or uncertainties regarding the moderate evidence (i.e., the synthesis judgment was borderline with slight), or mechanistic evidence in the slight or indeterminate evidence base (e.g., null results in well-conducted evaluations of precursors) exists to decrease confidence in the reliability of the moderate evidence. • Exceptionally, when there is general scientific understanding of mechanistic events that result in a health effect, this conclusion level could also be used if there is strong mechanistic evidence that is sufficient to highlight potential human toxicity'—in the absence of informative conventional studies in humans or in animals (i.e., indeterminate evidence in both). The currently available evidence is inadequate to assess whether [chemical] may cause [health effect] in humans. Evidence inadequate This conveys either a lack of information or an inability to interpret the available evidence for [health effect]. On an assessment-specific basis, a single use of this "inadequate" conclusion level might be used to characterize the evidence for multiple health effect categories (i.e., all health effects that were examined and did not support other conclusion levels).8 • This conclusion level is used if there is indeterminate human and animal evidence. • This conclusion level is_also used with slight animal evidence and compelling evidence of no effect human evidence. • This conclusion level could also be used with sliaht-to-robust animal evidence and indeterminate human evidence if strong mechanistic information indicated that the animal evidence is unlikely to be relevant to humans. A conclusion of inadequate is not a determination that the agent does not cause the indicated health effect(s). It simply indicates that the available evidence is insufficient to reach conclusions. This document is a draft for review purposes only and does not constitute Agency policy. 8-19 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Summary evidence integration judgment3 in narrative Evidence integration judgment level Explanation and example scenarios'3 Strong evidence supports no effect in humans. This conclusion is based on studies of [humans or animals] that assessed [exposure or dose] levels of [range of concentrations]. Strong evidence supports no effect This represents a situation in which extensive evidence across a range of populations and exposure levels has identified no effects/associations. This scenario requires a high degree of confidence in the conduct of individual studies, including consideration of study sensitivity, and comprehensive assessments of the endpoints and lifestages of exposure relevant to the heath effect of interest. • This conclusion level is used if there is compelling evidence of no effect in human studies and compelling evidence of no effect to indeterminate in animals. • This conclusion level is also used if there is indeterminate human evidence and compelling evidence of no effect animal evidence in models concluded to be relevant to humans. • This conclusion level could also be used with compellina evidence of no effect in human studies and moderate to robust animal evidence if strong mechanistic information indicated that the animal evidence is unlikely to be relevant to humans. aEvidence integration judgments are typically developed at the level of the health effect when there are sufficient studies on the topic to evaluate the evidence at that level; this should always be the case for "evidence demonstrates" and "strong evidence supports no effect," and typically for "evidence indicates (likely)." However, some databases only allow for evaluations at the category of health effects examined; this will more frequently be the case for conclusion levels of "evidence suggests" and "evidence inadequate." A judgment of "strong evidence supports no effect" is drawn at the health effect level, terminology of "is" refers to the default option; terminology of "could also be" refers to situational options dependent on mechanistic understanding. cln some assessments, these conclusions might be based on data specific to a particular lifestage of exposure, sex, or population (or another specific group). In such cases, this would be specified in the narrative conclusion, with additional detail provided in the narrative text. This applies to all conclusion levels. dlf concentrations cannot be estimated, an alternative expression of exposure level such as "occupational exposure levels," are provided. This applies to all conclusion levels. eFor some applications, such as benefit-cost analysis, to better differentiate the categories of "evidence demonstrates" and "evidence indicates," the latter category should be interpreted as evidence that supports an exposure-effect linkage that is likely to be causal. 'Scientific understanding of adverse outcome pathway (AOPs) and of the human implications of new toxicity testing methods (e.g., from high-throughput screening, from short-term in vivo testing of alternative species or from new in vitro testing) will continue to increase. This may make possible the development of hazard conclusions when there are mechanistic or other relevant data that can be interpreted with a similar level of confidence to positive animal results in the absence of conventional studies in humans or in animals. Specific narratives for each of these health effects may also be deemed unnecessary. This document is a draft for review purposes only and does not constitute Agency policy. 8-20 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Protocol for the Ethylbenzene IRIS Assessment For evaluations of carcinogenicity, consistent with EPA's cancer guidelines fU.S. EPA. 2005a), one of EPA's standardized cancer descriptors is used to describe the overall potential for carcinogenicity within the evidence integration narrative for carcinogenicity. These descriptors are: (1) carcinogenic to humans, (2) likely to be carcinogenic to humans, (3) suggestive evidence of carcinogenic potential, (4) inadequate information to assess carcinogenic potential, or (5) not likely to be carcinogenic to humans. The standardized cancer descriptors will often align with the evidence integration judgments (i.e., "evidence demonstrates" aligns with "carcinogenic to humans") but not in all cases. For example, the evidence integration judgments are generally used for individual tumor or cancer types and the standardized EPA descriptors are used to characterize overall cancer hazard. For each type of cancer evaluated (e.g., lung cancer; renal cancer) or sets of related cancer types, an evidence integration narrative and summary judgment level are provided as described above for noncancer health effects. When considering evidence on carcinogenicity across human and animal evidence, site concordance is not required (U.S. EPA. 2005a). If a systematic review of more than one cancer type was conducted, then the strongest evidence integration judgment(s) is used as the basis for selecting the standardized cancer descriptor in accordance with the EPA cancer guidelines (U.S. EPA. 2005a). This document is a draft for review purposes only and does not constitute Agency policy. 8-21 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Protocol for the Ethylbenzene IRIS Assessment 9. DOSE-RESPONSE ASSESSMENT: SELECTING STUDIES AND QUANTITATIVE ANALYSIS 9.1. OVERVIEW Selection of specific data sets for dose-response assessment and performance of the dose-response assessment is conducted after hazard identification is complete and involves database- and chemical-specific biological judgments. A number of EPA guidelines and support documents detail data requirements and other considerations for dose-response modeling, especially EPA's Benchmark Dose Technical Guidance (U.S. EPA. 2012bl. EPA's Review of the Reference Dose and Reference Concentration Processes [fU.S. EPA. 2005a. 20021. Guidelines for Carcinogen Risk Assessment fU.S. EPA. 2005al. and Supplemental Guidance for Assessing Susceptibility from Early-Life Exposure to Carcinogens (U.S. EPA. 2005b). This section of the protocol provides an overview of considerations for conducting the dose-response assessment, particularly statistical considerations specific to dose-response analysis that support quantitative risk assessment. Importantly, these considerations do not supersede existing EPA guidelines. For IRIS assessments, dose response assessments are typically performed for both noncancer and cancer hazards, and for both oral and inhalation routes of exposure following chronic exposure8 to the chemical of interest, if supported by existing data. For noncancer hazards, an inhalation reference concentration (RfC) and an oral reference dose (RfD) will be derived. In addition to an RfC and RfD, this assessment will attempt to derive organ- or system-specific toxicity values when the data are sufficiently strong (i.e., noncancer conclusions of evidence demonstrate or evidence indicates [likely]). A reference value may also be derived for cancer effects in cases where a nonlinear MOA is concluded that indicates a key precursor event necessary for carcinogenicity does not occur below a specific exposure level ffU.S. EPA. 2005al Section 3.3.4). In addition, when feasible and if the available data are appropriate for doing so, the assessment will derive a less- than-lifetime toxicity value (a "subchronic" reference value) for noncancer hazards. Both less-than- lifetime and hazard-specific values may be useful to EPA risk assessors within specific decision contexts. When low-dose linear extrapolation for cancer effects is supported, particularly for chemicals with direct mutagenic activity or those for which the data indicate a linear component below the point of departure (POD), an inhalation unit risk (IUR) facilitates estimation of human cancer risks. Low-dose linear extrapolation is also used as a default when the data are insufficient 8Dose-response assessments may also be conducted for shorter durations, particularly if the evidence base for a chemical indicates risks associated with shorter exposures to the chemical (U.S. EPA. 2002). This document is a draft for review purposes only and does not constitute Agency policy. 9-1 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Protocol for the Ethylbenzene IRIS Assessment to establish the mode of action fU.S. EPA. 2005al An IUR is a plausible upper-bound lifetime cancer risk from chronic inhalation of a chemical per unit of air concentration (expressed as ppm or Hg/m3). In contrast with RfCs, an IUR can be used in conjunction with exposure information to estimate cancer risk at a given dose. The derivation of toxicity values also depends on the nature of the hazard conclusion. Specifically, EPA generally conducts dose-response assessments and derives cancer values for chemicals that are classified as carcinogenic or likely to be carcinogenic to humans. When there is suggestive evidence of carcinogenic potential to humans, EPA generally would not conduct a dose-response assessment and derive a cancer value. Similarly, for noncancer outcomes dose- response is conducted based on having stronger evidence of a hazard (generally, "evidence demonstrates" and "evidence indicates [likely]". When the noncancer outcome is considered, evidence suggests of potential hazard to humans, EPA generally would not conduct a dose-response assessment and derive a RfC or RfD. Cases where suggestive evidence might be used to develop cancer risk estimates or noncancer toxicity value include when the evidence base includes a well-conducted study (overall medium or high confidence for the outcome), quantitative analyses may be useful for some purposes, (e.g., providing a sense of the magnitude and uncertainty of potential risks, ranking potential hazards, or setting research priorities) (U.S. EPA. 2005a). 9.2. SELECTING STUDIES FOR DOSE-RESPONSE ASSESSMENT 9.2.1. Hazard and MOA Considerations for Dose Response The assessment presents a summary of hazard identification conclusions to transition to dose response considerations, highlighting (1) information used to inform the selection of outcomes or broader health effect categories for which toxicity values will be derived, (2) whether toxicity values can be derived to protect specific populations or life stages, (3) how dose response modeling will be informed by pharmacokinetic information, and (4) the identification of biologically based BMR levels. The pool of outcomes and study-specific endpoints is discussed to identify which categories of effects and study designs are considered the strongest and most appropriate for quantitative assessment of a given health effect, particularly among the studies that exemplify the study attributes summarized in Table 9-1. Also considered is whether there are opportunities for quantitative evidence integration. Examples of quantitative integration, from simplest to more complex, include (1) combining results for an outcome across sex (within a study); (2) characterizing overall toxicity, as in combining effects that comprise a syndrome, or occur on a continuum (e.g., precursors and eventual overt toxicity, benign tumors that progress to malignant tumors); and (3) conducting a meta-analysis or meta-regression of all studies addressing a category of important health effects. Some studies that are used qualitatively for hazard identification may or may not be useful quantitatively for dose-response assessment due to such factors as the lack of quantitative measures of exposure or lack of variability measures for response data. If the needed information This document is a draft for review purposes only and does not constitute Agency policy. 9-2 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 1 cannot be located, semiquantitative analysis may be feasible (e.g., via NOAEL/LOAEL). In the draft 2 and final assessments, specific endpoints considered for dose-response are summarized in a tabular 3 format that includes rationales for decisions to proceed (or not) for POD derivation (see Table 9-2 4 for example format) selection. 5 In addition, mechanistic evidence that influences the dose-response analyses is highlighted, 6 for example, evidence related to susceptibility or potential shape of the dose-response curve (i.e., 7 linear, nonlinear, or threshold model). Mode(s) of action is summarized including any interactions 8 between them relevant to understanding overall risk. For cancer dose-response, biological 9 considerations relevant to dose-response for cancer are: 10 • Is there evidence for direct mutagenicity? 11 • Does tumor latency decrease with increasing exposure? 12 • If there are multiple tumor types, which cancers have a longer latency period? 13 • Is incidence data available (incidence data are preferred to mortality data)? 14 • Were there different background incidences in different (geographic) populations? 15 • While benign and malignant tumors of the same cell of origin are generally evaluated 16 together, was there an increase only in malignant tumors? This document is a draft for review purposes only and does not constitute Agency policy. 9-3 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 9-1. Attributes used to evaluate studies for derivation of toxicity values (in addition to the health effect category-specific evidence integration judgment) Study attributes Considerations Human studies Animal studies Study confidence High or medium confidence studies are highly preferred over low confidence studies. The available high and medium confidence studies are further differentiated based on the study attributes below as well as a reconsideration of the specific limitations identified and their potential impact on dose-response analyses. Rationale for choice of species Human data are preferred over animal data to eliminate interspecies extrapolation uncertainties (e.g., in pharmacodynamics, relevance of specific health outcomes to humans). Animal studies provide supporting evidence when adequate human studies are available and are considered principal studies when adequate human studies are not available. For some hazards, studies of particular animal species known to respond similarly to humans would be preferred over studies of other species. Relevance of exposure paradigm Exposure route Studies involving human environmental exposures (oral, inhalation). Studies by a route of administration relevant to human environmental exposure are preferred. A validated pharmacokinetic or PBPK model can also be used to extrapolate across exposure routes. Exposure durations When developing a chronic toxicity value, chronic or subchronic studies are preferred over studies of acute exposure durations. Exceptions exist, such as when a susceptible population or life stage is more sensitive in a particular time window (e.g., developmental exposure). Exposure levels Exposures near the range of typical environmental human exposures are preferred. Studies with a broad exposure range and multiple exposure levels are preferred to the extent that they can provide information about the shape of the exoosure-resDonse relationship (see the EPA Benchmark Dose Technical Guidance. (U.S. EPA, 2012b), Section 2.1.1) and facilitate extrapolation to more relevant (generally lower) exposures. Subject selection Studies that provide risk estimates in the most susceptible groups are preferred. Attempts are made to highlight where it might be possible to develop separate risk estimates for a specific population or life stage or determine whether evidence is available to select a data-derived uncertainty factor (UF). Controls for possible confounding3 Studies with a design (e.g., matching procedures, blocking) or analysis (e.g., covariates or other procedures for statistical adjustment) that adequately address the relevant sources of potential critical confounding for a given outcome are preferred. This document is a draft for review purposes only and does not constitute Agency policy. 9-4 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Study attributes Considerations Human studies Animal studies Measurement of exposure Studies that can reliably distinguish between levels of exposure in a time window considered most relevant for development of a causal effect are preferred. Exposure assessment methods that provide measurements at the level of the individual and that reduce measurement error are preferred. Measurements of exposure should not be influenced by knowledge of health outcome status. Studies providing actual measurements of exposure (e.g., analytical inhalation concentrations vs. target concentrations) are preferred. Relevant internal dose measures may facilitate extrapolation to humans, as would availability of a suitable animal PBPK model in conjunction with an animal study reported in terms of administered exposure. Measurement of health outcome(s) Studies that can reliably distinguish the presence or absence (or degree of severity) of the outcome are preferred. Outcome ascertainment methods using generally accepted or standardized approaches are preferred. Studies with individual data are preferred in general. Examples include: to characterize experimental variability more realistically, to characterize overall incidence of individuals affected by related outcomes (e.g., phthalate syndrome). Among several relevant health outcomes, preference is generally given to those with greater biological significance. When there are multiple endpoints for an organ/system, characterizing the overall impact on this organ/system is considered. For example, if there are multiple histopathological alterations relevant to liver function changes, liver necrosis may be selected as the most representative endpoint to consider for dose-response analysis. For cancer types, consideration is given to the overall risk of multiple types of tumors. Multiple tumor types (if applicable) are discussed, and a rationale given for any grouping. Study size and design Preference is given to studies using designs reasonably expected to have power to detect responses of suitable magnitude.15 This does not mean that studies with substantial responses but low power would be ignored, but that they should be interpreted in light of a confidence interval or variance for the response. Studies that address changes in the number at risk (through decreased survival, loss to follow-up) are preferred. aAn exposure or other variable that is associated with both exposure and outcome but is not an intermediary between the two. bPower is an attribute of the design and population parameters, based on a concept of repeatedly sampling a population; it cannot be inferred post hoc using data from one experiment (Hoenig and Heisev, 2001). This document is a draft for review purposes only and does not constitute Agency policy. 9-5 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Table 9-2. Example table used in assessment to s low endpoint consideration judgments for POD derivation Endpoint Study reference/ confidence Exposure route duration Human population or strain/species Sexes studied POD derivation Rationale Endpoint 1 Study citation and confidence (endpoint-specific level) e.g., Gestational (route) e.g., Wistar rats males, females, or both ~ e.g., Exposure-related increase Endpoint 2 Study citation and confidence (endpoint-specific level) e.g., Gestational (route) e.g., Sprague-Dawley rats males, females, or both X e.g., No exposure-related effect; response not considered biologically significant (<5%) Endpoint 3 Study citation and confidence (endpoint-specific level) e.g., ongoing, measured during gestation e.g., Children aged 7 yr Both males and females ~ e.g., Consistent associations across studies, minimal concerns for exposure measurement Table 9-3. Specific example of presenting endpoints considered for dose-response modeling and derivation of points of departure. Endpoint Study reference/ confidence Exposure route and duration Human population or test species and strain Lifestage and sex POD derivation Rationale Endocrine Effects (hazard judgment of evidence indicates [likely]) Decreased serum free and total T4 NTP (2018); high confidence Gavage, 28 d S-D rat Adult female Yes u Dose-dependent effects in free and total T4 in females and free T4 in males; large magnitude of effect in both sexes (91% reduction in free T4 in males at low dose where body weight unaffected, and 36%-53% reduction in free and total T4 in females at >3.12 mg/kg-d); effects in males were not prioritized due to elevated weight loss at higher doses. NTP (2018); high confidence Gavage, 28 d S-D rat Adult male No, X This document is a draft for review purposes only and does not constitute Agency policy. 9-6 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Endpoint Study reference/ confidence Exposure route and duration Human population or test species and strain Lifestage and sex POD derivation Rationale Endocrine Effects (hazard judgment of evidence indicates [likely]) Add a second endpoint, maybe not modeled due to large insensitivity vs. T4 Adult males and females No, X This document is a draft for review purposes only and does not constitute Agency policy. 9-7 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Protocol for the Ethylbenzene IRIS Assessment 9.3. CONDUCTING DOSE-RESPONSE ASSESSMENTS EPA uses a two-step approach for dose-response assessment that distinguishes analysis of the dose-response data in the range of observation from any inferences about responses at lower, generally more environmentally relevant, exposure levels ffU.S. EPA. 2012bl: fU.S. EPA. 2005al. Section 3): 1) Within the observed dose range, the preferred approach is to use dose-response modeling to incorporate as much of the data set as possible into the analysis for the purpose of deriving a POD, see Section 9.3.1 for more details. 2) Derivation of cancer risk estimates or reference values nearly always involves extrapolation to exposures lower than the POD and is described in more detail in Sections 9.3.2 and 9.3.3, respectively. When sufficient and appropriate human data and laboratory animal data are both available for the same outcome, human data are generally preferred for the dose-response assessment because their use eliminates the need to perform interspecies extrapolations. For noncancer analyses, IRIS assessments typically derive a candidate value from each suitable data set, whether for human or animal. Evaluating these candidate values grouped within a particular organ/system yields a single organ/system-specific reference value for each organ/system under consideration. Next, evaluation of these organ/system-specific reference values results in the selection of a single overall reference value to cover all health outcomes across all organs/systems. While this overall reference value is the focus of the assessment, the organ/system-specific reference values can be useful for subsequent cumulative risk assessments that consider the combined effect of multiple agents acting at a common organ/system. For cancer analyses, if there are multiple tumor types in a study population (human or animal), final cancer risk estimates will typically address overall cancer risk. 9.3.1. Dose-Response Analysis in the Range of Observation For conducting a dose response assessment, pharmacodynamic ("biologically based") modeling can be used when there are sufficient data to ascertain the mode of action and quantitatively support model parameters that represent rates and other quantities associated with the key precursor events of the modes of action. When pharmacodynamic modeling is not available to assess health effects associated with exposure to ethylbenzene, empirical dose-response modeling is used to fit the data (on the apical outcomes or a key precursor events) in the ranges of observation. For this purpose of empirical dose-response modeling, EPA has developed a standard set of models f https: //www.epa.gov/bmds] that can be applied to typical dichotomous and continuous data sets, including those that are nonlinear. In situations where there are alternative models with significant biological support, the users of the assessment can be informed by the presentation of these alternatives along with the models' strengths and uncertainties. The EPA has This document is a draft for review purposes only and does not constitute Agency policy. 9-8 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Protocol for the Ethylbenzene IRIS Assessment developed guidelines on modeling dose-response data, assessing model fit, selecting suitable models, and reporting modeling results [see the EPA Benchmark Dose Technical Guidance fU.S. EPA. 2012b")]. U.S. EPA Benchmark Dose Software (BMDS) is designed to model dose-response datasets in accordance with EPA Benchmark Dose Technical Guidance fU.S. EPA. 2012bl For noncancer (and nonlinear cancer), a BMDL is computed from a model selected from the BMDS suite of models using statistical and graphical criteria. Linear analysis of cancer datasets is generally based on the Multistage model, with degree selected following a U.S. EPA Statistical Workgroup technical memo available on the BMDS website fhttps: //cfpub.epa.gov/ncea/bmds/recordisplay.cfm?deid=308382 ). Modeling of cancer data may in some cases involve additional, specialized methods, particularly for multiple tumors or early removal from observation (due to death or morbidity). Additional judgments or alternative analyses may be used if initial modeling procedures fail to yield results in reasonable agreement with the data. For example, modeling may be restricted to the lower doses, especially if there is competing toxicity at higher doses. For noncancer (and nonlinear cancer) datasets, EPA recommends (1) application of a preferred set of models that use maximum likelihood estimation (MLE) methods (default models in BMDS) and (2) selection of a POD from a single model based on criteria designed to limit model selection subjectivity (auto implemented in BMDS version 3 and higher). For the linear analysis of cancer datasets, EPA recommends (1) application of the Multistage MLE model; (2) selection of a single Multistage degree; and (3) in cases where tumors are observed in multiple organ systems, use of a multi-tumor model (i.e., MS-Combo) that appropriately estimates combined tumor risk (both (2) and (3) are available in BMDS).9 Version 3.2 and higher of BMDS also provides an alternative modeling approach that uses Bayesian model averaging for dichotomous modeling average (DMA). EPA makes DMA available as alternative approaches but has not yet finalized guidelines for their use. For each modeled dataset for an outcome, a POD from the observed data should be estimated to mark the beginning of extrapolation to lower doses. The POD is an estimated dose (expressed in human equivalent terms) near the lower end of the observed range without significant extrapolation to lower doses. For linear extrapolation of cancer risk, the POD is used to calculate an OSF or IUR, and for nonlinear extrapolation, the POD is used in calculating an RfD or RfC. The selection of the response level at which the POD is calculated is guided by the severity of the endpoint. If linear extrapolation is used, selection of a response level corresponding to the POD is not highly influential, so standard values near the low end of the observable range are generally used (for example, 10% extra risk for cancer bioassay data, 1% for epidemiologic data, 9The Multistage degree selection process outlined in the memo is auto-implemented in the BMDS multitumor model, which can be run on one or more tumor data sets, but only the noncancer model selection process is auto-implemented for individual Multistage model runs in the current version, BMDS 3.2). This document is a draft for review purposes only and does not constitute Agency policy. 9-9 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 Protocol for the Ethylbenzene IRIS Assessment lower for rare cancers). Nonlinear approaches consider both statistical and biologic considerations. For dichotomous data, a response level of 10% extra risk is generally used for minimally adverse effects, 5% or lower for more severe effects. For continuous data, a response level is ideally based on an established definition of biologic significance. In the absence of such definition, one control standard deviation from the control mean is often used for minimally adverse effects, 1/2 standard deviation for more severe effects. The POD is the 95% lower bound on the dose associated with the selected response level. EPA has developed standard approaches for determining the relevant dose to be used in the dose-response modeling in the absence of appropriate pharmacokinetic modeling. These standard approaches also facilitate comparison across exposure patterns and species: • Intermittent study exposures are standardized to a daily average over the duration of exposure. For chronic effects, daily exposures are averaged over the lifespan. Exposures during a critical period, however, are not averaged over a longer duration ffU.S. EPA. 2005a), Section 3.1.1; fU.S. EPA. 1991al. Section 3.2). Note that this will typically be done after modeling because the conversion is linear. • Doses are standardized to equivalent human terms to facilitate comparison of results from different species. Oral doses are scaled allometrically using mg/kg3/4day as the equivalent dose metric across species. Allometric scaling pertains to equivalence across species, not across life stages, and is not used to scale doses from adult humans or mature animals to infants or children ffU.S. EPA. 2011) fU.S. EPA. 2005a). section 3.1.3). Inhalation exposures are scaled using dosimetry models that apply species-specific physiologic and anatomic factors and consider whether the effect occurs at the site of first contact or after systemic circulation fU.S. EPA. 2012a. 1994). Section 3). • It can be informative to convert doses across exposure routes. If this is done, the assessment describes the underlying data, algorithms, and assumptions ffU.S. EPA. 2005al. Section 3.1.4). • In the absence of study specific data on, for example, intake rates or body weight, the EPA has developed recommended values for use in dose response analysis fU.S. EPA. 1988). • The preferred approach for dosimetry extrapolation from animals to humans is through PBPK modeling. • Briefly, PBPK model simulations can be used to estimate internal dose metrics (e.g., ethylbenzene on blood or its oxidative metabolite produced in the liver) corresponding to the applied doses for each experimental animal bioassay. By simulating the exposure scenario for each toxicity study (e.g., 6 hours/day, 5 days/week inhalation exposure), the resulting internal metric effectively accounts for the difference between the pattern and a nominal 24 hours/day, 7 days/week exposure. The set of internal dose metrics for each toxicity study and endpoint can then be used in dose-response analysis to identify a BMDL or other POD for individual animal toxicity studies. In this assessment, the internal dose metric is either the tissue-specific rate of oxidative metabolism or a daily average blood concentration of ethylbenzene. The human version of the PBPK model can then be used to estimate the exposure concentration in air which, given continuous (24 hours/day, This document is a draft for review purposes only and does not constitute Agency policy. 9-10 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 Protocol for the Ethylbenzene IRIS Assessment 7 days/week) inhalation exposure, would result in internal dose PODs aforementioned. Any remaining uncertainty factors, including the factor of 10 for human inter-individual variability (UFH) will then be applied for derivation of the HECs. • If needed, a similar approach can be applied for oral-to-inhalation route extrapolation for endpoints where toxicity data are available from oral dosimetry studies but not from inhalation. 9.3.2. Extrapolation: Slope Factors and Unit Risk An OSF or IUR facilitates estimation of human cancer risks when low-dose linear extrapolation for cancer effects is supported, particularly for chemicals with direct mutagenic activity or those for which the data indicate a linear component below the POD. Low-dose linear extrapolation is also used as a default when the data are insufficient to establish the mode of action fU.S. EPA. 2005al If data are sufficient to ascertain one or more modes of action consistent with low-dose nonlinearity, or to support their biological plausibility, low-dose extrapolation may use the reference value approach when suitable data are available (U.S. EPA. 2005a). 9.3.3. Extrapolation: Reference Values Reference value derivation is EPA's most frequently used type of nonlinear extrapolation method. Although it is most commonly used for noncancer effects, this approach is also used for cancer effects if there are sufficient data to ascertain the MOA and conclude that it is not linear at low doses. For these cases, reference values for each relevant route of exposure are developed following EPA's established practices ffU.S. EPA. 2005al. Section 3.3.4). In general, it has been the IRIS Program's preference to base cancer reference values on key precursor events in the MOA that are necessary for tumor formation rather than on the incidence of tumors themselves. For example, see the ethylene glycol monobutyl ether (EGBE) assessment where the cancer RfD was based on hemosiderin deposition in the liver vs. liver tumor incidence (2010b). For each data set selected for reference value derivation, reference values are estimated by applying relevant adjustments to the PODs to account for the conditions of the reference value definition—for human variation, extrapolation from animals to humans, extrapolation to chronic exposure duration, and extrapolation to a minimal level of risk (if not observed in the data set). Increasingly, data-based adjustments (U.S. EPA. 2014a) and Bayesian methods for characterizing population variability (NRC. 2014) are feasible and may be distinguished from the UF considerations outlined below. The assessment will discuss the scientific bases for estimating these data-based adjustments and UFs: • Animal-to-human extrapolation: If animal results are used to make inferences about humans, the reference value derivation incorporates the potential for cross-species differences, which may arise from differences in pharmacokinetics or pharmacodynamics. If available, a biologically based model that adjusts fully for pharmacokinetic and pharmacodynamic differences across species may be used. Otherwise, the POD is standardized to equivalent human terms or is based on pharmacokinetic or dosimetry This document is a draft for review purposes only and does not constitute Agency policy. 9-11 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Protocol for the Ethylbenzene IRIS Assessment modeling, which may range from detailed chemical-specific to default approaches (U.S. EPA. 2014a. 20111. and a factor of 101/2 (rounded to 3) is applied to account for the remaining uncertainty involving pharmacokinetic and pharmacodynamic differences. • Human variation: The assessment accounts for variation in susceptibility across the human population and the possibility that the available data may not represent individuals who are most susceptible to the effect, by using a data-based adjustment or UF or a combination of the two. Where appropriate data or models for the effect or for characterizing the internal dose are available, the potential for data-based adjustments for pharmacodynamics or pharmacokinetics is considered 9,10 (U.S. EPA. 2014a. 20021. When sufficient data are available, an intraspecies UF either less than or greater than 10-fold may be justified (U.S. EPA. 20021. This factor may be reduced if the POD is derived from or adjusted specifically for susceptible individuals [not for a general population that includes both susceptible and nonsusceptible individuals; (see fU.S. EPA. 20021. Section 4.4.5; fU.S. EPA. 19981. Section 4.2; fU.S. EPA. 19961. Section 4; ("U.S. EPA. 19941. Section 4.3.9.1:fU.S. EPA. 1991al. Section 3.4). When the use of such data or modeling is not supported, an UF with a default value of 10 is considered. • LOAEL to NOAEL: If a POD is based on a LOAEL, the assessment includes an adjustment to an exposure level where such effects are not expected. This can be a matter of great uncertainty if there is no evidence available at lower exposures. A factor of 3 or 10 is generally applied to extrapolate to a lower exposure expected to be without appreciable effects. A factor other than 10 may be used depending on the magnitude and nature of the response and the shape of the dose-response curve fU.S. EPA. 2002.1998.1996.1994. 1991a). • Subchronic-to-chronic exposure: When using subchronic studies to make inferences about chronic/lifetime exposure, the assessment considers whether lifetime exposure could have effects at lower levels of exposure. A factor of up to 10 may be applied to the POD, depending on the duration of the studies and the nature of the response fU.S. EPA. 2002. 1998.19941. • Database deficiencies: In addition to the adjustments above, if database deficiencies raise concern that further studies might identify a more sensitive effect, organ system, or life stage, the assessment may apply a database UF (U.S. EPA. 2002.1998.1996.1994.1991a). The size of the factor depends on the nature of the database deficiency. For example, the EPA typically follows the recommendation that a factor of 10 be applied if both a prenatal toxicity study and a two-generation reproduction study are missing and a factor of 101/2 (i.e., 3) if either one or the other is missing ((U.S. EPA. 2002). Section 4.4.5). The POD for a reference value is divided by the product of these factors ((U.S. EPA. 2002). Section 4.4.5), recommends that any composite factor that exceeds 3,000 represents excessive uncertainty and recommends against relying on the associated reference value. This document is a draft for review purposes only and does not constitute Agency policy. 9-12 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 10. PROTOCOL HISTORY This document is a draft for review purposes only and does not constitute Agency policy. 10-1 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 Protocol for the Ethylbenzene IRIS Assessment REFERENCES Adgate. TL: Church. TR: Ryan. AD: Ramachandran. G: Fredrickson. AL: Stock. TH: Morandi. MT: Sexton. K. (2004). Outdoor, indoor, and personal exposure to VOCs in children. Environ Health Perspect 112: 1386-1392. http://dx.doi.org/10.1289/ehp.71Q7. Aguilera. I: Sunver. 1: Fernandez-Patier. R: Hoek. G: Aguirre-Alfaro. A: Meliefste. K: Bomboi- Mingarro. MT: Nieuwenhuiisen. MT: Herce-Garraleta. D: Brunekreef. B. (2008). Estimation of outdoor NOx, N02, and BTEX exposure in a cohort of pregnant women using land use regression modeling. Environ Sci Technol 42: 815-821. http://dx.doi.org/10.1021/es0715492. ANL (Argonne National Laboratory). (2021). Active thermochemical tables (ATcT), version 1.122d: Ethylbenzene enthalpy of formation. Available online at https://atctanl.gov/Thermochemical%20Data/version%201.122d/species/7species numb er=1092 (accessed December 16, 2021). Ashley. PL: Prah. ID. (1997). Time dependence of blood concentrations during and after exposure to a mixture of volatile organic compounds. Arch Environ Health 52: 26-33. http://dx.doi.org/10.1080/00039899709603796. ATSDR (Agency for Toxic Substances and Disease Registry). (2010). Toxicological profile for ethylbenzene [ATSDR Tox Profile], (PB2010100004). Atlanta, GA: U.S. Department of Health and Human Services, Public Health Service. http://www.atsdr.cdc.gov/ToxProfiles/tp.asp?id=383&tid=66. Basagana. X: Aguilera. I: Rivera. M: Agis. D: Foraster. M: Marrugat. 1: Elosua. R: Kiinzli. N. (2013). Measurement error in epidemiologic studies of air pollution based on land-use regression models. Am J Epidemiol 178: 1342-1346. http: //dx.doi.org/10.1093 /aie/kwtl27. Cannella. W. (2007). Xylenes and ethylbenzene [Encyclopedia], In Kirk-Othmer Encyclopedia of Chemical Technology. Michigan: John Wiley & Sons. Capella. KM: Roland. K: Geldner. N: Rev deCastro. B: De Testis. VR: van Bemmel. D: Blount. BC. (2019). Ethylbenzene and styrene exposure in the United States based on urinary mandelic acid and phenylglyoxylic acid: NHANES 2005-2006 and 2011-2012. Environ Res 171: 101- 110. http://dx.doi.Org/10.1016/i.envres.2019.01.018. Clayton. GD: Clayton. FE. (1981). "Alcohols". In Patty's industrial hygiene and toxicology: Toxicology. New York, NY: John Wiley & Sons. Dickersin. K. (1990). The existence of publication bias and risk factors for its occurrence. JAMA 263: 1385-1389. EFSA (European Food Safety Authority). (2017). Guidance on the use of the weight of evidence approach in scientific assessments. EFSA J15: 1-69. http://dx.doi.Org/10.2903/i.efsa.2017.4971. Heinrich-Ramm. R: Takubowski. M: Heinzow. B: Christensen. TM: Olsen. E: Hertel. 0. (2000). Biological monitoring for exposure to volatile organic compounds (VOCs). Pure Appl Chem 72: 385-436. http://dx.doi.org/10.1351/pac200072030385. Higgins. TPT: Thomas. 1: Chandler. 1: Cumpston. M: Li. T: Page. MT: Welch. VA. (2022). Cochrane handbook for systematic reviews of interventions version 6.3. Higgins, JPT; Thomas, J; Chandler, J; Cumpston, M; Li, T; Page, MJ; Welch, VA. http://www.training.cochrane.org/handbook. This document is a draft for review purposes only and does not constitute Agency policy. R-l DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Protocol for the Ethylbenzene IRIS Assessment Hill. AB. (1965). The environment and disease: Association or causation? Proc R Soc Med 58: 295- 300. Hoenig. 1M: Heisev. DM. (2001). The abuse of power: The pervasive fallacy of power calculations for data analysis. Am Stat 55: 19-24. Howard. BE: Phillips. 1: Miller. K: Tandon. A: Mav. D: Shah. MR: Holmgren. S: Pelch. KE: Walker. V: Roonev. AA: Macleod. M: Shah. RR: Thayer. K. (2016). SWIFT-Review: A text-mining workbench for systematic review. Syst Rev 5: 87. http://dx.doi.org/10.1186/sl3643-016- 0263-z. IPCS (International Programme on Chemical Safety). (2012). Harmonization project document no. 10: Guidance for immunotoxicity risk assessment for chemicals. (Harmonization Project Document No. 10). Geneva, Switzerland: World Health Organization. http://www.inchem.org/documents/harmproi/harmproi/harmproilO.pdf. Tia. C: Batterman. SA: Relvea. GE. (2012). Variability of indoor and outdoor V0C measurements: an analysis using variance components. Environ Pollut 169: 152-159. http://dx.doi.Org/10.1016/i.envpol.2011.09.024. Kim. YM: Harrad. S: Harrison. RM. (2002). Levels and sources of personal inhalation exposure to volatile organic compounds. Environ Sci Technol 36: 5405-5410. http://dx.doi.org/10.1021/esQ10148y. Konkle. SL: Zierold. KM: Taylor. KC: Riggs. DW: Bhatnagar. A. (2020). National secular trends in ambient air volatile organic compound levels and biomarkers of exposure in the United States. Environ Res 182: 108991. http://dx.doi.Org/10.1016/i.envres.2019.108991. Lin. YS: Kupper. LL: Rappaport. SM. (2005). Air samples versus biomarkers for epidemiology. Occup Environ Med 62: 750-760. http://dx.doi.org/10.1136/oem.2004.0131Q2. Mcdonald. BC: de Gouw. TA: Gilman. IB: Tathar. SH: Akherati. A: Cappa. CD: Timenez. TL: Lee-Tavlor. I: Hayes. PL: Mckeen. SA: Cui. YY: Kim. SW: Gentner. PR: Isaacman-Vanwertz. G: Goldstein. AH: Harlev. RA: Frost. GT: Roberts. TM: Rverson. TB: Trainer. M. (2018). Volatile chemical products emerging as largest petrochemical source of urban organic emissions. Science 359: 760-764. http://dx.doi.org/10.1126/science.aaq0524. Mukeriee. S: Smith. LA: Tohnson. MM: Neas. LM: Stallings. CA. (2009). Spatial analysis and land use regression of VOCs and N0(2) from school-based urban air monitoring in Detroit/Dearborn, USA. Sci Total Environ 407: 4642-4651. http://dx.doi.Org/10.1016/i.scitotenv.2009.04.030. NASEM (National Academies of Sciences, Engineering, and Medicine). (2021). Review of U.S. EPA's ORD staff handbook for developing IRIS assessments: 2020 version. Washington, DC: National Academies Press, http: //dx.doi.org/10.17226/26289. Nong. A: Charest-Tardif. G: Tardif. R: Lewis. DF: Sweeney. LM: Gargas. ML: Krishnan. K. (2007). Physiologically based modeling of the inhalation pharmacokinetics of ethylbenzene in B6C3F1 mice. J Toxicol Environ Health A 70: 1838-1848. http://dx.doi.org/10.1080/15287390701459239. NRC (National Research Council). (2014). Review of EPA's Integrated Risk Information System (IRIS) process. Washington, DC: The National Academies Press. http://dx.doi.org/10.17226/18764. NTP (National Toxicology Program). (2015). Handbook for conducting a literature-based health assessment using OHAT approach for systematic review and evidence integration. Research Triangle Park, NC: U.S. Deptartment of Health and Human Services, National Toxicology Program, Office of Health Assessment and Translation. https://ntp.niehs.nih.gov/ntp/ohat/pubs/handbookian2015 508.pdf. NTP (National Toxicology Program). (2018). 28-day evaluation of the toxicity (C04049) of perfluorononaoic acid (PFNA) (375-95-1) on Harlan Sprague-Dawley rats exposed via gavage [NTP], http://dx.doi.Org/10.22427/NTP-DATA-002-02655-0003-0000-3. This document is a draft for review purposes only and does not constitute Agency policy. R-2 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Protocol for the Ethylbenzene IRIS Assessment NTP (National Toxicology Program). (2019). Handbook for conducting a literature-based health assessment using OHAT approach for systematic review and evidence integration. Research Triangle, NC: National Institute of Environmental Health Sciences. https://ntp.niehs.nih.gov/ntp/ohat/pubs/handbookmarch2019 508.pdf. Ranslev. PL. (1984). Xylenes and ethylbenzene. In HF Mark; DF Othmer; CG Overberger; GT Seaborg; M Grayson (Eds.), Kirk-Othmer encyclopedia of chemical technology (Vol 24) (3rd ed., pp. 709-744). New York, NY: John Wiley & Sons. Schiinemann. H: Brozek. I: Guvatt. G: Oxman. A. (2013). GRADE handbook. Available online at https://gdt.gradepro.org/app/handbook/handbook.html (accessed April 22, 2022). Sexton. K: Adgate. TL: Church. TR: Ashley. PL: Needham. LL: Ramachandran. G: Fredrickson. AL: Ryan. AD. (2005). Children's exposure to volatile organic compounds as determined by longitudinal measurements in blood. Environ Health Perspect 113: 342-349. http://dx.doi.org/10.1289/ehp.7412. Sexton. K: Adgate. TL: Mongin. ST: Pratt. GC: Ramachandran. G: Stock. TH: Morandi. MT. (2004a). Evaluating differences between measured personal exposures to volatile organic compounds and concentrations in outdoor and indoor air. Environ Sci Technol 38: 2593- 2602. http://dx.doi.org/10.1021/es030607q. Sexton. K: Adgate. TL: Ramachandran. G: Pratt. GC: Mongin. ST: Stock. TH: Morandi. MT. (2004b). Comparison of personal, indoor, and outdoor exposures to hazardous air pollutants in three urban communities. Environ Sci Technol 38: 423-430. http://dx.doi.org/10.1021/es030319u. Sexton. K: Mongin. ST: Adgate. TL: Pratt. GC: Ramachandran. G: Stock. TH: Morandi. MT. (2007). Estimating volatile organic compound concentrations in selected microenvironments using time-activity and personal exposure data. J Toxicol Environ Health A 70: 465-476. http://dx.doi.Org/10.1080/15287390600870858. Smith. MT: Guvton. KZ: Gibbons. CF: Fritz. TM: Portier. CI: Rusvn. I: DeMarini. DM: Caldwell. TC: Kavlock. RT: Lambert. PF: Hecht. SS: Bucher. TR: Stewart. BW: Baan. RA: Cogliano. VI: Straif. K. (2016). Key characteristics of carcinogens as a basis for organizing data on mechanisms of carcinogenesis [Review], Environ Health Perspect 124: 713-721. http://dx.doi.org/10.1289/ehp.1509912. Sterne. TAC: Hernan. MA: Reeves. BC: Savovic. I: Berkman. ND: Viswanathan. M: Henry. D: Altman. DG: Ansari. MT: Boutron. I: Carpenter. TR: Chan. AW: Churchill. R: Peeks. IT: Hrobiartsson. A: Kirkham. 1: Tiini. P: Loke. YK: Pigott. TP: Ramsay. CR: Regidor. D: Rothstein. HR: Sandhu. L: Santaguida. PL: Schiinemann. HT: Shea. B: Shrier. I: Tugwell. P: Turner. L: Valentine. TC: Waddington. H: Waters. E: Wells. GA: Whiting. PF: Higgins. TPT. (2016). ROBINS-I: A tool for assessing risk of bias in non-randomised studies of interventions. BMJ 355: i4919. http://dx.doi.org/10.1136/bmi.i4919. Su. FC: Mukheriee. B: Batterman. S. (2011). Trends of VOC exposures among a nationally representative sample: Analysis of the NHANES 1988 through 2004 data sets. Atmos Environ 45: 4858-4867. http://dx.doi.Org/10.1016/i.atmosenv.2011.06.016. Sweeney. LM: Kester. IE: Kirman. CR: Gentry. RP: Banton. MI: Bus. IS: Gargas. ML. (2015). Risk assessments for chronic exposure of children and prospective parents to ethylbenzene (CAS No. 100-41-4) [Review], Crit Rev Toxicol 45: 662-726. http://dx.doi. org/10.3109 /10408444.2015.1046157. Thayer. KA: Shaffer. RM: Angrish. M: Arzuaga. X: Carlson. LM: Pavis. A: Pishaw. L: Pruwe. I: Gibbons. C: Glenn. B: Tones. R: Kaiser. TP: Keshava. C: Keshava. N: Kraft. A: Lizarraga. L: Markev. K: Persad. A: Radke. EG:... Yost. E. (2022). Use of systematic evidence maps within the US environmental protection agency (EPA) integrated risk information system (IRIS) program: Advancements to date and looking ahead [Comment], Environ Int 169: 107363. http://dx.doi.Org/10.1016/j.envint.2022.107363. This document is a draft for review purposes only and does not constitute Agency policy. R-3 PRAFT-P0 NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Protocol for the Ethylbenzene IRIS Assessment U.S. EPA (U.S. Environmental Protection Agency). (1988). Recommendations for and documentation of biological values for use in risk assessment [EPA Report], (EPA600687008). Cincinnati, OH. http://cfpub.epa.gov/ncea/cfm/recordisplay.cfm?deid=34855. U.S. EPA (U.S. Environmental Protection Agency). (1991a). Guidelines for developmental toxicity risk assessment. Fed Reg 56: 63798-63826. U.S. EPA (U.S. Environmental Protection Agency). (1991b). Integrated Risk Information System (IRIS): Chemical assessment summary for ethylbenzene (CASRN 100-41-4) [EPA Report], Washington DC. http: //www.epa.gov/iris/subst/0051.htm. U.S. EPA (U.S. Environmental Protection Agency). (1994). Methods for derivation of inhalation reference concentrations and application of inhalation dosimetry [EPA Report], (EPA600890066F). Research Triangle Park, NC. https://cfpub.epa.gov/ncea/risk/recordisplay.cfm?deid=71993&CFID=51174829&CFTOKE N=25006317. U.S. EPA (U.S. Environmental Protection Agency). (1996). Guidelines for reproductive toxicity risk assessment (pp. 1-143). (EPA/630/R-96/009). Washington, DC: U.S. Environmental Protection Agency, Risk Assessment Forum. https://www.epa.gov/sites/production/files/2Q14- 11/documents/guidelines repro toxicity.pdf. U.S. EPA (U.S. Environmental Protection Agency). (1998). Guidelines for neurotoxicity risk assessment [EPA Report] (pp. 1-89). (ISSN 0097-6326EISSN 2167-2520 EPA/630/R-95/001F). Washington, DC: U.S. Environmental Protection Agency, Risk Assessment Forum, http://www.epa.gov/risk/guidelines-neurotoxicity-risk-assessment U.S. EPA (U.S. Environmental Protection Agency). (2002). A review of the reference dose and reference concentration processes. (EPA630P02002F). Washington, DC. https://www.epa.gov/sites/production/files/2014-12/documents/rfd-final.pdf. U.S. EPA (U.S. Environmental Protection Agency). (2005a). Guidelines for carcinogen risk assessment [EPA Report], (EPA630P03001F). Washington, DC. https://www.epa.gov/sites/production/files/2Q13- 09/documents/cancer guidelines final 3-25-05.pdf. U.S. EPA (U.S. Environmental Protection Agency). (2005b). Supplemental guidance for assessing susceptibility from early-life exposure to carcinogens [EPA Report], (EPA/630/R-03/003F). Washington, DC: U.S. Environmental Protection Agency, Risk Assessment Forum. https://www.epa.gov/risk/supplemental-guidance-assessing-susceptibility-early-life- exposure-carcinogens. U.S. EPA (U.S. Environmental Protection Agency). (2010a). Comparison of 1999 model-predicted concentrations to monitored data. https://archive.epa.gov/airtoxics/natal999/web/html/99compare.html. U.S. EPA (U.S. Environmental Protection Agency). (2010b). Toxicological review of ethylene glycol monobutyl ether (EGBE) (CAS no. 111-76-2) in support of summary information on the integrated risk information system (IRIS), march 2010. (EPA/635/R-08/006F). https://iris.epa.gov/static/pdfs/05Q0tr.pdf. U.S. EPA (U.S. Environmental Protection Agency). (2011). Recommended use of body weight 3/4 as the default method in derivation of the oral reference dose. (EPA100R110001). Washington, DC. https://www.epa.gov/sites/production/files/2013-09/documents/recommended-use- of-bw34.pdf. U.S. EPA (U.S. Environmental Protection Agency). (2012a). Advances in inhalation gas dosimetry for derivation of a reference concentration (RfC) and use in risk assessment (pp. 1-140). (EPA/600/R-12/044). Washington, DC. This document is a draft for review purposes only and does not constitute Agency policy. R-4 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Protocol for the Ethylbenzene IRIS Assessment https://cfpub.epa.gov/ncea/risk/recordisplay.cfm?deid=244650&.CFID=50524762&.CFTOK EN=17139189. U.S. EPA (U.S. Environmental Protection Agency). (2012b). Benchmark dose technical guidance [EPA Report], (EPA100R12001). Washington, DC: U.S. Environmental Protection Agency, Risk Assessment Forum, https://www.epa.gov/risk/benchmark-dose-technical-guidance. U.S. EPA (U.S. Environmental Protection Agency). (2014a). Guidance for applying quantitative data to develop data-derived extrapolation factors for interspecies and intraspecies extrapolation [EPA Report], (EPA/100/R-14/002F). Washington, DC: Risk Assessment Forum, Office of the Science Advisor, https://www.epa.gov/sites/production/files/2015- 01/documents/ddef-final.pdf. U.S. EPA (U.S. Environmental Protection Agency). (2014b). Scoping and problem formulation for the identification of potential health hazards for the Integrated Risk Information System (IRIS) toxicological review of ethylbenzene [CASRN 100-41-4] [EPA Report], (EPA/635/R- 14/198). Washington, DC. http://nepis.epa.gov/Exe/ZyPURL.cgi?Dockey=P100L2BQ.txt. U.S. EPA (U.S. Environmental Protection Agency). (2015). Peer review handbook [EPA Report] (4th ed.). (EPA/100/B-15/001). Washington, DC: U.S. Environmental Protection Agency, Science Policy Council, https://www.epa.gov/osa/peer-review-handbook-4th-edition-2015. U.S. EPA (U.S. Environmental Protection Agency). (2017a). Guidance to assist interested persons in developing and submitting draft risk evaluations under the Toxic Substances Control Act (EPA/740/R17/001). Washington, DC: U.S Environmental Protection Agency, Office of Chemical Safety and Pollution Prevention. https://www.epa.gov/sites/production/files/2017- 06/documents/tsca ra guidance final.pdf. U.S. EPA (U.S. Environmental Protection Agency). (2017b). IRIS assessment plan for ethylbenzene [CASRN 100-41-4], U.S. EPA (U.S. Environmental Protection Agency). (2018a). Chemistry Dashboard. Washington, DC. Retrieved from https: //comptox.epa.gov/dashboard U.S. EPA (U.S. Environmental Protection Agency). (2018b). An umbrella Quality Assurance Project Plan (QAPP) for PBPK models [EPA Report], (ORD QAPP ID No: B-0030740-QP-1-1). Research Triangle Park, NC. U.S. EPA (U.S. Environmental Protection Agency). (2019a). ChemView [Database], Retrieved from https://chemview.epa.gov/chemview U.S. EPA (U.S. Environmental Protection Agency). (2019b). CompTox Chemicals Dashboard [Database], Research Triangle Park, NC. Retrieved from https: //comptox. ep a. gov/dashbo ar d U.S. EPA (U.S. Environmental Protection Agency). (2020a). ORD staff handbook for developing IRIS assessments (public comment draft) [EPAReport], (EPA/600/R-20/137). Washington, DC: U.S. Environmental Protection Agency, Office of Research and Development, Center for Public Health and Environmental Assessment https://cfpub.epa.gov/ncea/iris drafts/recordisplay.cfm?deid=350086. U.S. EPA (U.S. Environmental Protection Agency). (2020b). Umbrella quality assurance project plan (QAPP) for dosimetry and mechanism-based models. (EPA QAPP ID Number: L-CPAD- 0032188-QP-1-2). Research Triangle Park, NC. U.S. EPA (U.S. Environmental Protection Agency). (2021). CompTox chemicals dashboard. Washington, DC. Retrieved from https: / /comptox. ep a. gov/dashbo ard U.S. EPA (U.S. Environmental Protection Agency). (2022). ORD staff handbook for developing IRIS assessments [EPAReport], (EPA 600/R-22/268). Washington, DC: U.S. Environmental Protection Agency, Office of Research and Development, Center for Public Health and Environmental Assessment https://cfpub.epa.gov/ncea/iris drafts/recordisplay.cfm?deid=356370. This document is a draft for review purposes only and does not constitute Agency policy. R-5 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 Protocol for the Ethylbenzene IRIS Assessment Wallace. L: Nelson. W: Ziegenfus. R: Pellizzari. E: Michael. L: Whitmore. R: Zelon. H: Hartwell. T: Perritt. R: Westerdahl. D. (1991). The Los Angeles TEAM Study: personal exposures, indoor- outdoor air concentrations, and breath concentrations of 25 volatile organic compounds. J Expo Anal Environ Epidemiol 1: 157-192. Wallace. LA: Pellizzari. ED: Hartwell. TP: Sparacino. C: Whitmore. R: Sheldon. L: Zelon. H: Perritt. R. (1987). The TEAM study: Personal exposures to toxic substances in air, drinking water, and breath of 400 residents of New Jersey, North Carolina, and North Dakota. Environ Res 43: 290-307. http://dx.doi.org/10.1016/S0013-935ir87180030-0. Welch. VA: Fallon. KT: Gelbke. HP. (2005). Ethylbenzene. In Ullmann's Encyclopedia of Industrial Chemistry, http://dx.doi.org/10.1002/14356007.al0 035.pub2. Wolffe. TAM: Whalev. P: Halsall. C: Roonev. AA: Walker. VR. (2019). Systematic evidence maps as a novel tool to support evidence-based decision-making in chemicals policy and risk management Environ Int 130: 104871. http://dx.doi.Org/10.1016/i.envint.2019.05.065. This document is a draft for review purposes only and does not constitute Agency policy. R-6 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment APPENDIX A. ELECTRONIC DATABASE SEARCH STRATEGIES Table A-l. Database search strategy Search Search strategy Date and results PubMed Chemical terms (100-41-4[rn] OR "ethylbenzene"[tw] OR "Ethylbenzol"[tw] OR "4-Ethylphenetole"[tw] OR "Ethyl(benzene-d5)"[tw] OR "Ethyl-l,l-d2 benzene- d5"[tw] OR "Ethyl, 2-phenyl-"[tw] OR "ci- Methyltoluene"[tw] OR "Phenylurethane"[tw] OR "Ethyl-d5-benzene"[tw] OR "Ethylbenzene-dlO"[tw] OR "NCI-C56393"[tw] OR "NSC 406903"[tw] OR "Phenylethane"[tw] OR "UNII-L5l45M5GOO"[tw] OR "Ethylbenzol"[tw] OR "Etilbenzene"[tw] OR "Etylobenzen"[tw] OR "HSDB 84"[tw] OR "EC 202- 849-4"[tw] OR "EINECS 202-849-4"[tw] OR "Ethyl benzene"[tw] OR "Ethylbenzeen"[tw] OR "Aethylbenzol"[tw] OR "AI3-09057"[tw] OR "CCRIS 916"[tw] OR "DA0700000"[tw] OR "Phenylethane"[tw] OR "C004912"[tw] OR "ethyl- benzene"[tw]) NOT medline Date: 4/22/2019 Results: 2,765 Batch: 31018 (100-41-4[rn] OR "ethylbenzene"[tw] OR "Ethylbenzol"[tw] OR "4-Ethylphenetole"[tw] OR "Ethyl(benzene-d5)"[tw] OR "Ethyl-l,l-d2 benzene- d5"[tw] OR "Ethyl, 2-phenyl-"[tw] OR "ci- Methyltoluene"[tw] OR "Phenylurethane"[tw] OR "Ethyl-d5-benzene"[tw] OR "Ethylbenzene-dlO"[tw] OR "NCI-C56393"[tw] OR "NSC 406903"[tw] OR "Phenylethane"[tw] OR "UNII-L5l45M5GOO"[tw] OR "Ethylbenzol"[tw] OR "Etilbenzene"[tw] OR "Etylobenzen"[tw] OR "HSDB 84"[tw] OR "EC 202- 849-4"[tw] OR "EINECS 202-849-4"[tw] OR "Ethyl benzene"[tw] OR "Ethylbenzeen"[tw] OR "Aethylbenzol"[tw] OR "AI3-09057"[tw] OR "CCRIS 916"[tw] OR "DA0700000"[tw] OR "Phenylethane"[tw] OR "C004912"[tw] OR "ethyl- benzene"[tw]) AND ("2019/04/01"[PDAT] : "3000"[PDAT]) NOT medline Date: 4/13/2020 Results: 180 Batch: 37652 This document is a draft for review purposes only and does not constitute Agency policy. A-l DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Search Search strategy Date and results (100-41-4[rn] OR "ethylbenzene"[tw] OR "Ethylbenzol"[tw] OR "4-Ethylphenetole"[tw] OR "Ethyl(benzene-d5)"[tw] OR "Ethyl-l,l-d2 benzene- d5"[tw] OR "Ethyl, 2-phenyl-"[tw] OR "a- Methyltoluene"[tw] OR "Phenylurethane"[tw] OR "Ethyl-d5-benzene"[tw] OR "Ethylbenzene-dlO"[tw] OR "NCI-C56393"[tw] OR "NSC 406903"[tw] OR "Phenylethane"[tw] OR "UNII-L5l45M5GOO"[tw] OR "Ethylbenzol"[tw] OR "Etilbenzene"[tw] OR "Etylobenzen"[tw] OR "HSDB 84"[tw] OR "EC 202- 849-4"[tw] OR "EINECS 202-849-4"[tw] OR "Ethyl benzene"[tw] OR "Ethylbenzeen"[tw] OR "Aethylbenzol"[tw] OR "AI3-09057"[tw] OR "CCRIS 916"[tw] OR "DA0700000"[tw] OR "Phenylethane"[tw] OR "C004912"[tw] OR "ethyl- benzene"[tw]) Date: 11/13/2020 Results: 164 (100-41-4[rn] OR "ethylbenzene"[tw] OR "Ethylbenzol"[tw] OR "4-Ethylphenetole"[tw] OR "Ethyl(benzene-d5)"[tw] OR "Ethyl-l,l-d2 benzene- d5"[tw] OR "Ethyl, 2-phenyl-"[tw] OR "a- Methyltoluene"[tw] OR "Phenylurethane"[tw] OR "Ethyl-d5-benzene"[tw] OR "Ethylbenzene-dlO"[tw] OR "NCI-C56393"[tw] OR "NSC 406903"[tw] OR "Phenylethane"[tw] OR "UNII-L5l45M5GOO"[tw] OR "Ethylbenzol"[tw] OR "Etilbenzene"[tw] OR "Etylobenzen"[tw] OR "HSDB 84"[tw] OR "EC 202- 849-4"[tw] OR "EINECS 202-849-4"[tw] OR "Ethyl benzene"[tw] OR "Ethylbenzeen"[tw] OR "Aethylbenzol"[tw] OR "AI3-09057"[tw] OR "CCRIS 916"[tw] OR "DA0700000"[tw] OR "Phenylethane"[tw] OR "C004912"[tw] OR "ethyl- benzene"[tw]) AND (2020/ll/01:3000[dp]) Date: 1/21/2022 Results: 232 Batch: 46084 Web of Science Chemical terms3 TS=(" 100-41-4" OR "Benzene, ethyl-" OR"4- Ethylphenetole" OR "Ethyl(benzene-d5)" OR "Ethyl- l,l-d2 benzene-d5" OR "Ethyl, 2-phenyl-" OR "a- Methyltoluene" OR "Phenylurethane" OR "Ethyl-d5- benzene" OR "Ethylbenzene-dlO" OR "NCI-C56393" OR"NSC 406903" OR "Phenylethane" OR "UNII- L5I45M5GOO" OR "Ethylbenzol" OR "Etilbenzene" OR "Etylobenzen" OR "HSDB 84" OR "EC 202-849-4" OR "EINECS 202-849-4" OR "Ethyl benzene" OR "Ethylbenzeen" OR "Aethylbenzol" OR "AI3-09057" OR "CCRIS 916" OR "DA0700000" OR "Phenylethane" OR "C004912" OR "ethyl-benzene") Date: 4/22/2019 Results: 1,585 Batch: 31051 This document is a draft for review purposes only and does not constitute Agency policy. A-2 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Search Search strategy Date and results TS=(" 100-41-4" OR "Benzene, ethyl-" OR"4- Ethylphenetole" OR "Ethyl(benzene-d5)" OR "Ethyl- l,l-d2 benzene-d5" OR "Ethyl, 2-phenyl-" OR "a- Methyltoluene" OR "Phenylurethane" OR "Ethyl-d5- benzene" OR "Ethylbenzene-dlO" OR "NCI-C56393" OR"NSC 406903" OR "Phenylethane" OR "UNII- L5I45M5GOO" OR "Ethylbenzol" OR "Etilbenzene" OR "Etylobenzen" OR "HSDB 84" OR "EC 202-849-4" OR "EINECS 202-849-4" OR "Ethyl benzene" OR "Ethylbenzeen" OR "Aethylbenzol" OR "AI3-09057" OR "CCRIS 916" OR "DA0700000" OR "Phenylethane" OR "C004912" OR "ethyl-benzene") AND PY=(2019- 2020) Date: 4/13/2020 Results: 73 Batch: 37653 TS=(" 100-41-4" OR "Benzene, ethyl-" OR"4- Ethylphenetole" OR "Ethyl(benzene-d5)" OR "Ethyl- l,l-d2 benzene-d5" OR "Ethyl, 2-phenyl-" OR "a- Methyltoluene" OR "Phenylurethane" OR "Ethyl-d5- benzene" OR "Ethylbenzene-dlO" OR "NCI-C56393" OR"NSC 406903" OR "Phenylethane" OR "UNII- L5I45M5GOO" OR "Ethylbenzol" OR "Etilbenzene" OR "Etylobenzen" OR "HSDB 84" OR "EC 202-849-4" OR "EINECS 202-849-4" OR "Ethyl benzene" OR "Ethylbenzeen" OR "Aethylbenzol" OR "AI3-09057" OR "CCRIS 916" OR "DA0700000" OR "Phenylethane" OR "C004912" OR "ethyl-benzene") Date: 4/13/2020 Results: 50 This document is a draft for review purposes only and does not constitute Agency policy. A-3 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Search Search strategy Date and results (Tl=(" 100-41-4" OR "Benzene, ethyl-" OR"4- Ethylphenetole" OR "Ethyl(benzene-d5)" OR "Ethyl- l,l-d2 benzene-d5" OR "Ethyl, 2-phenyl-" OR "a- Methyltoluene" OR "Phenylurethane" OR "Ethyl-d5- benzene" OR "Ethylbenzene-dlO" OR "NCI-C56393" OR"NSC 406903" OR "Phenylethane" OR "UNII- L5I45M5GOO" OR "Ethylbenzol" OR "Etilbenzene" OR "Etylobenzen" OR "HSDB 84" OR "EC 202-849-4" OR "EINECS 202-849-4" OR "Ethyl benzene" OR "Ethylbenzeen" OR "Aethylbenzol" OR "AI3-09057" OR "CCRIS 916" OR "DA0700000" OR "Phenylethane" OR "C004912" OR "ethyl-benzene") OR AB=("100-41-4" OR "Benzene, ethyl-" OR"4- Ethylphenetole" OR "Ethyl(benzene-d5)" OR "Ethyl- l,l-d2 benzene-d5" OR "Ethyl, 2-phenyl-" OR "a- Methyltoluene" OR "Phenylurethane" OR "Ethyl-d5- benzene" OR "Ethylbenzene-dlO" OR "NCI-C56393" OR"NSC 406903" OR "Phenylethane" OR "UNII- L5I45M5GOO" OR "Ethylbenzol" OR "Etilbenzene" OR "Etylobenzen" OR "HSDB 84" OR "EC 202-849-4" OR "EINECS 202-849-4" OR "Ethyl benzene" OR "Ethylbenzeen" OR "Aethylbenzol" OR "AI3-09057" OR "CCRIS 916" OR "DA0700000" OR "Phenylethane" OR "C004912" OR "ethyl-benzene") OR AK=("100-41-4" OR "Benzene, ethyl-" OR"4- Ethylphenetole" OR "Ethyl(benzene-d5)" OR "Ethyl- l,l-d2 benzene-d5" OR "Ethyl, 2-phenyl-" OR "a- Methyltoluene" OR "Phenylurethane" OR "Ethyl-d5- benzene" OR "Ethylbenzene-dlO" OR "NCI-C56393" OR"NSC 406903" OR "Phenylethane" OR "UNII- L5I45M5GOO" OR "Ethylbenzol" OR "Etilbenzene" OR "Etylobenzen" OR "HSDB 84" OR "EC 202-849-4" OR "EINECS 202-849-4" OR "Ethyl benzene" OR "Ethylbenzeen" OR "Aethylbenzol" OR "AI3-09057" OR "CCRIS 916" OR "DA0700000" OR "Phenylethane" OR "C004912" OR "ethyl-benzene")) AND DOP=2020-11-01/2022-01-30 1/21/2022 56 results Batch: 46083 This document is a draft for review purposes only and does not constitute Agency policy. A-4 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Search Search strategy Date and results Toxline Chemical terms @ AN D+@OR+(" Benzene, ethyl-"+"4- Ethylphenetole"+"Ethyl(benzene-d5) "+"Ethyl-l,l-d2 benzene-d5"+"Ethyl, 2-phenyl-"+"a- Methyltoluene"+"Phenylurethane"+"Ethyl-d5- benzene"+"Ethylbenzene-dl0"+"NCI-C56393"+"NSC 406903"+"Phenylethane"+"UNII- L5l45M5GOO"+"Ethylbenzol"+"Etilbenzene"+"Etylobe nzen'VHSDB 84"+"EC 202-849-4"+"EINECS 202-849- 4"+"Ethyl benzene"+"Ethylbenzeen"+"Aethylbenzol"+"AI3- 09057"+"CCRIS 916"+"DA0700000"+"Phenylethane"+"C004912"+"et hyl-benzene"+@TERM+@rn+100-41-4) Date: 4/22/2019 Results: 2,780 TSCATS Chemical terms @ AN D+@OR+@rn+" 100-41- 4"+@AND+@org+TSCATS+@NOT+@org+pubmed Date: 4/22/2019 Results: 245 aThe search conducted on 1/21/2022 utilized an updated Web of Science search process. Previous searches used only the topic (TS) field tag, which searches title, abstract, author-keywords, and keywords Plus. The updated process searches title (Tl), abstract (AB), and author-keywords (AK) tags filtering out references that only matched in the keywords plus that are WOS-generated keywords and typically are not relevant to assessments. This document is a draft for review purposes only and does not constitute Agency policy. AS DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Protocol for the Ethylbenzene IRIS Assessment APPENDIX B. PROCESS FOR SEARCHING AND COLLECTING EVIDENCE FROM SELECTED OTHER RESOURCES Review of the citation reference lists is typically done manually because they are not available in a file format (e.g., RIS) that permits uploading into screening software applications. Manual review entails scanning the title, study summary, or study details as presented in the resource for those that appear to meet the PECO criteria. Any records identified that are not identified from the other sources are formatted in an RIS file format, imported into DistillerSR, annotated with respect to source, and screened as outlined in Section 3.2. For tracking assessments or reviews, the name of the source citation and the number of records imported into DistillerSR are noted. The reference list of any study included in the literature inventory is reviewed manually to identify titles that appear relevant to the PECO criteria. These citations are tracked in a spreadsheet, compared against the literature base to determine whether they are unique to the project, and then added to DistillerSR to be screened at the title and abstract stage for PECO relevance. B.l EPA COMPTOX CHEMICALS DASHBOARD (TOXVAL) ToxVal is searched in the EPA CompTox Chemicals Dashboard (U.S. EPA. 2018a). and data available from the "Hazard" tab is exported from the CompTox File Transfer Protocol site. Using both the human health POD summary file and the Record Source file, citations are identified that apply to human health PODs. A citation for each referenced study is generated in HERO and verified that it is not already identified from the database search (or searches of "other sources consulted") prior to moving forward to screening in DistillerSR. Full texts are retrieved where possible; if full texts are not available, data from the ToxVal dashboard are entered and the citation is annotated accordingly for Tableau and HAWC visualizations by adding "(ToxVal)" to the citation. B.2 EUROPEAN CHEMICALS AGENCY (ECHA) A search of the ECHA registered substances database is conducted using the CASRN. The registration dossier associated with the CASRN is retrieved by navigating to and clicking the eye- shaped view icon displayed in the chemical summary panel. The general information page and all subpages included under the Toxicological Information tab are downloaded in Portable Document Format (PDF), including all nested reports having unique URLs. In addition, the data are extracted from each dossier page and used to populate an Excel tracking sheet Extracted fields include data This document is a draft for review purposes only and does not constitute Agency policy. B-l DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Protocol for the Ethylbenzene IRIS Assessment from the general information page regarding the registration type and publication dates, and on a typical study summary page the primary fields reported in the administrative data, data source, and effect levels sections. Each study summary results in more than one row in the tracking sheet if more than one data source or effect level is reported. At this stage, each study summary is reviewed for inclusion based on PECO criteria. Study summaries identified as without administrative data information are excluded from review, and study summaries labeled "read-across" (if any) are screened and considered supplemental material. When a study summary considered relevant reports data from a study or lab report, a citation for the full study is generated in HERO and verified that it was not already identified from the database search (or searches of "other sources consulted") prior to moving forward to screening. When citation information is not available and a full text could not be retrieved, the generated PDF is used as the full text for screening and extraction and the citation is annotated accordingly for Tableau and HAWC visualizations by adding "(ECHA Summary)" to the citation. B.3 EPA CHEMVIEW The EPA ChemView database fU.S. EPA. 2019al using the chemical CASRN is searched. The prepopulated CASRN match and the "Information Submitted to EPA" output option filter are selected before generating results. If results are available, the square-shaped icon under the "Data Submitted to EPA" column is selected, and the following records are included: • High Production Volume Challenge Database (HPVIS) • Human Health studies (Substantial Risk Reports) • Monitoring (includes environmental, occupational, and general entries) • TSCA Section 4 (chemical testing results) • TSCA Section 8(d) (health and safety studies) • TSCA Section 8(e) (substantial risk) • FYI (voluntary documents) All records for ecotoxicology and physical and chemical property entries are excluded. When results are available, extractors navigate into each record until a substantial risk report link is identified and saved as a PDF file. If the report cannot be saved, due to file corruption or broken links, the record is excluded during full-text review as "unable to obtain record." Most substantial risk reports contain multiple document IDs, so citations are derived by concatenating the unique report numbers (OTS; 8EHD Num; DCN; TSCATS RefID; and CIS) associated with each document, along with the typical author organization, year, and title. Once a citation is generated, the study This document is a draft for review purposes only and does not constitute Agency policy. B-2 DRAFT-DO NOT CITE OR QUOTE ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Protocol for the Ethylbenzene IRIS Assessment moves forward to DistillerSR where it is screened according to PECO and supplemental material criteria. B.4 NTP CHEMICAL EFFECTS IN BIOLOGICAL SYSTEMS This database is searched using the chemical CASRN (https: //manticore.niehs.nih.gov/cebssearchl. All non-NTP data are excluded using the "NTP Data Only" filter. Data tables for reports undergoing peer review are also searched for studies that have not been finalized fhttps://ntp. niehs.nih.gov/data/tables/index, html] based on a manual review of chemical names. B.5 OECD ECHEMPORTAL The OECD eChemPortal (https://hpvchemicals.oecd.org/UI/Search.aspx] is searched using the chemical CASRN. Only database entries from the following sources are included and entries from all other databases are excluded in the search. Final assessment reports and other relevant SIDS reports embedded in the links are captured and saved as PDF files. • OECD HPV • OECD SIDS IUCLID • SIDS United Nations Environment Programme (UNEP) B.6 ECOTOX DATABASE EPA's ECOTOX Knowledgebase (https://cfpub.epa.gov/ecotox/search.cfm] is searched using the CASRN. Results are refined to terrestrial mammalian studies by selecting the terrestrial tab at the top of the search page and sorting the results by species group. A citation for each referenced study is generated in HERO and verified that it is not already identified from the database search (or searches of "other sources consulted") search prior to moving forward to screening in DistillerSR. B.7 EPA COMPTOX CHEMICAL DASHBOARD VERSION TO RETRIEVE A SUMMARY OF ANY TOXCAST OR T0X21 HIGH-THROUGHPUT SCREENING INFORMATION Version 3.0.9 of the CompTox Chemicals Dashboard fU.S. EPA. 2019b] is accessed for high-throughput screening (HTS) data by searching the Dashboard by CASRN. Next, the "Bioactivity" section is selected and the availability of ToxCast/Tox21 HTS data for active and inactive assays is examined in the "TOXCAST: Summary" tab. If active assays are reported, the figure is copied for presentation in the systematic evidence map. This figure presents (1) a scatterplot of scaled assay responses versus AC50 values for each active assay endpoint and (2) a This document is a draft for review purposes only and does not constitute Agency policy. B-3 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment 1 cytotoxicity limit as a vertical line. More detailed information on the results of ToxCast and Tox21 2 assays are available in the CompTox Chemicals Dashboard section "ToxCast/Tox21," which includes 3 chemical analysis data, dose-response data and model fits, and "flags" assigned by an automated 4 analysis, which might suggest false positivity/negativity or indicate other anomalies in the data. 5 This information is not summarized further for the purposes of the systematic evidence map, which 6 is focused on identifying the extent of available evidence. B.8 ETHYLBENZENE GRAY LITERATURE SEARCH SUMMARY 7 Dates Run: All gray literature searches were conducted in 2020 (between 11/1/2020- 8 12/1/2020)and on 1/21/2022. 9 Search Limits: No date limits were applied to the gray literature search. 10 Search Terms: 11 • CASRN: 100-41-4 12 • "EC 202-849-4" 13 • "ethylbenzene" 14 • "1-ethylbenzene" 15 Sources Searched: The following sources were searched: 16 • ECHA Registration Dossiers 17 • ChemView 18 • OECD eChem Portal 19 • NTP Chemical Effects In Biological Systems fCEBSl 20 • EPA ToxVal - Searched using internal data files provided by CCTE 21 • EPAECOTOX Table B-l. Summary table for ethylbenzene other sources search results (12/20211 Source Search method Total results retrieved (2020) Total results retrieved (2022) Unique results ECHA Automated Webscraping 359 3 60 ChemView Manual Searching 23 0 5 OECD eChem Portal Manual Searching 2 0 0 CEBS Manual Searching 1 0 1 This document is a draft for review purposes only and does not constitute Agency policy. B-4 DRAFT-DO NOT CITE OR QUOTE ------- Protocol for the Ethylbenzene IRIS Assessment Source Search method Total results retrieved (2020) Total results retrieved (2022) Unique results ToxVal Manual Searching 83 - 14 ECOTOX Manual Searching - 3 0 Total N/A 468 6 80 CEBS = Chemical Effects in Biological Systems; ECHA = European Chemicals Agency; NA = not applicable; OECD = Organisation for Economic Co-operation and Development. This document is a draft for review purposes only and does not constitute Agency policy. B-5 DRAFT-DO NOT CITE OR QUOTE ------- |