Are We Doing Non-targeted Analysis Right? A Progress Report from the Benchmarking and Publications for Non-Targeted Analysis Working Group Elin M. Ulrich1 (ulrich.elin@epa.gov), Benjamin J. Place2 (benjamin.piace@nist.gov) 1. U.S. Environmental Protection Agency; Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, NC 2. National Institute of Standards and Technology, Material Measurement Laboratory, Chemical Sciences Division, Organic Chemical Measurement Science Group, Gaithersburg, MD WP123 The views expressed in this presentation are those of the author(s) and do not necessarily represent the views or policies of the U. S. Environmental Protection Agency or National Institute of Standards and Technology. Abstract The broad availability of high-resolution mass spectrometers has made their use more common for environmental and other applications, particularly for suspect screening and non-targeted analysis (NTA). While fields like metabolomics have developed mature methodologies and quality control practices, NTA of the exposome is still experiencing a steep learning curve and growing pains. The current state of NTA has been described as "the wild west," where each research group approaches the technique in their own way, with little overlap, consistency, or harmonization. In a 2018 EPA NTA workshop, a discussion resulted in the formation of a working group called "Benchmarking and Publications for Non-Targeted Analysis" or BP4NTA. The group has a near term goal of publishing a white paper describing terms, definitions, recommendations, and best practices surrounding NTA studies. Topics of particular importance and relevance will include recommendations on how to characterize an NTA method's performance and minimum reporting information for publications to improve transparency and reproducibility. If NTA exposome data is to be used to support further scientific endeavors and/or for regulatory purposes, practitioners should use sound, defensible, and commonly accepted techniques for data collection, analysis, and reporting. Therefore, BP4NTAand other groups like it in related fields are critical for coming to a scientific-community consensus to make an impact on environmental and health policies. About BP4NTA 61 members: 26 government, 20 industry, 15 academia > Monthly conference calls or in-person meetings > Collaborative documentation/discussions via Google Drive > 9 working groups for definitions (see Table) Short term goals: 1. Publish a white paper containing: NTAterms and their definitions; Calculations of performance metrics; Reporting recommendations to promote transparency/reproducibility; Scientific best practices. 2. Build consensus with like-minded organizations (e.g., NORMAN). 3. Work with journal editors and NTA researchers to establish guidelines NTA reporting for methods and performance evaluations. Long term goal (~10 yrs): Move the field of NTA toward proficiency testing, using a mechanism like ASTM/ISO Guidance on Performance and Data Reporting Requirements. Define proficiency levels for SSA, NTA, expert, competent, etc. JOIN US! Leave your card or contact the poster authors Ben or Elin. oEPA United States Environmental Protection Agency | | National Institute of I I I I Standards and Technology I I U.S. Department of Commerce Definitions in Progress Examples of Working Group Progress Category Terms and Sub-terms to Define Working Group Assignments Group 1: Identification Experimental Design Blank Laboratory- Matrix- Solvent- Replicate -Data analysis -Extraction -Injection -Sample Laboratory standard Spike General Methods Data analysis- Instru mental- Sample preparation- Li brary/database Non-targeted analysis Suspect screening analysis Interpretation of Data Identification Confidence of identification Performance True positive (TP), TP rate/ratio Statistics True negative (TN), TN rate/ratio False negative (FN), FN rate/ratio False positive (FP), FP rate/ratio Unintended positive (UP), UP rate/ratio False discovery rate (FDR) Negative predictive value Accuracy Performance Specificity Precision Sensitivity, recall Area under precision-recall (AUPR) curve F1 Score QA/QC Accuracy Precision Repeatability Reproducibility Group 6: Blank + Matrix blank + Spike + Laboratory standard + Replicate Group 2: Non-targeted analysis + Suspect screening Group 4: Data analysis method Group 5: Library/database Group 7: Instrumental method Group 9: Sample preparation method Group 1: Identification - of identification Group 3: False positive h True negative Confidence True positive ¦* Group 8: Performance + Accuracy + Reproducibility + Precision HOW WOULD YOU DEFINE THESE TERMS? LEAVE A POST-IT WITH YOUR IDEAS! r BP4NTA Initial Definition: Identification is the attribution of a single chemical identity (specific to the stereoisomer form, at minimum) within a sample with an associated confidence; confidence level must be assigned for all identifications, single chemical identity- could be identified to the formula level . must be = pretty strong language Revised definition: Identification/Annotation: attribution of a chemical identity (Identification) or chemical formula (Annotation) of a detected featur^within a sample with an associated confidence. feature- is there a better term? ^ : \ BP4NTA Initial Definition ' Raw Data is the unmodified data that has been generated from the instrument. A data analysis method is the treatment of the raw NTA data (with no processing) that... {needs completion} Group 4: Data Analysis Method f" Components of a Data Analysis Method: "N\ Raw data processing: Includes centroiding, smoothing, thresholding. Peak picking: Selecting unidentified peaks/mass features/mz-rt pairs but that could represent compounds. Alignment or binning? Annotation? Library Searching: Using a library/database to match spectra/precursor masses between experimental and known info, to include match score. In siiico fragmentation generation: Generating theoretical spectrum to compare with empirical spectrum. Mass spectral interpretatbn: Annotation of m/z fragments to "build" supporting information for the identity. , \jJser Expertise": Need to report exactly how this was applied. Group 6: Experimental Design Terms Types of blanks: Ambient. Calibration, Equipment, Field, Filter, Fortified method, Matrix, Method, Preservation, Reagent, Trip. http:/ytoww.chromatographyonline.comA/ital- Vrole-blanks-sample-preparation?pagelD=1 Discussion: Define all 11? All applicable to NTA? What is used depends strongly on the analytical task. New strategy: More important to detail HOW the blank was produced and its intended use - don't let the reader interpret for themselves. . Future Directions > Continue to refine definitions/equations, build^efine "components of lists, build reference library in BP4NTA. > Discussion/refinement/(dis)agreement within BP4NTA. > Develop a list of questions/topics for discussion within broader NTA community. Is consensus possible? > Develop outline and draft of white paper. > Publish white paper. > Communicate. Communicate! COMMUNICATE and obtain buy in from the broader NTA community. > Put ideas into practice as practitioners, editors, reviewers, mentors, etc. ------- |