Are We Doing Non-targeted Analysis Right? A Progress Report from the
Benchmarking and Publications for Non-Targeted Analysis Working Group
Elin M. Ulrich1 (ulrich.elin@epa.gov), Benjamin J. Place2 (benjamin.piace@nist.gov)
1. U.S. Environmental Protection Agency; Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, NC
2. National Institute of Standards and Technology, Material Measurement Laboratory, Chemical Sciences Division, Organic Chemical Measurement Science Group, Gaithersburg, MD
WP123	The views expressed in this presentation are those of the author(s) and do not necessarily represent the views or policies of the U. S. Environmental Protection Agency or National Institute of Standards and Technology.
Abstract
The broad availability of high-resolution mass spectrometers has made their use
more common for environmental and other applications, particularly for suspect
screening and non-targeted analysis (NTA). While fields like metabolomics have
developed mature methodologies and quality control practices, NTA of the
exposome is still experiencing a steep learning curve and growing pains. The
current state of NTA has been described as "the wild west," where each research
group approaches the technique in their own way, with little overlap, consistency, or
harmonization. In a 2018 EPA NTA workshop, a discussion resulted in the formation
of a working group called "Benchmarking and Publications for Non-Targeted
Analysis" or BP4NTA. The group has a near term goal of publishing a white paper
describing terms, definitions, recommendations, and best practices surrounding
NTA studies. Topics of particular importance and relevance will include
recommendations on how to characterize an NTA method's performance and
minimum reporting information for publications to improve transparency and
reproducibility. If NTA exposome data is to be used to support further scientific
endeavors and/or for regulatory purposes, practitioners should use sound,
defensible, and commonly accepted techniques for data collection, analysis, and
reporting. Therefore, BP4NTAand other groups like it in related fields are critical for
coming to a scientific-community consensus to make an impact on environmental
and health policies.
About BP4NTA
61 members: 26 government, 20 industry, 15 academia
>	Monthly conference calls or in-person meetings
>	Collaborative documentation/discussions via Google Drive
>	9 working groups for definitions (see Table)
Short term goals:
1.	Publish a white paper containing:
NTAterms and their definitions;
Calculations of performance metrics;
Reporting recommendations to promote transparency/reproducibility;
Scientific best practices.
2.	Build consensus with like-minded organizations (e.g., NORMAN).
3.	Work with journal editors and NTA researchers to establish guidelines NTA
reporting for methods and performance evaluations.
Long term goal (~10 yrs): Move the field of NTA toward proficiency testing, using a
mechanism like ASTM/ISO Guidance on Performance and Data Reporting
Requirements. Define proficiency levels for SSA, NTA, expert, competent, etc.
JOIN US! Leave your card or contact
the poster authors Ben or Elin.
oEPA
United States
Environmental Protection
Agency
| |	National Institute of
I I I	I Standards and Technology
I	I U.S. Department of Commerce
Definitions in Progress
Examples of Working Group Progress
Category
Terms and Sub-terms to Define Working Group Assignments
Group 1: Identification
Experimental Design Blank
Laboratory-
Matrix-
Solvent-
Replicate
-Data analysis
-Extraction
-Injection
-Sample
Laboratory standard
Spike
General	Methods
Data analysis-
Instru mental-
Sample preparation-
Li brary/database
Non-targeted analysis
Suspect screening analysis
Interpretation of Data Identification
Confidence of identification
Performance	True positive (TP), TP rate/ratio
Statistics	True negative (TN), TN rate/ratio
False negative (FN), FN rate/ratio
False positive (FP), FP rate/ratio
Unintended positive (UP), UP rate/ratio
False discovery rate (FDR)
Negative predictive value
Accuracy
Performance
Specificity
Precision
Sensitivity, recall
Area under precision-recall (AUPR) curve
F1 Score
QA/QC	Accuracy
Precision
Repeatability
Reproducibility
Group 6: Blank + Matrix blank + Spike +
Laboratory standard + Replicate
Group 2: Non-targeted analysis + Suspect
screening
Group 4: Data analysis method
Group 5: Library/database
Group 7: Instrumental method
Group 9: Sample preparation method
Group 1: Identification -
of identification
Group 3: False positive h
True negative
Confidence
True positive ¦*
Group 8: Performance + Accuracy +
Reproducibility + Precision
HOW WOULD YOU DEFINE THESE TERMS?
LEAVE A POST-IT WITH YOUR IDEAS!
r	BP4NTA Initial Definition:
Identification is the attribution of a single chemical
identity (specific to the stereoisomer form, at minimum)
within a sample with an associated confidence;
confidence level must be assigned for all identifications,
single chemical identity- could be identified to the
formula level
. must be = pretty strong language
Revised definition:
Identification/Annotation: attribution of
a chemical identity (Identification) or
chemical formula (Annotation) of a
detected featur^within a sample with
an associated confidence.
feature- is there a better term?
^ : \
BP4NTA Initial Definition '
Raw Data is the
unmodified data that has
been generated from the
instrument.
A data analysis method
is the treatment of the
raw NTA data (with no
processing) that...
{needs completion}
Group 4: Data Analysis Method
f"	Components of a Data Analysis Method:	"N\
Raw data processing: Includes centroiding, smoothing, thresholding.
Peak picking: Selecting unidentified peaks/mass features/mz-rt pairs
but that could represent compounds.
Alignment or binning? Annotation?
Library Searching: Using a library/database to match spectra/precursor
masses between experimental and known info, to include match score.
In siiico fragmentation generation: Generating theoretical spectrum to
compare with empirical spectrum.
Mass spectral interpretatbn: Annotation of m/z fragments to "build"
supporting information for the identity.	,
\jJser Expertise": Need to report exactly how this was applied.
Group 6: Experimental Design Terms
Types of blanks: Ambient. Calibration,
Equipment, Field, Filter, Fortified method,
Matrix, Method, Preservation, Reagent, Trip.
http:/ytoww.chromatographyonline.comA/ital-
Vrole-blanks-sample-preparation?pagelD=1
Discussion: Define all 11? All applicable to NTA?
What is used depends strongly on the analytical task.
New strategy: More important to detail HOW the blank
was produced and its intended use - don't let the
reader interpret for themselves.	.
Future Directions
>	Continue to refine definitions/equations, build^efine "components of lists, build reference library in BP4NTA.
>	Discussion/refinement/(dis)agreement within BP4NTA.
>	Develop a list of questions/topics for discussion within broader NTA community. Is consensus possible?
>	Develop outline and draft of white paper.
>	Publish white paper.
>	Communicate. Communicate! COMMUNICATE and obtain buy in from the broader NTA community.
>	Put ideas into practice as practitioners, editors, reviewers, mentors, etc.

-------