http://www.orcid.org/0000-0002-2668-4821
Uri1o
-------
Outline
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
• Quick overview of the dashboard
• Specific data of interest to this audience
(it's not just Computational Toxicology)
• Support for Mass Spectrometry
• Data quality in the public domain
• Work in progress - prototypes
• A request for help
-------
CompTox Chemicals Dashboard
https://comptox.epa.gov/dashboard
SEPAs
A EPA
Unitad S'Blos
Ewwonmont»1 Protection
Agency
875 Thousand Chemicals
uct/Use Categories Assay/G
BATCH SEARCH
Bisphenol A
-05-7 I DTXSID7020182
Batch Search©
TOX DATA
Step Four: Select Data Output Format and Choose Data Fields to Download
Hazard
Bisphenol A
BIOACTIVITY
80-05-7 | DTXSID7020182
Chemical Activity Summary A
SIMILARITY
Bisphenol A
80-05-7 I DTXSID7020182
,
died by bxpert Val
Searched with a similarity threshold of 0.8
-------
BASIC Search
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Chemicals
Product/Use Categories Assay/Gene
Q, Bisphenol
Bisphenol A
DTXSID702Q182
b
Bisphenol A bis(2-hydroxyetfyl ether] diacrylate
DTXSIDS&5699-
Bispherol A b;s(2-hydroxyethyl ether) dirrethacrylate
DTXSIDf066992
Bisphero A bis(2-hydroxypropyl) ether
DTXS!D80S-\592
Bispheno A carbonate polyrrer
DTXSID602784Q
Bisphero Ati;glyc'dy ether
D1XSID6024624
Bisphero Agiycidyl methacrylate
DTXSID7Q44&41
3
-------
Detailed Chemical Pages
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
DETAILS
EXECUTIVE SUMMARY
PROPERTIES
Advanced Search Batch Search Lists v Predictions Downloads
ENV. FATE/TRANSPORT
HAZARD
~ ADME
~ EXPOSURE
0*0
[jBisphenol A
80-05-7 | DTXSID7020182
Searched by DSSTox Substance Id.
HjC CH3
~ BlOACTIViTY
SIMILAR COMPOUNDS
GENRA (BETA)
RELATED SUBSTANCES
SYNONYMS
>• LITERATURE
Wikipedia
Copy » I
Structural Identifiers
Linked Substances
Record information
Quality Control Notes
Presence in Lists
Bisphenol A (BPA) is an organic synthetic compound with the chemical formula (CHjJ^QC^H^OI-Oj belonging to the group of diphenylmethane derivatives and
bisphenols, with two hydros phenyl groups. ;t is a colorless solid that is soluble in organic solvents, but poorly soluble in water (0.344 wt % at S3 eC).
BPA is a starting material for the synthesis of plastics, primarily certain polycarbonates
Intrinsic Properties
IS Molecular Formula: C15H1502 £ Mo) File Q. Find All Chemicals
IS Average Mass 228.2S1 g/mol
IS Monoisotopic Mass: 228. "i 1503 g/mol
sotope Mass Distribution
LINKS
COMMENTS
4
-------
Properties, Fate and
Bisphenol A
80-05-7 I DTXSID7020182
Searched by DSSTox Substance Id.
Summary
Ai Download w Columns v
Property * Experimental average Predicted average
LogKow: Octanol-Water 3,32(1) 3,30
Melting Point 155(7) 140
Boiling Point 200 (1) 360
Water Solubility 8,55e-4(3) 8.78e-4
Vapor Pressure - 6.83e-7
Flash Point - 190
Surface Tension | - 46.0
Index of Refraction - 1.60
Molar Refractivity | - 68.2
)ort
A EPA
Unitiod S'&lGS
Ewwonmont*! Protection
Agency
Summary
" Experimental median " Predicted median
156
5.26e-4
3,39
144
355
7.56e-4
1.51e-7
190
5
-------
Properties, Fate and Transport
e.g. Solubility —
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
j. Download Experimental Data ~
Source
~
~
Result
A.
~
PhysPropNCCT
5.26e-4
Tetko et al. J. Chem. Inf. and Comp. Sci. 41.6 (2001): 1488-1493
1.51e-3
Kovdienko, et. al. Molecular informatics 29.5 (2010): 394-406.
5.25e-4
Source »
Result *
Calculation Details
EPISUITE
7.56e-4
Not Available
NICEATM
1.31e-3
Not Available
TEST
1.24e-B
TEST Report
OPERA
5.44e-4
OPERA Model Report [Inside AD]
OPERA2
5.35e-4
Not Available
-------
Properties, Fate arid Transport
e.g. log P
A EPA
Unhid S:ales
Ewwonmont*! Protection
*a«ocY
0
Q
s f
HO
\
OH
Predicted value: 3.35
Global applicability domain: | Inside|
Local applicability domain index: 0.877
Confidence level: 0.748
Model Performance
QMRF
LogP data
Training set
Test set
Weighted KNN model
5-fold CV (75%)
Training (75%)
Test (25%)
Q2
RMSE
R2
RMSE
R2
RMSE
0.850
0.690
0.860
0.670
0.860
0.780
od.epa.qov/dashboard/advanced search/index
7
-------
Sources of Exposure to Chemicals
Urilod Sm1di
Erwrronmont*! Protection
Agency
United States
Environmental Protection Home Advanced Search Batch Search Lists v Predictions Downloads
kl Agency
Copy ~ I Share ~ I Submit Comment I
Bisphenol A
80-05-7 | DTXSID7020182
Searched by DSSTox Substance Id.
DETAILS
EXECUTIVE SUMMARY
PROPERTIES
Product and Use Categories (PUCs) Q
JL Download ~
Columns ~ 10 v
ENV. FATE/TRANSPORT
Product or Use Categorization »
Categorization type ~
Number of Unique Products v
HAZARD
manufacturing, metals
CPCat Cassette
17
~ ADME
adhesive
CPCat Cassette
17
~ EXPOSURE
CPCat Cassette
16
CPCat Cassette
12
PRODUCT & USE CATEGORIES
CPCat Cassette
11
CPCat Cassette
8
CPCat Cassette
8
UntMIUAL VVEI^n I rKAL I IUIN
CPCat Cassette
8
CPCat Cassette
7
CHEMICAL FUNCTIONAL USE
CPCat Cassette
6
TOXICS RELEASE INVENTORY « < Q 2 3 * 5 6 7 a 9 10 > »> Last
MONITORING DATA
EXPOSURE PREDICTIONS
PRODUCTION VOLUME
8
-------
Identifiers to Support Searches
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
v
Bisphenol A
Q,
80-05-7 I DTXSID7020182
Searched by Approved Name.
EXECUTIVE SUMMARY
PROPERTIES
EIW. FATE/TRANSPORT
HAZARD
~ ADME
~ EXPOSURE
~ BIOACTIVITY
SIMILAR COMPOUNDS
GENRA (BETA}
RELATED SUBSTANCES
~ Literature
UNKS
COMMENTS
Synonyms
b Download
Synonym *
Quality *
Bisphenol A
Valid
4,4'-(Propane-2,.2-diyl}di phenol
Valid
Phenol. 4,4' - (1 - methyiethyiidene}bis-
Valid
Valid
BPA
Valid
4.4'-Propane-2,2-diyldiphenol
Valid
Phenol, 4,4' - (1 - methyiethylidenejbis-
Valid
4-0 6-00-05717
Beilstein
(4,4 -DihydroxydiphenytydimetiTy-lmethane
Good
22-Bis(4 -hydroxyphenyi) propane
Good
2J?'-Bis(4-hydraxyphenyi)propane
Good
Z2-B!S-{4-HYDRQZY~PHENYL)-PROPANE
Good
22- Bis(4 -hydroxyph enyi)propane
Good
22-Bh(p-hydroxypheny()propane
Good
2.2-CX^-Hydroxyphenyi) Propane
Good
9
-------
Link Access
DETAILS
EXECUTIVE SUMMARY
PROPERTIES
ENV. FATE/TRANSPORT
HAZARD
~ ADME
~ EXPOSURE
~ BIOACTIVITY
SIMILAR COMPOUNDS
GENRA (BETA}
RELATED SUBSTANCES
SYNONYMS
~ LITERATURE
LINKS
COMMENTS
0Q
Bisphenol A
General
80-05-7 | DTXSID7020182
Searched by Approved Name.
Toxicology
' *• EPA Substance Registry Service
Household Products Database
Chemical Entities of Biological Interest
(ChEBl)
PubChem
IS Chemspider
' ^1 CPCat
DrugBank
HMDS
W Wikipedia
Q MSDS Lookup
ChEMBL
Q Cbemicai Vendors
'¦£ CalEPA Office of Environmental Health
Hazard Assessment
D NtOSH Chemical Safety Cards
to ToxPlanet
n ACS Reagent Chemicals
W Wikidata
~ CbemHat Hazards and Alternatives TooJbox
^ Wolfram Alpha
* ScrubChem
©actor
DrugPc-rtal
EJ CCRIS
' *1 ChemView
0 LTD
¦*8 eChemPorcal
H Gene-Tox
H HSDB
' *' ToxCast Dashboard 2
~ LactMed
a International Toxicity' Estimates for Risk
& AT5DR Toxic Substances Portal
& Superfund Chemical Data matrix
J NIOSH IDLH Values
'Factor PDF Report
a* Toxics Release Inventory'
CREST
— National Air Toxics Assessment
JECHA Brief Profile
Uniied Scales
Erwrronmont*! Protection
*awcy
Publications
Mjcxlirs
B Environmental Health Perspectives
Q' NIEHS
D National Toxicology -'rogram
G Google Books
G Google Scholar
^ Google Patents
l* ^PRTVWEB
-'ubM ed
">.* 1 RIS Assessments
EPA HERO
23 NIOSH Skin Notation Profiles
29 NIOSH Pocket Guide
a RSC Publications
^ BioCaddie DataMed
€5 Springer Materials
-ederal Register
Regulatiwis.gov
Bielefeld Academic Search Engine
- CORE Literature Search
Analytical
Prediction
& FOR-IDENT O 2D NMR HSQC/HMBC Prediction
A NEMI: National Environmental Methods ndex O Carbon-13 NMR Prediction
¦ RSC Analytical Abstracts
& Tox21 Analytical Data
M MONA: MassBank North America
^ mzCloud
^ M ST IR Spectrum
Msr MUST MS Spectrum
*5 Proton NMR Prediction
ChemRXP Predictor
6 LSERD
10
-------
Analytica
a RSC Analytical Abstracts
Tox21 Analytical Data
* MONA: MassBank North America
^ mzCloud
NBrNIST IR Spectrum
NErNIST MS Spectrum
' I MassBank
^ NEMI: National Environmental Methods
Index
NGrNIST Antoine Constants
IR Spectra on PubChem
NErNIST Kovats Index values
>=,ERA
United Stales
Erwf#onmont»1 Protection
Ao**ky
11
-------
NIST WebBook
https://webbook.nist.gov/chemi
Analytical
0 FOR-IDENT
^ NEMI: National Environmental
Methods Index
RSC Analytical Abstracts
^ Tox21 Analytical Data
M MONA: MassBank North
America
*-^mzOoud
NErNIST IR Spectrum
NErNlST MS Spectrum
Spectrum
Help / Software credits
Cholesterol
Mass Spectrum
12
-------
MassBank of North America
https://mona.fiehnlab.ucdavis.e
Analytical
& FOR-IDENT
• NEMI: National Environmental
Methods Index
IS
MoNA - MassBank of North America IM Spectra ~ fit Downloads A Upload 6 Help w
0
RSC Analytical Abstracts
3 Tox21 Analytical Data
m MONA: MassBank North
America
^ mzCIoud
Nl5r NISI ]R Spectrum
Nl5r N1ST MS Spectrum
Display Generated Query
Bisphenol A
Originally submitted to the MassBank High Quality Mass Spectral Database
MassBank-. I A LC-MS -
Q instrument
Q instrument type
Q m3 level
Q ionization
Q, collision energy
Q retention time
Q precursor m z
Q precursor type
Q, ionization mode
Q accession
>=,ERA
United Stales
Erwf#onmont»1 Protection
Ao**ky
Score: ~ ~ ~ ~ ~
LTQ Qrbitrap XL Hiermo Sc...
LC-Esi-nrr
MS2
ESI
30 % (nominal)
14.0 min
229.1223
[M+H]-
positive
EA016309
0 Display Full Record
13
-------
Batch
Searching
-------
Aggregate data for a list of chemicals
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
| Trends in Environmental Analytical "theac
Chemistry
ELSEVIER
Volume 20, October 2018., e00059
Opioid occurrence in environmental
water samples—A review
Marina Celia Campos-Marias af Imma Ferrer b A E.Michael Thurnnsn D, Ana Aguera 3
E Show more
https://doi.Org/10.1016/j.teac.2018.e00059
Get rights and content
15
-------
Batch Search Names
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Buprenorphine
Codeine
Dextromethorphan
Dihydrocodeine
Dihydromorphine
Ethylmorphine
Fentanyl
Heroin
Hydrocodone
Hydromorphone
Ketamine
Meperidine
Methadone
Morphine
Morphinone
Naloxone
Naltriben
Oxycodone
Oxymorphone
Propoxyphene
Sufentanil
Tramadol
Step 1 Step 2
Step 3 Step 4 Step 5
Step 6
Step Five: Choose Data Fields to Download
1 Please enter one identifier per line
X
Select Input Type(s)
Enter Identifiers to Search searches should be limited to <5000 identifiers)
^ Identifiers
* Chemical Name 0
~ CASRN O
~ InChlKeyO
~ DSSTox Substance ID 0
1 > DSSTox Compound ID 0
Q InChlKey Skeleton 0
' •' MS-Ready Formula(e) 0
1 > Exact Formula(e) 0
O Monoisotopic Mass 0
<3> Display All Chemicals
Download Chemical Data
Excel
Download
Buprenorphine
Codeine
Dextromethorphan
Dihydrocodeine
Dihydromorphine
Ethylmorphine
Fentanyl
Heroin
Hydrocodone
Hydromorphone
Input
FOUND BY
DTXSID
Buprenorphine
Approved Name
DTXSID20227G5
Codeine
Approved Name
DTXSID2020341
Dextromethorphan
Approved Name
DTXSID3Q22908
Dihydrocodeine
Approved Name
DTXSID5022936
Dihydromorphine
Approved Name
DTXSID7Q489Q8
Ethylmorphine
Approved Name
DTXSID 1046760
Fentanyl
Approved Name
DTXSID9Q23049
Heroin
Synonym
DTXSID6046761
Hydrocodone
Approved Name
DTXSID8Q23131
Hydromorphone
Approved Name
DTXSID8023133
Ketamine
Approved Name
DTXSID8023187
Meperidine
Approved Name
DTXSID9023253
Methadone
Approved Name
DTXSID7023273
Morphine
Approved Name
DTXSID9023336
-------
Add Other Data of Interest
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Chemical Identifiers
y DTXSID©
v Chemical Name o
U DTXCID ©
CAS-RM ©
^ I nChltCey o
1 IUPAC Name©
Structures
Mol File O
SMILES 0
1 InChl String 0
* MS-Ready SMILES©
QSAR-Ready SMILES ©
Intrinsic And Predicted Properties
Molecular Formula ©
Average Mass 0
* Monoisotopic Mass ©
TEST Model Predictions o
OPERA Model Predictions ©
INPUT
DTXSID
CASRN
MOLECULAR F<
MONOISOTOPIC
MS READY SMI
Buprenorph
DTXSID20Z
52485-79-7
C29H41N04
467.3035588
[H]C12CC3=C4C
Codeine
DTXSID202
76-57-3
C18H21N03
299.1521435
[H]C12CC3=C4C
Dextrometh
~TXSID302J125-71-3
C18H25NO
271 1936144
[H]C12CC3=C(C:
Dihydrocod
DTXSID502; 125-28-0
C18H23N03
301 1677936
[H]C12CC3=C4C
Di hydro mor
DTXSID704! 509-604
C17H21N03
287.1521435
[H]C12CC3=C4C
Ethyl morph
DTXSID 104i 76-584
C19H23N03
313.1677936
[H]C12CC3=C4C
Fentanyl
~TXSID902:437-38-7
C22H28N20
336.2201635
CCC(=0)N(C1CC
Heroin
DTXSID6041561-27-3
C21H23N05
369.1576228
[H]C12CC3=C4C
Hydrocodor
DTXSID802:125-29-1
C18H21N03
299.1521435
[H]C12CC3=C4C
Hydro morpl
DTXSID8 02! 466-99-9
C17H19N03
285.1364935
[H]C12CC3=C4C
Ketamine
DTXSID802:6740-88-1
C13H16CINO
237.0920418
CNC1(CCCCC1 =
Meperidine
DTXSID902:5742-1
C15H21N02
247.1572289
CCOC(=0)C1(CC
Methadone
DTKSID702:76-99-3
C21H27NO
309.2092645
ccc(=0)c(cc(c;
[H]C12CC3=C4C
Morphine
DTXSID902:57-27-2
C17H19N03
285.1364935
Morphinone
DTXSID501467-02-7
C17H17N03
283.1208434
[H]C12CC3=C4C
Naloxone
DTXSID802:465-65-6
C19H21N04
327.1470582
[H]C12CC3=C4C
Naltriben
-
.
-
-
-
Oxycodone
~TXSID502:7642-6
C18H21N04
315.1470582
[H]C12CC3=C4C
Oxy morph o
DTXSID502:7641-5
C17H19N04
301 1314081
[H]C12CC3=C4C
Propoxyphe
DTXSID102469-62-5
C22H29N02
339.2198292
CCC(=0)OC(CC1
Sufentanil
DTKSID602:56030-54-7
C22H30N2O2S
386.2027994
CCC(=0)N(C1=C
Tramadol
DTXSID908! 27203-92-5
C16H25N02
263.188529
COC1=CC=CC(=
17
-------
Chemical Lists of
Interest...
-------
225 Chemical Lists (and growing)
Home Advanced Search Batch Search Lists v Predictions Downloads
A EPA
Unitod S'&Igs
Ewwonmont*! Protection
*a«ocY
Li=t= of Chemicals
Lift of Assays
b
i Download
Columns
mas3
Ci Copy Filtered Lists URL
List Acronym '
List Name *
Last Updated *
Number of Chemicals *
List Description *
HDXEXCH
MASSPECDB: Hydrogen
Deuterium Exchange Standard
Set - Under HDX Conditions
2018-11-07
592
Observed species (deuterated and ursdeuterated} from the HDXNOEX list
under hydrogen deuterium exchange conditions (Ruttkies, Schymanski et
al. in prep.)
HDXNOEX
MASSPECDB: Hydrogen
Deuterium Exchange Standard
Set - No Exchange
2018-11-07
765
Environmental standard set used to investigate hydrogen deuterium
exchange in small molecule high resolution mass spectrometry [Ruttkies,
Schymanski et al. in prep.)
MASSBANKEUSP
MASSPECDB: MassBank.EU
Collection: Special Cases
2017-07-16
263
The MassBank.EU list contains curated chemicals (Schymanski/Williams)
associated with the literature/tentative/unknown/SI spectra available on
MassBank.EU that are not available as part of the full MassBank
collection of reference standard spectra.
MASSBANKREF
MASSPECDB: MassBank
Reference Spectra Collection
2017-07-13
1267
This MassBank list contains chemicals associated with the full MassBank
collection of reference standard spectra available on MassBank.EU,
MassBank.JP and MassBank of North America 3S well as the Open Data
collection, curated by Williams/Schymanski.
MVCOTOXINS
MASSPECDB: Mycotoxins from
MassBank,EU
2017-08-02
88
This is a set of mycotoxins, initiated by the contribution of spectra of 90
mycotoxins to MassBank.EU by Justin Renaud and colleagues from
Agriculture and Agri-Food Canada, Government of Canada
-------
"Volatilome" Human Breath
v>EPA
Unitad S'Blos
Ewf*onmont*l Protection
Agency
LIST: VOLATILOME: Human Breath
1 Identifier substring search
List Details
Description: This list is a subset of compounds detected in human breath and reported in the peer-reviewed literature and identified in experimental work at US-EPA. The bulk of the collection is extracted from the article "The human
volatilome: volatile organic compounds (VOCs) in exhaled breath, skin emanations, urine, feces and saliva" by de Lacy Costello et al in J. Breath Res. 8 (2014) 034001 (DQI:10.1088/1752-7155/8/3/034001from the article "On-line analysis of
exhaled breath", by Bruderer et al in Chemical Reviews (D0l:10.1021/acs.chemrev.9b00005') as well as an increasing number of chemicals identified in our own laboratory studies.
Number of Chemicals: 1075
1075 chemicals
CASRN X I DTXS1D X
-CH,
Acetamide
CASRN:60-35-5
DTXSID:DTXSID7020005
Acetonitrile
CASRN:75-05-8
DTXSID:DTXSID7020009
H,C
o
Acrolein
CASRN;107-02-8
DTXSID:DTXSID5020023
H,C
Acrylonitrile
CASRN: 107-13-1
DTXS ID: DTXSID5020029
20
-------
"Volatilome" Saliva
Uri1o
-------
PFAS lists of Chemicals
v>EPA
Unitad S'Blos
Erw#onmont»1 Protection
Aqmior
Select List
i Download
Columns
PFAS
(b Copy Filtered Lists URL
List Acronym
List Name w
Last Updated *
Number of Chemicals *
List Description '
EPAPFAS75S1
PFAS|EPA: List of 75 Test
Samples (Set 1)
2018-06-29
74
PFAS list corresponds to 75 samples (Set 1) submitted for initial testing screens
conducted by EPA researchers in collaboration with researchers at the National
Toxicology Program.
EPAPFAS75S2
PFAS|EPA: List of 75 Test
Samples (Set 2)
2019-02-21
75
PFAS list corresponds to a second set of 75 samples (Set 2) submitted for testing
screens conducted by EPA researchers in collaboration with researchers at the
National Toxicology Program.
EPAPFASCAT
PFAS|£PA Structure-
based Categories
2018-06-29
64
List of registered DSSTox "category substances' representing PFAS categories
created using ChemAxon's Markush structure-based query representations.
EPAPFASINSOL
PFAS|EPA: Chemical
Inventory Insoluble in
DMSO
2018-06-29
43
PFAS chemicals included in EPA's expanded ToxCast chemical inventory found to
be insoluble in DMSO above 5mM.
EPAPFASINV
PFAS|EPA: ToxCast
Chemical Inventory
2018-06-29
430
PFAS chemicals included in EPA's expanded ToxCast chemical inventory and
available for testing.
EPAPFASRL
PFAS|EPA: Cross-Agency
Research List
2017-11-16
199
EPAPFASRL is a manually curated listing of mainly straight-chain and branched
PFAS (Per- & Poly-fluorinated alkyl substances) compiled from various internal,
literature and public sources by EPA researchers and program office
representatives.
PFASKEM1
PFAS: List from the
Swedish Chemicals
Agency (KEMl) Report
2017-02-09
2416
Perfluorinated substances from a Swedish Chemicals Agency (KEMl) Report on
the occurrence and use of highly fluorinated substances.
PFASMASTER
PFAS Master List of PFAS
Substances
2018-07-26
5061
PFASMASTER is a consolidated list of PFAS substances spanning and bounded by
the below lists of current interest to researchers and regulators worldwide.
PFASOECD
PFAS: Listed in OECD
Global Database
2018-05-16
4729
OECD released a New Comprehensive Global Database of Per- and
Polyfluoroalkyl Substances, (PFASs) listing more than 4700 new PFAS
PFASTRIER
PFAS Community-
Compiled List (Trier et ai,
2015)
2017-07-16
597
PFASTRIER community-compiled public listing of PFAS (Trier et al, 2015)
-------
Building a "reference" PFAS list
v>EPA
Unitad 3Ub1ds
ErwwonmontJil Protection
Aowcy
PFAS structure list (PFASSTRUCT) F
is expanded from public databases,
agency lists and literature
Approaching -7000 structures - 98.8% have
associated CAS Numbers
Compare with PubChem 220,720 structures
DATA SOURCES > CONTRIBUTOR INFORMATION
SureChEMBL
^ Share ^0 Tweet
Email
SureChEMBL automatically extracts chemistry from the full text patent documents provided by the three major patent authorities (WIPO, USPTO,
EPO). Compounds are derived from the chemical names found in text, images and attached MOL files, where available.
23
-------
Formula Search can find isomers
v>EPA
UflilreJ S'Blos
Ewwonmont*! Protection
Agency
i I I II i I I
Perfluorooctanesulfonic acid
1763-23-1 | DTXSID3031864
Searched by Synonym from Valid Source.
OH
:o
Wikipedia
Perfluorooctanesulfonic acid (conjugate base perfluorooctanesulfonate) (PFOS) is an a^
was the key ingredient in Scotchgard, a fabric protector made by 3M, and numerous stain i
Convention on Persistent Organic Pollutants in May 2009. PFOS can be synthesized in indu|
precursors. PFOS levels that have been detected in wildlife
Read more
Quality Control Notes
Intrinsic Properties
i Mol File
Q Find All Chemicals
Molecular Formula: C0HF17O3S
C Average Mass: 500.13 g/mol [M Isotope Mass Distribution
(Jj Monoisotopic Mass: 4-99.937494 g/mol
24
-------
Active expansion of the PFAS list
From 2 to8 variants of PFOS
A EPA
Unitad S:ales
Ewwonmont*! Protection
*g«*»CY
Searched by Exact Molecular Formula: C8HF1703S.
Select all ¦ A Download w m bend to Batch Search ¦ Default
Hide chemicals that are
by Name o*
O
Perfiuorooctanesulfonic acid
DTXSI DrDTXSI DB031864
CASRN:1763-23-1
TOXCAST:207/979
-o
1,1.23.3,4,4,5,5,6.6,7,7,7-tetradecafluoro...
DTXSI D:DTXSID501019148
CASRN:NOCAS_1019148
TOXCAST:-
-0
Heptadecaftuorooctane-2-sulfonic acid
DTXSID:DTXSID30895921
CASRN:927670-12-0
TOXCAST:-
-O
1.1,223.3.5,5,6.6,6-undecafluoro-4.4-bis...
DTXSID:DTXSID201019149
CASRN:NOCAS_1019149
TOXCAST:-
,'Y... .
tft+ft
ULOH
I No
IsoPFOS
DTXSIDDTXSID701019144
CASRN:NOCAS 1019144
TOXCAST:-
-HnfH-f
1.1.223.3,4,4.5.6,6,7,7,7-tetradecafluoro...
DTXSID:DDCSID401019145
C ASRN: NOCAS_1019145
TOXCAST:-
O
1,1,223.3,4,5.5,6,6.7.7,7-tetradecafluoro...
DTXSID£>TXSID101019146
CASRN:NOCAS_1019146
TOXCAST:-
1,1,223,4,4,5,5,6.6.7.7.7-tetradecafluoro...
DTXSID:DTXSID801019147
CASRN:NOCAS_1019147
TOXCAST:-
25
-------
Disinfection By-Products
vvEPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
LIST: Disinfection By-Products
t DBPRC
~ Identifier substring search
Description: Disinfection by-products (DBPs) result from chemical reactions between organic and inorganic matter in water with chemical treatment agents during the water disinfection process. DBPs are present in most drinking water
supplies that have been subject to chlorination, chloramination. ozonation, or treatment with chlorine dioxide.
Number of Chemicals: 87
-o
CI
Br-
Cl
Bromodichloromethane
CASRN:75-27-4
DTXSID:DTXSID1020198
87 chemicals
U ¦ CASRN X I DTX5ID X
3-Chloro-4-(dichloromethyl)-5-hydroxy-.
CASRN:77439-76-0
DTXSID:DTXSID6020276
Chloroacetaldehyde
CASRN:107-20-0
DTXSID:DTXSID4020292
CI
Br-
Br
Chlorodibromomethane
CASRN:124-48-1
DTXSID:DTXSID1020300
26
-------
Mycotoxins
Uri1o
-------
A EPA
Unitad S:aies
ErwwonmontJil Protection
Aowcy
Structural identifiers ;
p, IUPAC N ame: (3alpha,7alpha)-3,7,15-Trihydroxy-12,13-epoxytrichothec-9-en-8-one
C SMILES: CC1 =C[C@H]20[C@@H]3[C@H](0)C[C@@](C)([C@]33C03)[C@@]2(C0)[C'
ft InChl String: lnChl=1S/C15H20G6/c1-7-3-9-14(5-16,11(19)10(7)18)13(2)4-8(17)12(2
2H3/t8-,9-,11 -,12-,13-,14-, 15+/m 1/s1 '
InChlKey: LINOMUASTDIRTM-QGRHZQQGSA-KI ,
Search Google for: Q. Structural Skeleton Q, Full Structure I
ft Copy All \
28
-------
BIG databases are GREAT!
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Thanks to all of the public database efforts
So much benefit from what's been done
There are hundreds of them at this point...
1 0 -3
1 o8-
>
a)
o
c
to
w 1 0 7m
n
3
CO
to
2 1 o 6~
E
0)
O
1 0 •
1 0'
Pub©hem ^
^ ChemSpider
(J)
-
UC DAVIS
UNIVERSITY OF CALIFORNIA
o*
£
V
-------
Vomitoxin - ChemSpider
vvEPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
19 "Vomitoxins" - 3 isotopically labeled
Search term: LINOMUASTDIRTM (Found by InChlKey (skeleton match))
Deoxvnivalenol
3.7.15-Trihvdroxv-12.13-epoxvtrich othec-
9-en-8-one
(3al eha ,7atpha. 12RH-3.7.15-T rihvdroxv-
12.13-epoxytrichothec-9-en-8-Dne
(3beta.7alpha. 12xfl-3.7.15-Trihvdroxv-
12.13-epoxytrichothec-9-en-8-one
»I(J £ Cll,
(3aloha.7alPha)-3.7.15-Trihvdroxv-12.13-
epoxytrichothec-S-en-8-one
(3beta.7alPhal-3.7.15-Trihvdroxv-12.13- (3alDha.7alDhaV3.7.15-Trihvdroxv-12.13- (3alPha.7beta)-3.7.15-Trihvdroxv-12.13-
epoxvtrichQthec-9-en-8-one
epox¥trichothec-9-en-8-one
epoxvtrichothec-9~en-8-one
"iJ
(3alpha.7alpha.11xi>-3.7.15-Trihvdroxv- (2alpha.3alpha.5alpha,6beta.7beta. 11 beta (3alpba.5alphaJalpha.12xiV3.7.15-
12.13-epoxvtrich othec-9-en-S-one
Trihvdroxv-12.13-epoxvtrichothec-9-en- Trihvdroxv(2.3.4.5.6.7.8.9.10.11.12.13.14.
3-one -13-C 15 )-12.13-epoxvtridiothec-9-
t3alpha.6beta.7alpha)-3.7.15-Trihvdroxv- (3alpha.6xi.7alpha)-3.7.15-Trihvdroxv-
12.13-epoxytrichothec-9-en-8-one 12.13-epoxvtrichothec-9-en-8-one
¦en,
(3alpha.7alpha.11xi. 12x0-3.7.15-
Trihvdroxv-12.13-epoxvtrichQthec-9-en- Trihvdroxv-12.13-epoxytrichothec-9-en-
(3beta.6beta.7beta. 11 beta. 12R)-3.7.15- 3.7.15-T rihvdroxv-12.13-epoxytrich othec- 12.13-EpQXV-3.7.15-trihvdroxvtrichothec-
9-en-S-one
9-en-8-one
8-one
8-one
30
-------
Vomitoxin - PubChem
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
33 unique InChl Keys
Compounds
Substances
(10)
Searching chemical names and synonyms including IUPAC names and InChlKeys across the compound collection. Note th
pages is not searched. Read More...
33 results
Filters
SORT BY
Relevance
Not Vomitoxin
Vomitoxin; Trichothec-9-En-8-One, 12,13-Epoxy-3,7,15-Trihydroxy-,
(3.Alpha.,7.Alpha.)-; 12,13-Epoxy-3.Alpha.,7.Alpha.,15-Trihydroxy-9-
Trichothecen-8-One; LINOMUASTDIRTM-LMJBVPRVSA-N; Spiro[2(5-
Methano-1-Benzoxepin-10,2'-Oxirane], Trichothec-9-En-8-One Deriv.
Compound CID: 6432495
MF:Ci5H20O6 MW: 296.31g/mol
In Chi Key: LINOMUASTDIRTM-LMJBVPRVSA-N
IUPAC Name: (3R,10S)-3,10-dihydroxy-2-(hydroxymethyl)-1,5-dimethylspiro[8-
oxatricyclo[7.2.1.02,7]dodec-5-ene-12,2'-oxirane]-4-one
Create Date: 2006-04-28
31
-------
«Em
United Stales
Erwf#onmont»1 Protection
Ao**ky
• Other databases grow quickly...a lot of "virtual
chemistry" and "make on demand" compounds.
Vomitoxin has 7 ZINC stereoforms.
PUBCHEM_CID
Compound_Name
Compound_Synonym
InChlKey
98043267
(1R,2S,3R,7R,9S,10R,12S)
ZINC100006545
LINOMUASTDIRTM-DOZBXCHUSA-N
98051113
(1R/2R/3S;7S/9S/10R/12S)-
ZINC100066010
LINOMUASTDIRTM-KCWNRFLPSA-N
100853641
(1R/2R/3S,7S,9S/10R,12R)
ZINC229762267
LINOMUASTDIRTM-OMTHLLQNSA-N
98043268
(1R/2S,3R/7S/9S/10R/12S)-
ZINC100006546
LI N O M UASTDIRTM-U BTIPYQWSA-N
95566296
(1R/2S/3R/7R/9R,10R;12S)ZINC71789640
LINOMUASTDIRTM-WYQUPHEGSA-N
100853642
(1R/2R,3S,7R/9S,10R,12R)ZINC229762273
LINOMUASTDIRTM-XFRIDARHSA-N
95566297
(1R/2S/3R,7S;9R/10R;12S) ZINC71789642
LI N O M U ASTDIRTM-XGQZSAOASA-N
• The Dashboard database grows slowly (next
release is +20k chemicals in 6 months)
PubChem - "virtual chemistry"
-------
ChemSpider - lots of virtuals???
vvEPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
olMillion
chemical structures
52 million chemicals
from one vendor
< Data Sources
Data Source
Count
Date
Created
Last
Updated
Aurora Fine Chemicals
51885566
13/04/2009
09/01/2020
Chemsoace
14283313
30/11/2016
04/12/2018
AKos
12326374
15/04/2008
09/10/2017
Mcule
9299739
21/01/2014
26/10/2018
Molport
8200357
09/02/2010
09/01/2020
Enamine
3056649
15/04/2008
15/10/2019
33
-------
Taxol: 79 Results
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
Agency
Found 79 results
Search term: RCINICONZNJXQF (Found by InChlKey (skeleton match))
First
<
>
Last
Q '1
r
b
paclitaxel
tealpha.Sxi.SbetaJbeta. 10beta. 13aiPhaH
Diacetoxv-13-f[(2R,3S)-3-
(benzovlaminoi-2-hvdroxv-3-
(2alpha.5beta.7alpha.10beta.13alpha)-4.1
Diacetoxv-13-(Ff2R.3S)-3-
(benzQviamino'i-2-hydroxv-3-
(5beta.7beta IQalPha.13alpha)-4.10-
bis(aceMoxv)-13-|[(2R.3Sl-3-
(benzoylamino)-2-hvdroxv-3-
(2alpha.3xi.5beta.7beta. 10alpha. 13atphaV
Diacetoxv-13-ff(2R,3S)-3-
(benzoviamino)-2-hvdroxv-3-
C2alpha.3xi.5beta.7alpha. 10beta. 13alphaV
Diacetoxv-1 3-ff(2R.3S)-3-
(benzovlamino)-2-hvdroxv-3-
(2alpha.3xi.7beta. 10beta. 13atpha>-4.10-
Diacetoxv-13-;f(2R ,3S'j-3-
(benzovlaminoV2-hvdroxv-3-
(2alpha.3xi.5beta.7beta. 1Obeta. 13alphaH
Diacetoxv-13-ir(2S,3SV3-
(benzovlaminoV2-hvdroxv-3-
(2alph a. 3beta. 5beta Jbeta. 10beta. 13al ph s
Diacetoxv-13-TK2R.3S1-3-
fbenzovlaminoV-2-hvdroxv-3-
"6
(2alPha. 5beta. 7beta. 10beta .13a Ipha )-4.10
Diacetoxv-13-if(2R )-3-(benzovlamin q'i-2-
hvdroxv-3-ohenv[propanovnoxv}-1.7-
(2alpha.5beta.7beta.1 Obeta. 13alpha)-4.10
Diacetoxv-13-fT(2R.3S)-3-
(benzoviamino)-2-hvdroxv-3-
5-
(2alpha.5beta. 10beta. 13alPha)-4.10-
Diacetoxv-13H'f(2R. 3S ^
(benzovlamino>-2-bvdroxv-3-
(2alpha. 5beta. 7beta. 1Qbeta. 13alPha)-4.10
Diacetoxv-13-fI('2R,3S)-3-
(benzovlaminoV2-hvdroxv-3-
(2alpha.5beta.7beta.8alpha.10beta.13alph
Diacetoxv-1.3-W2R ,3S)^z
(benzovlaminoV2-lwdroxv-3-
6
(2alDha.5beta.7beta.10beta.13alPhaV4.10
Diacetoxv-1.7-dihvdroxv-13-fK2R ,3S)-2-
hvdroxv-3-phenvl-3-
f1b.2a.5b.7b.10b.13aV4.10-
bls(acetvloxv'l-13-(f(2R.3S)-3-
(benzovlaminoV2-hvdroxv-3-
(1 beta.2beta.3beta.4alpha.5alpha.7alpha.i
Diacetoxv-13-,fI(2S.3RV3-
(benzovlamino)-2-hvdrQxv-3-
(2alpha.5beta.7beta.1Qbeta. 13alPhaV4.10
Diacetoxv-13-ff(2S.3R)-3-
(benzovlamino)-2-hvdroxv-3-
(2a]pha.5beta.7alPha.10beta.13alPhaV4.1
Diacetoxv-13-fr(2R)-3-(benzovlamino]-2-
hvdroxv-3-ph en vlpropan ov 11 oxy}-1.7-
34
-------
Data Quality is important
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Data quality in free web-based databases!
pm
ELSEVIER
J1 Review
Keynote
Drug Discovery Today
Volume 17, Issues 13-14, July 2012, Pages 685-701
• r
Towards a gold standard: elsevier
quality in public domain
Drug Discovery Today
Volume 16, Issues 17-18, September 2011, Psges 747-750
Editorial
database's and armrnarhp*
tiie Machines first, humans second: on the importance
of algorithmic interpretation of open chemistry
Antony J.
0 Show i
https://d<
data
Alex M Clark i&, Antony J Williams and Sean Ekins
Journal o f Cheminforma tics 2015 7:9
https://doi.org/10.1186/si 3321 -015-0057-7 © Clark et al_; licensee Springer. 2015
Received: 24 November 2014 Accepted: 23 February 2015 Published: 22 March 2015
ind content
35
-------
HgHTe SeHHg
We're still cleaning data too
III(<«>¦>it ifif
Record Information
A EPA
United S«las
Erwf#onmont»1 Protection
Ao**ky
11 Benzeneselenol;lambda1 -selanylbenzene;lambda1-tellai
-------
Tire Crumb Rubber (298)
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
Related Topics: Safer Chemicals Research
CONTACT US
SHARE
© ®
July 2019 Report: Tire Crumb Rubber
Characterization
Key Takeaways:
• EPA is releasing a new report that addresses exposure (that is, chemicals and how people come in contact with these) to tire crumb
rubber on synthetic turf Thic ie nnt a riclf accoccmont nnr ran tho informatir\n ho i ico/H to iHontifx/ a Ia\iaI ahnx/o lA/hirh
health effects could occl ^ire Crum'3 Rubber
• In general, the findings f
human exposure appear
• Only Part 1 is being relea
assessment.
• Part 1 of this report pres
• The scope of this study v
| Q. Search TIRECRUM3 Oierr
Identifier substring search
Description: This chemical list is based on data contained within the Federal Research Action Plan (FRAP) on Recycled Tire Crumb Used on Plavina Fields and Playgrounds. The chemical list is obtained from the Toxicity reference information
spreadsheet compiled for the potential tire crumb rubber chemical constituents identified in the State-of-Science Literature Review/Gaps Analysis. White Paper Summary of Results. Eleven sources of publicly available toxicity reference
information were searched. It is important to recognize that not all potential chemical constituents identified through the literature search were confirmed through measurements made under the Federal Research Action Plan.
Number of Chemicals: 298
I
-CH,
H,C.
H-.N-
V \
*
Acetonitrile
CASRN:75-05-8
DTXSID:DTXSID7020009
Acrolein
CASRN:107-02-8
DTXSID:DTXSID5020023
Aniline
CASRN:62-53-3
DTXSID.DTXS1D8020090
Azobenzene
CASRN: 103-33-3
DTXSID:DTXSID8020123
-
-------
Terpenes in Vape (37)
v>EPA
Unitad S'Blos
Ewwonmont»1 Protection
*a«ocY
LIST: Terpenes in vape
'—I Identifier substring search
Description: Terpenes are organic compounds found in the marijuana plant that give strains their distinct aromatic and flavor profiles. They are now being isolated and concentrated into oils for individual vaping.
Number of Chemicals: 37
CASRN X I DTXSJD X
h3c.
o:
h3c/ 'ch 3
1,8-Cineol
CASRN:470-82-6
DTXSID:DTXSID4020616
Geranyl acetate
CASRN:105-87-3
DTXS!D:DTXSID0020654
Nerolidol
CASRN:7212-44-4
DTXSID:DTXSID3022247
beta-Caryophyllene
CASRN:87-44-5
DTXSID:DTXSID8024739
38
-------
Hydraulic Fracturing (1640)
A EPA
Unitad S:ales
Ewwonmont»1 Protection
*a«ocY
EPA's Study of Hydraulic Fracturing and Its
Potential Impact on Drinking Water Resources
Contact Us
Hydraulic Fracturing Study
Home
Final Assessment
EPA Published Research
Fact Sheets
Questions & Answers about
the final assessment
Multi-agency collaboration
on unconventional oil and
gas research
EPA Hydraulic Fracturing -
Agency Main Page
Hydraulic Fracturing For Oil And
Gas: Impacts From The Hydraulic
WATER|EPA; Chemicals associated with hydraulic fracturing
~ Identifier substring search
Description: Chemicals used in hydraulic fracturing fluids and/or identified in produced water from 2005-2013, corresponding to chemicals listed in Appendix H of EPA's Hydraulic Fracking Drinking Water Assessment Final Report (Dec
2016). Citation: U.S. EPA, Hydraulic Fracturing for Oil and Gas Impacts from the Hydraulic Fracturing Water Cycle on Drinking Water Resources in the United States (Final Report). U.S. Environmental Protection Agency, Washington, D.C.
EPA/600/R-16/236F, 2016. httDS.7Avww.epa.gov/hfstudv
'Note that Appendix H chemical listings in Tables H-2 and H-4 were mapped to current DSSTox content which has undergone additional curation since the publication of the original EPA HF Report (Dec 2016). In the few cases where a
Chemical Name and CASRN from the original report map to distinct substances (as of Jan 2018), both were included in the current EPAHFR chemical listing for completeness: additionally, 34 previously unmapped chemicals in Table H-5 are
now registered in DSSTox (all but 2 assigned CASRN) and, thus, have been added to the current EPAHFR listing.
Number of Chemicals: 1640
°
—i
H,C.
Acrolein
CASRN:107-02-8
DTXSID:DTXSID5020023
Acrylamide
CASRN-.79-06-1
DTXSID:DTXSID5020027
H,C
'V_=,
Acrylonrtrile
CASRN:107-13-1
DTXSID:DTXSID5020029
Aldrin
CASRN:309-00-2
DTXSID:DTXSlD8020040
-------
DRUGS: Opioids and related metabolites
Search OP
~ Identifier substring search
Description: This list of opioids and related metabolites is assembled primarily from public resources (e.g. Wikipedia, databases and literature articles) and is under ongoing curation and expansion.
Number of Chemicals; 180
-oVs
Codeine
CASRN:76-57-3
DTXSID:DTXS!D202G341
A
Alfentanif
CASRN:71195-58-9
DTXSID:DTXS1D9022570
,A^-ch>
Alpha prod ine
CASRN:77-20-3
DTXSID:DTXSID4022575
Anrleridine
CASRN:144-14-9
DTXSID;DTXSID8022610
-------
"MS-ready"
structures
Journal of Cheminformatics
METHODOLOGY Open Access
"MS-Ready" structures for non-targeted
high-resolution mass spectrometry screening
studies
Crass Mark
Andrew D. McEachran Kamel Mansouri ,2J, Chris Grulke2, Emma L Schymanski4, Christoph Ruttkies5
and Antony J, Williams"1*
-------
Overview of MS-Ready Structures
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
All structure-based chemical substances are
algorithmically processed to
- Split multicomponent chemicals into individual structures
- Desalt and neutralize individual structures
- Remove stereochemical bonds from all chemicals
• MS-Ready structures are then mapped to
original substances to provide a path between
chemicals detected by mass spectrometry to
original substances
-------
Nicotine
cm ccqc @ h] ic 1=cn=cc=ci
DTXSID10209301 SNICXCGAKADSCV
54-11-5 | 162.1157| 0.9291 72
Tbx: y£i| Expo: ye J J Bioasiay: y&£
cm
LEGEND: Name, SMILES
DTX5ID | inChlKey 1K Block
CAS | Monoiso, Mass | logP ! Sources
Data on. Toxicity I Exposure | Bioassays
D-Nicotine
CN1CCC(C@@H]1C1=CN=CC=C1
DTXSID0046351 SNICXCGAKADSCV
25162 00 9 | 162.1157| 0 929| 20
Tax: no | Expo: yes I Bioassay: yes
ticnzoic acid, 2-hydroxy,, compd. with
3-[(2S}-l-rnethyfl-2-pyrrolidiny I] pyridine {1:1]
OC(-0)Cl=qO)C=CC=Cl.CNlCCC[C@HllCl = CN=CC-Cl
DTXSID50753191 AIBWP8UAKCMKNS
29790-52-H 300,14741 0.9291 6
Tox: no | Expo: yes| B»oassay: no
MS-reody
DL-Nicotine
CN1CCCC1C1=CN=CC=C1
DTXSID 3048154 | SNICXCGAKADSCV
22083-74-5 | 162.11571 0.9531 9
Tox: yes | Expo: no | Bioassay: yes
'
HO
Nicotine hydrochloride
CI.CN 1CCC [C @ H] 1C1=CN=CC=C1
DTXSJD602093| HDJBTCAJIMNXEW
2820-51-1 | 19S.0924 | 0.9291 9
Tox: no | Expo: yes| Bioassay: yes
Dl-Nicotine-d3
[2H|C[|2H])([2H1)N1CCCC1C1=CN-CC=C1
DTXSID80442666| SNICXCGAKADSCV
69980-24-11 165,13451 0.9291 1
Tox no | Expo: no| Bioassay no
ence&iecnnoiogu
Open Science for Identifying "Known Unknown" Chemicals
Emma L. Schymanski*'T<® and Antony J. Williams*'*®
I
43
-------
MS-Ready Mappings from
Details Page — —
?/EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
H+HH
Perfluorooctanesulfonic acid
1763-23-1 | DTXSID3031864
Searched by Synonym from Valid Source.
Wikipedia
Perfluorooctanesulfonic acid (conjugate base perfluorooctanesulfonate) (PFOS) is an anthropogenic fluorosurfactant and global pollutant. PFOS
was the key ingredient in Scotchgard, a fabric protector made by 3M, and numerous stain repellents. It was added to Annex B of the Stockholm
Convention on Persistent Organic Pollutants in May 2009. PFOS can be synthesized in industrial production or result from the degradation of
precursors. PFOS levels that have been detected in wildlife
Read more
Quality Control Notes
Intrinsic Properties
Structural Identifiers
Linked Substances w
Same Connectivity: 4 records (based on first layer of InChl)
Mixtures, Components and Neutralized Forms: 9 records
mixture)
based on QSAR ready mappings and with the compound as a component of a
ent >0.8)
| MS-Ready Mappings: DTXCID1011864: 18 records:
Similar Compounds: 83 records (based on Tanimoto coeffic
44
-------
MS-Ready Mappings Set of 20
substances for "PFOS"
A EPA
Unitad S'&Igs
Ewwonmont»1 Protection
*a«ocY
xvEPA
United States
Environmental Protection Home Advanced Search Batch Search Lists v Predictions Downloads
Agency
MS-Ready Mappings of Perfluorooctanesulfonic acid (Isotopes pre-filtered)
18 of 20 chemicals visible
Perfluorooctanesulfonic acid
CASRN:1763-23-1
DTXSI D:DTXSID3031864
T
:E 6
Piperidinium perfluorooctanesulfonate
CASRN:71463-74-6
DTXSI D:DTXSID0072352
Lithium perfluorooctanesulfonate
CASRN:29457-72-5
DTXSID:DTXSID2032421
O
Potassium perfluorooctanesulfonate
CASRN2795-39-3
DTXSI D:DTXSID8037706
Perfluorooctanesulfonate
CASRN:45298-90-6
DTXSID:DTXSID80108992
-O
Tetrabutylammonium perfluorooctanesu...
CASRN:111873-33-7
DTXSI D:DTXS ID40584995
-------
Mass and Formula
Searching
46
-------
Advanced Searches
Mass Search
A EPA
Unitad S:ales
Ewwonmont*! Protection
*g«*»CY
Mass Search o
± Min/Max
Ad duct
Neutral
All Adduets
v C Chocse adduce from dropdown
191,131
Da
1. D
Search Q
Da ¦77
47
-------
Advanced Searches
Mass Search
A EPA
Unitad S:ales
Ewwonmont»1 Protection
*g«*»CY
v_^
qh
DEET
DTXSID: DTX5ID2021995
CASRN: 134-62-3
TOXCAST: 12/768
Mass Diff: 0.000014
Berearnide. N-pentyi-
DTXSID: DTXSI D20174196
CASRN: 20308-43-4
TOXCAST:-
Phendime-trazine
DTXSID: DTXSI D1023447
CASRN: 634-03-7
TDXCAST: -
Mass Diff: 0.000014
p-t-Butylacetanilide
DTXSI D: DTXSID80174238
CASRN: 20330-45-4
TOXCAST:-
Search Results
Searched by Mass: 191.131 +/- 5.0 ppm.
329 of 329 chemicals visible
X
CK CH,
N-Butj/iacetanilide
DTXSID: DTXSI D2042197
CASRN: 91-49-6
TDXCAST: -
Mass Diff: 0.000014
N, N-Diethylphen>1acetarnide
DTXSID: DTXSD00179048
CASRN: 2431-96-1
TDXCAST:-
o
CH, CH,
Benzaldehyde. 4-(diethylamirw)-2-rneth.-.
DTXSID: DTXSD4059C41
CASRN: 92-14-8
TOXCAST: -
Mass Diff: 0.000014
3-(Difr^hylamino}-2-rr>ethylpropiop^en...
DTXSID: DTXS D60180796
CASRN: 26171-50-6
TDXCAST:-
X
Acetanilide 2' 6-diethyl-
DTXSiD: DTXSI D90168U8
CASRN: 16665-89-7
TOXCAST: -
Mass Diff: 0.000014
H C V"1
3utyrarr.ide, 2-etbyl-2-phervy1-
DTXSID: DTXS?D60184653
CASRN: 30568-39-9
TDXCAST: -
h
f
1
k
J
Azetidire. 1,3-di method-3-(m-rr«thoxyp...
DTXSID: DTXSI D40173 560
CASRN
19832-26-9
TOXCAST
-
Mass Diff: 0.000014
1-Heptanone, 1-{4-pyridyl)-
DTXSID: DTXSI D40136594
CASRN: 32941-30-3
TDXCAST:-
48
-------
Molecular Formula Search ©
o Exact Formula o
Formula
Please use the format of the following example; C6H802 or C6H (8-10)0(0-2)
_» r » # i /
Search Q,
(§) MS Ready Formula ©
49
-------
MS-Ready Mappings
A EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
EXACT Formula: C10H16N2O8: 3 Hits
MO N-
O MS Ready Formula 0l Exact Formula Q
Formula
C10H16N2O8
3 of 3 chemi
PubChem, X
o
m
-N OH
Ethylenediaminetetraacetic acid
DTXSID: DTXSID5022977
PubChem: 158
CPDAT: 3S7
N,M'-Ethylenedi-L-aspartic acid
DTXSID: DTXSID1051852
PubChem: 25
CPDAT: 8
-o
Dimethyl 2,7-dinitrooctanedioate
DTXSID: DTXSID20498864
PubChem: 5
CPDAT: 0
50
-------
MS-Ready Mappings
A EPA
Unitad Stales
Ewwonmont*! Protection
*a«ocY
Same Input Formula: C10H16N2O8
MS Ready Formula Search: 125 Chemicals
k
o
o
o
0
o
o
•Krd .
Trisodtum ejhylenediaminetetrsacetate
DTXSID: DTXSI D70205 56
PubChem: 33
CPDAT: 82
)-\ °=(
' PV
Etfiy len ed i ami netetraacetic add
DTXSi D; DTXSi D6022977
PubChem; 158
CPDAT: 387
AAA
X
r*
v
Ethv1enediaminetetrs=cetiC add tetrasod...
DTXSI ft DTXSI DS0263 50
PubChem: 57
CPDAT: 1227
Etbylenediaminetetraacetic add disodiu...
DTXSID: DTXSiD9027073
PubChem: 56
CPDAT: 1359
Ethylenediamiretetraacetic acid ferric so...
DTXSi D: DTXSI 05027774
PubChem: 53
CPDAT: 62
Diammonium dihydrogen ethylenetfiamL.
DTXSI D: DTXSi D9027813
PubChem: 12
CPDAT: 17
Q
O
O
o
o
o
- V
Y C
v~x
Ferrate(1-J. [[N,W-^2-ethanediylbt£[N-[(...
DTXSi D: DTXSi D9027815
PubChem: 24
CPDAT: 20
Tetraammonium ethylenediaminetetraac..
DTXSi D: DTXS D8027820
PubChem: 11
CPDAT: 12
Zir>cate{2-), [[N,N'-1,2-ethanediyIbisTN -[
DTXSi D: DTXSI D8323343
PubChem: 5
CPDAT: 10
Y
V A
EDTA copper salt
DTXSi D: DTXSi DO034564
PubChem: 8
CPDAT: 10
V
a
v
Calcium disodium etfrjrtenediaminetetn
-DTXSi D: DTXSi D2036409
PubChem: 42
CPDAT: 29
Tetrapotassiurr. ethylenediami netetraace_.
DTXSiD: DTXSi D3036442
PubChem: 25
CPDAT: 36
51
-------
«Em
Ufiitod ii:aios
Environ mo nt*1 'Protection
Agency
• 125 chemicals returned in total
- 8 of the 125 are single component chemicals
- 3 of the 8 are isotope-labeled
- 3 are neutral compounds and 2 are charged
• Multiple components, stereo, isotopes and
charge all collapsed and mapped through
MS-Ready
MS-Ready Mappings
52
-------
Batch Sea
mass and formula
53
-------
Batch Searching
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
Singleton searches are useful but we work
with thousands of masses and formulae!
Typical questions
- What is the list of chemicals for the formula CxHyOz
- What is the list of chemicals for a mass +/- error
- Can I get chemical lists in Excel files? In SDF files?
- Can I include properties in the download file?
54
-------
Batch Searching Formula/Mass
A EPA
United Status
Ewf^oninont»l Protection
Ag«*tcy
Step 1
Step 2
Batch Search©
Step 3 Step 4
Step 5
Step 6
Please enter one identifier per line
Step Five: Choose Data Fields to Download
Select Input Type(s)
D Identifiers
~ Chemical Name O
~ CASRK O
'—I InChlKey©
U DSSTox Substance ID ©
LI DSSTox Compound ID 0
I—I InChlKey Skeleton O
ppm
Enter Identifiers to Search (searches shoulc oe Iiinited to <5000 centif e-s
41.0265
56.02621
53.0265
58.041 s|
93.057S
113.9639
151.8754
69.9377
77.9872
I
LJ MS-Ready Formula(e) search is based on what we refer to as "Mass Spec (MS) Ready" structures. All chemicals
1 Exact Fo'mu a(e) Q within the database are treated in a manner such that all are desalted, mixtures are separated,
, and stereochemistry is removed as Mass Spectrometry detects the major components of a salt
la Monoisotopic Mass ft 3 t- j< j r
l^Jor mixture and is insensitive to stereochemistry. As an example, a search for the monoisotopic
<9? Display All Chemjmass of phenol will return phenol, sodium phenolate and calcium phenoxide. See the
publication for more details: https://doi.org/10.1186/s13321-018-0299-2.
55
-------
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
A 1
sarching batches using MS-Ready
)rmula (oiHTiass) searching
A EPA
Unitad S:ales
Ewwonmont*! Protection
Agency
B
D
INPUT
DTXSID
CASRN
PREFERRED NAME
MOL FORMULA MONOISOTOPIC MASS DATA SOURCES
C14H22N203 D7XSID2022628 29122-68-7 Atenolol C14H22N203 266.163042576 46
C14H22N203 D"D-Atenolol C14H22N203 "266.163042576 "19
C14H22N203 DTXSID2048531 5011-34-7 Trimetazidine C14H22N203 "266.163042576 "14
C14H22N203 DTXSID 10239405 93379-54-5 Esatenolol C14H22N203 "266.163042576 "12
C14H22N203 DTXSID50200634 52662-27-8 N-(2-Diethylaminoethyl)-2-(4-hydroxyphenoxy)acetamide C14H22N203 "266.163042576 7
C14H22N203 DTKSID4020111 5170640-2 dl-Atenolol hydrochloride C14H23CIN203 "302.1397203 "6
C14H22N2Q3 DTXSID1068693 51963-82-7 Benzenamine. 2.5-diethoxv-4-f4-morpholinvl)- C14H22N2Q3 "266.163042576 "5
C18H34N206S
C18H34N206S
C18H34N206S
DTXSID3023215
DTXSID7047803
DTXSID20849438
154-21-2
859-18-7
1398534-62-7
~neomycin
~neomycin hydrochloride
PUBCHEM 71432748
C18H34N206S
C18H35CIN206S
C18H35CIN206S
"406.213757997
"442.1904357
"442.1904357
"35
"22
"1
C10H12N2O DTXSID1047576 486-56-6 Cotinine C10H12N2O 176.094963014 40
C10H12N2O DTXSID8075330 50-67-9 Serotonin C10H12N2O "176.094963014 "22
C10H12N2O DTXSID8044412 2654-57-1 4-Methyl-1-phenylpyrazolidin-3-one C10H12N2O "176.094963014 "18
C10H12N2O DTXSID80165186 153-98-0 Serotonin hydrochloride C10H13CIN2O "212.0716407 "11
C10H12N2O DTXSID2048870 29493-774 (4R,5S)-4-methyl-5-phenyl-4,5-dihydro-1 3-oxazol-2-amine C10H12N2O "176.094963014 "10
C10H12N2O DTXSID 10196105 443-31-2 6-Hydroxytryptamine C10H12N2O "176.094963014 "9
C10H12N2O DTXSID90185693 31822-84-1 1,4.5.6-Tetrahydro-5-phenoxypyrimidine C10H12N2O "176.094963014 7
C10H12N2O DTXSID40178777 2403-66-9 2-Benzimidazolepropanol C10H12N2O "176.094963014 "7
C10H12N2O DTXSID80157026 13140-86-8 N-Cyclopropyl-N'-phenylurea C10H12N2O "176.094963014 "6
C10H12N2Q DTXSID30205607 570-14-9 4-HvdroxvtrvPtamine C10H12N2Q "176.094963014 r$_
C14H18N403 DTXSID5023900 17804-35-2 Benomyl C14H18N403 "290.137890456 "68
C14H18N403 DTXSID3023712 738-70-5 Trimethoprim C14H18N403 "290.137890456 "51
C14H18N403 DTXSID40209671 60834-30-2 Trimethoprim hydrochloride C14H19CIN403 "326.1145682 "8
C14H18N403 DTXSID70204210 55687-49-5 Benzenemethanol, 4-((2,4-diamino-5-pyrimidinyl)methyl)-2, C14H18N403 "290.137890456 "5
C14H18N403 DTXSID20152671 120075-57-2 6-Methoxy4-(3-(N,N-dimethylamino)propy!amino)-5.8-quin: C14H18N403 "290.137890456 "4
C14H18N403 DTXSID30213742 63931-79-3 1H-1.2.4-Benzotriazepine-3-carboxylic acid. 4,5-dihydro-4- C14H18N403 "290.137890456 "3
C14H18N403 DTXSID30219608 69449-07-6 2,4-Pyrimidinediamine, 5-((3,4,5-trimethoxyphenyl)methyl)- C14H20N4O4 "308.14845514 "3
C14H18N403 DTXSID20241155 94232-27-6 L-Aspartic acid, compound with 5-({3.4.5-trimethoxyphenyl C18H25N507 "423.175398165 "3
C14H18N403 DTXSID80241156 94232-28-7 L-Glutamic acid, compound with 5-((3,4,5-trimethoxyphen>, C19H27N507 "437.191048229 "3
C14H18N4Q3 DTXSID20143781 101204-93-7 1H-Pvridof2.3-eM.4-diazepine-2.3.5-trione. 4-(2-(diethvlarr C14H18N4Q3 "290.137890456 r3
C12H11N7 DTXSID6021373 396-01-0 Triamterene C12H11N7 253.107593382 52
C12H11N7 DTXSID00204465 5587-93-9 Ampyrirnine |C12H11N7 "253.107593382 7
C12H11N7 DTXSID5064621 7300-26-7 Benzenamine. 4-azido-N-(4-azidophenyl)- C12H9N7 "251.091943318 "4
C12H11N7 DTXSID00848025 90293-82-6 Sulfuric acid-6-phenylptendine-2.4,7-triamine (1/1) C12H13N704S "351.074973101 "1
C12H11N7 DTXSID50575293 92310-83-3 f1E^-N-Phenvl-1.2-bisnH-1.2.4-triazol-1-vl^ethan-1-imine C12H11N7 "253.107593382 "1
C8H9N02 DTXSID2020006 103-90-2 Acetaminophen
C8H9N02
rsymiQ2_
"151.063328534
"151
75
-------
Batch Search in specific lists
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
INPUT
DTXSID
MASSBANKREF
NEMILIST
WRTMSD
NORMANPRI
SUSDAT
Buprenorph
DTXSID202
-
-
Y
-
Y
Codeine
DTXSID202IY
Y
Y
Y
Y
Dextrometh
DTXSID302Y
Y
Y
-
Y
Dihydrocodi
DTXSID502 Y
-
Y
Y
Y
Dihydromor
DTXSID704i
-
-
-
-
Y
Ethyl morph
DTXSID 1041
-
-
Y
-
Y
Fentanyl
DTXSID902!
Y
-
Y
-
Y
Heroin
DTXSID604IY
-
Y
Y
Y
Hyd roc odor
DTXSID802: Y
Y
Y
Y
Y
Hy dro morph
DTXSID8 02;
-
-
Y
-
Y
Ketamine
DTXSID802: Y
-
Y
-
Y
Meperidine
DTXSID902:Y
-
Y
-
Y
Methadone
DTXSID702: Y
Y
Y
-
Y
Morphine
DTXSID 902:
Y
Y
Y
Y
Y
Morphinone
DTXSID501!
-
-
-
-
Y
Naloxone
DTXSID802:
-
-
Y
-
Y
Naltriben
-
-
-
-
-
-
Oxycodone
DTXSIDSOZY
Y
Y
Y
Y
Oxymorpho
DTXSID5 02;
-
-
Y
-
Y
Pro poxy phe
DTXSID 102: Y
Y
Y
-
Y
Sufentanil
DTXSID602:
-
-
Y
-
Y
Tramadol
DTXSID908! Y
Y
Y
Y
Y
u -M
~ L
L
H K
~ nJ
* h
Si rv
~ K
~i r-r
b.
Is? N"
~ N
~ N
-------
Benefits of bringing it all together
A EPA
Unitad S:ales
Ewwonmont*! Protection
Agency
The true dashboard benefit is integration
Rank potential candidates for toxicity using
available data - hazard, exposure, in vitro
t.nviniimirnt Intpnulianil itH (2016) 2G0 28ft
Contents lists available ai SoienceDiracl
Environment International
journal homepage: www.elsevief.eom/locate/envifll
Linking high resolution mass spectrometry data with exposure and
toxicity forecasts to advance high-throughput
environmental monitoring
Julia E. Rjger Mark J. Strynartj. Shuang Liang J, Rebecca L McMahenJ, Ann M. Richard',
Christopher M. Grulke J, Jolin F. WambaughKristin K. Isaacs b, Richard Judsonc,
Antony J, WilliamsL, Jon R. Sobus
J Unit KsJpr /m£il uter far Science and kduaitaHT (OKfifcf / larlBcipanl, 109 l.W. Alexander Wive, Keseaidi .'nansir far,fc, Nt 277®, l.tnil«J Stales
" ULi Lnviranmrnfcui /"rofcrtiw! Atyrrr, Office af Kearnrdi untl Drvrioprncni, Naiiunof Ijfpomrr tiaeardi Laboratory, 109 LW. .Mcxamicr IJrivc, Kescardi Irianglc Park, NC2//0Q United Slates
c Ui tmirraimcmnf ITotetfEin Agency, C^fkeofKeseirdiandDeveiapmeat, Nalkinal Center/or t omputalRinuI Jasckology, i09 f.W-AkunndCTlJrewc, toeonir rnattfc J'ort. Nt 2/c/(9, United
States
d inefcheed Martin, flW LW. yiiejcander Dnvc, Hcxnrdi Inangte l'ark NC2//09, United Males
^ J © ^ ^
"' n«-2lenoalfCOtl „
H-|3-
c
2+tydro>y-3-
mine [33 ffloenjoatellJ
TrtsO-
ijaoi»can3™>»[5]
%£
22-
2-»noate|1W
9
sFiespftate!3|
> > 2
ano4|21
>
N.M-dtslhfl-'n-
Icluatvde (E£ET)!-i]
«
Dtglycld/I
esotanW ether
[1#3 C<
pwmatoM No,a"*IMlS|
£ ^
"•"•J*1"™
penty/sebacate [23
58
-------
Candidate ranking
using metadata
A
MS
i American Society for toss Spectrometry, 2011 J. Am. Soc. Mass Spedrom. (2012) 23:179-165
COI: 10.l007/s1!l36l-011-Qfi65-y
RESEARCH ARTICLE
Identification of "Known Unknowns" Utilizing
Accurate Mass Data and ChemSpider
-------
Data Source Ranking of
"known unknowns"
A mass and/or formula search is
for an unknown chemical but it
is a known chemical contained
within a reference database
Most likely candidate chemicals
have the most associated data
sources, most associated
literature articles or both
?/EPA
Unitad S:ales
Ewwonmont*! Protection
Agency
C14H22N203
266.16304
O
Chemical
Reference
Database
o
Sorted candidate
structures
60
-------
• CompTox Dashboard Data Sources
• Pub©hem Data Source Count
• publftjed. Reference Count
• Toxcast in vitro bioactivity
• Presence in CPDat database
• OPERA PhysChem Properties
• Other possibilities - predicted media
occurrence, frequency of InChls online
-------
Search 228.115 +/- 5.0 ppm
234 single component chemical
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
o
Search Results
Searched by Mass: 228.115 +/- 5.0 ppm.
234 of 247 chemicals
!¦ ¦ CASRN X ¦ DTXSID X ¦ Mass Diff X
Multicomponent Chemicals X
Bisphenol A
CASRN:80-05-7
DTXSI D:DTXSI D7020182
Mass Diff:0.00003
Phenol, 3-methoxy-5-(2-phenylethyl)-
CASRN:17635-59-5
Q
Phenol, 2-[1 -{4-hydroxypheny0-1-methy...
CASRN:837-08-1
DTXSID:DTXSID7042275
Mass Diff:0.00003
-O
\
1.1 '-(Dimethoxymethylene)bisbenzene
CASRN:??3S-01-n
4,4'-Propane-1,1 -diyldiphenol
CASRN:1576-13-2
DTXSID:DTXSID3044594
Mass Diftt}.00003
-<5
O
Nabumetone
CASRN:42924-53-8
DTXSID:DTXSlD4045472
Mass Diff:0.00003
-O
Preventol D2
fASRN:?749-70-4
1-Methoxy-2-((4-methoxyphenyl)methyl...
TASRN::50S67-R7-4
0
Phenol, Z2'-methy)enebis[4-methyl-
CASRN;3236-63-3
DTXSI D:DTXS ID0062923
Mass Diff:0.00003
u
1,2-Naphthalenedione. 3.8-dimethyl-5-(...
CASRN:5574-34-5
O
Gweicurculactone
CASRN:123914-43-2
DTXSID;DTXSID50154143
Mass Diff:0.00003
Methane, bis-(o-methoxyphenyl)-
CASRN:5819-93-7
62
-------
Search 228.115 +/- 5.0 ppm
234 single component chemical
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
CASRN
QC Level
CPDat Count
Number of Sources
PubChem Data Sources
PubMed Ref. Counts
80-05-7
Level 1
326
170
161
3850
42924-53-8
Level 2
14
45
138
342
87619-52-1
Level 5
0
2
0
87607-32-7
Level 5
0
2
0
63
-------
The original ChemSpider work
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Compound class
Number in class
Average rank
Number of compounds in each
position rank-ordered
#1
#2
#3
#4 #5+
Pharmaceutical dmg
72
1.4
55
9
6
2
Industrial chemicals
42
5.5
28
6
3
5
Personal care products
8
6,1
3
1
4
Steroid hormones
7
1.0
7
Perfluorochemicals
6
1.2
5
1
Pesticides
12
2.3
6
2
3
1
Veterinary drugs
3
1.3
2
1
Dyes
2
1.0
2
Food product/natural compounds
4
3.8
2
1 1
Illicit drugs
2
2.0
1
1
Misc. molecules
3"
1.3
2
1
64
-------
Is a bigger database better?
• ChemSpider was 26 million chemicals for
the original work
• Much BIGGER today Si/Million
• Is bigger better?? chemical structures
• Are there other metadata to use for ranking?
Uri1o
-------
Comparing Search Performance
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Ana! Bioaiial Chera (2017) 409:1729-1735
DOI 10,1007/s00216-016-0139-z
d)
CrossMark
RAPID COMMUNICATION
Identifying known unknowns using the US EPA's CompTox
Chemistry Dashboard
Andrew D. McEachran1 • Jon R. Sobus2 • Antony J. Williams1
When dashboard contained 720k chemicals
Only 3% of ChemSpider size
What was the comparison in performance?
66
-------
SAME dataset for comparison
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
Compound class
Number in class Average rank
Number of compounds in each
position rank-ordered
#1
#2
#3
#4
#5+
Pharmaceutical drug
72
1.4
55
9
6
2
Industrial chemicals
42
5,5
28
6
3
5
Personal care products
8
6,1
3
1
4
Perfluorochemicals
THE qAM E; DATASET
Pesticides
12
2,3
6
2
3
1
Veterinary' drugs
3
1.3
2
1
Dyes
i
1,0
2
Food product natural compounds
4
3,8
2
1 I
Illicit drugs
2
2,0
1
1
Misc. molecules
3 u
1,3
2
I
67
-------
How did performance compare?
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
Mass-based searching
Formula- based search ins
Dashboard
ChemSpider
Dashboard
ChemSpider
Average rank position
Percent in #1 position
.3
9 y
1.2
1.4
a Average rank in ChemSpider shown here does not include an outlier where the rank was 201, when added the
average rank position is 3.5
For the same 162 chemicals,
Dashboard outperforms
ChemSpider for both Mass and
Formula Ranking
68
-------
Identification ranks for 1783 chemicals
using -multiple data streams
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
100
(13
O
£
CD
sz
o
"ro
-4—•
o
80 -
60 -
40 -
20 -
0
Data Sources alone
rank ~75% of the
chemicals as Top Hit
Top Hit
Top 10
DS: Data Sources
PC: PubChem
PM: PubMed
STOFF: DB
KEMI: DB
59
-------
"UVCB"
Chemicals
-------
UVCB Chemicals
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Chemical Substances of Unknown
or Variable Composition, Complex
Reaction Products and Biological
Materials (UVCB Substance) on
the TSCA Inventory
This paper is a compendium of information related to the broad class of chemical substances referred
These
UVCBs for the Toxic Substances Control Act (TSCA) Chemical Substance Inventory,
chemical substances cannot be represented by unique structures and molecular formulas.
71
-------
UVCBs challenge in non-target analys
Urilod Sm1di
Erwrronmont*! Protection
Agency
o Complex mixtures (UVCBs) are a huge
and very challenging part of the
unknowns in many environmental
samples
o
o
o
o
o
CO
o
o
N
o
o
"3-
o
O -I
CN
O
o
to
o
o
o
c
o
c
Q)
0
q:
O
o
to
'A
200
400
600
800
m/z
¦ s sm
24.02 1
m
10
1^
1 • 8 02
7 • 701
6 • 6.01
10
15
Retention time
20
25
—i—
30
Homologue screening plots from
Swiss Wastewater (Schymanski et al
2014, left) and Novi Sad (right)
s*luti*ns
72
-------
Public TSCA Inventory on Dashboard
31,460 Chemicals (1/24/2020
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
EPA|TSCA: TSCA Inventory, active non-confidential portion
LJ Identifier substring search
Description: Section 8 (b) of the Toxic Substances Control Act (TSCA) requires EPA to compile, keep current and publish a list of each chemical substance that is manufactured or processed including imports, in the United States for uses
under TSCA. Information about what types of substances are on the TSCA inventory can be found here. The Toxic Substances Control Act (TSCA), as amended by the Frank R. Lautenberg Chemical Safety for the 21st Century Act, requires EPA
to designate chemical substances on the TSCA Chemical Substance Inventory as either "active" or "inactive" in U.S. commerce. To accomplish this, EPA finalized a rule requiring industry reporting of chemicals manufactured (including
imported) or processed in the U.S.. This reporting is used to identify which chemical substances on the TSCA Inventory are active in U.S. commerce and help inform the prioritization of chemicals for risk evaluation. The list contained in the
dashboard includes the active TSCA inventory based on notifications through Feb. 7th 2018 and substances reported from Feb 8, 2018 - March 30, 2018 that have been unambiguously mapped to DSSTox using CASRN and chemical names.
The curation of the non-confidential portion of active TSCA inventory is an ongoing process involving trained chemists to validate the correctness of DSSTox structural and identifier data. The content of the list will change over time as the
non-confidential active TSCA inventory is updated and more substances are curated. (Updated January 5th 2020)
Number of Chemicals: 31460
2250 of 31460 chemicals loaded
Select all I £ Download ~ I Send to Batch Search I Default v £
Hide chemicals that are: ^
h,cv
H?N
Acetaldehyde oxime
CASRN:107-29-9
DTXSID:DTXSID2020004
Acetamide
CASRN:60-35-5
DTXSID:DTXSID7020005
Acetonitrile
CASRN:75-05-8
DTXSID:DTXS!D7020009
CH
Acetaminophen
CASRN:103-90-2
DTXSID:DTXSJD2020006
73
-------
Many Chemicals are "Complex"
>14000 chemicals are UVCBs
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
«,^ Sj
jv
CH,
2-lrmiazclidinone. 4.5-dimethoxy-1,3-b...
DTXSID: DTXSID0027569
PubChein: 24
CASRN: 4356-80-9
0 related chemical
structures with this
substance
Palm kernel oi
DTXSID: DTXSIDDQ27694
PubCnem: 0
CASFLM: 3023-79-8
Malic acki
DTXSID
PubChem
CASRft
DTXSID0027640
273
6915-15-7
0 related chemical
structures with this
substance
Taltov#, hydragenased
DTXSID: DTXSID00271BQ6
PubChem: 0
CASRN: 8030-12-4
-O
s.
C.I. Pigment Red 43, calcium salt {1:11
DTXSID: DTXSID0027642
PubChem: 0
CASRN: 7023-61-2
-O
0 related chemical
structures with this
substance
Quaternary ammonium compounds, tn..
DTXSID: DTXSIDQ027893
PubChem: 0
CASRN: 3030-73-2
O h
0 related chemical
structures with this
substance
Lard, oil
DTXSID: DTXSIDOO2709O
PubChem: 0
CASRN: 8fl16-2S-2
4-(2-P'^enylpropan-2-yrj-N-[4-[2-pftenyl..
DTXSID: DTXSIDD027721
PubChem: 50
CASRN: 10081-67-1
0 related chemical
structures with this
substance
Tall-oil pitch
DTXSID: CTTXS4D0027992
PubChem: C
CASRN: 801&-S1-7
O
1 related chemical
structure with this
substance
I somethyltetrahydrophthalic anhyd ride
DTXSID: DTXSIDCJC27729
PubChem: 0
CASRN: 1T 070-44-3
74
-------
"Markush Structures"
https://en.wikipedia.org/wiki/Markush_structure
O
CH,
H3C
Xylenes
1330-20-7
o
0
0=S OH
Alkyi
(C10-C16) Alkylbenzenesulfonic acid
6S584-22-5
Methy (naphthalene
1321-94-4
Sodium xylenesulfonate
1300-72-7
Agwcy
n-Nonylphenol
25154-52-3
o
Diisononyl phthalate
28553-12-0
75
-------
How to represent complexity?
v>EPA
Unitad S'Blos
Ewwonmont»1 Protection
*a«ocY
Alkylbenzenesulfonate, linear
42615-29-2 | DTXSID3020041
Searched by DSSTox Substance Id.
15 of 25 chemicals selected
Searched Chemical
Alkyfbenzenesulfonate, linear
CASRN:42615-29-2
DTXSID: DTXSID3020041
4-(3-Dodecanyl)benzenesulfonic acid
CASRN:18777-54-3
DTXS ID:DTXSID7058670
Component
Alkyl
(C10-C16) Alkytbenzenesulfonic acid
CAS RN :68584-22-5
DTXSID:DTXSID2028723
4-{1-HeptylnonyI)benzenesulfonic acid
CASRN:80233-94-9
DTXSID:DTXSID40273953
Component
C-12-linear alkyl benzene sulfonate
CASRN:NOCAS_891641
DTXSI D : DTXSI D90891641
2- Phenyld odecane- p-su Ifonate
CASRN.-18777-53-2
DTXSI D: DTXSI D40274021
A
0
Component
Markush Child
o
II
II
C-.-CM
o
h
o
X
$
Q
-
I
HjC
C10-linear alkyfbenzenesulfonate
4-lsopropylbenzenesulfonic acid
CASRN:NOCAS_891689
CASRN:16066-35-6
DTXSI D:DTXS 1D70891689
DTXSI D:DTXS ID1044932
4-(Decan-3-yl)benzene-1 -sulfonic acid
C ASRN:65186-00-7
DTXSID:DTXSID20859618
4-(Dodecan-6-yl)benzene-1 -sulfonic acid
CASRN:23003-92-1
DTXS I D:DTXS ID30860093
76
-------
In the Dashboard
Abstract
Sifter
-------
Literature Searching
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
Limonene
138-86-3 | DTXSID2029612
Searched by Approved Name.
1) Select PubMed starting point query then 2) click on Retrieve, ©
Select a Query Term
Retrieve Articles
©
Select a Query Term
Hazard
Fate and Transport
Metabolism/PK/PD
Chemical Properties
Exposure
Mixtures
Male Reproduction
Androgen Disruption
Female Reproduction
GeneTox
Cancer
Clinical Trials
Embryo and embryonic development
Child (infant through adolescent)
Dust and Exposure
Optionally, edit the query before retrieving.
("138-86-3" OR "Limonene") AND (food AND (exposure OR near-field OR far-field OR nhanes OR
Environmental Monitoring OR Environmental Exposure OR exposome))
70 of 70 articles loaded...
Food and Exposure
Water and Exposure
Algae
Disaster / Emergency
78
-------
Literature Searching
A EPA
United Status
Ewf^oninont»l Protection
Agwcy
To find articles quickly, enter terms to sift abstracts. 0
Iimonene
food
exposure
* }
Clear Terms
LJ
Iimonene |
fyyyl
exposure
Total
PMID
Year
Title
~
17
2
2
21
2024047
1991
The human relevance of the renal tumor-inducing pote...
~
11
2
3
16
23424676
2013
Mechanism of bacterial inactivation by (+)-limonene an...
O
10
1
3
14
23573938
2013
Safety evaluation and risk assessment of d-Limonene.
~
10
5
0
15
12633519
2003
Development of a questionnaire and a database for as.
~
9
1
1
11
18809464
2008
Odour of Iimonene affects feeding behaviour in the bio...
"me human relevance of the renal tumor-inducing potential of d- Iimonene in male rats: implications for risk assessment.
The monoterpene d-| Iimonene is a naturally occurring chemical which is the major component in oil of orange. Currently, d- Iimonene is widely used as a flavor and fragrance and is listed to be generally recognized as safe (GRAS) in food by the Food
and Drug Administration (21 CFR 182.60 in the Code of Federal Regulations). Recently, however, d- Iimonene has been shown to cause a male rat-specific Kidney toxicity referred to as hyaline droplet nephropathy. Furthermore, chronic exposure to d-
llmonene causes a significant incidence of renal tubular tumors exclusively in male rats. Although d- Iimonene is not carcinogenic in female rats or male and female mice given much higher dosages, the male rat-specific nephrocarctnogenicity of d-
limonene may raise some concern regarding the safety of d-Iimonene for human consumption. A considerable body of scientific data has indicated that the renal toxicity of d- Iimonene results from the accumulation of a protein, alpha 2u-globulin, in
male rat kidney proximal tuble lysosomes. This protein is synthesized exclusively by adult male rats. Other species, including humans, synthesize proteins that share significant homology with alpha 2u-globulin. However, none of these proteins,
including the mouse equivalent of alpha 2u-globulin, can produce this toxicity, indicating a unique specificity for alpha 2u-globulin, With chronic aqJOSup to d- Iimonene, the hyaline droplet nephropathy progresses and the kidney shows tubular cell
necrosis, granular cast formation at the corticomedullary junction, and compensatory cell proliferation. Both d Iimonene and cis-d-pmonefie-1 2-oxide (the major metabolite involved in this toxicity) are negative in In vitro mutagenicity screens.
Therefore, the toxicity-related renal cell proliferation is believed to be integrally involved in the carcinogenicity of d- Iimonene as persistent elevations in renal cell proliferation may increase fixation of spontaneously altered DNA or serve to promote
spontaneously initiated cells. The scientific data base demonstrates that the tumorigenic activity of d- Iimonene in male rats is not relevant to humans. The three major lines of evidence supporting the human safety of d- tffnonene are (1) the male rat
specificity of the nephrotoxicity and carcinogenicity; (2) the pivotal role that alpha 2u-g!obulin plays in the toxicity, as evidenced by the complete lack of toxicity in other species despite the presence of structurally similar proteins; and (3) the lack of
genotoxicity of both d- Iimonene and d-llmonene-1,2-oxide, supporting the concept of a nongenotoxic mechanism, namely, sustained renal cell proliferation .(ABSTRACT TRUNCATED AT 400 WORDS)
79
-------
Abstract Sifter for Excel
A EPA
Unitad S:ales
Ewwonmont»1 Protection
*a«ocY
FlOOOResearch
F1 OOOResearch 2017, 6(Chem Inf Sci):2164 Last updated: 02 OCT 2019
H) Check for updates
SOFTWARE TOOL ARTICLE
Abstract Sifter: a comprehensive front-end system to PubMed
[version 1; peer review: 2 approved]
Nancy Baker 1, Thomas Knudsen2, Antony Williams 2
1Leidos, Research Triangle Park, NC, USA
2National Center for Computational Toxicology, U.S. Environmental Protection Agency, Research Triangle Park, NC, USA
First published: 21 Dec 2017, 6(Chem Inf Sci):2164 (
V 1 https://doi.Org/10.12688/f1000research.12865.1)
Latest published: 21 Dec 2017, 6(Chem Inf Sci):2164 (
https://doi.Org/10.12688/f1 OOOresearch.12865.1)
Open Peer Review
Reviewer Status
80
-------
Work in
Progress
-------
List Registration Activities
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
Registering and curating numerous lists
- NIST library of chemicals -clean up especially around
stereochemical representation
- United States Geological Survey chemicals in water
- Scientific Working Group for the Analysis of Seized Drugs
- Synthetic Cannabinoids
Blood Exposome Database
82
-------
«Em
Ufiilod ii:aios
Environ mo nt*1 'Protection
Agency
• Blood exposome data collection from Barupal and
Fiehn. Great work and we reviewing.
Vol. 127, No, 9 | Research
Generating the Blood Exposome Database Using a
Comprehensive Text Mining and Database Fusion Approach
Dinesh Kumar Barupal H and Oliver Fiehn E3
Published:26 September 2019 CID: 097008 https://doi.org/10.1289/EHP4713 Cited by: 1
• Aggregating large datasets is CHALLENGING
• Comparing with our "Abstract Sifter" approach
• We will iterate into a dashboard form..
83
Blood Exposome Curation
-------
Prototype Work in Progress
• CFM-ID
- Viewing and Downloading pre-predicted spectra
- Search spectra against the database
• Structure/substructure/similarity search
• Access to API and web services
-------
Predicted Mass Spectra
http://cfmid.wishartlab.com/-
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
OCFM-ID
~ Utilities- Help Data Publications Contact Us
CFM-ID >
Competitive Fragmentation Modeling for Metabolite Identification
I
(
/
\ M
\
/
/
MS/MS spectra prediction for ESI+, ESI-, and El
Predictions generated and stored for >800,000
structures, to be accessible via Dashboard
DTXCID00916157
100
90-
80-
£, 70-
I 60-
e
S 50"
! 40-
fVj
30-
20-
20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270
-------
Search Expt. vs. Predicted Spectra
A EPA
Unitad S:ales
Ewwonmont*! Protection
*g«*»CY
I
v>EPA
United States
Environmental Protection Home Advanced Search Batch Search Lists v Predictions Downloads
Agency
Non Target Analysis Prototype
Mass Search
Mir.'Max
321.136493473
Da
Molecular Formula Search
Q.0KXXJ02
Da Dpm
Molecular Formula
Mass or Formula must be entered before searching spectrum
Ionization Type
ESI+ ~
ESI+
I ESI-
I El
Spectra Input
Single Energy Multiple
304.1332052 11.91SS475
1BS.0913404 7.30643.3686
123.044055B 0.53E34S262
196.0756654 5.206463115
216.1019051 4.7M46167S
/i
Peak Match Window:
C.02
Da
ppm
Ssarch
A
-------
Search Expt. vs. Predicted Spectra
I
v>EPA
United States
Environmental Protection Home Advanced Search Batch Search Lists v Predictions Downloads
Agency
Spectra Inpu
Single Ene*
304.1332052 11,31
188.0913404 7.30!
123.0440569 6.53!
19S.O7588O4S.20!
216.1019051 470!
Peak Match
Search
TSV CSV
Chemical Structure ID
DTXCID1D1D4S191
DTXCID101181567
DTXCID50879086
DTXCIDGOS88349
DIXCID00B309DD
DTXGID10971178
DTXCID&D301242
DTXCID4070304S
DTXCID60349&S2
DTXCID1O310S4O
Showing 1 to 10 of 32 errtres
Chemical Structure ID
DTXCID10971176
DTXCID60301242
DTXCID40703048
DTXCI D60349982
DTXCID10316649
Score (10eV)
DTXCID101048191
0.22
DTXCID101181567
0.19
DTXCID50879086
0.17
DTX CI D60686349
0.14
DTXCID00830900
0 13
0.12
0.12
0.11
0.11
0.09
of Scores
A EPA
Unitad S'&ios
Ewwonmont*! Protection
*a«ocY
:
[2
3
A
Next
-------
Spectral Viewer Comparison
Uri1o
y.<
Hi 21! £! 22 ML
mi 217 035 fl?07
< >
OH
-
£
^ ' "i£>~ i5S" i4o ijo iA» "rfo IsT* ifc yj tlo ai» ski ilo1 sSo j4> W
jki 36) ' ££ j5>
W W <6* ~S "so'
100-
'Xi
*y
70-
I «,
u
i *
¦»
30-
10-
0-
jSi'CTc: 0-45| I J- p"iikk Tnatvh'30 inpmj
'i> "m" 'fa " ib ® liki ilo lii i5i. iii lij ' liii lid iio"ite
lid" ado siii iii
a a
iln nJu : !ii mi ;*a ih
Input IfleV 2Qv\ 4(1CV
I'
|0.45
|().I6 |UD!I9
2
(.¦>.45
|3
(0.-45
|O.I6[|OJ029{
88
-------
Predicted Data Already Public
Publication and Data Files
A EPA
Unitad S:ales
Ewwonmont»1 Protection
*a«ocY
Data Descriptor | OPEN | Published: 02 August 2019
Linking in silico MS/MS spectra with
chemistry data to improve identification
of unknowns
Andrew D. McEachran®, llya Batabin, Tommy Cathey, Thomas R. Transue, Hussein Al-Ghoul, Chris
Grulke, Jon R. Sobus & Antony J. Williams ®
CFM-ID Paper Data
Scientific Data 6, Article number: 141 (2019) Download Citation A
Dataset posted on 01.03.2019. 08:38 by EPA's National Center for Computational Toxicology
17
downloads
0
citations
This upload is a zip containing the following files:
Predicted EI-MS Spectra of CompTox Chemicals Dashboard Structures:
Predicted EI-MS spectra of -700,000 chemical structures from the CompTox Chemicals
Dashboard were generated using the CFM-ID model developed by Allen, etal
(https://doi.org/10 1021/acs analchem 6b01622). These data are provided in dat ASCII
format.
vvEPA
United States
Environmental Protection
Agency
CATEGORIES
• Toxicology
Predicted MS/MS Spectra in ESI-positive mode of CompTox Chemicals Dashboard
Structures:
Predicted MS/MS spectra of -700,000 chemical structures from the CompTox Chemicals
Dashboard were generated using the CFM-ID model developed by Allen, et al
(https://doi.org/10.1007/s11306-014-0676-4) in ESI-positive mode These data are
provided in dat ASCII format.
CH33
Predicted MS/MS Spectra in ESI-negative mode of CompTox Chemicals
Dashboard Structures:
Predicted MS/MS spectra of -700,000 chemical structures from the CompTox Chemicals
Dashboard were generated using the CFM-ID model developed by Allen, et al
(https://doi.org/10 1007/s11306-014-0676-4) in ESI-negative mode. These data are
provided in dat ASCII format.
EXPORT
RefWorks
BibTeX
Ref. manager
Endnote
https://epa.fiqshare.com/articles/CFiVl-ID Paper Data/7776212/1
89
-------
Published: Chao et al
A EPA
Unitad S:ales
Ewwonmont»1 Protection
*a«ocY
Analytical and Bioanalytical Chemistry
RESEARCH PAPER
In silico MS/MS spectra for identifying unknowns: a critical
examination using CFM-ID algorithms and ENTACT mixture samples
Alex Chao1'2 • Hussein Al-Ghoul1 2 • Andrew D. McEachran1'3 • llya Balabin4 • Tom Transue4 • Tommy Cathey"
Jarod N. Grossman2,3 • Randolph Singh1,5 • Elin M. Ulrich2 • Antony J. Williams6 • Jon R. Sobus2
Received: 4 October 2019/Revised: 27 November 2019/Accepted: 11 December 2019
© The Author(s) 2019
90
-------
Prototype Development
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
AADashboard
D & S »
° 100*. T li a ¦ » V O &> S
o © o
A9
K
oa
[ ]'
N
n
OOOOADOO
Select properties to predict
T.E.S.T. ID OPERA Search
° Exact
«' Substructure
° Similarity
° Molecular Foimula
Molecular Weight
01 Filter by elements (enter comma separated list e.g. C,F,H} include
Search result ^^3 show u Isotopically Labeled u Charged u Salts or Mixtures Sort Similarity* —
Search result
Show j isotopically Labeled u <
N-^
<' N
N ~J ,
N ' N
Jo U
h2n n
0.62
N^N
fccV
0.57
nh2
N^N
U -1
N ^§7
nh2
Nz<
HjN-0 n
N—(
NH?
0.57
N^N
HjC^N^CHb
0.53
N^N
Jo U
CI N
0.53
CHa
N^N
HjC ~N ^ CHa
0.53
OH
N-K
HO-<' N
N-<
°fe
F
N^N
N^N
CI "^N^CI
0.5
CI
N^N
ci^n'A9.5
N^N
X -X
Br N Br
0.5
Br
N^N
X: X
Br N Br
0.5
N^N
H
0.47
N^N
0.44
hH2
N^N
0.44
wr0*3
N ¦ N
0.44
N^N
h^o^n'J
0.42
N^N
CHi
0.42
N^N
HbC^U
0.42
N~N
0.42
NHj
N^N
0.42
CH,
N *1*
HsC^N^CHj
0.42
CH
N^N
hc'^n'Sch
0.42
NH2
nan
A'
0.42
I™
N • N
CHi 0.42
N^N
H2N N CH3
0.4
N~N
Cl^Ny
0.4
Ob
0.4
0.4
NHj
N^N
«Hi-Vxnh,
6h
0.4
N
J.
i*fe
N ' H
0.4
HjCocr0*3
0.4
N^N
0.4
H5Sh
"V-V
0.4
Ntfe
6<> CH}
0.4
***
0.4
^-CH,
HsC N-<
0.4
SXH,
K^H
HsCnj li. jXHs
0.4
J'
n*Vb
FN CI
°H 0.4
N N tv.l
K N
0.38
N ^ N'n.
0.38
N^N
tt»C js| ^ N J
0.38
HbC
hN
N >-MH2
>—N
HzN
0.38
N~N
H2N ^ N 'X CI
0.38
NH2
N^N
HS 'X NH2
0.38
CH,
N*«
0.38
ch3
N^N
°^HXOh
0.38
-------
CASMI 2012-2017 revisited
v>EPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
CASMI
Critical Assessment
of Small Molecule Identification
The experimental and computational mass spectrometry communities are invited to
participate in the fifth round of an open contest on the identification of small molecules
from mass spectrometry data.
This year the contest will test the applicability of MS and MS/MS on natural products chemistry
identifications. With 45 (Category 1) and up to 243 (Categories 2&3) natural products challenges -
including a few tricky ones - there's something for everyone!
Application of metadata candidate ranking
and CFM-ID to all five years of CASMI data
92
-------
Method Amenability
Charlies-owe
Why?
• Chromatography-mass
spectrometry can be LC or GC
• Which phase is more appropriate
for which chemicals?
>=,ERA
United S«las
Erwf#onmont»l Protection
Ao**ky
Random Forest Simplified
Instance
Random Forest
Tree-1
Class-A
T ree-
Class-B
I
Majority-Voting
Tree-n
Class-B
Final-Class
-------
Ongoing Work
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Data sources to date
• Massbank of North America
9,275 chemicals for rion-derivatized GC
846 chemicals for derivatized GC
816 chemicals for APCI+
454 chemicals for APCI-
4,907 chemicals for ESI+
3,430 chemicals for ESI-
• EPA Non-targeted Analysis Collaborative Trial (ENTACT)
• 886 chemicals for non-derivatized GC
• 44 chemicals for derivatized GC
• 774 chemicals for APCI+
• 431 chemicals for APCI-
• 1,113 chemicals for ESI+
• 648 chemicals for ESI-
-------
TMAP Visualization of MoNA GC Da
A EPA
Unitad S:ales
Ewwonmont*! Protection
Agency
FZLJPEPAYPUMMR-FMDGEEDCSA-N
-------
Future Work: Add database of
Collision Cross Section Predictio
P^JNL Collision Cross Section Database
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Pacific Northwest
NATION*]. LABORATORY
Search
Showing 1 to 25 of 1000 rows
25
rows per page
it
T
1
4
5
40
>
Chemical
SMILES
InChi
Formula
Mass A
CCS (A2)
Hrf: CMj
{3E)-perit-3-en-2-one
*
*
c5h8o
84.0575
J*.
[M-H]- 112.1 ISiCLE Lite vO.1.0
[M+Na]+ 112.6 ISiCLE Lite VO.1.0
[M+H]+ 113.1 ISiCLE Lite vO.1.0
HdC — 3 = 0
II
O
Dimethyl sulfone
*
*
c2h6o2s
94.0089
[M-H]' 106.9 ISiCLE Lite vO.1.0
[M+Na]+ 107.3 ISiCLE Lite V0.1.0
[M+H]+ 108.1 ISiCLE Lite vO.1.0
/v
isothiocyanatocyclopropan
e
*
*
C4H5NS
99.0143
[M-H]' 111.9 ISiCLE Lite vO.1.0
[M+Na]+ 112.1 ISiCLE Lite vO.1.0
[M+H]+ 110.0 ISiCLE Lite vO.1.0 ,
-------
API services and Open Data
v>EPA
UflilreJ S'Blos
Ewwonmont*! Protection
*a«ocY
Web Services https://actorws.epa.gov/actorws/
Data sets also available for download..
—
ujJtates
(mental Protection Home Advanced Search Batch Search Lists v Predictions Downloads
w
E
DA cni,eci
Envirci
I Agenc
Share *
DSSTox identifiers mapped to CAS Numbers and Names File Posted: 11/14/2016
The DSSTox Identifiers file is in Excel format and includes the CAS Number, DSSTox substance identifier (DTXSID) and the Preferred Name.
1|
[casrn
Idsstox substance id
preferred_name
2
26148-68-5
DTXSID7020001
A-alpha-C
3
107-29-9
DTXSID2020004
Acetaldehyde oxime
4
60-35-5
DTXSID7Q20005
Acetamide
5
103-90-2
DTXSID2020006
Acetaminophen
6
968-81-0
DTXSID7020007
Acetohexamide
7
18523-69-8
DTXSID2020008
Acetorte[4-(5-nitro-2-furyl)-2-thiazolyl] hydrazone
8
75-05-8
DTXSID7020009
Acetonitrile
9
127-06-0
DTXSID6020010
Acetoxime
10
65734-38-5
DTXSID6020012
N'-Acetyl-4-(hydroxymethyl) phenylhydrazine
DSSTox MS Ready Mapping File Posted: 11/14/2016
The CompTox Chemistry Dashboard can be used by mass spectrometrists for the purpose of structure identification. A normal formula search would search the exact formula
associated with any chemical!, whether it include solvents of hydration, salts or multiple components. However, mass spectrometry detects ionized chemical structures and molecular
formulae searches should be based on desalted, and desolvated structures with stereochemistry removed. We refer to these as "MS ready structures" and the MS-ready mappings are
delivered as Excel Spreadsheets containing the Preferred Name, CAS-RN. DTXSID, Formula, Formula of the MS-ready structure and associated masses, SMILES and InChl Strings/Keys.
DSSTox SDF File Posted: 12/14/2016
This zip file contains the entire chemical structure collection of over 700,000 chemicals from the DSSTox database contained in one large SDF file. The file contains the structure, The
DSSTox Structure Identifier (DTXCID), The DSSTOX Substance Identifier (DTXSID listed as PubChem External Data Source), the associated Dashboard URL, associated synonyms and
97
-------
Web Services
https://actorws.epa.gov/actorws
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
• Data in Ul, JSON and XML format
• Our services are free of course..
kttps://actonvs.epa.gQv/actonvs--dsstox/v02/msreadv?identifier=80-05-7
https://actonvs.epa.gov actorws dsstox v02 msreadv.ison?identifier=S0-05-7
https:acto^vs¦epa.go^'actorwsdssToxv02msl^eadv.xml0ldentlflel^=80-05-7
https://actorws.epa.,gov¦¦ actorws dsstox v02/msready'?identifier=DTXCID6Q513
https:/ actorws.epa.gov actorws dsstox v02/msreadv.jsoiinideiitifiei=DTXCID60513
https: actorws.epa.gov actorws dsstox v02/msreadv.xmr?icleiitifier=I)TXCID60513
https://actorws.epa.gov actorws. dsstox-¦;v02/nisreadv?identifier=UVOFGKIRT(-""(!"NKG-UHFFFA(])YSA-X
https://actorws.epa.gov actorws dsstox/v02-;msreadv.i>o:i?Kien:if[e:i=IATC>F(jKIRTCC'XK(j-UHFFFA
-------
InChlKey to DTXCIDs
vvEPA
Urtilod S'&Igs
Ewwonmont*! Protection
*a«ocY
https://actorws.epa.gov/actorws/dsstox/v02/msreadv7identifier
=UVOFGKIRTCCNKG-UHFFFAOYSA-N
Image
DTXCID
Smiles
Image
MsReady DTXCID
MsReady SMILES
s
-A
S N
H3C NH2+ |
\ CHi
ch2
DTXCID60513
C[NH2+]C.CN(C)C([S-])=S
^SH
HoC N
\
ch3
DTXCID0023797
CN(C)C(S)=S
s
Ji ch3
S N
H3C NH2+ |
\ CH,
CH,
DTXCID60513
C[NH2+]C.CN(C)C([S-])=S
Chk
/
H3C NH
DTXCID 70405 7
CNC
99
-------
Data and
used by the
Community
100
-------
NORMAN Suspect List Exchange
https://www.norman-network.com/?q=node/236
vvEPA
Unitad S'Blos
Ewwonmont*! Protection
*a«ocY
T*\
, I
Network of reference laboratorie ,
organisations for monitoring of emerging environmental
substances
h centres and related
Wastewater Suspect List
based on Swedish Product
Data
Wastewater Suspect List Original File
with Mapped DTXSIDs (12/02/2019)
KEMIWWSUS
InChlKeys (12/02/2019)
A prioritized list of 1,123 substances relevant for wastewater
based on Swedish product registry data, including scores.
Provided by Stellan Fischer, KEMI.
Algal toxins list from
CompTox
ALGALTOX XLSX. CSV {14/02/2019)
CompTox ALGALTOX List
ALGALTOX InChlKeys
(14/02/2019)
List of algal toxins (generated during blooms) from the CompTox
Chemicals Dashboard.
CCL 4 Chemical Candidate
List
CCL4XLSX, CSV (14/02/2019)
CompTox CCL4 List
CCL4 InChlKeys
(14/02/2019)
Contaminants that are not (yet) regulated in the USA but are
known or anticipated to occur in public water systems; from
CompTox.
Hydrogen Deuterium
Exchange (HDX) Standard
Set
HDXNOEX XLSX. CSV (14/02/2019)
CompTox HDXNOEX List
CompTox HDXEXCH List
HDXNOEX InChlKeys
(14/02/2019)
Environmental standard set used to investigate hydrogen
deuterium exchange in small molecule HRMS (Ruttkies et a I.
submitted). HDXEXCH list also contains observed deuterated
species.
Neurotoxicants Collection
from Public Resources
NEUROTOXINS XLSX. CSV
(14/02/2019)
CompTox NEUROTOXINS List
NEUROTOXINS
InChlKeys (14/02/2019)
A list of neurotoxicants compiled from public resources, details
on CompTox and Schymanski etal. (submitted).
Statins Collection from
Public Resources
STATINS XLSX. CSV (14/02/2019)
CompTox STATINS List
STATINS InChlKeys
(14/02/2019)
A list of statins (lipid-lowering medications) compiled from public
resources, details on CompTox.
Synthetic Cannabinoids and
Psychoactive Compounds
SYNTHCANNAB XLSX. CSV
(14/02/2019)
CompTox SYNTHCANNAB List
SYNTHCANNAB
InChlKeys (14/02/2019)
A list of synthetic cannabinoids and psychoactive compounds
assembled from public resources, from CompTox.
101
-------
Integration to MetFrag in place
https://jcheminf.biomedcentral.eom/articles/10.1186/s13321-018-0299-^
A EPA
Unitad S'Blos
Ewwonmont»1 Protection
*a«ocY
Candidate Score Distribution
| FbvlScor*
| MKfraf
I fv»CtSpKV«($UT«Urtty
| DATA_SOU*C£S
| NORMANPRl
| KUM8CR.0f.PU6M£0_ARTiaES
| STOFflOCMT
| T0XCAST_PtftC£MT_4CnVE
( Show Labels
Display Score Graphs
Scores *¦
6 8
Candidate Index
t? Export
Soioct v» to soon *> Do. 5* etek to fbm
CtcM Wi dot to jcM to &*>d«itt in 9* R«su«s Ufc
Molecule
Identifier
Mass Formula
Normalized Scores
FinalScore
p
Terbutylazine
Propazme
Sebulhylazine
DTXSIP4Q27M8
InChlKeyBlock! =
FZXISNSWEXTPMF
229 10948 C^H.oCINs
i
&
&
EMteM
Peaks 10 /1-1
7 0
0.0 0.2 0.4 0.6 0.8 1.0
A
PTCSIP3021186
InChlKeyBlockl =
WJNRPILHCQKWCK
229 10948 CgH1eCIN5
n-5"
«5»
5 4894
0.0 0.2 0.4 0.6 0.8 1.0
DTXSID70S8171
InChlKeyBlockl =
B7RUVK7GXNSXMB
229 10948 CsHnCIN,
•p
Peaks 7 /14
Ffjqmwlt
Scorn
Peaks 10 /14
0.0 0.2 0.4 0.6 0.8 1.0
3 2476
102
-------
MassBank mapping to Dashboard
Based-orrWeb Service lookup
MassBank Record: EA028808
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Home | Search | Record Index | Data Privacy | Imprint | MassBankID:
Go
Atrazine; LC-ESI-ITFT; MS2; CE: 15%; R=15000; [M+H]+
Mass Spectrum
1000
800.0
600.0
400.0
200.0
0.000 J
Cliemical Structure
yy
Options
• Labels
ci
214.0
215.0
216.0
217.0
218.0
CH$NAME: Atrazine
CH$NAME: 6-chloro-N-ethyl-N1-isopropyl-1,3,5-triazine-2, 4-diamine
CH$NAME: 6-chloranyl-N4-ethyl-N2-propan-2-yl-l,3,5-triazine-2,4-diamine
CH$COMPOUND_CLASS: N/A; Environmental Standard
CH$FORMULA: C3H14C1N5
CH$EXACT_MASS: 215.0932
CH$SMILES: cl(nc (nc(nl)CI)NCC)NC(C)C
CH$IUPAC: InChI=lS/C8Hl4ClN5/cl-4-10-7-12-6(9)13-8(14-7)11-5(2)3/h5H,4H2,1-3H3, (H2,10,11,12,13,14)
CH$LINK: CAS 1912-24-9
CH$LINK: CHEBI 15930
CH$LINK: KEGG C06551
CH$LINK: PUBCHEM CID:2256
CH$LINK: INCHIKEY MXWJVTOOROXGIU-UHFFFAOYSA-N
CIIOLIMKi CIIEMnPIDER 21 CD
CH$LINK: COMPTOX DTXSID9020112
iUO
-------
Conclusion
A EPA
Unitad S:ales
Ewwonmont*! Protection
*a«ocY
Dashboard access to data for ~875,000 chemicals
(~895k in the Spring Release)
MS-Ready data facilitates structure identification
Related metadata facilitates candidate ranking
Relationship mappings and
chemical lists of great utility
Curation and mutual
sharing of chemical lists is
important (e.g. NORMAN)
tC&v
ERA'S OompTox
lis Dashb
Cherllica
Chemistry Dashboard ^
Other Lists ,User |nteractior,s>
Aggregated Public Data
Consumer Products & Categories,
Functional Use Data
ToxValDB Summarized In Vivo Data
InVitroDB In Vitro 4ssoy Data (ToxCast, Tox21)
ACToR
CPDat
DSSTox
Chemistry Core
Substances
'-SjSRNs, Names.
<#
ChemProo Experimental &
Predicted Property Data
Models
Documentation for
Predictive Models
sjfnical struct^*
-------
Acknowledgements
Credit: the Research Triangle Foundation
v>EPA
Unitad S:ales
Ewwonmont*! Protection
Agency
EPA ORD
Ann Richard
Chris Grulke
Jeremy Dunne
Jeff Edwards
Grace Patlewicz
Alex Chao
Kristin Isaacs
Charles Lowe
James McCord
Seth Newton
Katherine Phillips
Jon Sobus
Mark Strynar
Elin Ulrich
Joach Pleil
TEAMS
IT Development Team
Curation Team
ILS
Kamel Mansouri
GDIT
llya Balabin
Tom Transue
Tommy Cathey
Collaborators
Emma Schymanski
NORMAN Network
Andrew McEachran
-------
MANY presentations online
L a
A EPA
United y:ai«ji
Erwf#onmont»1 Protection
Agency
v>EPA
Browse Search on The United States E_. Q.
Log in
Antony Williams
help?
incl. content from 4 / ; figshare sort Relevance ~ type Presentation
licence ANY
+ Follow this search
Antony Williams, the ChemConnector
A career path through a diverse series
of roles and responsibilities
Antony Williams, the
ChemConnector: A career path thr...
Antony Williams 09/05/2019
The needs lor chemistry standards,
database tools and data curation at
the chemical-biology Interlace
The needs for chemistry standards,
database tools and data curation a...
Antony Williams v 30/06/2017
ANNOUNCEMENT
ED5P21 and ToxCast
Dashboards To Be Discontinued
EDSP21 and ToxCast Dashboards To
Be Discontinued
Antony Williams 30/07/2019
Non-Targeted Screening of
Wastewater for Water Reuse using ...
Jerry Zweigenbaum v 12/09/2019
Invest (gating Impact Metrics tor Performance
lor the US EPA National Center for
Computational Toxicology
RukM tlwia ant tft&V HirwVs
Investigating Impact Metrics for
Perfonnance for the US EPA Natio...
0000-0002-2668-4821 v 30/0-6/2017
terau r*ttane*.
I prnc'ical MlieatB*
Consensus ranking and fragmentation
prediction for Identification ot unknowns In
high resolution mass speclromelry
Consensus ranking and
fragmentation prediction for identif...
Andrew McEachran v 21/08/2018
OPERA: A QSAR loot for
physttoehenitcal pToperties and
envbonmeftfal (arte prwficBons
V, .
ssssr VV.\;>
• r"
Generalised Read-Across GenRA,
research, implementation and prac...
Grace Patlewtcz v 18/09/2018
OPERA: A QSAR tool for
physicochemical properties and e...
Kamel Mansourt v 20/06/2018
Building an Online Profile Using
Social Networking Tools
Building an Online Profile Using
Social Networking Tools
Antony Williams v 30/05/2018
The EPA CompTox Chemistry
Dashboard ¦ a centralized hub for
integrating data for the
environmental sciences
The EPA CompTox Chemistry
Dashboard - a Centralized Hub for...
Antony Williams 05/07/2018
The CompTox Chemicals Dashboard as
An Integration Hub for Chemistry, Biology
and Environmental Toxicity Data
ly UOKtm. Ova OtUu.Aiu nM.
Onxr «Btnmttt Mi rtir-i!
urtmii Ou-m am /*& i
The CompTox Chemicals Dashboard
as An Integration Hub forChemistr...
Antony Williams v 09/10/2019
Environmental Chermslry Compound
IcJenMieatwn Using High Resolution Mass
Spcclroiryjtry Data Integrated to the EPA
Chemistry Dashboard
J. IWtero, Andrei Mttacfrm Jan Sates. Sem
Helton On UMi Giro 0
-------
Antony Williams
CCTE, US EPA Office of Research and Development,
Williams.Antonv@epa.gov
ORCID: https://orcid.org/0000-0002-2668-4821
i
© Journal of Cheminformatics
DATA BASE Open Access
The CompTox Chemistry Dashboard: a 8
community data resource for environmental
chemistry
Antony J. Williams1* , Christopher M. Grulke1, Jeff Edwards', Andrew D. McEachran2, Kamel Mansouri1'2,4,
Islancy C. Baker3, Grace Patlewicz1, Imran Shah1, John F. Wambaugh1, Richard S. Judson1 and Ann M. Richard'
https://doi.org/10.1186/s13321 -017-0247-6
107
------- |