United States
Environmental Protection
Agency
Pollution Prevention
and Toxics (7403)
748-R-93-002
February 1994
A Computer Program for
Estimating the Ecotoxicity
of Industrial Chemicals
Based on Structure Activity
Relationships
User's Guide
ECOTOXICOLOGY
V ^
Racyclad/Reeyclabla
Primed with Soy/Canola Ink on papar that
contains as laast 60% rocycied fiber
-------
''PIS';ฃ!'-
i 111
ill" ,|KK! nJluSHJI
t . c!;,',:
"It "Jlill m I.- "+ ' III 11 "'", "
nl; ' ' K Jlfi
! ffit ; -I JIS;i::,i ' : t:
L! ill" :
lllflillf'.l'll'i III,.I >>
-------
ECOSAR (Ecological Structure Activity Relationships)
A ECOSAR
http://e{iawww.epa.goV''oppt/'testsite/cbep/actlocaL'21ecosar
-------
'' ........ ! ....... !; ............. ................... .............. ................... ' ........................................... " ...........
Structure ...... Activity ....... ReiatFonshFps)
! ..... ; ...... 'TJ ...... " ........... .......... ;:
'
-" I, . .
\ ; ; .
Kftp-y '/epaww w .' epa. gov 7opi3t7testsite7cBep/actlocal72 1 ecosar.htm
..... Tvpe ...... offraln'ing ...... 5o ..... I ..... Reed ...... to ...... Kiin'tiie ..... Program ..... arid ..... rriterpref Data?
ฐ ' ' '
'' ..... ' ................................ '"'' " .............. I"1'
. ..... 4 :
,J ............
,,,e program requires a basic understanding of organic chemistry, ecotoxicology, งARs, and SWItES
notation, Users must also know how to estimate KQW for situations where measured or estimated data
:ZMost"SAjRsIn^ using ClogP ' ^
prog7am''avallable from BioByte' CoipTsOl "W. Fourth Str^teฅ2'047t:Iaremoht*
tA 9171 1-4707, tel: 909-624-5992, fax: 909-624-1398. Syracuse Research Corporation offers two
;,: programs. EPIWINC and KOWWINฉ,"for estimating Kow Contact Philip H. Howard (contact
. ordering information.
ili iiiiinnii;! ' ifl'iJiiir'iiiBEiiiiJiijiiiiiii1;:;!* iini ,:>r -OB, ::j iia jnh1:^ ssirt-aii,)!!!1 lii
n1111;!!!1,!,!: 'iia 'PiiiT -i^
ECOSAR is available on disk from EPA, free of charge. Contact J. Vincent NaBhblz or Gordon G. Cash
......... ' ................. |iei Belbw)"- ...... ECOSAR' can also be purchased from Syracuse Research Corporation. For pricing and .
!3^,e.ri!>g information, contact Philip H. Howard (details below). ^ . - _
....... f .....
> ,i ...... n.< ป ";; is nn i; i|11'!:1 ....... i: a i : ''flisiii'iiiflBBirti:1 1; i ! ..... ^iaiihiiii1 ซ : ...... i:ซ ..... ป , iisiBl!; J: ;i* nlLi '" < v IILIJI i \< ma niiiiiraiu. tfii M rv>i> i)' ''rtiii"1':.:!!!*! 'ikiilii1 ; it vป .nt ; :ป vmf't ti '"JB i ซ ;. ^ : BBi'tB:/ r
-------
.ECOSAR (Ecological Structure Activity Relationships)
http://epawww.epa.gOv/oppt/testsite/cbep/actlocal/2 lecosar.htm
E-mail: howardp@syrres.com
tn "Act Locally" || nnmrrmnitv-Based Programs Page
OPPT Home || EPA Home || Search EPA Server || Comments.
Last Update: February 25, 1998
URL: http://www.epa.gov/opptintr/cbep/actlocal/21 ecosar.htm
03-03'98 15:21:01
-------
Ecological Hazard Potential' Consumer 'and Environmental Exposure and Chemical Fate^
-. f ..--,.v.|: ---,' -'^/.f^-vl ;. -;, ., Chemical Information and Design j
Ecological Structure-Activity
Relationships, MS-Windows
Version 0.99o (ECOSAR)
What Does ECOSAR Do?
-^.Predicts the toxicity of industrial chemicals to-aquatic organisms such as. fish, invertebrates, and algae. --
How Does ECOSAR Work? ,
- Uses-Structure-Activity Relationships (SARs) to predict the aquatic toxicity of chemicals based on. their.
structural similarity to chemicals -with existing aquatic toxicity data. SARs express the.correlations between"
a compound's physicochemical properties and its aquatic toxicity. SARs measured for one compound can be
' used to .predict the toxicitiy of similar compounds belonging to,the same chemical class; '.- ...
' -'Allows users to access over 100 SARs developed for 42 chemical classes. The SARs-contained within the -
- program are, based on test:data. Many of the SAR. predictions have been validated,
What Do I Need to Use IECOSAR?
- OctanolAvater partitioning coefficient (Kow) and molecular weight.
-- Charge density and.othen information may also be required for some chemical classes.
- Descriptions of chemicaistructures, input according to Simplified Molecular Input Line Entry System
' (SMILES) notation. SMILES is a widely used language for describing chemical structures.
I, i i - - , .
What Type of Information Is Produced? .
."' - Estimates a chemical's acute (short-term) toxicity and, when available, chronic (long-term or delayed). - -
/ toxicity., ' . . '.
How Are ECOSAR Data Used?
- The U.S.'Environmental Protection Agency (EPA).uses SARs to predict the aquatic toxicity of new industrial
chemicals in the absence !qf test data. The use of SARs is a.n accepted practice for estimating ecotoxicity Tor
many chemicals: .;''.'. -"-,- '
' '-j- Environmental assessors and others use ECOSAR data to develop quantitative toxicity profiles for fish,
invertebrates, and aquatic green algae. . , " ,''..-
What Type of Computer System Do I Need?
' - Hardware: IBM-compatible computer with an 80386 or-80286 processor and 640K feVM. At least 512K to
550K RAM must.be'free for acceptable performance, and expanded memory will improve performance,.
- Software: Windows 3.1 K Windows 95, or Windows NT. .'_'. ' .' -
What Type of Training Do I Need to Run the Program and Interpret Data?
- The program' requires a basic understanding of organic chemistry, ecotoxicology, SARs, and .SMILES
notation. Users must also know how to estimate Kow for situations where measured .or estimated data, are
not available. Most of the SARs-in ECOSAR were .developed using Kow values predicted.by ClogP, a\
. computer program available .from BioByte Corporation (201.W.. Fourth Street,,Suite ^204, Claremont, CA-
91711-4707; Phone: 909 624-5992; Fax: 909.624-1398). Syracuse,Research Corporation offers two. '-
programs, EPIWINฉ and K'OWWINฎ, for-estimating Kow Contact Philip H. Howard for pncing and -
ordering information. ,..-' . ' -
:- , | -. - ':..;; .. ''. ': ' :-. .-.-' ' : -. .
What Is ECOSAR's Availability?
- ECOSAR is available on disk from EPA, free of charge. .Contacfj. Vincent Nabhob or
'-' .Gordon G. Cash. ECOSAR can-also be-purchased from.Syracvise Research-Corporiiicn.,
For pricing-and. ordering, information; contact Philip H. Howard. -
-------
Where Can I Go for More Information?
The 'ECOSAR User's Manual"' ECOSAR.''! Computer ProgYamJor Estimating trie
'Ecotoxicity of Industrial Chemicals (EPA-748-R-93-002), and Estimating Ibxitity o/;
Industrial Chemicals to Aquatic Organisms Using Structure Activity Relationships
(EPA-748-R-93-001) are available by contacting the National Center for
Environmental Publications and Information at<800" 490-9198. '
^&&m>>^iฃ>e*F^:*-'' *
"Ecological Hazard Pptentid^Conwmer^and Environmental ExposureanjiCh^^l fate^
' ", ,'",' < ." r^ ~-C ' T'.';'^OMT ' . 'Chemical inJofrwtioh'arti'Oesign'f
I
.J_i , ' o ' " *
j, Vincent Nabholz
Risk Assessment'Division
U.S. Environmental Protection 'Agency
401 M"Stre"e7; งwl (7403) "' : '
Washington, DC 20460-0001
Phone: 202 260-1271
Fax:202260-1236 - ; .
Gordon G. Cash
U.S. Environmental Protection Agency
401 M Street, SW. (7403)
Washington, DC 20460-0001 '
Phone: 202 260-3900
,Fax:202260-1236
E-mail: cash.gordon@epa gov-
Philip H. Howard
Syracuse Research Corporation '
- ~~.'~:."-~~ Environmental Science Center '. .: .
: ; : :i ; 6225 Running Ridge Road . . .
North Syracuse, NY 13212 ' ' '
Phone:315452-8417 ' '
Fax:315452-8440 /
.E-mail: howardp@syrres.com
Peer review has been conducted according to the 1998 EPA Peer Review Guidance.
I 'I'lrl11:",1! Till !ซiM|!
, ijsiti miiSi" |v IH: iiiijiiSiiiiiii iiiiil^iitlii1'! j^ii' ^iml '&lซ!' ii'i" iiiiiiiihii!^i'4lipjซ;iik iiiiili iiii'iiKiiiiiiiiiiiiiiii N i|: Jiiiii'^fliliilhi i! iisitS illiiiyillii [u3i:'
l>.!!!!!'l!!!!!!!!!l!!!!!!!!'!i:"'!'!!!!!I!'!!!i!!;!!!''1:!1'1'!'^Illiit'1 >::i!!,!m!i"!:l!11!!!1!!?!:1;:"'!!!!!!!!!"'!!!!!!!!'l||lf*'!? I!!11!''!!'!"!!1;Jnซ!!!!"H^K;: f:!:':ซ!?'!;m1 !l"!!! '!|<''!'"'<''"''!ii:i !l!!!ii'SP';Ii"1!!1 V'1"*!1"!li!ซ!!!if!S1;;;JI!!!!!!iiJniSif!!''3ii^!ฃ::' 'm!!!!'!!!'1"
-------
User's Guide for the ECOSAR
Class Program
MS-Windows Version 0.99c
. I i . _ .''. i ; .- :
February 1998.
'i prepared by:
' William M. Meylam and Philip H. Howard
I . , ' . . ^ . -
Syracuse Research Corporation
Environmental Science Center
6225 Running Ridge Road .
. ' . North Syracuse, NY 13210
I prepared for:
J. Vincent Nabholz and Gordon Cash
Environmental Effects. Branch
Health and Environmental Review Division (7403)
U.S. Environmental Protection Agency
| . 401 MS.t., SW - .
Washington, DC 20460^
-------
Table of Contents
Page
1. INTRODUCTION .............. , ................... "
" ' ' 1 f '
2. COMPUTER-SOFTWARE REQUIREMENTS ..... .............. I ........ ...... -2
i i
3. INSTALLING the ECOSAR Class Program ........................ ............. 3
4, STARTING ECQSAR Class Program ..... . . . ...... .............. ....... ....... . 5
5. DATA ENfRY and EDiT KEYS...' .......................... \ ......... ........ 6
ilkiiniiii 5,1. Entering Data ." .................... '. ................................ 6.
3.1.1. SMILES Notation ....... ! ...................... ............. 6
5.1.2. Individual Data Entry Fields ...................... ' ............. 6
5.2. Function Keys & Buttons ............................ .' ............... 8
6. RESULTS
6.1. Konemann Equation *. , 12
6.2. Example SAREquations .'....' 12
7, BATCH RUNS
13
7.1. Batch Output Formats 14
8. SPECIAL CLASS Calculations " '.. .. 15
IK
llliiliili|li|liiiil 1111111111111111111111 iiiiiiiili 1
9, BIBLIOGRAPHY
APPENDIX - Selected SMILES Information
17
18
APPENDIX B - Description of User Input File .. .' " 19
APPENDIX C - CAS Number Data Base
iii i
APPENDIX D - Estimation of Water Solubility
|l|l 11 III III
III I I
APPENDIX E - List of ECOSAR Chemical Classes
,
19
20
21
iiliiiMiiiiiiiiiiiliiiikl
In *
hi
-------
1. INTRODUCTION
The structure-activity relationships (SARs) presented in this program are used to predict
the aquatic toxicity of chemicals baseii on their similarity of structure to chemicals for which the
aquatic toxicity has been .previously measured. Most SAR calculations in the ECOSAR Class
Program are based upon the octanol/water partition coefficient (K^). Various surfactant SAR :
calculations are based upon the average length of carbon chains or the number of ethoxylate
units. . , '',..' .--'.',
SARs have been used by the U.S. Environmental Protection Agency since 1981 to predict
the aquatic toxicity of new industrial chemicals in the absence of test data. The acute toxicity of
a chemical to fish (both fresh and saltwater), water fleas (dapnnids), and green algae has been the
focus of the development of SARs, although for some chemical classes SARs are available for
other effects (e.g, chronic toxicity-and b'ioconcentration factor) and organisms (e.g., earthworms);
i' .,,-.' 'i
SARs are developed for chemical, classes based on measured test data that have been submitted
. r ' - ' i , , - .' ^. , . . .'.'''. \
by industry, or they are developed by other sources for chemicals with similar structures, e.g.,
phenols. Using the measured aquatic toxicity values and estimated Kow values, regression
equations can be developed for a class, of chemicals. Toxicity values for new chemicals may
then be calculated by inserting the estimated Kow into the regression equation and correcting the
resultant value for the molecular weight of the compound.
To date, over 150 SARs have been developed for more than 50 chemical 'classes. These
chemical classes range from the very large, e.g., neutral organics, to the very small, e.g., aromatic
diazoniums. Some chemical classes have only one SAR, such as acid chlorides, for which only a
fish 96-hour LC50 has been developed. The class with the greatest number of SARs is the neutral
organics, which has SARs ranging from, acute .and chronic'SARs for fish to a 14-day LC50 for
earthworms in artificial soil. :
The' ECOSAR Class Program is a computerized version of the ECOSAR analysis
procedures as currently practiced by [ the Office of Pollution, Prevention and Toxics (OPPT). It
has been developed within the regulatory constraints of, the Toxic Substances. Control Act
(TSCA). It is a pragmatic approach to SAR as opposed to a theoretical approach.
This ECOSAR program is designed for the expert user. You are expected to have some
knowledge of environmental toxicology and organic chemistry. It is menu-driven and contains ,
various help functions to assist you|. You cannot change'any of the equations or data stored
within the program or a'ccidently erase any important information. The following pages show
-------
3. INSTALLING the ECOSAR Class Program
The ECOSAR Class Program; diskette contains an installation program that can install
, ' ' ! ' .'': > ; '
ECOSAR Class Program and create a Windows Program Group with program icon. -The
installation program must be started while Microsoft Windows (3.1, 95 or NT) is running. To
install, place the floppy, diskette in thej appropriate floppy drive. -Then, (a) in Windows 3.1, select-
FILE, RUN from the Program Manager's menu, or- (b) in Windows 95, press the Start button
and select Run. Then: ; :
If the floppy is in the a: drive, enter a:install
. If the floppy | is in'the b:. drive, enter b:install
The FILE, RUN entry box (in Windows 3.1) may look similar to the following:
The Run entry box (in Windows 95) may look similar to the following:
Typซ the name of a program, folder, or document, and
Windows will open it for you.
Open: JA: \install.exe
The ECOSAR Class Program
because installation can be handled i
used as they exist on the floppy (thai
floppy if you want). However,
tie
.OK
~j Cancel | Browse... |
program does not actually require the installation program
manually; ECOSAR Class Program and its help file'can be
is, you can start ECOSAR Class Program directly from the
installation program ; automatically creates a hard-drive'
-------
1 1
1
ii iiigiiiiiiiiiiiinnnni nil iiiiiniiiiiiiiiiiiii in in
n II
J *
'!
iiii in ill
n __ ,_ _ , . i i i n i
1 1 jli i i -
iiliiiiiiiiiiiiiiiii i iiiiiiiiiiilil|iiiill in iiiiiiill iiiiiii i in iiii n in ii ill in n nil i n i linn mil in MI i n i in in iiii iiii nil iiiiiii n i MI nil ill
*
i * <
i
IN 111111111111111111111111111 ii iiii nil iiiiiii i 1 1 1 in iiii ii i 111 i iiii i ii PI i r i 11 ii ii ii i 111 iiiiiii i INN nun in in nil ill n 1 1 in iiiji i
i
I
i
tin in mi il i i illinium 1 1 i i i iiii n iiiiiii iiiiiii
1
J
1 IIII 1 1 III III II III II II II IIIIIII II
1 ^ f ll
i-i-H
1 A
iiiiiiiniiiiiiiiiiiiiiiiiii MIII i -
n I
1
fc
ji 1 -" 1
subdirectory, copies the necessary files to it, and creates a Windows program group. The
EC0SAR Class Program group folder (named "Ecos'ar") contains.a ECOSAR icon that starts the
program.
the following files are installed during the installation process:
ECOWIN.EXE: the necessary ECOSAR executable file
I
ECOWHELP.HLP: a file containing extensive help information for SMILES notations,
^^^^^^pg^ execution, key & button usage, etc.
. ' r ' ' HI i i j'
;;:;;;:f|eฐ ^flowing files'are'7 HOT installed during 'the installation process (and are not on 'the
' e obtained
can be used
: , ; i separately:
;:*;;;; "== SMILECAS.DB:
ii i ' i SMILECAS.IDX:
a database of more than 103,000 SMILES notations indexed by
CAS (Chemical Abstract Registry) number. By simply entering a
CAS n"um"be-r" hi EC*SงAlt allows alitomatic retrieval of available
SMILES. This database can also be used to run automated batches
of CAS numbers.
index file for SMILECAS.DB
! Ill
KOWWIN.EXE: a Syracuse Research Corporation program .that estimates log Kow
from SMILES. When this program (and" its two library files listed
[[[ below) ....... are ....... available" ........ in" ...... the ......... same .......... suESi'rectorv .......... as ........ ECDS AR. jt
', " allows ECOSAR to automatically start 'TO'WWtN and retrieve the
KOWWIN estimate. The estimation methodology is described in a
journal article (Meylan and Howard, 1995).
!' if
nii|nnnnini|ii|iiiiiinni i n ninin in in iiii in in i in in in i in in i in|in|iini in
in ii i n iiiiiii ii wi i iiii iiii i n n i
CSDLL.DLL:
QCBASED.DLL:
Note: ' the KOWWTN program located in the ECOSAR
subdirectory must' be closed while ECOSAR is running.
Otherwise, ECOSAR can not use it. A duplicate copy of
KOWWIN can be running from a different subdirectory however.
iiiiiii i n in i i iiii in n i iiiiiii i in i MIII in n i t 111 iii iiiiiii ii 11 ill i iiii iiiiliiiiiiiiiii ill 1 in ii null 11 i n mi l i
a library file required by the KOWWIN program.
a library file required by the KOWWIN program.
1111 1111 inn il iiiiiiiiiiiiiini i iiiiiii in
I I I II I "I r " T ' I II I [" J
-------
4. STARTING the ECOSAR Class Program
The ECOSAR Class Program is started like any other Microsoft Windows program. The
easiest way to start ECOSAR Class Program is to double-click the program icon installed in the
ECOSAR program group during installation. ,For additional information, on starting Windows
' ' ' ' "*> , ' ' '
programs, consult your Windows documentation. . " " .-.'.'
Program execution begins at the data entry screen; an example is illustrated in Figure 1.
1 Ecosar Classes vO.99
File Edit Functions fiatchMode ฃpecial_Clas'ses jHelp
Previous I Get User I Save User I CAS Input I Calculate
Enter SMILES: clbccccIO
Enter NAME:
CAS Number:
Chemical ID 1:
Chemical ID 2:
Chemical ID 3:
Phenol
108-95-2
Hydroxybenzene
Measured Water Sol (mg/L):
Melting Point [deg C):
Measured Log Kow:
Figure 1. Example Data Entry Screen
'Note: the appearance of the screen may vary somewhat due to screen resolution (e.g. 640 X 480
. vs. 800 X 600), user selection of MS-Windows attributes (e.g. colors, font size, etc:), etc.
In addition, Figure 1 illustrates how the entry screen appears when using Windows 95.,
Appearance in Windows-3.1 varies slightly.. ; ' ' .
-------
(1) SMILES: the SMILES notation of the structure to be estimated. A maximum of 360
characters are allowed. This field is required. Do not leave any blank spaces
in front of a SMILES notation ... a SMILES is considered finished when a
blank space is encountered. - .
(2) Name: the name and/or description of the structure. This field is optional; not required.
. A maximum'of 120 characters are allowed. . .
(3) CAS Number: the'CAS (Chemical Abstract Service Registry) Number. This field is
optional; not required. When a SMILES is retrieved from the SMILECAS
Database, the CAS is automatically inserted in this field.
(4), Chemical ID 1: optional description/identity field; not required. '.
(5) Chemical ID 2: optional description / identity field; not required.
(6) Chemical ID 3: optional description/identity field; not required.
(7) LogKow: the log octanol-water partition'coefficient. A value is required unless the
KOWWTN Prograin (Syracuse Research Corporation) is present in the same
subdirectory as the EC-OS AR Class Program. When KOWWIN (and its two
library files) are available in the same subdirectory as ECOSAR. it allows
ECOSAR to automatically start KOWWIN and retrieve the KOWWIN
estimate. Note: the KOWWTN program located in the ECOSAR subdirectory
must be closed while ECOSAR is running. Otherwise, ECOSAR can not use
it. A duplicate copy of KOWWTN can be running from a different
subdirectory however. . '
' ' i : . '.'" '
(8) Measured Water Solubility:
the Measured Water Solubility in mg/L. This field is optional. It is NOT
- required'. When left blank, a Water Solubility will be calculated from the log
. Kow value. Predicted toxicity values are compared to the Water Solubility ....
if toxicity exceeds Water Solubility, the toxicity value is marked with an
asterick(*) to .indicate'No Effect at Saturation'. Water Solubility is not used
to calculate ecotoxicity values. The estimation methodology is described in
Appendix D. j .
i
(9) Melting Point: the Melting Point (in deg C). This field is optional; not required. It is
used to calculate Water Solubility when a measured Water Solubility is
unavailable, I It generally helps in estimating more accurate water
. . solubilities, but is not required to estimate Water Solubility.
(10) Measured Log Kow:
the measured log Kow value,.if available.'This field is informational only. It
- .. is not used to calculate ecotoxicity values. The value in the Log Kow field is
used .to calculate ecotoxicity values. . , , , ,
-------
:;J:;:|;I': HijllillJ'Iff! ffl!.ฃ*ฃM"! [:jf4"!!ฃ;|I|:!l;lJ, *ifi!.*4j !"IK';tf [ ^MM^ 1 HR ' ^Mijj iJyijil^ljlilJjlliiilijiiiJiPi; diiif:!!; [ 1Vjfj - Miffi 1 5'ป" 'iljj'lliilljj'liliiliijjj'iii !',< 'J ft;f * " M
ii"1' >.< 'lUPhiiiiiniiiiiiLUPiii'1, iii! I;; i.ij.iiiiiipip! I ii'ii'VMiLMi.iiJijiiiliiiiiiiiiiiiiiiliiii'iiliiiiiiiiiiiiiiiiliiiiihiiiiiiiiiiiiiiiiiiiiliiiHi' uriiiii 4
; , ,' !| Hi,:,!;,,, ; | p ,4
=11
i :, ?! iiiiiiirii i pinui !'ป'iiij I"1*!* i < i iiP'ini mill H'liii!'1 iiinii'in inn 11 i miJ|; n< 1i'ii' 'I1* i, " if .,ป; iii.ifiiijiiii .HI,I,I| . n/" siin! iv* \' n, > Jiiia:"" P i minima; IT na1 'iipi'iimp1 .' jwir up "puUhiiiiiiiiiiiiii:, piiii t .ii'Mui '+ ;iซn <ป,iini i TIIIL i '?:ป, n,".: ,,;,iin::i "''" ,' v i:, in' in ,i ",i' i ', t ,'|"'I||F "' i;s;?f., iiiiiiiiiiiniiiiini i mi i
i Li ; : :', ; ' ;
!;l,;l ;; m ซ | "
::; I;:;
S Kiffto'ns '
in i | , i Fl; Accesses a help message for the individual field where the blinking cursor is located. General Help is available .
froifl "H[elp" on the Nfe'nu Bar at the top of the screen 'It is a standard Windows help system; to access a specific''
ij(lii|ii i helj> logic, simply click on the topic'(or keyword) that is highlighted in green where the mouse'pointer changes to a
hand, ' '
Previous)
F2; 'Pressing the F2 key or clicking the "Previous" button recalls the most recent SMILES and chemical 'name that
EKsFas ..... caTc^tsttwTor attempted" to be calculated by'the program. It can save"'a lot of time when making small changes to
...... ..... lCS ...... an'S ....... name' ....... lt"""!s ....... especially ...... useful ...... after ...... a""SMILES ..... notation" error occ'iirs~."."the incorrect SMILES can
:::;; be recalled and edited.
Illll
IIIII
II l| III j III llllllllll|ll IIIIII 111 I IIIIII I lll|l|l
'HIWPIIWIil 1IIIIIIII1IIM HIIK I I '' h
:;:;;;|r3"' clears the currently displayed SMILES Notation^ Chemical Name and other data. All entry fields are filled
;;:;!';!::i:i?H D"i5n"k" spaces.
nil i ii'iiiiiiiiiiii iiiiiii IF iiiiiiiflini '
Get User J
F4l, Pressing tne F"4 key or
clicking the "Get User" button
a file selection dialog
bos that allows tfie user to open
i "file of pjcviousjy saved
notations and' chemical
nafne'1'oTt'Ee
ni2S;for.
with similar
The file selection box
'For files" witK tfie extension
I nameTiles''1
this extension . wKen
\l:::;I:i:eiring them* with 'the F6 key
Select fttim User File:
2,4.5-T: CLcl cc[CL)c(CL)cc1 OCC[=0)0
Tebuthiuron: S1 C(C(C)(C)C)=NN=C1 N(C)C(=OJNC
Terbutryne: nl c(NCC)nc(NC(C)(C)C)nd SC
Thiabendazole: cl ccc2NC(C3=CSC=N3)=Nc2d
Thiobencarb: N(CC)(CC)C(=0)SCcl ccc(CL)cc1
Trlallate: N(C(C)C)[C[C)C)C(=O)SCC(CL)=C[CL)CL
Trifluralin: N(CCC)(CCC)c1 c(N(f 0)=0)cc(C(F)(F)F)cc1 N(=0)=0
Triclopyr: nl c(CL]c(CL)cc[CL]c1 OCC(=0)0
Benzo[f]quinoline: cl ccc2ccc3ncccc3c2d
Benzo[b]thlophene: d ccc2ccsc2c1
Malathion:S=PtOC)(OC)SC(C(=0)OCC)CC(=0)OCC
7H-Dibenzo[c,g]carbazoie: c1 ccc2ccc3Nc4cccHccccc5c4c3c2c1
13H-Dibenzo[a,iJcarbazolE: clccc2ccc3c4ccc5ccccc5c1Nc3c2c1
2.4,5-T: CLcl cc(CL)c(CL)cc1 OCC(=0)0
2.2'.4,4'.S.5'-PCB: CLc1 cclCLlctCUcc! c2c(GLlcc|CLlc(CUc2
Cancel
;::::: ',,T; cani contain up to' 1500 SMILES
Figure 2. Example User Input File Selection
and the user can
"'an"3''
saS'SSi";! Ha'iSe'for'jnput. The SMILES.TNP file can be created one chemical at a time by using the F6 key as described below.
0 y: :"~ " ' * '' -'on is only usable after a file has been created with the F6 key feature!! An example screen is .
"Selection is made by highlightmg the desired line arid clicking to "OK" button or by
, ^g ' iiiiiii',:| ,n/,ini,,,;, ii'iii i 111;,]*: ?: i: ni <: iifnr.' ^I'liiiiiiiirir "'iiiii inr I'lEiiiiiii iiiiii1;
~
i Hill: I ii ,,i ihiii, f 'ii, :iii i; |i|i|ii|;iyi iCi, p;: ; f;; i,1
-------
F5: Pressing the F5 key (or clicking the "B;atchMode" option on the main menu and selecting'"Batch File Input
Using SMILES Strings") brings up the selection box'shown. The F5 key is used for batch entry of SMILES strings
from ascii text files. The text files MUST be in either of two formats. (1) String Format or (2) EcowinForrnat.
String Format must have the SMILES string at the beginning of each line in the filefit can then be followed by a
space(s) and then the name or other ID. the SMILES is considered terminated at the first space. An example
String Format is as follows: - .: " . ' '.,..".'.'
CCCCO Butanol ' ' :. . ..-'.".,
clccccc.l Benzene , I. ' , . _ . .
Fclcccccl Fjuorobenzene .
CC(=O)C Acetone L ;
EeowinFormat is the same format used by the "Get User" and "Save User" button features. Therefore, the
"SMILES.INP" file can be used directly to run batch file outputs. In this format, the name comes first (maximum of
60 characters) followed by a colon and one space, and then the SMILES notation. An example EeowinFormat is as
follows: ; j ' ' . . " .-.. . '"
Butanol: CCCCO .'".-;, ' ' ...'.'.
Benzene: c.lcccccl . ' ,. , , /
Fluorobenzene: clccccclF \, , . , .,.''. ....-..'
Acetone;: CC(=O)C ...-!' - '. '''!
Save User J
F6: Pressing the F6 key or clicking the "Saye User" button displays a file selection dialog box that allows the user
to save the SMILES notation and chemical name currently showing on the data entry screen to the file. The default
name of the file is SMILES.INP; this is for compatibility with similar'estimation programs. After a file is selected
(or entered by the user), ECOSAR appends the SMILES notation and chemical name currently showing on the data
-entry screen to the file. If the file does not already exist, ECOSAR will create it and append the current SMILES
and name as the first entry. The SMILES and names in a "Saver .User" file can be accessed from the data input
screen by pressing the F4 key. See Appendix B. .
F7: The F7 key is used to enter CAS numbers from an ascii text file...the number.of CAS numbers in the file is not
a election menu is not currently available. The F7 key is1 used
limited. The user must enter the file name.,
primarily for batch-mode runs...output is
written to files named "CASLOG#.OUT"
where "#" is a number determined by .Jhe
program.
The format of the'ascii text file is;
no spaces in front of the CAS number,
hyphens and leading zeros are optional, and a
trailing cartridge return'....example:
000050TOO-0 , . J
71-43-2 ' j.'
108883 !
000050-02-2
NOTE: the. presence of SRC's
SMILECAS.DB database is required! It is
not included with" ECQSAR Class Program
unless acquired separately. |
Select Batch Text Format:
;- Batch "T.ext Format Choice*
StiingFormat on each line, the SMILES
string comes first and ends with the First
blank space...name or" ID can follow the
blank space.
EeowinFormat each line must be in the
format used by the "Get User" list which is,
kept in the SMILES.INP file.
-------
Ili
11111 111 111 III III1111111
I CAS Input |
1 11 i ป
1 F8: Pressing the F8 key or clicking the "CAS Input" button requires the presence of a supplemental database file
(SMILECAS.DB) and Index "fife inlfie current subdirectory. A small data entry window is created on the data entry
screen which asks for the CSS "number of the chemical. An error message will appear' in the "window if the program
cannot ffl tne 3ata5a?e oTuicIex fileTfiie^ataBase fTlTconteins ab'out T63J55 entries'; But not an'cfiernTcals'.witn^
ฃAS SHmbers, are included in the file. If the chemical is not in the database,'an appropriate message is displayed.
Th'e program-can identify impossible CASliiurnbers by exainirmig-tHe cTieclTcfigit ^He''firi"aTnurn6ef'ofthe CAS).
The SMILEtAS Database is not included .with the ECOSAR Class Program installation. It must be acquired 'and
installed separately (Syracuse Research Corp., Environmental Science Center).
Calculate |
i i i i | i i i
PgDn: Pressing The PgDn key or clicking the "Calculate" button calculates the SMILED currently showing on the
data entry screen. If an acceptable SMILE'S has been entered, the Results Window will either appear or be updated.
If an incorrect SMILES has been entered, an error message box will appear. After removing the error message box,
the incorrect SMILES can be recalled and then edited by pressing the F2 key or clicking the ''Previous" button
i T
Illlllllllllllllllllllllll Mill
Ill I III III III Illlllllll III 111 111 II
II I Illllll II
1 III I 111 I
I i ป j
Esc: During data entry, pressing the Esc key exits the program. When the Results Window is active, pressing the
Esc ke> removes the Results Window.
Enter: Pressing the Enter (Return) key sends the cursor to the next data entry field.
ซ '
Tab or Shift-Tab: changes entry fields.
Illlllll tim I 111 II I II I Illlllllll
ll
Illlllllll
III I 111 Illlllllll II 111 111' I I
II Illlllllll Illlllllll I 1 II
l" II
. I Illlllllll 111
10
111 I
ill!
II I I 111 II
in 1 in (i iiiiiii 111111 i in limit
(i I I Ill I I i(
. 1,1
.In
III
li f
-------
' -: . -i. 6. RESULTS -' ; .'" ~ '".' '''"..'.' " '-
The Results Window-presents the results of ECOSAR Class Program's estimations. It
appears when a SMILES notation is' calculated. Figure 3 below 'illustrates an example Results
Window: . . . . I. , -: . , '
5 Ecowin Results i 1 i : HIalI3||
Print งave Results Copy Bemov
SMILES : dcccccIO
CHEM : Phenol
CflS Nun: 108-95-2
ChemlDI : Hydroxybenzene
ChenID2: .
ChenlDS:
MOL FOR: C6 H6 01
MOL WT. : 94.11
Log Kow: 1.46 (User entered)
Melt Pt: 41.00 deg C
Wat Sol: 3725 mg/L (calculate
ECOSflR Class(es) Found
Phenols
ECOSflR Class (
e Window Help
<3)
rganism
Konenann Equation fish (guppy)
Phenols Oaphnid
Phenols Daphnid
Phenols Daphnid
Phenols , Fish
Phenols Fish
Phenols Fish
Phenols ISreen Algae
Duration
14-day
48-hr
96-hr
96-hr
3 0-day
60-day
End Pt
LC50
LC50
EC50
ChU
LC50
ChU
ChU
ChU
v 'd
Predicted
mg/L (ppn)
373.245
8.424
140.468
3.193
29.737
4.573
0.196
10.280
Figure 3. Example Results Screen
The. Results Window can be moved,-sized arid placed anywhere on the Microsoft Windows
desktop. It does not need to be removed to calculate another SMILES notation; the Results
Window will be updated when another SMILES is calculated.
The Results Window lists the SMILES (which might have been modified by the program
due to aromatic detection or other conversion), molecular formula, molecular weight, and the
fragments used to derive the estimation.' The following menu 'choices are available when the
" , . 0
Results Window is active: ! :
''" ;
Print: prints the results as shown. ; - ' ' .
II
-------
l,j!"ll > '| '"", ' ; ,' , ',' 1 '' "]
-
lllllta^ lilllllllb l 1)1111 iiiiii iiiiiiiii ii in i l i nil iiiiiiiiiiiii in in ii jiuim iiiiii in ill nil ซ ni nil iiiin iiiiiiiiiiiii in iiii'iiiiiii
lining Iiiin iiiiiiiiiiiii n in in i in in nil i in nil iiliiiiliiiiiiiniiiiilnni in iiiiiiiiifiiiiiiiiiiinnnnin mini nil iignni i nil inn ni n mini in u i inniiii nn HIM l ininiini linn niiiininniii iilniiiini ilinii niiiin Hi inn in
i '
*
i i i r ni
i
111 Mill 11)11 III II ll|ll 111 1 111 1 IIIIII Illlllll I III 1 lllllllllll
iiiiiiiii n nil inn i linn yn n n i n i in i in mi iiiiiln iiiiiiiiiiiii
i
" 1
IV
I
Illlllll II I IIIIIWII III '
iini|iiiil iiiiiiiiiiiiinnni
'1 l
mi
.1 i '
lllllllllH
i iiiiiiiii iiiii|i iiiiiiiii
ll
Save Results: saves the summary output to a file. The output files are named ECOW*.DAT where "*" is a
number frorq I to 100. Numbering begins at 1 and automatically proceeds to number 100. Currently, all results are
appertcied to the same file number until the program is exited. The next time the program is started, the next
available number is used; therefore different files are used from session to session! If all numbers have been used
in existing files, then number 1 will be used and the existing file ECOW001.DAT will be overwritten!!
Copy: copies the results as shown (minus the rectangle enclosing the estimate) to the Windows clipboard. The
results can then be copied into other Windows programs such as word processors. When copied to a word processor
(such as Word Perfect, Ami Pro, or Microsoft Word), a non-proportional font (such as courier) must be used for
corrc'ct formatting!!... Also, the page width margins must be wide enough!
I I I 111 ii Hi I I i "I i I I IIIIII Hill 111 Kill 111 1 III lllll|lllllll III II III i II II I II III , II I
|l ' i I 11 l ' , .
Remove Window: deletes the Results Windows; a new Results Window will appear with the next estimation.
U ma% be more convenient to move and size the Results Window for personal preference (after the first estimation)
rather than to remove it after each estimation. If the Results Window is left on the screen, the next estimation
results \\ ill simpl> replace the existing results.
Log Kow value in the Result Window designates whether it was entered by the user or
calculated by the KOWWIN program. The Water Solubility value designates whether it was
|H ! I ~ I 1
calculated or measured (see Appendix D for water solubility estimation methodology).
6.1. Koneraann Equation
i i '
The Konemann equation is an equation developed from a variety of different compounds
(including chlorobenzenes, chlorotoluenes, chloroalkanes, diethyl ether and acetone) using
N |i i , ซi"ii
guptaies and 14-day exposure periods (Konemann, 1981). The equation is:
';:Log(l/LC50) = 0.871 log Kow -4.87
where LC50 is in umol/L.
i, " i . i i i i
6.2 Example SAR Equations
The following are example SAR equations used by the ECOSAR Class Program to
calculate ecotoxicity values. They are indicative of all SARs calculated from SMILES and log
1
Km values.
Tfl
Acrvtates:
Log48-htC50 = 0.00886 - 0.51136 log Kow
Log96-hLC50 = -1.46 - 0.18 log Kow
LogChV* = -1.99 - 0.526 log K
Log46-hfeC50 = -1.02 - 0.49 log Kow
The values calculated by these equations are in units of millimoles/L.
i i j
(Daphnids, mortality)
(Fish, mortality)
(Fish chronic value; survival/growth)
(Green Algae, growth)
12
111"
iiiiiiiii ii i in
1 ' ii ii l
i in > i HI 1111111 nil i ii i
-------
menu-
run"
7. BATCH RUNS
. f ., - . - .- .. ,
Batch runs are used to make.multiple estimates from a single input file. The ECOSAR
Class Program can make "batch runs" from-three different'types of input files. Each input file
must be in a specific format, otherwise, the batch run will fail. Program access to "batch-runs" is
available from (a) the .top menu option "BatchMode", (b) various options under the top i
.option "Functions", and (c) the F5, F7 Function keys. The following describes each "batch
input file that ECOSAR Class Program can use. .
. . i
(1) CAS Number List - This is a plain text file (usually with a ".txt" file extension) containing a,
list of CAS (Chemical Abstract Service) Registry numbers. The format of the ascii text file is: no
spaces in front of the CAS number, hyphens and leading zeros are optional, and a trailing
carridge return. For example:
000050-00-0
71-43-2 '.- . '
108883 '
000050-02-2
NOTE: the presence of SRC's SMILECAS.DB database is required! The SMILECAS Database
must be in the same subdirectory as the ECOSAR Class Program program. There is no limit to
the number of CAS numbers in the file. The F7 function key accesses the CAS batch list option.
(2) SMILES String, String Format List - This is a plain text file (usually with a ".txt" file
extension) containing a list of SMILES notations. A "String Format" list must have the SMILES
string at the beginning of each line in" the file; it can then be followed by a space(s) and then the
name or other ID.' The SMILES string is considered terminated at the first space! An example
String Format is as.follows: , :* , .
-CCCCO Butanol
'clcccccl Benzene . | .
Fcleccccl Fluorobenzene }.'.'.'''. ' ' '....."'.''
CC(=O)C Acetone . j. . * " ; . ' ;
f . , ; .'"> '' '-' ';:"~''. ; > ':' -
The F5 function key accesses the'" SMILES String batch1 option. The output file is named
"BATCH#.OUT" where "#" is a number determined by the program. ,
(3) SMILES String, Ecosar Format List - This is a plain text file (usually with a ".inp" file
extension) containing a list of SMILES notations. EcowinFprmat is the same format used by the
"Get User" and "Save User" button features. Therefore, the "SMILES.INP" file can be used
" directly to run. batch file outputs., In this format,, the name comes first (maximum of 60
-13
-------
Characters) followed by a colon and one space and then the SMILES notation. An example
- :!l- : : ^ckoc Format is as follows: v _ '
Butpiol: CCCCO
Benzene: clcccccl
lllllllllllll'IIIIIIIIIIIIIIIIIIIPIIIIIIIi:lll!i|i]illll!lllllปll!l!ill:l|||il|i||lilllllllnlllllllllli!H1 ' llllliilW'^" '"^JUf
!|| .................... ! ................ i ....... ' ...... ! ........ : ............... ' ....................... '^'FluprobenzenerclccccclF _
3 ...... ........... I ......... ,;: ..... : ..... ; ........................ : ....... ..... ...... ; Acetone: ........ CC(=O)C ..........
..... i
......
'k>3Ii'ซ ^'ii'Spffiliiijiii
F5 function key'accesses the sMILEi String batch option. The output file is named-
"BATCri4.6l)Trt where "#" is a number determined by the program.
1 j
i , i '
1 ' . ,, - i
iiiiiiiiiiii in iiiiiii ii|i i iiii (ill i in u nil in iiii|i|iiiiiii i ii IN i nil it ii 11 i ii i . ' .
7.i! Batch Output formats
| H I ll, ป |
, Batch runs can capture results as either "Full Output" or "Summary Output". Full Output
captures results for each compound the same as they would appear in the "Result Window" (if
each compound was estimated individually); these output files can get very large for large
numbers of compounds. Summary Output captures selected results a,nd places these results on a
single line for each compound. Before
running a batch with "Summary Output",
""! ' " fh^ toLat of the output file can^be
selected from the dialog box shown here.
The default is "space filled" with required
. identifiers to identify various results.
ll I ! Output can also be "Comma de-limited"
' ' "! ^-Tnh /^..imiW1. These output
..... ......... . ................................ .....................
" '
Batch Output Foripat
- Selecl
or
selections separate results on each line
with either commas or tabs. This is useful
for importing batch output file directly
into other programs( such as Microsoft
Excel or Lotus 123 spreadsheets):
Original (spaces)
Comma de-limited
r Tab de-limited
Select 'he formal for single line
(summary) batch output files.
Original uses space de-limiten and
parentheses identifiers when
needed (this is the default).
"UK"
Comma or Tab-delimited output
separates selected result options
'with either commas or tabs.
Cancel
14
11 l
-------
8. SPECIAL CLASS CALCULATIONS
The ECOSAR Class Program has been developed primarily for the following scenario:
(1) enter a SMILES notation, (2) computer determination of appropriate ECOSAR classes for the
SMILES notation, and,(3) calculate the ecotoxicity SARs using a. log Kow value. Several
"Special Classes" of ECOSAR SARs or classifications do not use the log Kpw value or can not be
'adequately classified from the SMILES (in this ECOSAR version). These "Special Classes"
include (a) Polymers, (b) Inorganics, (c) Dyes, and (d) Surfactants. The current version of the
ECOSAR Class Program does not include SARs for Polymers, Dyes, or Inorganics (these may
be added'in the future). However, SARs are available for various Anionic, Cationic, Noniqnic,.
and Amphoteric.Surfactants. Instead of the log Kow value5: these SARs utilize the number of
ethoxylate units or the average length of a carbon chain. These "Special Classes" are accessed
from the Main Menu bar (see Figure 4). .'.
I^Ecosar Classes vO.99 ซ ; I 1 S HHI3
, FilP -Fdit- Fiinrtinrifi Rnfr-hUnria Ip.j^K^^RPSgJ Help
^ Enter SMILES:
Enter NAME:
CAS Number:
Chemical ID 1:
Chemical ID 2:
Chemical ID 3:
Log Kow:
Previous |rGetls Dyes "^ CAS Inpuf)* Calculate
Pofymorc ~ 1
: Inorganics '
; : JpfflSSIfflKII. Neutral/Nonionic
Anionic . - - -
1 Amphoteric ( , ,
. ~. - ..
Melting Point [deg C):
.-'' ' - . ,
I .Measured Log Kow:
i ' ~~
Figure 4. Special Class Menu Options
The Special Classes have their own data entry dialog, box (see Figure 5). The calculated results
are placed in'the same Results'Windows as results using, SMILES notations,, (an example is
illustrated in Figure 6). Note: the Water Solubility or Water Dispersibility fields in the data
entry dialogs-are not used in SAR calculations. - .
-------
ill? II
I- ": L
i|Nii'Uii:;iiE|jiu nimjii
illiilH^^^^^^^^^^^ "
IV ฅS1ฅS - '
ivsmiwiwi, i .liWi1,IP i:'ซ>' iiiiii::iiiiilU!iiiiipiii::i!iBiiiJiPซiBi!iii!liiiiii;iiiiii!P:ilB TX'ip; ^".mtv ,:>;; viiflnn, "irniiuiimi tu;' ".r. ijhiiiiiiiiiiiiiiiiiiiii, ,ut\
|lllll, llll"l|i|lllllllllllllllirilllllll!llll|l|illl' ; .r ll i l|,lill liii SIT'ii I Ilkilliiilllli,' I lit !1; iii'i'l'lu fi <:\,[l\U; II1 ii ,|i i^llilf i Hi nlii|i q< I Illpi i|i| II ill lllllnlriliMi'injVi: ,'' 'i|> ,'ซi
-------
9. BIBLIOGRAPHY
Koneman, H. 1981. Fish toxicity'tests with mixtures of more than two-chemicals: a proposal for
' ' a quantitative approach and experimental results. Toxicology 19: 229-238.
. r . :..-.--; ,. ." - ~
Meylan, W.M. and P.H. Howard1, 1994a. Upgrade of P.GGEMS Water Solubility Estimation
Method (May 1994 Draft).' prepared for Robert S. Boethling, U.S. Environmental
' . Protection Agency, Office of Pollution Prevention and Toxics, Washington, DC;
prepared by Syracuse Research Corporation, Environmental Science Center, Syracuse,
NY 13210. '."";. . .._.;'....' .!... ..'...:.,.'. ..,
Meylan. W.M. and P.H. Howard. I994b. Validation of Water Solubility Estimation Methods-
Using Log Kow for Application in PCGEMS & EPI (Sept 1994, Final'Report), prepared
" for Robert's. Boethling, U;S. Environmental Protection Agency, Office of Pollution
Prevention and Toxics, Washington, DC; prepared1 by Syracuse Research Corporation,
Environmental Science Center, Syracuse, NY 13210N
Meylan, W.M. and Howard, P.H. 1995. Atom/Fragment: contribution method for estimating
octanol-water partition coefficients. J. Pharm:Sci. 84: 83-92. .
Meylan, W.M. and Howard, P.H. ] 996. Improved method for estimating water solubility from
octanol/water partition coefficient. Environ. Toxicol. Chem. 15: 100-106.
Weininger, D. 1988. SMILES, A Chemical Language and Information System. 1. Introduction
to Methodology and Encoding Rules. J. Chem. Inf. Comput. Sci. 28: 31-36.
17
-------
" ! APPENDIX-A . l ' " ' "''/''"
Selected SMILES Information
A SMILES notation is considered terminated at the first blank space. Characters following the first blank
space are ignored! ', . .;'.- - '
-' - ' ! . ; . ,--.-''
Entering the Nitro Function (NO2) . "
The nitro function (NO2) is usually written as N(=O)(=O) or N(=O)=O , in SMILES notation. In this
program version, the nitro function can also be designated (simpiy) by the capital letter T. .
Entering the Sulfonic Acid Function
The sulfonic acid function (-SO2-OH) is usually written as S(=O)(=O)O in SMILES notation.
Ca rbonyI Function (C=O) Information , .
~ The .carbohyl function (C=O) should always be entered in upper case letters. Additional information is
presented in the SRC document "A Brief Description of SMILES Notation". /
Metals & Charged Species -
Charged species can not be entered directly into the program with + and signs, Compounds, such as
QACs (quaternary ammonium compounds), can be entered by simply attaching the charges as if a direct bond
exists; for example, tetramethyl ammoniuih bromide can be entered as > N(C)(C)(C)(C)Br ...also, for many
hydrochlorides, simply ignore the HCL portion of structure (leave it out and enter the compound as the
nonhydrochloride; alternatively, see section below).
ECOSAR Class Program .can accept and evaluate the following METALS:
Na sodium Hg mercury K. potassium Li lithium '
Use the chemical symbol to include any of these metals; for example, sodium acetate could be: NaOC(=O)C.
Alternatively, the above metals and ALL OTHER metals can be put in a SMILES notation by bracketing as follows:
[Na] sodium [As] arsenic [Ca] calcium [Sn] tin [Pb] lead...:etc.
Valence charges are NOT evaluated.in brackets!! ATTACH metals to the corresponding negatively charged species
' and do NOT use + and charges in the SMILES!! Example: in some SMILES notations, sodium hexanoate would
be entered as: [Na+][O]C(=O)CCCCC however, this is-not allowed in this program because charges are not
allowed and oxygen can not be bracketed. | , , .
Entering Hydrogen Directly ,
F.or ECOSAR Class Program, direct hydrogen entry in a SMILES notation is not allowed with the
exception of connection to aliphatic or aromatic nitrogen for the purpose of entering a nitrogen with a valence
greater than +3 (eg, Various quaternary ammonium compounds and hydrochlorides)...nitrogens with a valence of+3
or less ignore direct hydrogen entries. Hydrogen is entered as an upper case H (as'in the- following examples):
(l)acridmehydrochlqride: clccc2cc3ccccc3n(H)(CL)c2cl
(2) benzenepentanamine hydrochloride: clccccclCC'CCCN(H)(H)(H)CL
.' When to include the "HCL" in SMILES for various hydrochlorides depends upon the nature of the
hydrochloride...for example, most hydrochlorides represented generically as: ซFormula HCL can ignore the HCL;
however, most ammonium-type compounds (such as #2 above) require the direct hydrogens.
Aromatic Selenium !
Aromatic selenium can be entered as either (1) lower case se or (2) as [se] ....for example, selenofuran
could be entered as (1) clcseccl or as (2) clc[se]ccl ....if entered as: Cl=CSeC=Cl, ECOSAR Class Program will
automatically convert it to: clc[se]ccl ' .
Miscellaneous
In selected diazoacetyl compounds (eg. azaserine, N2=CH-CO-O-CH2-CH(-NH2)-COOH), the. N: is
commonly written as: N"=N". For the purposes of SMILES notation, the unit is considered as: N#N.
-------
APPENDIX D
Estimation of Water Solubility
The ECOSAR Class Program estimates water solubility using methodology developed for the
U.S. 'EPA and described in Meylan and Howard (1994a, 1994b, 1996).., The estimation equations
used in the current version are as follows: '.':.-.. .
No Melting Point Available: i ; .
log WaterSol (moles/L). = -0.312 - 1.02 log Kow : . . .
1 ป . - ' '
"Liquid at 25 deg C: :
log WaterSol (moles/L) = 0.551. -.1,091 log Kow
Solid at 25 deg C: , ! : " ' '
log WaterSol (moles/L) = 0.2236 - 1.009 log Kow -'O..00956(tm-25)
(where Tm is the melting point in deg C)
Note: all water solubility estimates pertain to 25 deg C.
20
-------
!'|i|ii| I;,!;,,,!!!! iiijiiiiiiliiiili liiidiUiIii B iliiilllnS I'll! jiJIiiinliiiiniil 1!,iiiliillllllllllllllllf !!ii nig, njjjjjjjjljjjiljjjiji , !n iiiiMnnE!!!1: Ii i,! I'lNllii". F'HEiiK tinii lll!!r"! 'liailPUiiiili; Bi;11"!,:!!;, JllfUBIfVV' p pi' 1C17>J T\ T V
l!!!!1!ll^^^^^^^ of J^S^AlT^Bemical (C'lassjes
ปJlllllil ni'liiiilllil'li'llilllillllinilllllllliETillllTlllil''^'!!!'!!!'!'!!!; M|li HlllillllHi' iiillllll,llllll|i, dill!1'' ,.': illiri! .,!i" i i'I|l< i'lll'iif J! iniBll'llliiiBIII'lllllllllflrr1 liiTliil!' H'liiil!" 'iilllh I1""' I!1 I!"|l|l!'! i|i|' ,\W 'I'lllliliill1'1 IB
.
^lEU1,!!:!^
IIUIM^^^^ lilfllllllllllll l{ซ3ซ ,!,:'! =j ' liปl!lป=.S gispiiE.!' i ซ, j ,1 > i '-,
alphabetic list of chemical classes identified from
|||,
ซ|!!^
"iiijiiLiiFซ,'ป,!,,!!!:! ill1 iliil^^ nl^
iiiliillili, ,, 'iiwiini!,,,111,1;,: ii!;r!l|iii|i,|,|^ i|, iiii,,,,|iiilii|i,kiiiii nil
notatjons ฃy g^e
IK3 CHIoride/Halide^
i^crylamides
n
Amines
ississis)
'Animesamino-brtlio)
Neutral Organics
Peroxy Acids
Phenols
Phenols (dinitro)
Propargyl Alcohols
Propargyl Ethers
'Quinone
;Salicylates
' Salicylic Acid
! Diazoniums
-;Diepoxides
:;;;;;: Diketones
Dimtro Aromatic Arnme
IlillllHilBlliilliilfill il'llllllilllllllll!if IIIBBBBaBBiilBiBlfMnFIBfliifftlBIBliiiiir'JBii"i < *!:! :J'!|?:',!ปJiBI'i, MBISIIIIIp!!!: >' !!?!B!!!!!!B;!" k IfSiS,
_, -I,!:. ,,,-,, .,[. IIII,,,,,,,,,,,, , ,: , III, "I,, ",.,,..., ~_,
I, , , ซ ,, , ', i, ,,, ' ',',,, ,|,, ''"
: ] , , i . , ' . , ',.',, ,, i1'' ' ''
Ji1!:1!1!',!inl||!|i:!!f!;iV JVfiKtMEllSIHI'l'11!11!1!!!11:;';If! i^EJiBSiESiBIE HEvEEEEEEEl4l|IBI!EBr'ElpEPIBlEBIiEBEIBI!'BIEE!1 EVBEBirBiv>!il|l!!EEB||lBEE;!JEiiEBITSSBBiEii EBEEB!;E'lB!1'!'',!'IB El,"i!':HII'I"i;AfSiEiliflBE!!';l BlllillililBSBJlBEErEB'EBBBBBIIIIlilll
: rt -i:! i-S ' ":
-------
"Nabhol'z JV, Cash G, Meylan WM, and Howard PH. 1998. ECOSAR: A
computer program for estimating the ecotoxicity of industrial
chemicals based on structure activity relationships. Washington,
DC: Risk Assessment Division, Office of Pollution Prevention and
Toxics, United States Environmental Protection Agency. MS-Windows
Version 0.99. Available from J. V. Nabholz, RAD(7403), USEPA, 401
M St, SW, Washington, DC 20460-0001, tel: 202-260-1271, email:
nabholz.joe@epamail.epa.gov, or P. H. Howard, Syracuse Research
Corp. 622.5 Running Ridge Rd, N. Syracuse, NY 13210, Tel: 315-452-
8417, email: hqwardp@syrres.com. ' .
-------
'
"11 ' '11111'1 1111''11111 1(
iitiiiiii'iM! Ji'iPpPinnD i.,! linn i liBiiiiiiiiiiniiii f tt\ \ ii'pipai1 ;,i i;
,111! I i i,;. i ,!: LJIIiliPf <|!!illl!llppi!!l|! iilllllillilPBI'Pi; lllii!!illll!lll|llpipl!ji.!l!i!lli<'>!li|pli I; i Jin niiipiji;!,!
''iiiiniiiii1 :i,,,< :i!!i'i< < ,ii|ii,i|iiiii mi ,i,: jihiiijnigiiiiiiiiiii,. iiiH'jSiiis1; iiiinii'iiiiiiiT i piiiiii.n njiiruiPrniiin 'ii'n.i'iiป' >!iili!i!iiii!l!'iiuiPK{iiiiiiiliPH|i iiiiiiiiiiniNiiin iiniliiiin,
liiiiiiipiirpniiiiniaiiniiiiiiii
jiHli nil :,| BiiH ซป'.(I i|liil!ili!il';:Jll|i!l!|iiiig!il|i i;lil!|ฃ; A JJf ill;:.:: aii'Slli, S,i,i!,:!i!:!!'''[aiHilM!: 111!:;:!:!!!!11!!!!!:!!! '.iJllieUII:! IX ~!i
'= "' 'K : !: if
, \ . '' ' Si, '
! !t!!!! i | ป =! i
,;; :,|,
lUliili!
------- |