National Environmental Supercomputing Center (NESC) Annual Report, FY 1994


            United States       National Environmental Supercomputing  EPA 208/K-95-QQ1
            Environmental Protection    Center (NESC)          February 1 1995
            AQency         135 Washington Avenue
                       Bay City, Ml 48708-5845
&EPA      FY 1994 National
            Environmental
            Supercomputing
            Center (NESC)
            Annual Report

-------
UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
NATIONAL ENVIRONMENTAL SUPERCOMPUTING CENTER
The NESC Mission:

The mission of the NESC is to
provide compute-intensive
processing for scientific appli-
cations that have a high prior-
ity within the EPA's overall
mission. These applications
will come from the EPA
researchers, regulatory staff,
and support personnel. In
addition, the NESC will ser-
vice those universities, agen-
cies, and private companies
external to the EPA having grants, cooperative agreements, and
memoranda of understanding with the EPA, in which their spi-
ence qualifies as compute-intensive and has a high priority ;
within the EPA.
The computational services of the NESC include:
•   Management of a wide range of hardware, software, and
    services into an integrated supercomputing service.
•   Centralized compute-intensive processing facility.      •
    Data communications networks.                      \
•   Consultation services relating to the NESC.
A secondary mission of the NESC is to promote environmental
modeling and computational science within local schools, by
means of academic-year and summer programs.

-------
   Contents
 Introduction to the NESC FY1994 Annual Fteport.	1
 Message from the NESC Director.	3
 Framework for Environmental Modeling	19
 Models of Service Level for Jobs Submitted to the NESC's HPC
 Resources..		23
 Second Annual International Environmental Visualization Workshop	31
 Calculating the Rates of DNA-Catalyzed Reactions of Aromatic
 Hydrocarbon Diol Epoxides	33
 Prediction of Oxidative Metabolites by Cytochrome P450s with
 Quantum Mechanics and Molecular Dynamics Simulations.,.	37
 High Performance Computing For Environmental Research ,i	45
 U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes
 Modeling (WELLM)	49
 Visualization for Place-Based Management	57
 Experimental and Calculated Stabilities of Hydrocarbons and
 Carbocations	59
 Infrared Frequencies and Intensities Calculated from MM3,
 Semiempirical, and Ab Initio Methods for Organic Molecules	65
 Ab Initio Calculations of Proton Affinities, lonization Potentials,
 and Electron Affinities of Polynuclear Aromatic Hydrocarbons  and
 Correlation with Mass Spectral Positive and Negative Ion Sensitivities	71
 Parameterization of Octanol / Water Partition Coefficients (LogP)
 Using 3d Molecular Properties:   Evaluation of Two Published Models
for LogP Prediction	77
 Regional Acid Deposition Model (RADM) Evaluation.		83
Atmospheric Deposition of Nitrogen to Chesapeake Bay		85
Study of the Feasibility of Acidic Deposition Standards	91
NESC Annual Report - FY1994

-------
Contents
1990 Clean Air Act Section 812 Retrospective Study		P	95

The Role of Supercomputers in Pollution Prevention: Predicting
Bioactivation for the Design of Safer Chemicals		.	i	97

Application of Molecular Orbital Methods to Qsar And Lfer:             |
Understanding and Characterizing Interactions of Solutes and Solvents, i	99

Calculated Infrared Spectra for Halogenated Hydrocarbons		i	105

Development of Physiologically Based-Pharmacokinetic and            ;
Biologically Based-Dose Response (PB-PK/BB-DR) Models             ;
for Dioxins and Related Compounds Incorporating Structure-Activity    i
Considerations	•	f •
                                                                      113
                                                                      117
The Chesapeake Bay Program Simulation Models:  A Multimedia       j
Approach	•	\ •

Visualization Techniques for the Analysis and Display of Chemical,     !
Physical, and Biological Data Across Regional Midwestern Watersheds . L	131

Estimation of Global Climate Change Impacts on Lake and Stream     ;
Environmental Conditions and Fishery Resources		f	135

Integration of Computational and Theoretical Chemistry with          i
Mechanistically-Based Predictive Ecotoxicology Modelling	 ^	139

Evaluation and Refinement of the Littoral Ecosystem Risk            !
Assessment Model (LERAM) Year Two		145
The Reactivities of Classes of Environmental Chemicals -
Understanding Potential Health Risks		
                                                                      149
                                                    NESC Annual Report - FY1994

-------
Introduction to the NESC FY1994 Annual Report1
The National Environmental Supercom-
puting Center (NESC) is EPA's latest invest-
ment to assure that science of the highest
quality is conducted to support environmen-
tal protection. The benefits of the NESC to
the public and EPA programs center on fos-
tering collaborative efforts among scientists
to bring the results of scientific research to
the decision making process.
To achieve success at the NESC, four
tightly integrated programs are maintained.
• First, operation of a supercomputing
resource to provide the maximum
amount of computing time to research-
ers is the backbone of the NESC
mission.
• Second, a strong computational science
support effort for Agency scientists is
essential to ensure the efficient use of
resources and the improvement of math-
ematical models.
• Third, an aggressive and innovative
program in visualization and presenta-
tion of scientific information is directed
toward scientists, environmental deci-
sion making officials, and the public.
• Fourth, collaborative efforts among all
groups are strongly supported through a
nationwide telecommunications network,
workshops and seminars, and educa-
tional programs such as Earth Vision:
EPA's Grand Challenge for High
Schools.
In its first years of operation, the NESC
has become the central resource for carry-
ing out the research programs that are vital
to large-scale studies of ecological and bio-
logical systems. Its continued success in
supporting these efforts depends upon the
collaboration among scientists, managers,
and the public.
The NESC remains dedicated to providing
not only supercomputing resources, but also
to providing the coHegial environment nec-
essary for that collaboration.
Walter M. Shackelford, Director of Scientific Computing, U.S. EPA, National Data Prcicessing Division (NDPD)
Research Triangle Park, NC 27711.
NESC Annual Report - FY1994

-------
Introduction to the NESC FY1994 Annual Report
                                                            NESC Annual Report - FY1994

-------
   Message from the NESC Director
 Overview of The NESC
   The United States Environmental Protec-
 tion Agency's (EPA) National Environmental
 Supercomputing Center (NESC) has a sin-
 gle purpose; to support world-class environ-
 mental research.  The NESC provides
 scientists and researchers with the high-per-
 formance computational tools necessary to
 solve EPA's Grand Challenges.
   On August 25,1992 the first EPA-owned
 supercomputer, a Cray Research Y-MP 8i/
 232,  was delivered and installed at the cen-
 ter. In 1994, this computer was replaced
 with a Cray C94/264 (subsequently
 upgraded to a C94/364). The replacement
 was required to satisfy the demands of com-
 pute-intensive jobs, (e.g., computational
 chemistry) and those requiring more mem-
 ory space (e.g., the Regional Acid Deposi-
 tion Model).
   The NESC is housed in an 80-year-old
 building near downtown Bay City. The build-
 ing was extensively renovated to make it
 suitable as a supercomputing center.
 Despite that renovation, the building retains
 much of its circa 1910 Chicago-style exte-
 rior, belying the state-of-the-art sophistica-
 tion and functionality that lies within. In.
 addition to housing the computer, the  build-
 ing includes a Visitor's Center, classroom,
 visualization laboratory, and staff offices.

 The NESC's Customers
   In FY1994, 51 research projects had allo-
 cations on the NESC's supercomputer.
 Table 1, page 4, shows a summary of the
 projects. Table 2, page 7, shows a break-
 down  of the FY1994 projects by EPA
 organization.
  Although the bulk of research projects at
the NESC come from the ORD laboratories,
 a broad cross-section of EPA organizations
 are using the computing resources at the
 NESC to approach problems in a wide vari-
 ety of disciplines, j
   Within ORD, the breakdown is shown in
 Table 3, page 7. Table 4, page 8, lists the
 NESC's customers by their area of research.

 NESC Usage   i

   The NESC became officially operational in
 October of 1992.  Within three  months con-
 strained demand for NESC's resources very
 nearly saturated the first Cray Y-MP super-
 computer and the third processor that was
 soon added. The following May (1993), the
 initial Storage Technology silo (capacity of
 about one Terabyte) was full of environmen-
 tal research data,  and a second silo was
 installed. On May 6,1994 the Cray Y-MP
 was replaced with a more powerful Cray
 C94. The C94's two processors were
 almost immediately saturated, and on Octo-
 ber 12,1994, a third processor was added.
 Demand for the firs* four months of FY1995
 has averaged 91%. At this time the upper
 limit of research demand is unknown.
  It is useful to visualize the distribution of
 projects and organizations together with the
 percentages of the use of supercomputing
 resources at the NESC.  Figure 1, page 9,
 shows the usage of the NESC Cray C94
 supercomputer (Sequoia) for FY1994. Fig-
 ure 2, page 10, shows FY1994 Sequoia
 usage by project.  Figure 3, page 11, shows
 Sequoia FY1994 usage by organization.
  Three projects (Earth Systems Modeling,
 Regional Acid Deposition Model, and Struc-
ture Activity Relations) use more than one-
half of the NESC's resources, and eight
projects (Regional Acid Deposition Model,
Earth Systems Modeling, Structure Activity
Relations, Regional Oxidant Model -
NESC Annual Report - FY1994

-------
Message from the NESC Director
                           Table 1: NESC FY 1994 Projects

Project
AQDECON
ARSENIC
ASTER
BUFFSED
CERES
CHBAY
CORVTEST
DOSMETRY
ENGINE
ERTOM
FOXSED
GENESIS
GENSED
GLOBCHEM
HPCC
HSPF
HYDRO3D
EPALab
RSKERL Ada
ORD
HERL
ORD
ERL Duluth
ORD
LLRS
ORD
ERL Athens
ORD
CBPO
ERL Corvallis
ORD
HERL
ORD
NVFEL
OAR
RSKERL
ORD
LLRS
ORD
ERL Athens
ORD
LLRS
ORD
AREAL
ORD
AREAL
ORD
ERL Athens
ORD
ERL Athens
ORD
•'/' JV^'::V a ;v'':^;, Description --; ,'< .>;";;!
Transport of Multiple Components in Aquifers under Bio-
logical or Chemical Reactions
Pharmokinetic modeling of arsenic speciation, ligiand
exchange chemistry and toxicokinetics ,
Database of Electronic Structures for Environmental Risk
and Toxicology
Buffalo River Contaminant Transport Modeling
i
Earth Systems Model
Chesapeake Bay Program Water Quality Models!
Corvallis Evaluation Account '
Multiple Paths Dosimetry Model for Gas Uptake j
Flow Field Fuel-Air Mixing and Combustion Process in an
Alternative Fueled Engine
Inverse Solution of Electrical Resistance Tomography
Fox River/Green Bay Contaminant Transport Modeling
GENESIS Earth Systems Modeling Project :
Sediment and Contaminant Transport and Fate in Aquatic
Systems j
Global Chemical Transport Modeling for MODELS3
I
High Performance Computing and Communications
Linked Airshed/Watershed/Water Quality Simulat on Mod-
els
Integration of 3D Hydrodynamic, Water Quality, ahd Sedi-
ment Models in Surface Waters

4 NESC Annual Report - FY1 994

-------
                                                          Message from the NESG Director
                        Table 1:  NESC FY 1994 Projects (continued)
Project
'•-'' AV'"-:s£'-
INDOOR
JUNEAU
LAKEGCM
LERAM
LISOUND
LUNG
MICHUAM
MICRO
MM4
MOHAVE
MOMOLSAR
NEWCHEM
ONTARIO
PAHLV
PCB
PHOTOX
QSARHERD
RADM
.^.EP^jLab
AREAL
ORD
EPA Region 10
ERL Duluth
ORD
ERL Duluth
ORD
EPA Region 1
HERL
ORD
EPA Region 5
AREAL
ORD
AREAL
ORD
AREAL
ORD
HERL
ORD
OPPT
LLRS
ORD
EMSL Las
Vegas
ORD
EPA Region 7
ERL Duluth
ORD
OPPT
AREAL
ORD
. •• ; .'v': "' ' '. Description',-'-'-.-; ;-; .. •..'•'•' '•'••*A
" ,': :•••-,. •• '• .-,:'•• '••:•' - . • .. .• ,•'. ••••'•":• •••".• • ' ••'•'':. ." •,"'•'-?£
Indoor Air Exposure Modeling Program
Alaska Juneau Mine, Gastineau Channel
Estimation of Global Climate Changes and Agricultural
Activities on Lakes and Streams
Evaluation and Refinement of Ecosystem Risk Assess-
ment Models
Nitrogen Discharge Allocation in Long Island Sound
Mathematical Modeling of Fluid Dynamics in Lung Airways
and Behavior of Inhaled Particles
Photochemical Modeling for the Detroit-Ann Arbor Nonat-
tainment Area
Exposure Assessment Research in Microenvironments
Mesoscale Meteorological Modeling for Air-Quality Simula-
tions
Intensive Meteorological and Dispersion Simulations for
the MOHAVE Field Study
Molecular Modeling !
Toxicity of New Chemicals
A Linked Hydrodynamic-Water Quality Model for Lake
Ontario
Ab Initio Calculations on Polynuclear Aromatic Hydrocar-
bons
Theoretical IR spectra of polychlorobiphenyls and dioxins
3rediction of Photo-oxidation by PAH Modeling
Molecular modelling for health effects research
Regional Acid Deposition Model ;
NESC Annual Report - FY1994

-------
Message from the NESC Director |
Table 1 : NESC FY 1994 Projects (continued)
Project
RELMAP
ROMA
ROMO
RPM
SACRAIR
SARMAP
SEDCRIT
SNAKE
SOS
SPECIES
SUBSURF
TOXMODE
TRANSPRT
UAM
VENTUFIP
WIPP
EPA Lab
AREAL
ORD
AREAL
ORD
OAQPS
OAR
AREAL
ORD
EPA Region 9
EPA Region 9
EPA
OW
EPA Region 10
AREAL
ORD
ERL Corvallis
ORD
RSKERL
ORD
ERL Duluth
ORD
RSKERL
ORD
AREAL
ORD
EPA Region 9
ORIA
OAR
/Description
.-.. :•- •••-••• .-•-," • '. ' .,' •:•'•:;;:•• '•.. ' ';v
Regional Toxic Deposition Modeling
r
Regional Ozone Modeling - Research and Development
Regional Ozone Modeling to Support Clean Air Act Man-
dates
I
Regional Particulate Modeling
Sacramento Area Tropospheric Ozone Modeling for FIP
SJVAQS/AUSPEX Regional Model Adaptation Project
Development of Toxic Sediment Criteria Methodologies
F
A River Basin Modeling Framework for Ecological Risk
Analysis |
Southern Oxidant Study: Urban-Scale Photochemical
Modeling
Heuristic Approximations to the Exact Set Coverage of
Biogeographic Data ;
Uncertainty Analysis of Subsurface Hydrocarbon Releases
Artificial Intelligence Systems for Toxic Mode-of Action
Predictions
Three-dimensional Multiphase Flow and Contam nant
Transport Mathematical Model
Urban Photochemical and Meteorological Modeling
Ventura Air Basin Federal Implementation Plan Study
Waste Isolation Pilot Plant Performance Assessment Mod-
eling
NESC Annual Report - FY1994

-------
                                                           Message from the NESG Director
                             Table 2: NESC Project Affiliations
Affiliation
Office of Research arid Development
Regional Offices
Program Offices
Office of Air and Radiation
Office of Prevention, Pesticides and Toxic Sub-
stances
Office of Water
Number of
Projects
36
8
1
3
2
1
                          Table 3: NESC Projects - by Laboratory
Laboratory^
' ?"' ~ . ,,'v^
V A ~, y
AREAL
ERL Duluth
RSKERL Ada
ERL, Athens
HERL
LLRS
ERL Corvallis
EMSL Las Vegas
.Number of ^
\ Projecfs' -
12
5
4
4
4
4
2
1
NESC Annual Report - FY1994

-------
Message from the NESC Director
                       Table 4:  NESC Project - Areas of Research
Research
Area
Air Quality
Computational
Chemistry
Water Quality
Ground Water
Transport
Global Climate
Modeling
Health Effects
Emissions
Ecological and
linked models
Meterology
Other
Number of
'Projects
13
9
8
4
4
2
1
5
2
3
f
Projects . , £
i
INDOOR, MICHUAM, MICRO, RADM, RELMAP, ROMA,
ROMO, RPM, SACRAIR, SARMAP, SOS, UAM, VENTUFIP
ARSENIC, ASTER, MOMOLSAR, NEWCHEM, PAHLV,
PCB, PHOTOX, QSARHERD, TOXMODE \
BUFFSED, CHBAY, FOXSED, GENSED, HYDRO3D,
LISOUND, ONTARIO, SEDCRIT |
AQDECON, ERTOM, SUBSURF, TRANSPRT
I
CERES, GENESIS, GLOBCHEM, LAKEGCM
DOSMETRY, LUNG
ENGINE
HSPF, JUNEAU, LERAM, SNAKE, SPECIES
MM4, MOHAVE
CORVTEST, HPCC, WIPP .
OAQPS, Polynuclear Aromatic Hydrocar-
bons-UC, San Joaquin Valley Air Quality,
Chemical/Ecological Relation, and Mesos-
cale Meteorological Modeling) use more
than three-quarters of the NESC's resources
(see figure 2, page 10.)
  Figure 3, page 11, illustrates measured
percentages of NESC usage by organiza-
tion. Two organizations, Athens ERL and
AREAL, use 60% of the NESC's supercom-
puter. Four organizations,  AREAL, Athens
ERL, HERL, and OAQPS comprise three-
quarters of the demand.  And seven organi-
zations, AREAL, Athens ERL, HERL,
OAQPS, Las Vegas EMSL, Great Lakes,
and Region 9, use more than 94% of the
supercomputing resources.
Hardware                   ,
  The NESC is a state-of-the-art supercom-
puting center. Figure 4, page 12, |is a simpli-
fied schematic of the NESC's hardware
configuration as of September 1994.  The
major hardware components are detailed in
the following paragraphs.       \

Cray C94 - Sequoia           '
  The NESC's supercomputer, named
Sequoia, is a Cray Research C94/364.  The
C94 was installed in May 1994 and replaced
the NESC's original machine, a Cray Y-MP
8i/364. Specifically designed for scientific
and engineering disciplines, the Cj94 is one
of the most powerful supercomputers cur-
rently available.              j
  Sequoia has three central processing
units (CPUs), 64 megawords of central
                                                          NESC Annual Report - FY1994

-------
                                                         Message from the NESG Director
    N
   0.
   o
          o
                                                                                    I

                                                                                    "5
                                                                                    o


                                                                                    I
o
nf
•=


I
O

i
                                                                                    £

                                                                                    il
NESC Annual Report - FY1994

-------
Message from the NESC Director
(B
CD
19 .Q
N E
o
CQ
0 CQ
0)
0 -
3 Q
IT O
** 4-1
00

-------
Message from the NESC Director
c
o

"S <*

US
C r"
CO i:
.Q
*- C

o I

•j^ ^^

O
k>
N 0)
5 ?;

"o °
3 *-
CT
01
w
• i i i i i i i i i i i

UJNJi-JJlJOOO "B *

||o|S-3Sc°S|||
ID

o
o
e
in CQ

o O
o
o
n
0
n
o
e
-I tt -I -I .1
I"
CL
CJ

ca
(D
CO
NESC Annual Report - FY1994
11
-------
Message from the NESC Director
Cray C94/364
3 processors
64 Megawords memory
512 Megaword SSD
Communications
links
SAS server
DG Aviion 5240
StorageTekSTK4400
Tape Silos (2)
2.4 Terabytes total
Chemistry server
SGI Indigo 2
I Jl J
1600/6250 9 track
tape drives (4)
DS60 disk drive units (24)
DS62 disk drive units (16)
90.7 Gigabytes total
Rgure 4: Hardware Configuration Overview, National Environmental Supercomputing
Center (NESC) - September 1994
memory, and a 512 megaword solid-state
storage device (SSD). The C94's clock
speed is 4.2 nanoseconds, with a theoretical
peak performance of 1.024 billion floating
point operations per second (Gigaflops).
With three CPUs, Sequoia's theoretical peak
performance is three Gigaflops and, if fully
configured with four CPUs, it is capable of
achieving a theoretical peak performance of
four Gigaflops.
In addition to speed, a truly balanced
supercomputer must optimize data retrieval
and storage. Sequoia has very fast input
and output (I/O) channels, which enable it to
read and/or write data at a pace in keeping
with the speed of its CPUs. Internal transfer
rates of up to 200 megabytes per second
(MB/second) are achieved between the
CPUs, memory, and the I/O clusters.
Sequoia's highest transfer rates are
achieved when communicating at 1,800 MB/
second with the SSD. The SSD is used as
an extremely high-speed intermediate stor-
age area for both system and user data, and
can provide significant speedups tar
l/O-intensive applications. This, coupled
with high-speed disk data storage (see the
next section), results in an extremely well-
balanced computing environmentfor the
NESC's customers. j

High-Speed Data Storage :
Augmenting Sequoia's central memory
and SSD are twenty-four DD-60 a;nd sixteen
DD-62 disk drives. Eight of the DJD-60, and
all sixteen of the DD-62 disk drives were
installed during FY1994. Each DD-60 drive
can store 1.96 billion bytes (Gigabytes) of
data; each DD-62 holds 2.73 Gigabytes.
The disk drives give the NESC a total high-
speed storage capacity of more than 90
Gigabytes (9 xio'0 bytes) of information.
12
NESC Annual Report - FY1994
-------
Message from the NESC Director
These disk drives are connected to Sequoia
by 40 data channels, each capable of a
20MB/second transfer rate.

Mass data storage
In addition to disk drives, the NESC has
two StorageTek 4400 robotic tape silos.
Each silo contains approximately 6,000 tape
cartridges and is capable of storing 1.2 tril-
lion bytes (Terabytes) of information. Com-
bined, the two units provide a total storage
capacity of 2.4 Terabytes (2.4x1012 bytes) of
data. All tape handling is performed by
robotic arms and is completely automatic
and "transparent" to the user. Two six MB/
second data channels connect the silos with
the Cray.
Other data transfer media, including
"round" tape facilities may be available upon
special request.

Specialized Servers
To provide expanded distributed comput-
ing functionality to EPA research community,
in FY1994 the NESC added two specialized
server machines to its computing resources.
"Sassafras" is a Data General Aviion 5240
with four CPUs, 192 Megabytes of central
memory, and 15.4 Gigabytes of disk stor-
age. Sassafras is dedicated to the manipu-
lation and statistical processing of data files
used by air quality modelers.
The NESC's computational chemistry
users are running applications on "Almond",
a Silicon Graphics (SGI) Indigo2 with one
CPU, 160 Megabytes of memory, and five
Gigabytes of disk storage. Almond pro-
vides a platform for chemistry software that
is not available or appropriate for use on the
Cray, and complements the larger applica-
tions that run on the C94.

Telecommunications
In order to meet its mission, the NESC
must serve customers throughout the United
States. From its location in Mid-Michigan,
the NESC uses a sophisticated telecommu-
nications network to serve customers at EPA
sites around the country.
The NESC's telecommunications network
consists of both a Local Area Network (LAN)
and a Wide Area Network (WAN). Each is
described in greater detail in the following
paragraphs.

NESC LAN ;
The NESC's LAN, shown in Figure 5,
page 14, is responsible for communications
inside the NESC. It consists of four Ethernet
backbones, capable of transmitting data at a
rate of 10 million bits per second (MbS). In
addition, there is a single Fiber Distributed
Data Interface (FDDI), which moves data at
100MbS. These networks are responsible
for moving data within the NESC.

NESC WAN
The NESC's WAN is illustrated in Figure
6, page 15, and is responsible for moving
data between the NESC and its customers.
The WAN consists of one T3 transmission
link and two T1 transmission links. The T3
link, which is capable of a peak transmission
rate of 45MbS, connects the NESC with
EPA's largest research facility in Research
Triangle Park (RTP), North Carolina.
One T1 link, which has a peak data trans-
mission rate of 1.5MbS, also links the NESC
with EPA's RTP facilities. The other T1 link
connects the NESC to EPA's Cincinnati
communications hub. The second T1 link
connects the NESC with MichNet, which in
turn, connects the NESC with the NSFNet
and the Internet. Plans are in place to
upgrade the MichNet link to a T3
connection.
Telecommunications routing is handled
through three NSC high speed routers.
These routers are fully redundant, with each
router capable of managing all telecommuni-
cations traffic. Two 12MB/second data lines
connect the routers to Sequoia.
Through the use of UNIX TCP/IP proto-
cols and the File Transfer Protocol (FTP),
NESC Annual Report - FY1994
13
-------
Message from the NESC Director
i i i i rn
V H' V H- \S
;
M/ i- \i

! ! ' | ' ! '
1
CB

I
£

I
ruT iui
14
NESC Annual Report - FY1994
-------
Message from the NESG Director
Tl CIliCUIT «1.S«« MI-PS)
aa Ki-PS SKA UACM-OKl: KEIAVOliK
6U KEFS X.2S EACKECNB KETWORK
LAK IHilDGI: ClliCUITS
TCP/IP CUiUUITS
MSFS ANALOG ClliCUITS
T5 CIliCUIT (
-------
Message from the NESC Director
Cooling
Consumption of all that electrical current
produces an unwanted by-product of super-
computing, considerable heat created by the
computer's densely-packed circuitry. With-
out sufficient cooling, Sequoia would be sub-
ject to extensive thermal damage.
To keep Sequoia functioning within its
thermal envelope, the NESC has three 110-
ton chilling units which operate through two
175-ton cooling towers. In the event of a
total failure of the chilling units, a 2,000-gal-
lon chilled water reservoir provides up to 15
minutes of emergency cooling capacity.

Monitoring
During FY1994 the NESC completed the
installation and configuration of a Darwel
monitoring system. The Darwel system is a
PC-based system that is connected to sen-
sors throughout the NESC. It is used to
monitor the condition and security of all vital
facility support systems.

Software
To complement the NESC's supercomput-
ing hardware, the NESC supports EPA
researchers and scientists with specialized
scientific application software packages.
Table 5, lists the software applications that
are available to researchers as of Septem-
ber 1994.
The Cray supercomputer runs Cray's
standard operating system, UNICOS, which
is an acronym formed by the words UNIX
and C_ray Operating System. As the first
part of the acronym suggests, UNICOS is a
UNIX System V-based system with Univer-
sity of California-Berkeley extensions. This
means that the UNICOS internals have been
extensively modified to make the system
usable on a supercomputer.
UNIX is the de-facto standard operating
system in the scientific community. By using
a UNIX-based operating system, research-
ers can easily move their programs and
applications between their local environment
Table 5: Software applications available on
Sequoia (as of September 1994.)
Discipline
Chemistry
Mathematics /
Statistics
Data Exchange
Graphics / Visual-
ization
Application ;; §?
I \*' '".':":"
Amber 4.0 :
AMSOL3.0 |
CHARMm 22
DISCOVER 2.9.0
DMol 2.3 i
GAUSSIAN 92
MOPAC 6.0.2
IMSL 2.0
NAGIibmarkhe
LIBSCI ;
netCDF
AVS 5.0 I
NCAR Graphics 3.2
and that of the NESC. Once a user
becomes familiar with UNIX, those skills are
transferrable across a number of hardware
platforms, including the Cray. ;
Another advantage of a UNIX pllatform is
its adaptability to Distributed Computing. Be
it through Massively Parallel Processing
(MPP) or some form of distributed
computing such as Parallel Virtual Machine
(PVM), UNIX permits the NESC to readily
embrace future trends in large-scale scien-
tific computing. |
I
Visualization ;
In addition to "crunching numbers", the
power and speed of a supercomputer is ide-
ally suited to supporting the extensive use of
graphical visualization. EPA scientists can
call upon state-of-the-art graphical visualiza-
tion and computer-modeling capabilities to
augment their research. These visualization
techniques permit the NESC's users to "see
the unseeable". ;
Using graphically-based scientific work-
stations, environmental researched develop
complex mathematical models of air
16
NESC Annual Report - FY1994
-------
Message from the NESC Director
pollution, atmospheric conditions, the chemi-
cal components of pollution, and other EPA
"Grand Challenges". The speed and data
handling capabilities of supercomputers
allow environmental scientists to model the
interaction of the complex variables that,
until now, could not be simulated in the
laboratory.
Another important aspect of the NESC's
visualization support is in the vital area of
Computational Chemistry. This rapidly
developing branch of chemistry permits
chemists to use a supercomputer in place of
their more traditional test tubes and flasks.
Computational Chemistry experiments are
intuitive, fast, and cost-effective.
The NESC features a state-of-the-art visu-
alization laboratory staffed by experts in sci-
entific visualization. EPA researchers are
encouraged to use the laboratory and its
staff to transform their research data into
strikingly meaningful graphical images.
The NESC's visualization group includes
skilled visualization specialists located at the
NESC in Bay City and at EPA's Scientific
Visualization Center in RTP. They are avail-
able to serve EPA's researchers with per-
sonalized service and advice.
Visualization training classes have been
held for customers, both onsite and at cus-
tomer sites. The training classes focus on
the visualization tool kit adopted as EPA
standard - Advanced Visualization Systems
(AVS). Highly successful visualization con-
ferences have been sponsored by EPA and
have featured nationally-recognized
speakers.

Earth Vision
EarthVision, EPA's educational program,
is administered through a cooperative
agreement with Saginaw Valley State Uni-
versity, a local university. This is a competi-
tive program whereby high schools submit
proposals for EPA evaluation. If selected,
the school's students and teachers partici-
pate in a tutorial program on Saturdays dur-
ing the academic year. The high school
teams then submit another proposal and a
winner is selected for a three-week Summer
Research Institute. It is during this institute
that the schools work on their accepted
projects. The schools continue to work on
their projects during the academic year, and
produce a report on their research. Exam-
ples of some research projects are: Uptake
and Food Chain Transfer of Polychlorinated
Byphenyls (PCBs) in the Zebra Mussel (Dre-
issena polymorpha) and The Computer
Modeling and Visualization of Contaminant
Plume Development in Aquifers.

The NESC Staff
In addition to its hardware and software, a
world-class supercomputing center requires
considerable talent and expertise on the part
of its staff. The NESC's staff includes
experts in supercomputing operations, plan-
ning, computational science, and related
fields. The NESC staff is organized into the
following functional areas:
• Operations
• Systems Support
• Scientific User Support
• Facilities ',
• Visualization '•
• Documentation
• Management
• Telecommunications
The NESC's staff is dedicated to support-
ing the users. Staff expertise is available to
assist researchers with questions about
computer systems, UNIX, code optimization,
application porting, and documentation.
User contacts and inquiries are encouraged.

Customer Support
The NESC's customers are supported by
a special scientific group dedicated to cus-
tomer satisfaction. Staff members include
scientists with advanced degrees in physics,
chemistry, and computer science. This
group helps scientists port their codes to the
supercomputer and optimize codes in order
to meet customer requirements. This group
NESC Annual Report - FY1994
17
-------
Message from the NESC Director
has coordinated workshops in computational
chemistry and water modeling.

User Outreach
The arrival of Working Capital Funding
demands a closer working relationship
between the environmental researchers
and experienced NESC Scientific Customer
Support staff to ensure that real and impor-
tant research needs are adequately met.
This process, started in FY1994 under the
title of "outreach", will become more efficient
as communications improve and computa-
tional research needs are better understood.
Outreach and collaboration with computa-
tional scientists at the NESC will materially
improve the efficiency with which the
NESC's resources are used and save EPA
significant funding in the process.

Collaborative Modeling.
With the rapid pace of today's research,
scientists are increasingly turning to the
Internet, rather than the customary journals,
for the latest in research information. As col-
laborative tools such as collaborative visual-
ization (Mbone) improves and becomes
available at the centers of research, collabo-
rative modeling between environmental
researchers thousands of miles apart will
become the preferred way to work together.
This process will also be augmented by col-
laboration with experienced computational
scientists at the NESC and at RTF.
i

Future Directions at the NESC
The future is always predicated upon the
availability of funding supporting the NESC.
Assuming funding is adequate, one can
speculate on probable future activities at the
NESC. To keep up with resource demand at
the NESC, at least one more processor will
be needed to reach the maximum.processor
configuration on the existing C94,> bringing
the theoretical peak performance to about
four Gigaflops. In addition, sometime
between May and September of 1995,
AREAL-ORD funding will allow installation of
a Cray T3D Massively Parallel Processor
(MPP) computer at the NESC, an; architec-
ture useful for scientific applications that are
highly parallel in nature. Finally, FY1996 will
probably bring with it the installation of the
next generation of Cray Supercomputer.
1 Arthur G. Cullati, U.S. EPA Director, National Environmental Supercomputing Center (NESC), 135 Washington
Avenue, Bay City, Ml 48708,517-894-7600. !
18
NESC Annual Report - FY1994
-------
Framework for Environmental Modeling
Background
The ever-increasing complexity in the
nature and extent of environmental prob-
lems demands the use of more accurate and
reliable integrated assessment tools by
those directly responsible for solving the
problems. Many of these more scientifically
credible models, however, are difficult and
time consuming to use. A different model
exists for each pollutant of interest (i.e.
ozone, acid deposition, particulate, nitrate
loading) and for both the urban and regional
scales. These individual models do not con-
sider interactions between pollutant media,
such as air and water. Control of one pollut-
ant could adversely affect the concentration
of another pollutant, thus all key related pol-
lutants must be considered simultaneously
for integrated environmental assessments.
Technical problems, such as slow execution
times, inadequate access to large volumes
of data, incompatibility of file formats, awk-
ward human-computer interfaces, and diffi-
culty maintaining existing codes or porting
large codes to new computer architectures,
severely limit the usefulness of existing envi-
ronmental assessment tools by scientists
and regulatory analysts. Therefore, an
extensible framework is being developed to:
• provide a community platform for contin-
uous improvement of the scientific basis
of environmental models
• make model application, data, and policy
relevant information more accessible to
a variety of users to enable more effec-
tive environmental decision making
• support the infusion of emerging high
performance computing and communi-
cations technology and associated
numerical algorithm implementations.
Research Objectives
The three main objectives of the research
are to: ;

• develop the Agency's capability to per-
form complex multi-pollutant and multi-
media pollutant assessments
• build the States' capability to use more
advanced assessment tools directly
responsive to iheir needs
• position the Agency to more easily inte-
grate emerging computing technology
into key assessment tools to ensure the
most reliable and timely response to key
environmental issues.
The initial program is focused on develop-
ing a common framework, called Models-3,
to address multi-pollutant and multi-scale air
quality managemeint issues with the flexibil-
ity for future extension to cross media (air
and water) issues.

Approach ;
The initial version of Models-3 is designed
to provide state-of-the-art urban and
regional ozone, acid deposition, and aerosol
modeling; user-friendly human-computer
interaction; automated management of pro-
cessing, data, and resources; and enhanced
tools for analyzing environmental informa-
tion. Atmospheric processes are treated as
interchangeable science modules to enable
rapid testing and integration of new science.
These modules contain explicit formulations
of scale and coordinate dependencies. To
achieve acceptable turnaround for its users,
the system incorporates high performance
computing and communications technol-
ogy. For example, key algorithms are being
adapted to take advantage of parallel com-
puting. The Models-3 system will also be
NESC Annual Report - FY1994
19
-------
Framework for Environmental Modeling
extensible to satisfy special distributed pro-
cessing needs of mixed-media modeling
(e.g. simultaneous simulation of air and
water quality).
Rapid prototypes of user interfaces, data
and resource management, system control
and communication, science process, and
analysis and visualization are developed to
test the feasibility of different approaches
and to resolve critical design issues. The
system implementation is based on formal
system requirements and design specifica-
tions integrating the knowledge gained
through rapid development and user testing
of prototypes.

Results
The first five volumes of "EPA Third Gen-
eration Air Quality Modeling System" are in
final review prior to EPA clearance:
Concept
Project Management Plan
Project Risk Management
Project Verification and Valli-
Volume 1:
Volume 2:
Volume 3:
Volume 4:
dation
Volume 5: Project Configuration Man-
agement
An early prototype of a simplified air quality
model with linear chemistry demonstrated
that a modular approach could effectively
provide interchangeability among science
modules. Scalable algorithms with general-
ized coordinate systems for key physical
and chemical processes are being tested,. A
benchmark chemistry solver was imple-
mented to serve as the baseline for perfor-
mance evaluation of chemistry algorithms
on parallel architectures. The mesoscale
meteorological model (MM5) has been
ported to Sequoia and is being tested for
multiscale data generation.
Several Application Visualization System
(AVS) modules have been written to provide
in-house researchers with desktop data
analysis and animation capabilities which
directly access data sets on Sequoia. Initial
versions of modules for X-Y plots, colored
tiles, colored mesh, and colored tile and col-
ored mesh comparison have been created
and linked with a generic (reusable) AVS
module named "Run Visualization" which
serves as a driver for selecting and running
data display modules. Scientists are able to
visualize both input and output data associ-
ated with the Regional Oxidant Model, the
Regional Acid Deposition Model, and the
Urban Airshed Model. These tools handle
data access across computing platforms,
plotting and manipulating graphic| images,
and automatic selection of background
maps. |
Rapid prototyping of system framework
capabilities such as user interface, model
builder, collaborative tools, interactive analy-
sis/visualization, and decision support have
been completed to better define require-
ments and design alternatives for1 environ-
mental decision support systems;
Knowledge gained from user feedback on
early prototypes is being used to provide a
balanced understanding of the hujman and
machine interaction issues and for formal
specification of the Models-3 system
requirements and design. ;
Researchers at MCNC and NCSU have
demonstrated the use of dependency
graphs for automated execution of multiple
dependent programs including meteorology
and emissions processing for thejUrban Air-
shed Model The processing graphs are
created using the Heterogeneous! Network
Computing Environment (HeNCE) - a public
domain integrated graphical environment for
creating and running parallel programs over
a heterogeneous collections of computers.
These directed graphs which specify both
data and execution dependencies among
the tasks, support the building, stbring, and
reuse of large sequences of executions,
manage the scheduling and execution of the
computational processes, automate the
retrieval and storage of the data required as
inputs and produced as outputs. |MCNC
also completed a portable suite of modules,
data, and networks for an analysis compo-
nent for the environmental decision support
20
NESC Annual Report - FY1994
-------
Framework for Environmental Modeling
system. Interactive analysis has focused on
an interface for controlling compilation, exe-
cution and visualization within the prototype.
Communication channels between different
computational modules are an important
area of extensive testing.

Future Objectives
During FY1995 and FY1996 the detailed
design specifications for the Models-3
framework will be finalized, coded and
tested. Tests will also be performed to ana-
lyze communications issues between cou-
pled meteorology and air quality models. A
variety of tests will be performed to under-
stand the advantages and limitations of dis-
tributed computing in a hardware
environment that employs both vector and
parallel processing components. Models-3
visualization prototypes will be tested to
evaluate latency effects in a network parallel
environment where functional modules are
executed on distinct systems to support ani-
mation, image rendering and image dis-
play. Remote collaborative computing
approaches will be evaluated for
effectiveness.
Relevant Reports and Publications
Byun, D.W., A.F. Hanna, et al., "Models-3 Air
Quality Model Prototype Science Concept
Development". Transactions of the
A&WMA Specialty Conference on
Regional Photochemical Measurements
and Modeling Studies. November, 1993,
San Diego, Caliifornia.
Coats, C.J., Jr., A.F. Hanna, et al., "Model
Engineering Concepts for Air Quality Mod-
els in an Integrated Environmental Model-
ing System.1' Transaction of the A&WMA
Specialty Conference on Regional Photo-
chemical Measurements and Modeling
Studies. Novemiber, 1993, San Diego,
California.
Dennis, R.L., D.W. Byun, et al., "The Next
Generation of Integrated Air Quality Mod-
eling: EPA's Models-3." Transactions of
the A&WMA Specialty Conference on
Regional Photochemical Measurements
and Modeling Studies. November, 1993,
San Diego, California.
1 Joan H. Novak*, Atmospheric Characterization and Modeling Division, Atmospheric Research and Exposure
Assessment Laboratory (AREAL), RTP, NC 27711 (*on assignment from the National Oceanic and Atmospheric
Administration (NOAA), U. S. Department of Commerce.)
NESC Annual Report - FY1994
21
-------
Framework for Environmental Modeling
22
NESC Annual Report - FY1994
-------
Models of Service Level for Jobs Submitted to the NESC's
HPC Resources1. *
Abstract
The Network Queueing System (NQS)
complex on the National Environmental
Supercomputing Center's (NESC) C94/264
was analyzed to determine service levels as
required by the U.S. EPA's National Data
Processing Division (NDPD) under the exist-
ing contract with Martin Marietta Technical
Services, Inc. (MMTSI). The queueing the-
ory method described in the fiscal year 1993
Annual Report has again been applied in the
analysis of jobs submitted to twenty-two
public and four private queues on Sequoia
over an approximate five month period dur-
ing fiscal year 1994. As previously
observed, the analysis showed that the
probability density function of queue wait
times and service (or wallclock) times are
hyperexponential distributions and that pro-
cess rates and mean times are easily
extracted after a fit to the empirical data.
The queueing theory results have been
entered into the qperf command on the
Cray and this gives the expected queue wait
time and service time for the queue appro-
priate to the user specified CPU and mem-
ory requirements. While past performance
is no guarantee of future results, stability of
the analysis is only suspect if the character
of the whole job population changes drasti-
cally at some future time. Clearly, the longer
the time interval of the sample the more sta-
ble the prediction and therefore the analysis
has been updated on a regular basis at the
NESC.

Introduction
It was a requirement of EPA's National
Data Processing Division (NDPD) under the
existing contract with MMTSI that users of
the NESC had access to a quantitative mea-
sure of service levels for jobs submitted to
the Cray resource at the NESC. Detailed
information on job processing is available
from Cray Standard Accounting (CSA) and,
at the NESC, locally written code extracts
job-level transaction data from the CSA
super-record on a daily basis. This process
tabulates (for each batch job) the CPU, ser-
vice (wall-clock) and queue wait times, and
provides the data for queue analysis with a
view to determining the quality of service
users enjoy at NESC.

The NESC NQS System and Job Level
Data Collection

The NESC's supercomputer, Sequoia,
has twenty-two public and four private
queues identified as shown in Table 1 on
page 26. The period of the sample repre-
sents approximately five months of through-
put on Sequoia between commissioning of
the two CPU C94 and the upgrade to a three
CPU configuration. Table 1 shows, the sam-
ple size, N, for the respective queues, the
queue limits on memory and CPU time, and
also some descriptive statistics on the actual
CPU time used. It is interesting to observe
that the mean CPU time is invariably signifi-
cantly smaller than the queue CPU time
limit. A total of 6,796 jobs were processed in
this sample and Table 1 shows the distribu-
tion with respect to the queues. Table 1 also
shows values of the coefficient of variation
(sample standard deviation divided by the
arithmetic mean). This is typically larger
than unity and this indicates the likelihood of
a hyperexponential distribution. Although
not shown here, a simiiar result holds for
queue wait and sen/ice times.
NESC Annual Report - FY1994
23
-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
Elements of Queueing Theory and
Analysis of NESC Data
Once a job has been dispatched to a par-
ticular queue by NQS it can be viewed as
residing in a single server queue system.
Queueing theory then applies and treats
these times as random variables or, observ-
ables that do not have individually predict-
able values but whose values show
statistical regularity. In particular, a random
variable, X, is completely described by a
probability distribution function,
F(t)=Prob{X < t}, or by the corresponding
probability density f(t)=dF/dt. The latter is a
frequency distribution and may have various
possible shapes depending on the details of
the queueing system. However, in this anal-
ysis it is found that the distribution of queue
and service times is dominated by exponen-
tial shapes such as f(t)= ne'1*1, t > 0, or com-
binations of exponentials
(hyperexponential). Therefore, the analysis
requires the determination of the rate
parameter p. from which the probability distri-
bution is computed as F(t)=1-e^1. The func-
tion F(t) is also known as the cumulative
probability because it represents the "area"
under the density curve f(t). Since both job
queue wait and service times have been
recorded at the NESC, each is analyzed in
four simple steps: (1) a sort into bins in a his-
togram plot, (2) the fitting of the resulting dis-
tribution with jae"^ to determine n, (3)
computation of the mean as 1/u, (property of
the exponential distribution) and, (4) compu-
tation of F(t). The observed distribution is
characteristic of a hyperexponential func-
tion where the forward peak is described by
an exponential with one rate and the tail is
described by another exponential with
another rate. The mean times are simply
the inverses of the respective rates for each
exponential distribution. The fact that there
is a hyperexponential distribution shows that
the jobs are of (at least) two types. How-
ever, since the largest fraction of the sample
falls in the peak of the distribution, the analy-
sis focused on results for an exponential dis-
tribution which fits this peak.
Table 2, page 27, shows the values
obtained for the mean queue rate from an
empirical fit to the forward peak of the
observed probability density. Cases where
N is small (less than 40) should be viewed
with some circumspection and cases with
very small samples (less than 10), as intelli-
gent guesses at best. Table 2 shows the
distribution of mean queue times as a func-
tion of the queue. This is seen to| vary over
many orders of magnitude from 0:01 min-
utes for the 4MW, 600 second queue to over
1,000 minutes for queues with 10(3,000 sec-
ond time limits. Table 3, page 28, [shows the
corresponding results for the mean service
rate and mean service time, again for expo-
nential empirical fits to the forward peak of
the probability density. The same bomments
on sample size apply here also. •
Table 3 shows the distribution ojf mean
service times which varies from one minute
to 4,264 minutes. One method of [estimating
a gross throughput is to use the sum of the
individual rates given in the row marked
"TOTALS" In Tables 2 and 3. Forjthe queue
times this gives a rate of 162.7 per minute, a
value biased by two large entries of 80.4 and
81.6, which, when subtracted, leaye the
gross queue estimate of 0.8 jobs/minute.
For the service time the gross rate is 1.8
jobs per minute, with a strong bias of 0.77
from one queue, which when subtracted,
leaves 1.02 jobs per minute. Thus, as a
gross estimate, a two CPU C94 configura-
tion processed one job per minute in the
mean. However, it should be stated that this
simple argument takes no account of vari-
ability in the data. Also, while mean rates
and mean times are shown in Tables 2 and
3, the mean time alone is not the best indi-
cator of service level. Variability of the data
is taken into account when the cumulative
probability is computed as the accumulated
area under the corresponding probability
density distribution. Also, the cumulative
probability distribution curve shows the
probability (or likelihood) that a given job has
the predetermined queue/service Itime cho-
sen for any specified value of the [time.
NESC Annual Report - FY1994
-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
Sequoia's qperf command has been
revised to include the results of the C94
analysis and made available online to users
of the NESC. This command generates a
tabular form of the model probability distribu-
tion function for both queue wait and service
times. The user specifies the CPU time and
memory requirement and the appropriate
queue is selected. Figure 1, page 29,
shows an example of how the command is
used and the resulting output that the com-
mand provides the user.
Conclusions
A five month sample of job level NQS data
on Sequoia as a G94/264 configuration has
been analyzed by a simple single server
queueing model. The resulting analysis
enables the prediction of expected job ser-
vice levels for queue wait and service times.
These predictions have been made avail-
able to NESC's users through a revision of
the on-line qperf command which tabu-
lates expected queue and service times and
associated probabilities.
1 George Delic, Martin Marietta Technical Services, Inc. (MMTSI), National Environmental Supercomputing Center
(NESC), 135 Washington Avenue, Bay City, Ml 48708-5845.
2 Robert Upton, Martin Marietta Technical Services, Inc. (MMTSI), National Environmental Supercomputing Center
(NESC), 135 Washington Avenue, Bay City, Ml 48708-5845. ,
NESC Annual Report - FY1994
25
-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
Table 1: Queue names and sample staitistics for CPU times for the twenty-two public and four
private queues of the NESC's NQS for a sample corresponding to the period 6 May to 30
September 1994 when a Cray C94 2/64 was installed at the NEESC. Queue limits are shown in
million words (memory) and seconds (CPU time). The sample size for each queue is shown in
the column labelled N. i
Queue name
Q4MW_600S
Q4MW_3600S
Q4MW_100KS
Q4MWJJNLS
Q8MW_600S
Q8MW_3600S
Q8MW_100KS
Q8MWJJNLS
Q12MW_600S
Q12MW_3600S
Q12MW_100KS
Q12MW_UNLS
Q18MW_600S
Q18MW_3600S
Q18MW_100KS
Q18MWJJNLS
Q24MW_600S
Q24MW_3600S
Q24MW_100KS
Q24MWJJNLS
Q40MW
AREAL1 (RADM)
AREAL2 (ROM)
AREAL_ROMDP
AREAL_RADMDP
NIGHT
Memory
(MW)
4
4
4
4
8
8
8
8
12
12
12
12
18
18
18
18
24
24
24
24
40
10
16
6
4
24
CPU
(sec)
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
999999
999999
999999
1830
1830
2880
N
•? -/
1136
1588
338
9
98
183
300
36
597
477
375
27
127
181
351
72
45
43
83
43
60
202
414
5
0
6
Mean ,
CPU ;
time
(min) ~
0.82
7.45
114.27
889.46
0.90
17.19
109.74
326.13
2.81
12.83
94.64
1070.54
2.04
12.08
90.88
174.63
1.71
17.74
116.29
623.65
60.57
135.17
51.13
0.03
-
165.88
Standard
deviation
1.88
12.55
211.31
1239.80
1.76
17.37
106.34
578.47
2.48
16.13
143.28
1771.95
2.91
16.91
146.99
389.51
2.76
1.38
145.92
929.47
103.81
93.53
52.27
0.03
-
147.64
Coefficient
of
variation
I 2.29
| 1.68
; 1.85
i 1.39
1.94
i 1.01
| 0.97
; 1-77
! 0.88
j 1.26
j 1.51
I 1.66
! 1.43
! 1.40
| 1.62
| 1.66
I 1.61
i 0.98
i 1.25
i 1.49
i 1.71
i 0.69
I
\ 1.02
i 0.96
i
I 0.89
26
NESC Annual Report - FY1994
-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
Table 2: Queueing model analysis results of Sequoia NQS data for a sample corresponding to the
period 6 May to 30 September 1994 when a Cray C94 2/64 was installed at the NESC. The results
show queue rate parameters and mean queue wait times as installed in the qperf user tool on
Sequoia.
"* y -e
<. s- *? *
*-*'C\,
Queue name'
A < Z» *V
f ^s
^ ' 0
Q4MW_600S
Q4MW_3600S
Q4MWJOOKS
Q4MW_UNLS
Q8MW_600S
Q8MW_3600S
Q8MW_100KS
Q8MWJJNLS
Q12MW_600S
Q12MW_3600S
Q12MW_100KS
Q12MW_UNLS
Q18MW_600S
Q18MWJ3600S
Q18MW_100KS
Q18MW_UNLS
Q24MW_600S
Q24MW_3600S
Q24MW_100KS
Q24MW_UNLS
Q40MW
AREAL1 (RADM)
AREAL2 (ROM)
AREAL_ROMDP
AREAL_RADMDP
NIGHT
TOTALS
Memory
/-(MW)
4
4
4
4
8
8
8
8
12
12
12
12
18
18
18
18
24
24
24
24
40
10
16
6
4
24

CPU (sec)
^ •>
•?
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
999999
999999
999999
1830
1830
28800

N
1136
1588
338
9
98
183
300
36
597
477
375
27
127
181
351
72
45
43
83
43
60
202
414
5
0
6
6796
Mean
.queue
" rate
(number/
min)
80.3958260
0.1909948
0.0176312
0.0339270
0.1061542
0.0094518
0.0006956
0.0021341
0.0274937
0.0113368
0.0008207
0.0022504
0,0082988
0.0220156
0.0073623
0.0025847
0.0087332
0.0047635
0.0050432
0.0010706
0.1050747
0.0005905
81,5830440
0.2132761
-
0.0008520
162.761425
Mean
queue
time
(minutes) 1"
0.012
5.236
56.718
29.475
9.420
105.801
1437.631
468.591
36.372
88.208
1218.470
444.356
; ,t2G.49<3
45.422
135.828
386.895
114.506
209.932
198.288
934.091
9.517
1693.470
0.012
4.689
-
1173.659
0.006
NESC Annual Report - FY1994
27
-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
Table 3: Queueing model analysis results for service (wallclock) rates and times for
corresponding to the period 6 May to 30 September 1994 when a Cray C94 2/64 was installed at
NESC. These results are now installed in the qperf user tool on Sequpiaj
a sample
Queue name '
• - > „
Q4MW_600S
Q4MW_3600S
Q4MW_100KS
Q4MW_UNLS
Q8MW_600S
Q8MW_3600WS
Q8MW_100KS
Q8MW_UNLS
Q12MW_600S
Q12MW_3600S
Q12MW_100KS
Q12MW_UNLS
Q18MW_600S
Q18MW_3600S
Q18MW_100KS
Q18MW_UNLS
Q24MW_600S
Q24MW_3600S
Q24MW_100KS
Q24MW_UNLS
Q40MW
AREAL1 (RADM)
AREAL2 (ROM)
AREAL_ROMDP
AREAL_RADMDP
NIGHT
TOTALS
Memory
(MW)
4
4
4
4
8
8
8
8
12
12
12
12
18
18
18
18
24
24
24
24
40
10
16
6
4
24

.CPU (sec) \f,
s.
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
999999
999999
999999
1830
1830
28800

N
1136
1588
338
9
98
183
300
36
597
477
375
27
127
181
351
72
45
43
83
43
60
202
414
5
0
6
6796
Mean
service
rate
(number/ ,
min)
0.7730623
0.2481 766
0.0114870
0.0001752
0.1862655
0.0124481
0.0015649
0.0024683
0.0534026
0.0407103
0.0084882
0.0002345
0.1410151
0.0767252
0.0067908
0.0015469
0.0397179
0.0191779
0.0020375
0.0007568
0.0159816
0.0022673
0.0058797
0.1403448
—
0.0007308
1.7614555
Mean -
service
time" -
(minutes)
j 1 .294
j 4.029
87.055
5709.196
5.369
80.334
639.018
405.144
' 18.726
i 24.564
| 117.811
4264.010
j 7.091
: 13.034
i
[ 147.259
: 646.475
| 25.178
52.143
490.788
1321.414
i 62.572
441 .053
170.077
7.125
i —
1368.415
0.558
28
NESC Annual Report - FY1994
-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
Your memory request is 12 MWords.
Your time request is 600 seconds.

Your job will be placed in the q!2mw_600s queue.
Queue memory limit is 12 MW.
Queue time limit is 600 seconds.
Sampling period is 05/94-09/94. Sample size is 597.
Queue rate is 0.027494. Service rate is 0.053403.
10% of
20% of
30% of
40% of
50% of
60% of
70% of
80% of
90% of

10% of
20% of
30% of
40% of
50% of
60% of
70% of
80% of
90% of
jobs
jobs
jobs
jobs
jobs
jobs
jobs
jobs
jobs

jobs
jobs
jobs
jobs
jobs
jobs
jobs
jobs
jobs
in this
in this
in this
in this
in this
in this
in this
in this
in this

in this
in this
in this
in this
in this
in this
in this
in this
in this
queue
queue
queue
queue
queue
queue
queue
queue
queue

queue
queue
queue
queue
queue
queue
queue
queue
queue
are
are
are
are
are
are
are
are
are
scheduled
scheduled
scheduled
scheduled
scheduled
scheduled
scheduled
scheduled
scheduled
within
within
within
within
within
within
within
within
within
3.83 minutes.
8.12 minutes.
12.97 minutes.
18.58 minutes,
25.21 minutes,
33.33 minutes,
43.79 minutes.
58.54 minutes.
83.75 minutes.
complete
complete
complete
complete
complete
complete
complete
complete
complete
execution
execution
execution
execution
execution
execution
execution
execution
execution
within 1,
within 4,
within 6,
within 9,
within 12
within 17
within 22
within 30
within 43
.97 minutes.
.18 minutes.
.68 minutes.
.57 minutes.
.98 minutes.
.16 minutes.
.55 minutes.
.14 minutes.
. 12 minutes.
Rgure 1: This shows the response to Sequoia command line entry qperf -m 12 -t 600. The the queue and service rates are the
mean rates found in the analysis and the corresponding mean queue time is 0.027494'1 = 36.4 minutes and mean service time is
0.053403" - 18.7 minutes. The tabulations are for the respective cumulative probability distributions in equal percentile increments
to indicate the model prediction for the spread in times. The cumulative probability distribution table can also be read in reverse to
read the probability that a specific time is observed. As an example, for this queue, the probability that a job has a queue wait time of
25.2 minutes and a service time of 22.6 minutes is 0.5 and 0.7, respectively. Since these results are for the C94/264 configuration
they provide conservative estimates of the expected service levels on the current G94/364 installation.
NESC Annual Report - FY1994
29
-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
30
NESC Annual Report - FY1994
-------
Second Annual International Environmental Visualization
Workshop1
The second annual International Environ-
mental Visualization Workshop was held at
the Marriott Society Center in Cleveland,
Ohio from August 30 through September 1,
1994. The event was a collaboration
between the U.S. Environmental Protection
Agency's (EPA) National Data Processing
Division (NDPD), the Great Lakes National
Program Office (GLNPO), the Gulf of Mexico
Program Office, and EPA High Performance
Computing and Communications (HPCC)
Program.
The workshop was designed to provide
EPA and associated environmental
researchers and policy analysts with an
opportunity to learn about scientific visual-
ization tools. The focus of the two-and-one-
half-day workshop was on the application of
visualization, high speed networking, and
high performance computing technologies to
environmental research problems. With that
goal in mind, invited researchers from out-
side EPA shared their experiences and
viewpoints on exploring natural and physical
sciences data sets. Gregory J. McRae, pro-
fessor of Chemical Engineering at the Mas-
sachusetts Institute of Technology, provided
his perspectives on applying visualization
tools to air pollution problems. Mary Whit-
ton, from SUN Microsystems and current
Chair of the Association for Computing
Machinery's Special Interest Group on
Graphics (ACM/Siggraph), gave us some
insights into the trends and futures of visual-
ization products.
Three visualization researchers agreed to
share the latest tools they are developing.
Polly Baker, from the National Center for
Supercomputing Applications (NCSA), pre-
sented task directed visualization tools
which are oriented toward assisting specific
inquiry and analysis activities of scientists.
John Rasure, of Khoral Research Inc. and
the University of Mew Mexico, showed us a
complete application development system
(Khoros 2.0) which redefines the software
engineering process to include all members
of the work group, from the application end-
user (i.e. environmental scientists) to the
infrastructure/visualization programmer.
Peter Kochevar, from Digital Equipment Cor-
poration (DEC) and the San Diego Super-
computer Center (SDSC), demonstrated
the implementation of collaborative comput-
ing, database management, virtual reality,
and visualization tools to examine earth sci-
ences data sets as sociated with the Sequoia
2000 project.
Researchers from the National Center for
Atmospheric Research (NCAR), California
Air Resources Board (GARB), Environment
Canada, and Supercomputer Systems Engi-
neering and Services Company (SSESCO)
also agreed to share their practical experi-
ences associated with applying visualiza-
tion tools to environmental problems. Work
underway at EPA research laboratories in
Ada, Oklahoma and Athens, Georgia and
the Environmental Monitoring and Assess-
ment Program (EMAP) was also featured at
this visualization workshop.
An important component of visualization
technology transfer includes computer
graphics education programs. Gloria
Brown-Simmons, chair of the Visualization
and Presentation Subcommittee for the Glo-
bal Learning and Observations to Benefit the
Environment (GLOBE) Program, demon-
strated current efforts underway to educate
children around the world about environ-
mental sciences research. Ralph Coppolla,
of Saginaw Valley State University, spoke
NESC Annual Report - FY1994
31
-------
Second Annual International Environmental Visualization Workshop
about EarthVision, EPA's computational sci-
ence educational program for high school
students and teachers. Acha Debela, from
North Carolina Central University and cur-
rent Chair of the Historically Black Colleges
computer graphics education effort of ACM/
Siggraph, also shared with us his,latest
activities.
The event was well received and planning
is already underway for the Third Annual
International Environmental Visualization
Workshop in 1995. !
1 Theresa Marie Rhyne, Martin Marietta Technical Services, Inc. (MMTSI), Research Triangle Park, NC 27711.
32
NESC Annual Report - FY1994
-------
Calculating the Rates of DNA-Catalyzed Reactions of
Aromatic Hydrocarbon DJol Epoxides1
Some classes of small electrophiles - the
diol epoxides of polycyclic aromatic hydro-
carbons, for example - react to form cova-
lent adducts with nucleophilic sites within
specific nucleotide sequences of DNA. If
the resulting lesion is not repaired prior to
replication, a transcriptional block or a muta-
tion, possibly in a cancer-associated gene,
may result. The overall goal of our research
program is to understand the factors regulat-
ing the extent to which these molecules
react with DNA as well as their base
sequence specificity and the consequences
of their binding. Explaining the chemical
events in the early etiology of chemical car-
cinogenesis requires predicting relative
extents of DNA binding within a class of
small molecules as well as deducing the
nucleotide sequences that are "hot-spots"
for mutation for each member of the class.
Predicting "mutation "hot-spots" - not neces-
sarily coincident with "binding hot-spots" -
may be possible if the adduct conformation
is known as a function of sequence con-
text. For some molecules (e.g.
benzo[a]pyrene diol epoxide and aflatoxin
BI) binding hot-spots, mutation hot-spots
and adduct conformations have been exper-
imentally characterized.
Some molecules exhibit substantial and
varied sequence preferences in their cova-
lent adduct formation with DNA, The nucle-
otide target is the primary identifier of adduct
type but the sequence context of this nucle-
otide has a dramatic effect on the binding.
For example, aflatoxin B-, has a preference
for reacting with the N7 position of G. In
addition, in considering the sequence spe-
cific binding to the trinucleotide S'-XGY-S',
the likelihood of adduct formation varies with
X as G > C > A > T and varies with Y as
G > T > C > A and the 3' neighbor exerts a
greater influence ihan does the 5' neighbor
(Said and Shank, 1991). The alkylating
agents MNU and CCNU also preferentially
attack the central G of GGG sequences.
The enantiomers of trans-7, 8-dihydroxy-
anti-9,10-epoxy-tetrahydrobenzo[a]pyrene
(BPDE-2) prefer different sequences in their
binding to the exocyclic amine of G (Rill and
Marsch, 1990). The (-) isomer prefers AGG,
CGG and TGG while the (+) isomer rarely
binds to TGY triplets.

Both the sequence context and the adduct
conformation contribute to the mutation
spectrum of a chemical. Mutation "hot-
spots" do not necessarily coincide with bind-
ing "hot-spots". The frameshift mutagen, N-
2-Acetylaminofluorene (AAF), which forms
adducts primarily at the C8 of guanine, binds
approximately equally to all G residues
regardless of sequence context, yet muta-
tional "hot-spots" occur at alternating GC
sequences and at contiguous G
sequences. A model has been advanced
suggesting that the local structure surround-
ing the AAF-DNA a,dduct depends on the
sequence context and that this structure
determines the adduct fate (Lambert et al.
1992). The trans-7, 8-diol-9,10 epoxide of
(+)anfr-benzol[a]pyrene, (+)BPDE-2, a puta-
tive "ultimate carcinogen", has been shown
to predominately lead to G to T transversion
in a site-directed mutagenesis study
(MacKay et al 1992). Rodriguez and
Loechler (1993) proposed a model in which
bulky adducts such as those formed by
(+)BPDE-2 binding with the N2 of G exist in
multiple conformations which depend on
sequence context and that the conformation
determines the choice among G to T, G to A,
and G to C point mutations.
NESC Annual Report - FY1994
33
-------
Calculating the Rates of DNA-Catalyzed Reactions of Aromatic Hydrocarbon Diol Epoxides
The mutation spectrum of small molecules
is correlated with their potential to induce
specific neoplastic transformation. This
statement is based on DNA-adduct forma-
tion of several chemical carcinogens, most
notably, benzo[a]pyrene and aflatoxin B^, as
well as several alkylating agents and the
mutations involved in protooncogenes and
in the well studied tumor-suppressor gene,
p53. For example, G to T transversionsat
the third base of codon 249 (AAG) in the p53
gene are frequently associated with hepato-
cellular carcinoma occurring in southern
Africa, Qidong, China, establishing a link
between exposure to aflatoxin B-\ and a spe-
cific mutation in a cancer-related gene.
Benzo[a]pyrene (B[a]p) has also been
shown to exhibit a high frequency of G to T
transversion in the "hot-spot" region of p53
and in codon 12 of the Ha-ras oncogene
(Bailleul et al. 1989). On the other hand, 7,
12-dimethylbenzanthracene (DMBA) exhib-
its a similar mutational frequency to B[a]p
but is only rarely linked with G to T transver-
sion in p53 (Ruggeri et al., 1993). No tumor
cell line derived from DMBA-induced tumors
contains G to T transversion (Ruggeri et al.
1991). The alkylating agents, N-nitrosoethy-
lurea and N-nitrosomethylurea (MNU),
induce G to A transitions in codons 204 and
213 of p53 with high frequency (Ohgaki, et
al., 1992). The fact that the distinct mutation
spectra of different carcinogens are associ-
ated with different cancers suggests that the
ability to predict the mutations caused by a
given chemical may provide an etiologically
based predictive tool for its carcinogenicity.
Knowing the specific pathway for initiation
carcinogenesis for a given chemical may
ultimately prove useful as suggested by the
recent demonstration that a ras-activating G
to A transition in O6- alkylated-DNA adducts
formed by reaction with MNU has been
blocked in mice by the transgenic expres-
sion of a single human DNA repair gene
(Dumenco, et al., 1993).
Epoxides are subject to acid-catalyzed
(pH-dependent; rate constant kH) and spon-
taneous (pH-independent; rate constant ko)
hydrolyses and the overall observed rate
constant can be written as: |

k = ko + [H*J kH
Equation 1 |

The diol epoxides of numerous j polycyclic
aromatic hydrocarbons undergo DNA-cata-
lyzed hydrolysis, the rate constants of which
transversions have been experimentally
measured by others. We have hypothesized
that acidic domains at the surface! of DNA
serve as catalytic zones for the creation of
reactive carbocations (Lamm and; Pack,
1990). The rate constant for hydrolysis of a
diol epoxide (DE) in the presence! of DNA
can then be written in a way that embodies
this ;
hypothesis: !

J {kO + kH [H+](R)| [DE] |(R) dV

J [DE](R)dV j
i
|
Equation 2 '
I
In this equation, the spatial distribution of
hydronium ion concentration is written as
[H"I"](R). When the pseudo-first-o^der rate
constant kH [H+](R), is added to ko, the rate
constant for hydrolysis at R is obtained.
Multiplying this by the concentration of diol
epoxide at R, [DE](R), yields the local rate of
hydrolysis. The overall rate of hydrolysis, k,
is then given by the normalized summation
over the volume of the system, as; written in
Equation 2. If ko and kH are experimentally
known, k can be calculated from the distribu-
tions [H+](R) and [DE](R). j
Using a variable dielectric Poisson-Boltz-
mann approach (Pack et al., 1993), the spa-
tial distribution of hydrogen ions, [H+](R),
and the electrostatic potential, (j)(jR), of the
DNA-electrolyte system have bee;n mapped
to a non-cartesian grid that contours the
nucleic acid surface. A Metropolis Monte
Carlo calculation was done to determine the
distribution of diol epoxide [DE](R) in this
average electrostatic field of the system.
34
NESC Annual Report - FY1994
-------
Calculating the Rates of DNA-Catalyzed Reactions of Aromatic Hydrocarbon Diol Epoxides
Briefly, this was done by computing the inter-
action energy, U, of the diol epoxide with the
remainder of the system as shown in Equa-
tion 3.
The van der Waals parameters, a and c,
were taken from the AMBER parameter set
(Weiner etal., 1986). The electrostatic
energy was obtained by determining the
environmental grid cell, k, in which each
atom, i, of the diol epoxide was located and
subsequently multiplying the charge on that
atom by the PB-calculated potential in that
cell. The rigid diol epoxide was translated
and rotated by random amounts to generate
new configurations which were kept or
rejected according to the usual Metropolis
criterion. Special adaptations, scaling the
maxima for the translational and rotational
motions at each step, were required to
improve convergence. These will be
detailed in a subsequent publication.
The (+)syn and (+)antf diastereomers of
frans-7,8-diol-9,10-epoxy-tetrahy-
drobenzo[a]pyrene were studied. The over-
all hydrolysis rate constant as well as the
spatial distribution of rates have been calcu-
lated as a function of pH and DMA concen-
tration. These have been compared to the
experimentally determined rate constants for
DNA-catalyzed hydrolysis of these mole-
cules (Islam etal. 1987). The agreement of
the calculated and experimental rates for
this reaction as a function of DMA concen-
tration and of pH indicate that the hypothesis
is correct and suggests that this is a promis-
ing step along the pathway in predicting the
genotoxicity of a chemical from its molecular
structure.
BPDE
Equation 3
References

Said, B. and R.C. Shank, Nucl. Acids Res.
19:1311-1316,1991.
R. L Rill and G.A. Marsch, Biochemistry
29:6050-6058, 1990.
Lambert, I.E., R.L. Napolitano and R.P.R
Fuchs, Proc. Natl. Acad. Sci. USA
89:1310-1314,1992.
MacKay, W., M. Benasutti, E. Drouin and
E.L Loechler, Carcinogenesis 13:1415-
1425,1992.
Rodriguez, H. and E. L. Loechler, Biochem-
istry 32:1759-1769,1993.
Bailleul, B., K. Brown, M. Ramsden, R.J.
Akhurst, F. Fee and A. Balmain, Environ.
Health Perspect. 81:23-27,1989.
Rugerri, B., M. DiRado, S.Y. Zhang., B.
Bauer, T. Goodrow and A.J.P. Klein-
Szanto Proc. Natl. Acad. Sci. USA
90:1013-1017,1993.
Ruggeri, B. J. Caamano, T. Goodrow, M.
DiRado, A. Bianchi, D. Trono, C.J. Conti
and A.J.P. Klein-Szanto, Cancer Res.
51:6615-6621, 1991.
Ohgaki, H., G.C. Hard, N. Hirota, A.
Maekawa, M. Takahashi and P.
Kleihues, Cancer Res. 52:2995-2998,
1992.
Dumenco, L.L., E. Allay, K. Norton and S.L.
Gerson, Science 259:219-222,1993.
Lamm, G. and G. R. Pack, Proc. Natl. Acad.
Sci. USA 87:9033-9036, 1990.
Pack, G.R., G.A. Garrett, L. Wong and G.
Lamm, Biophysical J. (in press), 1993.
Weiner, S.J., P.A. Kollman, D.T. Nguyen and
D.A. Case. J. Comp Chem. 7:230-252,
1986.
Islam, N.B., D.L. Whalen, H. Yagi, and D.M.
Jerina, J. Am. Chern. Soc. 109:2108-
2111,1987. '
1 George R. Pack and Linda Wong, UIC College of Medicine at Rockford, 1601 Parkview Ave., Rockford, Illinois
61107.
NESC Annual Report - FY1994
35
-------
Calculating the Rates of DNA-Catalyzed Reactions of Aromatic Hydrocarbon Diol Epoxides
36
NESC Annual Report - FY1994
-------
Prediction of Oxidative Metabolites by Cytochrome P450s
with Quantum Mechanics and Molecular Dynamics
Simulations1
Background (History of Project)
The ubiquitous cytochrome P450
enzymes are a family of well known monox-
ygenase which are involved in the phase
one metabolism of a variety of xenobiotics
leading to either benign or harmful (i.e., car-
cinogenic, teratogenic, hepatotoxic, or neph-
rotoxic) metabolites. It is therefore important
to identify and characterize metabolites with
toxic potency and modulators which lead to
their formation. During the past few years,
considerable work in our laboratory has
been devoted to (1) the study of the mecha-
nistic nature of the oxidative biotransforma-
tion by P450s, and (2) the identification and
characterization of properties which modu-
late the competition among possible oxida-
tion reactions for a specific group of
compounds. In the case when the three
dimensional structure of the specific P450
enzyme is not known, the study is focused
only on the intrinsic properties, such as elec-
tronic structure, conformation, and functional
groups, of the xenobiotics and their oxidative
metabolites. On the other hand, when the
3D structure of the P450 enzyme is known,
as in the case of the bacterial P450cam
enzyme, the possible modulations of com-
peting reactions from the steric interaction
between the enzyme and the xenobiotics
are also considered.
The major types of P450-metabolized oxi-
dation reactions include aliphatic and aro-
matic C-hydroxylation, epoxidation, and
heteroatom oxygenation. Recently, studies
from our laboratory of the hydroxylation of
aliphatic acids, such as valproic acid and its
analogues1-3 and the epoxidation of styrene
analogues4-5 by cytochrome P450s have
been reported. Here, we reported two
investigations involving the oxidation of het-
eroatom containing compounds. In the first,
as shown below, the competition between
heteroatom oxygenation and Ca-hydroxyla-
tion of N- and S-containing compounds were
studied. See Figure 1, page 38.
The specific questions asked were (1)
why S-containing compounds are more
likely to proceed heteroatom oxidation than
N-containing compounds? (2) why Ca-
hydroxylation is preferred over N-oxidations
in amines? and (3) why S-oxidation is
favored over Ca-hydroxylation in thioethers?
Ab initio quantum mechanical methods were
used to find any possible electronic modula-
tors leading to these results.6
In the second study, Figure 2, page 38,
the stereoselective suifoxidation of thioani-
sole and p-methyl :thioanisole by P450cam
was investigated.
The experimental observations showed
that P450cam-metabolized suifoxidation
produced more R-enantiomer for thioanisole
while more S-enantiomer for p-methyl thio-
anisole. The theoretical study presented
used the known 3D structure of P450cam
enzyme and molecular dynamics simula-
tions of the enzyme-substrate interaction to
reproduce the experimental results and to
find the steric modulators that lead to these
observations.

Method/Approach
For the first study, three amines including
methyl amine, methyl ethyl amine and N-
methyl aniline and three thioethers including
methanethiol, methyl ethyl thioether and
thioanisole were used. A triplet oxygen atom
was used as a model P450 since the active
state of P450 can be best described as con-
taining a triplet [Fe=O] moiety. Ab initio
NESC Annual Report - FY1994
37
-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molecular
Dynamics Simulations :
(1) Heteroatom Oxygenations
N— O or
— OH
S -

(2) Ca Hydroxylation
S—O
\ I
N—C-
/ I

-s-c-
\ I
N-C
I
-OH
_S— C-OH
Figure 1: Investigations Involving the Oxidation Of Heteroatom Containing Compounds - First Study;
P450cam/O2
Figure 2: Investigations Involving the Oxidation of Heteroatom Containing Compounds - Second Study!
38
NESC Annual Report - FY1994
-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molecular
Dynamics Simulations
quantum mechanics were utilized to fully
optimize the structures and minimize the
energies of these parent compounds, the
possible reactive intermediates, and their
oxidative products. HF was used for closed
shell species while UHF or ROHF (when
large spin contamination was found from
UHF) were used for open shell species. 6-
31G* basis set was used for all species.
For the second study, an extended bind-
ing site model of P450cam containing 87
amino acids, 7 waters and the protoporphy-
rin IX heme unit and an active oxygen at a
distance of 1.7A above the Fe was used.
Thioanisole and p-methyl
thioanisole were then
docked in the binding site
with six different orienta-
tions. Three of the initial
orientations have R config-
urations while the other
three have S configura-
tions. After energy minimi-
zation with the AMBER
3.0A force field, the 6 different enzyme-sub-
strate complexes were each subjected to 5
ps of heating-equilibration and 125 ps of
molecular dynamics simulations with two dif-
ferent initial velocity distributions. Since the
87 amino acids forming the extended bind-
ing site are not contiguous, the backbone
(N, Ca, C) atoms of the enzyme were con-
strained in coordinate space with a harmonic
potential of 100 kcal/A2 during the MD simu-
lations. The coordinates of all atoms during
the 125 ps simulations were saved at 0.2ps
interval resulting in 625 snapshots for each
trajectory. For each snapshot, if the dis-
tance between the active oxygen of
P450cam and the S atom of the substrate
was less than 4.0A, sulfoxidation was
assumed to proceed. An angle parameter,
defined as the angle between the normal of
the C-S-C plane and the S to O vector, was
then used to determine which lone pair of
the S atom was to be attacked and the cor-
responding enantiomer produced. In this
study, values of angle less than 80° lead to S
enantiomer and those greater than 100°, to
the R isomer. Snapshots with values of 9
between 80° and 100° were disregarded,
since they correspond to the less stable
transition state configurations between the S
and R isomers.

Results and Discussion

(1) The Oxidation Reactions of N- and S-
Containing Compounds

(1) Origin of Observed Preference for Het-
eroatom Oxide Formation: S > N Guenger-
ich et al. proposed that heteroatom oxide is
formed via an initial electron transfer from
the substrate to the [FeO] moiety of P450,
followed by an oxygen transfer in the oppo-
site direction. \
R-X-R-+ [Fe=0]v--> [R-X-RT- + [Fe=0]lv
Compound I Compound II
R-X(O)-R' + [Fe]
Resting State
If this electron transfer step is rate limiting,
the energy of the cation radical, [R-X-R']-+,
relative to its parent compound could be a
modulator determining why heteroatom
oxide formation is more likely in S-containing
compounds than in N-containing com-
pounds. Therefore!, the ionization potential
of three similar compounds, methylamine,
methanethiol and nnethylphosphine was cal-
culated and the results are shown in Table 1,
page 40. Comparison among the three
compounds shows the trend of the IP: meth-
anethiol > methylphosphine > methy-
lamine. This contradicts the observation
that the preference for heteroatom oxide for-
mation: P-O > S-O > N-O. Therefore, we
concluded that the relative stability of the
cation radical cannot be a modulator deter-
mining the oxide formation. Our results also
do not support the oxide formation via an
electron transfer mechanism.
Another possible mechanism for the het-
eroatom oxide formation is a one-step direct
oxygen atom transfer without the formation
of the intermediate cation radical. Based on
this mechanism, a comparison of the
NESC Annual Report - FY1994
39
-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molejcular
Dynamics Simulations j
heteroatom oxide stabilities of the three
compounds was made in order to examine if
this is a possible modulator of the observed
preference of heteroatom oxide formation in
the order of P-O > S-O > N-O. As shown in
Table 1, our results show that the rank order
of the stability of X-O is parallel to the obser-
vation of the preference of formation. These
results demonstrate that the relative het-
eroatom oxide stability is a very good indica-
tor of the observed preference of S-
containing compounds for heteroatom oxide
formation compared to N-containing com-
pounds.
the three types of products Ca-hydroxides,
N-hydroxides, and N-oxides were; consid-
ered. Specifically, the formation of Ca-
hydroxides and N-hydroxides was assumed
to proceed via an initial hydrogen [atom
abstraction producing corresponding Ca rad-
ical and N radical as reactive intermedi-
ates. The formation of N-oxide wias
assumed to proceed via a one-step direct
oxygen transfer mechanism due to its suc-
cess in explaining the preference jof oxide
formation between N- and S-compounds.
The stabilities of the Ca-radical, the N-radi-
cal intermediates in the proposed:H-abstrac-
Table 1: Energetics of N- and S-compounds chosen for this study. Energy is in
kcal/mol relative to the parent compound and the triplet oxygen atom. Relatiye
stability of radical cations and products.
Substrate
CH3NH2
CH3SH
CH0PH2
IP '"
174.2
192.8
180.9
; [CHsXRj^o- ;;
&••*'* wrt X u.
215.1
233.7
221.8
\ * Ctt3X<0)R v
12.7
-2.5
-56.9
(2) Internal Competition in Formation of
Three Oxidation Products of Amine
To identify reliable determinants of product
distribution in cytochrome P450 mediated
oxidations in amines, plausible pathways to
tion mechanism and the N-oxide product
were hence calculated. As shown in Table 2
for the three amine substrates studied, the
CL-radical is less stable than both the N-rad-
•'a.
ical intermediate and the N-O product by a
Table 2: Energetics of N- and S-compounds chosen for this study. Energy is; in
kcal/mol relative to the parent compound and the triplet oxygen atom. Relative
stability of C and N-radical intermediates from H-abstraction mechanism and the
three possible oxidation products of amines. i
Substrate
CH3NH2
C'HaNHCVV
C6H5NHCH3
[CJ-fOH
15.5
15.9' 14.4"
16.9
INHPH-
12.3
10.2
9.5
[C«-OH3
-46.8
-45.8' -47.7"
-46.4
[N-OH]
-11.2
-10.3
-9.8
>[N:O]
12.7
6.1
12.9
The two possible sites of Ca are denoted with ' and".
40
NESC Annual Report - FY1994
-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molecular
: Dynamics Simulations
few kcal/mol, a result that is in contrast to
the observed preference of Ca-hydroxyla-
tion-dealkylation in amines. Therefore, the
relative stability of competing Ca- and N-rad-
ical intermediates is not a reliable determi-
nant of different product formation.
Comparison of the stability of the three
oxidation products also shown in Table 2,
page 40, gives the order Ca-hydroxide > N-
hydroxide > N-oxide in all three amines. The
formation of the Ca-hydroxides and the N-
hydroxides are both exothermic while the
production of N-oxide is endothermic. The
large energy difference of ~50 to ~60 kcal/
mol obtained between the N-oxide and the
Ca-hydroxide in the three amines suggests
that the formation of Ca-hydroxy amiine is
energetically much more favorable. This
result agrees with the observation that Ca-
hydroxylation-dealkylation is the predomi-
nant oxidation reaction in amines. Thus it
appears that the relative stabilities of the dif-
ferent products modulate the competition
among the various reactions. Among the two
heteroatom oxygenations, N-OH was found
to be ~35 kcal/mol more stable than N-O
product and would therefore be predicted to
be the next most abundant product.

(3) Internal Competition in Formation of
Two Oxidation Products of Thioethers
To identify reliable modulators of cyto-
chrome P450 mediated oxidations of sulfur-
containing compounds, it was assumed that
Ca-hydroxylation proceeds via H-abstraction
and sulfoxide formation via direct oxygen
transfer. The stability of the Ca-radical and
OH radical intermediates for each of the three
substrates was calculated as shown in Table 3
and compared with the stability of sulfoxide.
In contrast to the results for the N-containing
compounds, for all three thioethers the sulfox-
ide is more stable than the intermediates Ca-
radical and OH radical by -20 to -30 kcal/mol.
Thus, if the formation of the Ca-radical and
OH radical is the rate limiting step in Ca-
hydroxylation for thioethers, more S-oxide
than C^-hydroxide product is predicted which
is consistent with observations.
To determine the uniqueness of this infer-
ence, a second comparison was made of the
product stabilities as shown in Table 3. For all
three S-containing parent compounds, the Ca-
hydroxylation product was found to be ener-
getically more stable than the sulfoxide by ~25
to ~38 kcal/mol. Thus if the relative product
stability is used as a criterion for predicting
product distribution, it leads to the prediction
that the Ca-hydroxide is the predominant
product in all three thioethers, a result which
contradicts known experimental works of a
few selected thioethers. Therefore, it can be
concluded that the relative stability between
the radical intermediates and the sulfoxide,
rather than the final oxidation products, is an
important factor of product distribution of
thioethers.
Table 3: Energetics of N- and S-compounds chosen for this study. Energy is in kcal/mol relative
to the parent compound and the triplet oxygen atom. Relative stability of C-radical intermediate
from H-abstraction and the two possible oxidation products of thioethers.
[ Substrate
CHgSH2
C'^SC'^HS^
CgHsSCHs
[CJ^OH
18.6
18.8' 15.8"
18.4
[Ca-OHJ ,
-40.2
-37.8' -43.3"
-38.4
[S-0]
-2.5
-12.0
-11.5
The two possible sites of Ca are denoted with ' and".
NESC Annual Report - FY1994
41
-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molepular
Dynamics Simulations ;
Stereoselective Sulfoxidation of
Thioanisole and p-methyl Thioanisole
The results of each individual 125 ps
molecular dynamics simulations for thioani-
sole and p-methylthioanisole using the two
geometric determinants, (r<4.oA for reactiv-
ity and 9 for sterochemistry) are summarized
in Tables 4 and 5, page 43, respectively.
The overall results yield enantiomeric ratios
of sulfoxide products R:S 65:35 for thioani-
sole and 22:78 (R:S) for p-methylthioani-
sole. Results from two different cutoff S-O
distances, 3.5 and 4.5 A, were also ana-
lyzed to test the sensitivity of the results to
the cutoff criterion. For thioanisole, 69:31
(R:S) and 61:39 (R:S) ratios were found for
the 3.5 and 4.5 A criteria, both very'similar to
the 65:35 (R:S) value for the 4.0 A. For p-
methylthioanisole, the values of 23:77 (R:S)
and 24:76 (R:S), respectively, were[also very
similar to that found for the 4.0 A cutoff. The
experimental results, also shown inJTable 4,
give 70:30 (R:S) for thioanisole and 48:52
(R:S) for p-methyl thioanisole. !
\
The most interesting results obtained from
both experiment and theory is the modulation
of stereoselectivity from R to S by the thioani-
sole p-methyl substituent. In order to under-
stand the role of the para substituent in
modulating this selectivity, a detailed analysis
of substrate interactions with the binding site
residues and the heme prosthetic group is
required. Our analysis from the trajectories
Table 4: Summary of MD results for each 125 ps trajectory of thioanisole and p-
methylthioanisole and the comparison between theoretical and experimental
enantiomeric ratios - thioanisole.
Run
01 V1
O1 V2
O2V1
O2V2
O3V1
O3V2
O4V1
O4V2
O5V1
O5V2
O6V1
O6V2
Total
Conf
R
R
S
S
S
S
R
R
S
S
R
R

Experimental
Total
318
443
394
364
322
419
398
356
11
11
18
55
3069

0-80°(S}
141
179
99
134
144
85
100
132
10
6
3
21
1035

^80:1QQ° '
6
11
13
16
6
12
15
12
0
4
10
23
114

UQ0rt8Q0(R}l
171
253
282
214
172
322
283
212
1
1
5
11
1920

'Ratid{S;B>;
f ** "*_ <...'...*..>. ,..$.
i 45:55
i
41:59
26:74
39:61
; 46:54
21:79
26:74
38:62
91:1
86:14
37:63
66:34
35:65
28:72
42
NESC Annual Report - FY1994
-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molecular
Dynamics Simulations

Table 5: Summary of MD results for each 125 ps trajectory of thioanisole and p-
methylthioanisole and the comparison between theoretical and experimental
enantiomeric ratios - p-methylthioanisole.
Run
O1 V1
O1 V2
O2V1
O2V2
O3V1
O3V2
O4V1
O4V2
O5V1
O5V2
O6V1
O6V2
Total
Conf
R
R
S
S
S
S
R
R
S
S
R
R

Experimental
Totaf
515
451
562
548
535
558
561
557
396
70
438
285
5478

0-80°(S)
247
437
549
334
383
534
514
483
250
11
335
13
4090

80-100°
3
12
8
6
3
3
8
5
49
18
23
85
223

100-18Q°(R)
; 265
2
; 5
'• 208
149
21
39
! 69
97
41
80
189
1165

Ratto(S:R) .
48:52
100:0
99:1
62:38
72:28
96:4
93:7
88:12
72:28
21:79
81:19
6:94
78:22
52:48
shows the stronger lipophilic interaction of p-
CH3 with Phe-87 and Tyr-96 causes the
substrate to stay relatively rigid with a spe-
cific orientation. In this orientation, the steric
effect of residues Val-295, lle-395, and Val-
396 forces the S-CH3 side chain to move
inward toward the empty space of the active
site pocket and results in more S(-) enanti-
omer.

On the contrary, the para proton of thioani-
sole has less interaction with the surround-
ing residues. The only significant exception
is the electrostatic interaction of the para
proton with Asp-297. This smaller interaction
with the residues causes the para proton of
thioanisole to have more mobility than the p-
CH3 of p-methylthioanisole. Although it is
relatively mobile in the binding site, we
found a higher probability from the trajecto-
ries for thioanisole to stay in the orientation
in preference of R(t) sulfoxide formation
because the steric effect of residues Leu-
244 and Thr-101 forces the S-CH3 group
into that configuration.
The predicted value for the p-methylthio-
anisole sulfoxide enantiomer ratio shifts in
the same direction as the observed value
although the two values differ in absolute
terms. Thioanisole favors R(+) sulfoxide for-
mation (R:S = 65:35, theoretical, and 72:28,
experimental) whereas p-methylthioanisole
favors S(-) sulfoxide formation (R:S = 22:78,
theoretical, and 48:52, experimental). The
theoretical treatment thus predicts, in agree-
ment with experiment, that introduction of a
p-methyl substituent leads to an inversion of
NESC Annual Report - FY1994
43
-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molecular
Dynamics Simulations |
the preferred absolute stereochemistry of
the sulfoxide product. The experimental and
theoretical results thus indicate that the p-
methyl substituent is an important modulator
of the stereoselectivity. Lipophilic and elec-
trostatic interactions between the p-methyl
substituent and residues of the active site
appear to determine the favored orientation
of the thioanisole framework. This orienta-
tion, in conjunction with the orientation of the
S-CH3 group, determines whether the R or
S enantiomer is produced.

Scientific Accomplishments and
their Relevance to EPA's Mission
The electronic modulators of the compet-
ing hydroxylation and heteroatom oxidation
reactions in N- and S-containing compounds
were identified in this study with ab initio
quantum mechanics. The results obtained
can be helpful for predicting the preferred
oxidation products and assessing their toxic
potency for other N- and S-containing com-
pounds which are so heavily used as drugs,
insecticides or other industrial materials.
In the study of the steroselective sulfoxi-
dation of thioanisole and analogue by
P450cam, we demonstrated that the product
distribution can be predicted from studying
the enzyme-substrate direct interaction
using molecular dynamics simulations. Fur-
thermore, given the fact that xenobiotics-
macromolecule or metabolite-macromole-
cule interaction is one of the important pro-
cesses for triggering toxicity, the experience
obtained here can be useful for further appli-
cation to these important events.

Future Objectives, Goals, and Plans
The mechanistic study of P450-metabo-
lized oxidation reactions for a few groups of
xenobiotics will be continued. We also plan
to begin the study of adduct formations
between xenobiotics and macromolecules
which is a very important step in the overall
toxic process. Specifically, interaction of
some relevant xenobiotics with the heme
unit of the P450 enzyme and the nucleic
acids will be studied. ;
i
Acknowledgment: j
The support from the Environmental Pro-
tection Agency Grant #CR-818677-01-0 is
gratefully acknowledged.

Relevant publications and reports
1 D. L. Camper, G. H. Loew, and J. R. Col-
lins, Steric and Electronic Criteria for
Teratogenicity of Short Chain Aliphatic
Acids, Int. J. Quant. Chem. Q$S 17,173
(1990). ;
2 J. R. Collins, D. L. Camper, and G. H.
Loew, Valproic Acid Metabolism by Cyto-
chrome P450: A Theoretical Study of
Steroelectronic Modulators of'Product
Distribution, J. Am. Chem. Soc. 113,
2736(1991). ;
3 Y. T. Chang and G. H. Loew, Binding of
Flexible Ligands to Proteins: Valproic
Acid and Its Interaction with Cytochrome
P450cam, Int. J. Quant. Chem. QBS 20,
161 (1993). |
4 P. R. Ortiz de Montellano, J. Fruetel, J. R.
Collins, D. Camper, and G. H.iLoew,
Theoretical and Experimental^Analysis
of the Absolute Stereochemistry of cis-b-
Methylstyrene Epoxidation byiCyto-
chrome P450cam, J. Am. Chem. Soc.
113,3195(1991). |
5 P. R. Ortiz de Montellano, J. Friiietel, J. R.
Collins, D. Camper, and G. HJLoew cal-
culated and experimental absolute ste-
reo chemistry of the styrene and b-
Methylstyrene Epoxides formed by cyto-
chrome p450 cam, J. Am. Chem. Soc.
114,6987(1992). j
6 The work presented here was published
in Int. J. Quant. Chem. QCS 27, 815
(1993). G. H. Loew and Y. T. Chang,
Theoretical Studies of the Oxidation of
N- and S-Containing Compounds by
Cytochrome P450. \
1 Gilda H. Loew and Yan-Tyng Chang, Molecular Research Institute, 845 Page Mill Road, Palo Alto, CA 94304.
44
NESC Annual Report - FY1994
-------
High Performance Computing For Environmental
Research1-2
Note: Certain sections of this article were
presented in an invited talk at the "Design for
the Environment" Symposium, American
Chemical Society Fall Meeting, August
1994, Washington, DC.

The National Environmental
Supercomputing Center
The U.S. Environmental Protection
Agency (EPA) established the National Envi-
ronmental Supercomputing Center (NESC)
in October 1992 to provide a high perfor-
mance computing facility for environmental
research. This Supercomputing facility is
located in the middle of the Great Lakes
region of the United States, along the Sagi-
naw River in Bay City, Michigan. The NESC
is the only Supercomputing center in the
world dedicated solely to the support of envi-
ronmental research. As the central comput-
ing facility for EPA scientific needs, the
NESC has become a focal point for collabo-
rative efforts in environmental research
within a short period of time.
The NESC has more than 45 environmen-
tal research projects and has been support-
ing more than 300 users throughout the
United States. These projects are related to
disciplines such as: environmental computa-
tional chemistry, air quality modeling, global
climate change, and the modeling of
streams, lakes, and estuary systems. These
projects have consumed the majority of the
NESC resources. This paper will describe
the efforts of the NESC to provide empower-
ing services to EPA scientists while fostering
collaboration across disciplines, across
organizations, and across geographical
locations. Specifically highlighted will be the
efforts in environmental computational
chemistry.
Hardware and Sloflware Resources
The NESC began its operation with one of
the most powerful supercomputers that was
available at that time, a Cray Research Y-
MP 8i/232. For data storage, the NESC had
two StorageTek 4400, Automated Cartridge
Systems (Silos) with a total capacity of 2.4
terabytes of data. Since the NESC serves
users across the United States, the center is
well-connected through EPA's own dedi-
cated high-speed telecommunication net-
works and the Internet.
Since the dedicsitiori of the NESC's super-
computer (Sequoia) in October 1992, it has
been running at or greater than 95% of its
capacity. After the installation of the Cray Y-
MP, more and more environmental research-
ers have discovered the potential value of
using Sequoia, and the NESC computing
resources have become increasingly in
demand. The NESC's user community and
its computing requirements grew rapidly dur-
ing the first six months of its operation. In
the fall of 1993, in order to accommodate
this growing demand, the NESC upgraded
its original two-processor Y-MP 2/32 to a
three-processor Cray Y-MP 3/64. To meet
the requirements of its growing customer
community, the NESC replaced its original
Cray Y-MP with a state-of-the-art Cray C94
2/64 in May 1994. In the late October 1994
an additional (third) processor was added to
the C94. These latest hardware upgrades
are expected to provide increased through-
put and faster job processing turnaround for
the NESC users. In addition to the Cray
mainframe, the NESC also uses Silicon
Graphics and Data General distributed serv-
ers to satisfy graphics and other UNIX work-
station requirements.
NESC Annual Report - FY1994
45
-------
High Performance Computing For Environmental Research
The NESC also features several public
domain and third-party application software
packages, databases, mathematical, and
statistical library routines. Most of the avail-
able computational chemistry software pack-
ages are also installed and supported at the
NESC. Details about the chemistry user
resources are covered in a later section of
this article. Since visualization is an impor-
tant part of supercomputing, the NESC
maintains and supports several state-of-the-
art hardware and many popular graphics
software packages.
For more detailed information about the
NESC's hardware and software, please refer
to "Message from the NESC Director" on
page 3.

The NESC User Community and Their
Scientific Projects
The NESC's users, scattered throughout
the United States, include EPA scientists,
engineers, and those agencies, universities,
or companies having grants, co-operative
agreements, memoranda of understanding,
or contracts with EPA. The NESC's
resources are allocated annually, based on
peer-review of proposals submitted by indi-
viduals satisfying any of the above require-
ments. Trial accounts are also available,
upon request, for new users. The current
NESC projects, which support Congres-
sional mandates, are the Clean Air Act, the
Clean Water Act, and the Superfund
Cleanup initiatives. There are also projects
supporting local environmental initiatives
such as the Great Lakes, the Chesapeake
Bay, the Green Bay/ Fox River, and the San
Joaquin Valley projects.

Environmental Computational
Chemistry at the NESC
In an attempt to provide a user-friendly
supercomputing environment at the NESC,
the authors of this article have implemented
many innovative approaches. One such
effort provided remote chemistry users with
access to a molecular modeling software
package, which was installed on a powerful
graphics workstation at the NESG. Depend-
ing on the capacity of the network, most of
these remote users experienced excellent to
adequate throughput and response. Some
of the NESC's users at EPA's Headquarters
in Washington, DC are successfully access-
ing this software package on their personal
computers (PCs), using X-WindoW System
emulation software. Plans exist to interface
all the supercomputer chemistry software
packages with at least one user-friendly
graphics workstation molecular modeling
software to improve productivity, i
i
The NESC's chemistry user community is
relatively large, very active, and they con-
sume a major share of its hardware and soft-
ware resources. Some of the major projects
in the area of Environmental Computational
Chemistry include: The Mechanisms of
Chemical Toxicity (U.S. EPA, Health Effects
Laboratory, RTP, NC), Ab Initio Calculation
of the Electronic Spectra of Polycyclic Aro-
matic Hydrocarbons (U.S. EPA, EMSL-Las
Vegas, NV), Database of Moleculjar Elec-
tronic Structures for Environmental Risk
Assessment and Toxicology (U.Sj EPA ERL-
Duluth, MN), and Exploring Toxic IMecha-
nisms for the Design of Safe Chernicals
(U.S. EPA OPPTS-Washington, DC).
To meet the various requirements of the
chemistry users, the NESC has an extensive
list of computational chemistry software
packages such as: AMBER, AMSOL,
CHARMm, DISCOVER, DMol, Gaussian 927
DFT, and MOPAC. The most frequently
used software package from the above list is
Gaussian 92/DFT, distributed by paussian,
Inc. The group from EMSL-Las Vegas,
heavily uses this suite of programs to cali-
brate and predict molecular structure and
electronic spectra of polycyclic aromatic
hydrocarbon molecules, which arb one of
the major health hazards and environmental
contaminants.
46
NESC Annual Repbrt - FY1994
-------
High Performance Computing For Environmental Research
NESC Training and Workshops
The NESC also provides extensive train-
ing for its users. These training sessions,
ranging from Introductory UNIX to Advanced
FORTRAN and C Code Optimization, are
conducted throughout the year. The authors
of this article coordinated the first major
event, the Computational Chemistry Work-
shop, held at the NESC in September 1993.
The United States Environmental Protection
Agency's (EPA) National Data Processing
Division (NDPD) and the Environmental
Monitoring Systems Laboratory (EMSL) -
Las Vegas jointly sponsored that event. The
main objective of that workshop was to sur-
vey the scientific objectives and achieve-
ments of EPA in the field of computational
chemistry. Topics of discussion included
some of the major work being done in the
field of environmental computational chem-
istry using the NESC's supercomputing
resources.
The Computational Chemistry Workshop
was a unique opportunity to hear in-depth
details about the new and exciting scientific
discipline, Environmental Computational
Chemistry. Approximately 60 people, repre-
senting various EPA-related institutions and
others, attended the three-day workshop.
Prominent scientists from regional institu-
tions, such as: Dow Chemical, Michigan
Molecular Institute, the University of Michi-
gan, Saginaw Valley State University, and
Wayne State University, participated in the
scientific presentations and discussions.
Other speakers included scientists from
national and international universities spe-
cializing in the field of computational chemis-
try. Experts in areas such as Quantitative
Structure Activity Relationship (QSAR) for
chemical exposure and risk assessment,
Computational Analytical Chemistry, Water
and Atmospheric Chemistry Modeling, Tox-
icity Prediction, and Database Design, led
the scientific presentations and subsequent
discussions. Of particular interest was the
fact that, for the first time, scientists from the
regulatory and the research wings of EPA
met and exchanged ideas and discussed
issues of mutual interests. Thus, the work-
shop at the NESC served as a melting pot
for the Agency's long-term research and
regulatory objectives in the field of environ-
mental computational chemistry.
In April 1994 the NESC conducted a work-
shop titled "Watershed, Estuarine and Large
Lakes Modeling". In addition to these work-
shops, the NESC was also responsible for
the International Environmental Visualization
Workshop successfully held for the past two
years in Cleveland. Workshops conducted
at the NESC such as the Computational
Chemistry and the modeling of Watershed,
Estuarine and Large Lakes have helped pro-
vide focus to widely distributed environmen-
tal research teams.
In September 1994, the NESC organized
a very successful Gaussian 92/DFT work-
shop. NESC users from EPA, EPA contrac-
tors, and scientists from other government
agencies, industries and universities partici-
pated in the workshop. Of the 28 partici-
pants registered, 25 attended the entire
three days of the workshop. Logistic sup-
port was provided by the Martin Marietta
staff at the NESC: Management, Systems,
Computational Science Services, Visualiza-
tion, Documentation, and Facilities groups.
Overall, the workshop was conducted
smoothly. Some of the suggestions for
improvement provided by the participants
were:
• Split the workshop into Gaussian I and II.
• Some fundamentals are needed, scien-
tists like J. A. Pople should be invited.
• Is it possible for the workshop to focus
on a subset of applications?
• A bit more hands-on sessions, lectures
are too long/complex.
• Overall an excellent workshop. Very
useful and informative.
• Hands-on session could be more struc-
tured.
• This course would be incomprehensible
to a real novice. Also, starting with the
INPUT and OUTPUT is difficult if student
is a novice.
NESC Annual Report - FY1994
47
-------
High Performance Computing For Environmental Research
• More supervision/instruction in hands-on
sessions, send some of the reference
materials out in advance so that partici-
pants can prepare before arriving.
• Q & A sessions and hands-on session
not organized well. Parts of presentai-
tions tended to drag on.
• I guess I would have enjoyed more the-
ory. However, the course is intended to
be more introductory.
• Skip AVS and construct Z-matrix by
hand. Have more help available from
experts in hands-on session. I spent
75% of my time waiting for help.
• Dr. Schlagel was a very interesting
speaker and I enjoyed talking to him very
much.
• Another workshop held. Pre-prepared
exercises in a step-wise fashion. Practi-
cal recipes for hands-on session.
• Well organized and a very pleasant
experience. Hands-on sessions were
particularly interesting. Keep up the
good work!

The Future
The future of high performance computing
and general automated information process-
ing for environmental research appears to
be extremely promising. High performance
computing is a well-known paradigm in envi-
ronmental research. As a direct conse-
quence of this wonderful tool, we are
beginning to experience an information
explosion in this field.
However, what is really lacking in this field
are concerted tools for information manage-
ment. Too many databases are scattered all
over the world addressing certain limited
information. EPA, National Science Founda-
tion, Department of Energy, or any other
environmentally related scientific organiza-
tions will have to come forward in order to
establish a comprehensive plan and initia-
tive in bridging together all this scattered
information. An organization such as the
National Library of Medicine (for the health-
related information processing) needs to
evolve for the environmental information
management as well. It is also surprising to
note that a vast and interdisciplinary science
such as environmental science does not
have its own periodic abstracting jservice
similar to the Chemical Abstracts.; Initiation
of an Environmental Abstract service would
be the first step in the right direction. The
authors strongly believe that the near future
will witness the realization of such new
initiatives. I
i

Conclusion ;
Using computers in environmental chem-
istry research adds a new dimension to the
traditional experimental approach. For
example, it is difficult to comprehend the
nature of ozone depletion or acid rain based
on certain rare and costly experimental data
points. By incorporating computers and
graphic visualization into this research, a
whole new world of graphical and, three-
dimensional images emerges. This not only
improves comprehension but also saves
time and money while permitting more
aggressive and innovative scientific
research. j
i

Acknowledgments
K. Namboodiri wants to express his sin-
cere appreciation to Ben Bryan, Manager,
NESC, Martin Marietta Technical Services,
Inc., and the staff of the NESC, for their
excellent support. Also, special thanks are
due to Kerslin Felske for carefully proofing
this manuscript.
i
Disclaimer
The U.S. EPA does not endors'e or rec-
ommend any of the commercial products or
trade names mentioned in this manuscript.
1 Krishnan Namboodiri, Martin Marietta Technical Services, Inc. (MMTSI), National Environmental
(NESC), 135 Washington Ave., Bay City, Ml 48708.
2 Walter M. Shackelford, Director of Scientific Computing, U.S. EPA, National Data Processing Division
27711.
Supercomputing Center

(NDPD), RTP, NC
48
NESC Annual Report - FY1994
-------
U.S. EPA Workshop on Watershed, Estuarine, and Large
Lakes Modeling (WELLR/i)^
Overview
The U.S. Environmental Protection
Agency's (EPA) workshop on Watershed,
Estuarine, and Large Lakes Modeling
(WELLM) was the second major scientific
meeting to be held at the National Environ-
mental Supercomputing Center (NESC)
since its dedication in October 1992. This
workshop followed the highly successful
Computational Chemistry workshop held
September 27-29,1993. These workshops,
organized by Computational Science Ser-
vices at the NESC, were designed to further
the interaction between modeling, policy,
and management staff within EPA-spon-
sored programs. In particular, the work-
shops focused on the relevance of the
NESC's resources to the future of such pro-
grams and they displayed the resources and
services that EPA's National Data Process-
ing Division (NDPD) provides for compute-
intensive environmental modeling.
The mission of the WELLM workshop was
to survey the scientific objectives and
achievements of EPA in modeling water
quality, hydrology, hydrodynamics, sedi-
ment transport and deposition, ground water
pollutant transport, ecosystem based risk
analysis, and policy analysis. Additional
subjects included the multimedia aspects of
water quality modeling, toolkit availability,
and EPA support levels for models.
The WELLM workshop was jointly spon-
sored by EPA's NDPD, and the following
EPA organizations: The Chesapeake Bay
Program Office (Region 3), Annapolis, MD;
The Great Lakes National Program Office,
Chicago, IL; ERL-Athens, GA; ERL-Duluth/
Large Lakes Research Station, Grosse He,
Ml; and EPA's National Environmental
Supercomputing Center (NESC), Bay City,
Ml. The Program Committee of five senior
EPA staff was selected on the basis of their
dedication to the importance of modeling
within EPA and their commitment to the
importance of the NESC as a resource for
EPA's modeling community. The program
committee consisted of: Robert Carsel,
EPA, ERL-Athens, GA; Lewis Linker, EPA,
Chesapeake Bay Program Office, Annapo-
lis, MD; Pranas Pranckevicius, EPA, Great
Lakes National Program Office, Chicago, IL;
William Richardson, EPA, Large Lakes
Research Station, Grosse lie, Ml; and, Tho-
mas Short, EPA, FiS. Kerr ERL, Ada, OK.
The work of the Program Committee and
Local Organizing Committee was greatly
facilitated by the agreement of Professor
Robert V. Thomann to act as the workshop
moderator. Professor Thomann, of the Envi-
ronmental Engineering Department at
Mahattan College, New York, is a world
leader in the field of water quality issues and
related subjects and proposed the Guiding
Principle for the workshop and prepared the
executive summary.
Thirty-two participants attended the
WELLM workshop and presented sixteen
invited and six contributed papers of high
quality and relevance to EPA interests. The
speakers included representatives from EPA
sites, other Federal agencies such as the
National Oceanographic and Atmospheric
Agency, and the U.S. Army Corps of Engi-
neers, state agencies, environmental con-
sulting companies, Martin Marietta Technical
Services, Inc. (MMTSI), and Universities
(including EPA grantees). The invited pre-
sentations were grouped around themes in
separate (numbered) sessions and each
session had associated with it an extended
discussion session of similar length.
NESC Annual Report - FY1994
49
-------
U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
Corresponding to each theme, Professor
Thomann posed a series of key questions to
be addressed in the group discussion ses-
sions. Responses in the group discussion
sessions were coordinated by discussion
leaders and communicated to the workshop
participants in a Workshop Recommenda-
tions and Conclusions session. The final
session posed for discussion the theme of
"Workshop Follow-up and Implementation",
and the result was a group decision to pre-
pare the executive summary, shown below,
for presentation to EPA.
The broad spectrum of participants and
corresponding areas of expertise made for a
productive and exciting workshop. The
results of the workshop evaluation included
in Table 1, page 56, and Figures 5, page 55,
and 6, page 56, confirmed a high degree of
satisfaction on the part of the attendees.

Executive Summary

Introduction
On April 18-20,1994, a group of scien-
tists and engineers experienced in environ-
mental modeling' [see NOTES at end of the
Executive Summary] came together at the
U.S. EPA National Environmental Super-
computing Center (NESC) to conduct an
"audit" of the current state of large scale
interactive environmental models. This
examination of the record of environmental
modeling revealed important insights into
the role of predictive frameworks in advanc-
ing the understanding of environmental
behavior and, most importantly, in assisting
environmental decision making.
More specifically, as the "audit" continued
during the Workshop, it became apparent
that large scale environmental modeling had
much to offer to national environmental deci-
sion making and policy. The assessment of
modeling large watersheds, estuarine, and
Great Lakes systems indicated that while
there were important scientific questions
and issues, the state of the art of modeling
had progressed sufficiently to be of signifi-
cant value in assisting upper level [managers
and directors in developing policy and in
providing support for environmental
decisions.
This Executive Summary is intended to
provide an overview of the important results
of the Workshop with specific emphasis on
the role of modeling in decision making and
policy. |

Whv this Workshop? j
The motivation and rationale for this
Workshop is an outgrowth of several obser-
vations: |
• The questions related to environmental
decision making have grown i'n complex-
ity related to environmental issues and
the extent of spatial and tempbral scale2.
• The economic and policy consequences
of making a "right" or "wrong" decision
have increased substantially.'
• The need for a high level of enforcement
credibility and scientific defen&bility is
absolutely essential. j
• As a result of these pressures, model-
ing of atmospheric, aquatic and terres-
trial environmental systems in recent
years has increased rapidly in geo-
graphic scope and degree of complexity
(i.e., incorporation of increasing physi-
cal, chemical and biological pro-
cesses). The complexity has been
further expanded by another trend
toward the linkage of these systems to
quantify the cross-media impacts of
pollutants. ;
The principal results of the Workshop
Process3 included a series of key j recom-
mendations supporting conclusions and
suggestions for implementation. '

Recommendations ;
1. Because of the increasing spatial, tem-
poral, and multi-media scope of contem-
porary environmental questions, the
U.S. EPA should provide the leadership
50
NESC Annual Report - FY1994
-------
U.S. EPA WorKshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
among Federal agencies to develop,
plan and build a National Environmental
Decision Support System (NEDSS).
Such a system would include multi-
media models, telecommunications,
visualization and Geographical Informa-
tion System (GIS) tools integrated into a
unit that would assist and support the
decision making process.
NEDSS would provide:
• Quantitative evaluations of the environ-
mental response to alternative control
strategies.
• Synthesis of monitoring data and model-
ing to provide assessment of environ-
mental quality trends.
• Reduction of the uncertainty of policy
results.
2. The basic modeling structure underlying
the NEDSS should be a synthesized,
linked, and coupled interaction of air-
shed, watershed, and coastal models as
illustrated in Figure 1, page 52.
3. In order to effectively utilize confirmed
and accepted environmental models, the
maintenance and upgrading of such
models needs to be institutionalized
within the proposed NEDDS.
4. The National Environmental Supercom-
puting Center (NESC) has provided the
modeling community with a much
needed, dedicated, and valuable
resource for environmental modeling
research and application. Support for
this resource facility should be continued
and computational resources should
continue to expand to match growth in
demand and model complexity. The
NESC can also assist in the demonstra-
tion of model utility to decision makers
through practical applications to impor-
tant issues confronting EPA and other
resource agencies.
5. Associated facilities such as the visual-
ization laboratory and telecommunica-
tions and teleconferencing capabilities
have shown that such technologies can
aid in comprehending complex environ-
mental systems and should continue to
be supported and expanded to fill latent
demand.

Supporting Conclusions
• The increased costs of environmental
problems, the complexity of multi-media
interactions, and the need to improve
scientific credibility of environmental reg-
ulations mandates the intelligent use of
environmental models.
• Models of environmental systems at
EPA and other Federal and state agen-
cies have been neither used nor sup-
ported by management at an adequate
level because of:
a) an incomplete understanding of model
utility and,
b) the necessity for concerted and
focused efforts by scientists, engineers,
decision makers, and the general public.
• Models are essential to decision making,
yet excessive reliance on models that
have not been confirmed in a manner
acceptable to the community can lead to
erroneous results.
• The "track record" of the successful
management and scientific utility of mod-
els is now extensive and can be
documented.
• Early interactions v/ith managers and
decision makers is crucial to successful
predictive models. Such interactions
would facilitate properly structured mod-
els which are consistent with policy and
problem objectives. Early interactive
relationships determine model complex-
ity that is consisitent with policy goals.
• Model confirmation with observed data is
essential in all cases except for analyses
of a "screening" nature where direct
comparisons of model output to
observed data are not possible.
• Models are the essential and important
component for interpolating and
NESC Annual Report - FY1994
51
-------
U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
NATIONAL ENVIRONMENTAL

DECISION SUPPORT SYSTEM (NEDSS)
Rgurel: NEDSS
extrapolating observational data, thereby
expanding in a cost-effective manner the
ability to estimate and understand areas
that were not directly measured.
• The credibility of model diagnoses and
predictions are synonymous with contin-
ual "post auditing" (i.e., monitoring of the
efficacy of purported results to environ-
mental actions or lack of action; "How
well is the prediction actually matched by
subsequent observations?")
• Models (in the fully integrated sense of
field observations and mathematical
descriptions) have often pointed the v/ay
for the development of future policies
and decisions, especially with regard to
the effectiveness of implementing vari-
ous levels of environmental control
strategies.
• As problem contexts increase in geo-
graphical scale and process complexity,
the divergence between model develop-
ment and available field data is increas-
ing. The depth of monitoring and field
validation is lagging model computa-
tional ability by an increasing amount.
• For large scale models that are linked or
coupled (e.g., atmospheric deposition
and watershed models), the data
requirements are unique and must
reflect a common observational strategy
and a resulting common data base.
• Individual models exist at a variety of
locations (e.g., watershed models in
Chesapeake Bay, air quality models at
RTP) but there is no central facility in
EPA for development, maintenance and
upgrading of environmental models.

Implementation Strategy
• Senior EPA policy makers should
receive a briefing report of these recom-
mendations reached at the WELLM
Workshop. !
• The first initiative of NEDSS should be to
focus on the Eastern United States with
specific emphasis on the Great Lakes,
Chesapeake Bay, and additional coastal
52
NESC Annual Report - FY1994
-------
U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
regions and associated airsheds and
watersheds4.
• The various subsequent phases for con-
structing the complete NEDSS should be
undertaken by EPA in cooperation with
other agencies.
• Implementation should recognize the
reorganization of the ORD and issues of
how EPA program offices relate to one
another and how they interact on these
issues.
• Representatives of the modeling com-
munity as illustrated by those attending
the WELLM workshop should provide
input into the implementation process
through examples of modeling needs,
successful case studies and the impor-
tance of the larger geographical per-
spective on regional and local
environmental quality.

Notes
Note 1: Models of environmental systems
are integrated and synthesized frame-
works of observations from the laboratory
and field, coupled with quantitative repre-
sentations of environmental processes.
Thus, models provide credible methods for
predicting the results of environmental
controls or policies.
Note 2: An example of this increasing scale
is given by the Chesapeake Bay experi-
ence where the initial focus was on the
Bay proper. Subsequent analyses indi-
cated the significance of developing pre-
dictive assessments of controls on the
associated watershed and airshed of the
Bay together with the nearby coastal
region interactive with the Bay as illus-
trated in Figure 2, page 54.
The results of this expanding scale of
impact on the Bay is illustrated in Figure 3,
page 54, which shows the estimated
improvement in main Bay deep wafer oxy-
gen as calculated by three linked models:
an air quality model of the airshed, a
Chesapeake Bay watershed model and a
hydrodynamic/water quality model of the
Bay. The Program Goal of 40% controlla-
ble nutrient reduction is improved signifi-
cantly when compared to the Limit of
Technology by inclusion of nutrient reduc-
tions from the Clean Air Act and the full
scale Basin-wide application of controls.
Note 3: The Workshop was organized
around five themes:
• Environmental Modeling: Understanding
and Prediction;
• Trade-offs Between Complexity and
Utility.
• Modeling Cross-Media Linkages and
Coupling.
• Model Computation: Current and Future.
• Model Support Systems and
Visualization. :
Analysis of each theme was begun by sev-
eral presentations to set the stage for sub-
sequent discussion of each theme. The
discussions were centered around a series
of focus questions. A final session sum-
marized the key recommendations and
supporting conclusions and implementa-
tion suggestions.
Note 4: The Great Lakes and Chesapeake
Bay systems in particular have been
shown to be impacted by atmospheric
sources far distant from location of impact
as shown in Figure 4, page 55. As such,
the decision making process requires an
integrated framework to credibly evaluate
control strategies to meet environmental
objectives for these water bodies and their
associated watersheds.
I
Workshop Feedback
The results of workshop questionnaires are
shown in Figure 5, page 55, Figure 6,
page 56, and Table 1, page 56.
1 George Delic, WELLM Workshop Program Coordinator, Martin Marietta Technical Services, Inc., National
Environmental Supercomputing Center, 135 Washington Avenue, Bay City, M 48708I.
2 Robert V. Thomann, Workshop Moderator, Manhattan College, Mew York 10471. ,
NESC Annual Report - FY1994
53
-------
U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
CHESAPEAKE BAY
Figure 2: Chesapeake Bay Watershed Model Expansion
CHESAPEAKE BAY MODEL ESTIMATES OF IMPROVEMENT IN
MAIN BAY DEEP WATER OXYGEN DUE TO NUTRIENT REMOVAL
PROGRAMS

% IMPROVEMENT FROM BASE CASE
no
40
30
20
10
•
•
•

SSOLVED OXYGEN L£
mg

"SSTHX
a.

N1
32

i
PROGRAM GOAL GOALtCAA+ALL BASIN
GOAL+CLEAN MR AMEND LIMIT OF TECHNOLOGY
SCENARIO

Figure 3: Estimated Improvement in Main Bay Deep Water Oxygen
54
NESC Annual Report - FY1994
-------
U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
GREAT LAKES
DOMAIf
SOURCE CONTROL
FOR GREAT LAKES
IMPACTS
CHESAPEAKE BAY VIA
AIRSHED - COASTAL
OCEAN
INTERACTIONS
CHESAPEAKE
BAY DOMAIN
Figure 4
NESC WELLM Workshop, 18-20 April, 19941
Mean of Means =3.47
StdDev of Means = 0.31
Question and area
Figure 5: Mean score with N=19 respondents for each of sixteen questions
NESC Annual Report - FY1994
55
-------
U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
NESC WELLM Workshop, 18-20 April, 1994
Mean of Std Dev= 0.56
Std Dev of Std Dev= 0.24
Question and area
Figure 6: Standard deviation of score with N=19 respondents for each of sixteen questions
Table 1: List of questions, sample mean (n) and standard deviation (a) for N=19 respondents on
a scale of 0 to 4. :
- ' vC< ' y"~" • f •?- ,
':•* ' <
i - cr .*;
Evaluation of the presentations: the speakers ,
1
2
3
4
5
6
7
8
Eva
9
10
11
12
13
Eva
14
15
16
Possessed a thorough knowledge of the workshop subject area
Were well-organized and prepared for the workshop
Explained the presentation objectives clearly
Communicated the subject matter well
Stimulated interest in the workshop subject(s)
Were willing to answer questions in the workshop
Were courteous
Were willing to participate in discussions
3.7
3.5
3.2
3.2
3.5
3.8
3.8
3.7
0.45
0.49
0.54
0.56
0.77
0.50
0.50
0.45
uation of the workshop
The abstract book adequately covered topics discussed
The discussion questions listed were appropriate
The time allowed for discussion was adequate
The organization of the workshop was satisfactory
The pace of the workshop was comfortable
3.3
3.7
3.3
3.4
3.0
0.94
0.48
0.89
0.64
0.73
uation of the participant's response
I would attend a NESC workshop again given an opportunity
I measure my own participation in this workshop as
I measure how much this workshop meets my expectations as
3.8
3.1
3.4
0.47
0.69
0.61
56
NESC Annual Report - FY1994
-------
Visualization for Place-Based Management1
Most of us find it hard to value what we
can't visualize. To illustrate the point, con-
sider Alaska. Back in 1867, when Ameri-
cans heard that Secretary Seward had
negotiated a $7.2 million deal to purchase
Alaska from the Russians, they were out-
raged. Cries of "Seward's folly" were heard,
and the purchase barely won approval from
Congress. Why? To most Americans,
Alaska was an unknown place they
assumed to be worthless wilderness. They
couldn't visualize Alaska, and they couldn't
value the un-visualizable.
How greatly the situation had changed by
1980. In that year, Congress passed the
Alaska National Interest Lands Conservation
Act, preserving almost 100 million acres of
land. A key difference across 113 years was
that Americans could now visualize "Alaska"
and perceive it as a valuable place, with
wildlife and natural features worthy of pres-
ervation. Thanks to widespread exposure to
literature, photographs, movies, and televi-
sion coverage, Alaska no longer seemed an
unknown, worthless place.
The Environmental Protection Agency
(EPA) is moving to a "place-based," ecosys-
tem management approach, in which we
work in partnership with many others to pro-
tect, restore, and manage ecosystems. I
believe that visualization will be an indis-
pensable tool to aid our efforts.
Let me offer another example of the spe-
cial importance of visualization. Can people
regulate their high blood pressure without
the use of drugs? The medical literature
suggests that the answer is yes, through the
use of bio-feedback techniques. In bio-feed-
back, the patient is given tools to enable him
or her to self-regulate parameters such as
blood pressure. It seems that, if one pre-
sents a person with the capability to continu-
ously monitor blood pressure - to visualize
the data if you will ~ she will find ways
(many of them subconscious) to self-regu-
late her blood pressure within reasonable
limits.
Isn't this what we would like to see in our
ecosystems? Rather than relying exclu-
sively on command-arid-control regulation,
penalties for non-compliance, and expen-
sive, after-the-fact mitigation and restora-
tion programs, shouldn't we also seek to
empower people to self-regulate and oper-
ate sustainably? (I am indebted to Dr. Dan
Janzen, the tropical biologist, for suggesting
this analogy.)
Visualization is key to such a strategy for
self-regulation for ecosystem protection ~
what one might term: "eco-feedback." And
many of the techniques, technologies, and
tools that you and others have developed for
data visualization can be applied to echo-
feedback. Let me review some ideas about
priority areas for using visualization for eco-
systems, i

Helping People Visualize Places of
Interest to Them
All of us have places we're interested in:
our "home turf;" places we've visited or
would like to visit; places we've studied;
places we're afraid of, and so forth.
Using the coming National Information
Infrastructure, I would like to see all Ameri-
cans, and eventually all citizens of the globe,
presented with rich capabilities to visualize
ecosystems of interest. Combining
remotely-sensed and ground-based moni-
toring data with spatial, demographic, and
biological information, citizens should be
able to do "fly-overs, -throughs, and -
unders;" zoom in and out; select layers for
display and filter data; run animations of
time-series data such as land-use change;
and so forth. Citizens should be given sim-
ple, yet powerful and flexible, interfaces that
NESC Annual Report - FY1994
57
-------
Visualization for Place-Based Management
allow them to visualize and interact with
information about ecosystems and run simu-
lations of alternative scenarios. I note, with-
out making any endorsement, that packages
such as Maxis Corporation's SimCity dem-
onstrate impressive capabilities along such
lines.
Citizens should be empowered to define
their own ecosystems, rather than being pre-
sented with arbitrary boundaries and
instructed that those are the "correct" eco-
systems. We all need to know where the
political boundaries are and the land-owner-
ships, the watersheds and the ecoregions
according to various authorities, but we
should not impose them as the only ways to
think about ecosystems. And just as a citi-
zen can follow his own favored stock portfo-
lio, he should be able to follow his own
portfolio of ecosystems.
Citizens must also be empowered to visu-
alize the components of ecosystems and the
processes associated with ecosystems.
This means innovative use of multimedia
technologies and perhaps virtual reality tools
to help citizens explore, learn, and under-
stand. Let a citizen see the inventory of
known biota of her place. Let her see what
each looks like, how it moves, what it
sounds like, what it eats, what eats it, what
is its habitat, and so forth.

Fostering "Friendly Competition"
Americans believe in the positive power of
competition in the marketplace, on the
sports fields, and often in other contexts as
well. I believe friendly competition could be
of great value for place-based management.
Give each political jurisdiction the visualiza-
tion tools to see what natural resources it
possesses now, as well as in the past and
the predicted future. Do the same for indica-
tors of ecosystem service levels. Empower
citizens to compare their ecosystem and
sustainability statistics ~ their "batting aver-
ages," if you will. Use advanced visualiza-
tion tools to make it compelling and fun ~
and it had better be compelling, arid it
should be fun.
We'll be enabled to make tremendous
progress if we make it possible to foster
friendly competitions among towns, coun-
ties, states, and nations to gauge who's
doing the best jobs of managing their
places. |

Eco-Feedback or Self-Regulation
f
Just as bio-feedback can enable individu-
als to regulate parameters such as blood
pressure, eco-feedback could enable com-
munities and societies to self-regulate their
management of ecosystems. For jthis to
work, citizens need readily-visualizable,
unambiguous, real- or near-real-time indica-
tor information about their places or ecosys-
tems. I know this is challenging, but I am
optimistic because we have seen this done
in other contexts that our society deems
important, such as weather and financial
information. Through national partnerships,
I believe we could construct usable indica-
tors of such things as land use/land cover
and ecosystem service levels that:were con-
sciously designed to empower and enable
self-regulation to protect, restore, pid man-
age ecosystems, conserve biodiversity, and
pursue sustainable development, j If the citi-
zenry is then bathed in this information, and
we are doing other things like promoting
friendly competition, I believe we will see
increased self-regulation and less need for
costly after-the-fact controls, mitigation, and
restoration. |
I have spoken elsewhere about ithe vision
for an "Environmental Channel" oh the infor-
mation superhighway - a two-way, interac-
tive capability to empower citizens to do
more to protect the environment. You in the
visualization community are out \rit front in
demonstrating technical capabilities that will
be essential in making the Environmental
Channel a reality. \
I
1 Written by Steve Young, Chief, Client Support Branch, Program Systems Division, Office of Information Resources
Management. Presented by Bill Laxton at EPA's International Environmental Visualization Workshop, Cleveland,
OH, 30 August 1994. !
58
NESC Annual Report - FY1994
-------
-
Experimental and Calculated Stabilities of Hydrocarbons
and Carbocations1
introduction
Mass spectral analysis of environmental
samples depends critically upon the quanti-
tative sensitivity of detection of a molecule in
question. Analytical strategies for detection
of new molecules can be designed most
effectively when there is a basis for predic-
tion of the detection sensitivity for that sam-
ple and the type of mass spectral analysis to
be done, El, Cl, positive or negative ion
detection, FAB, electrospray, etc. The gas-
phase basicities or proton affinities (PA) and
ionization potentials (IP) of molecules may
be correlated with positive ion sensitivities,
and electron affinities (EA) with negative ion
sensitivities, for example with polynuclear
aromatic hydrocarbons (PAH's). The quan-
tum mechanical calculation of these proper-
ties would make predictions of mass
spectral sensitivities possible even in cases
where experimental properties are not
known. The results obtained might also per-
mit the design of experiments for increased
efficiency of detection with particular Cl
gases tailored to match the properties of the
molecule to be analyzed.
As both gas-phase experimental tech-
niques and theoretical methods have devel-
oped to permit accurate determination of the
energetics of chemical processes, careful
comparisons of experimental and theoretical
results provide important insights into the
current state of theoretical methodology and
into the interpretation of experimental data.
Heats of formation of many stable neutral
molecules have long been well known,1 but
experimental developments in the area of
gas-phase ion energetics have particularly
accelerated in the last 20 years. With the
advent of equilibrium methods based on
high-pressure mass spectrometry and ion
cyclotron resonance spectrometry for deter-
mination of relative ion stability and comple-
mentary methods for determination of
absolute ion stabilities,2'3 we now have a set
of experimental thermodynamic data on
over 1000 positive and negative ions.
These data now rival those for neutral mole-
cules in quantity, although their quantitative
accuracy is often somewhat less reliable,
and entropy and heat capacity data are less
abundant or accurate. The wide variety of
charge and structural types among ions
makes these species especially critical in
testing the limits of quantum theoretical
methodology. Much effort has ensued,
therefore, to calculate energies of ions now
known experimentally, with some success in
reproducing relative trends with lower levels
of theory4 and more success in reproducing
absolute ion energies with higher levels of
theory on small ions. One particular aim of
our current work is to extend previous stud-
ies on proton affinities of heteroatom bases
to proton affinities of hydrocarbon bases and
hydride affinities of corresponding carboca-
tions by investigating systematically the
effects of (1) geometry optimization, (2)
basis set, and (3) correlation method on
these computed reaction energies.
We begin here by concentrating attention
on a series of C-|, C2, C3, and C4 hydrocar-
bon ions and related neutral molecules with
diverse structures typical of those found in
larger carbocations and hydrocarbons of
general interest, such as PAH's. The car-
bocation and hydrocarbon set chosen pro-
vides a particularly critical test of the ability
of theory to handle difficult problems with
structures containing pi-bonds, strained
rings, and resonance-stabilized, aromatic,
H-bridged, and non-classical carbocations.5
The experimental reaction energies are well-
NESC Annual Report - FY1994
59
-------
Experimental and Calculated Stabilities of Hydrocarbons and Carbocations
known in most cases by multiple experimen-
tal methods, so that we can achieve a "cali-
bration" of different levels of theory for
application to larger systems. Systematic
calibration on a wide variety of structures of
known energy is important in assessing the
accuracy that can be expected as the level
of theory is improved. This calibration also
might allow the identification of some lower-
level theoretical methods that are computa-
tionally feasible larger ions, but, at the same
time, are capable of giving results to "chemi-
cal accuracy" on the order of 1-3 kcal mol"1,
comparable to the experimental errors
expected for many larger hydrocarbon
ions. In addition, these calibrations also
might justify sufficient confidence in the
accuracy of the theoretical energies (and
structures) to be able to use them to help
interpret experimental results, either by fill-
ing in data inaccessible experimentally or
clarifying cases where the experimental
results are in question. For carbocations,
experimental problems are sometimes espe-
cially difficult, because of the possibility of
rearrangement, problems in generation of
radicals and assignment of their photelec-
tron spectra, slow rates of proton-transfer,
lack of gas-phase structural data and
entropy data, and an incomplete knowledge
of the dynamics of ion-molecule reactions
used to determine thermochemical limits or
equilibrium data. From the results reported
here and as yet unpublished results on
larger ions, we are especially encouraged by
the prospect that currently accessible quan-
tum theoretical models are now capable of
making major contributions in the solution or
reliable circumvention of such experimental
problems.

Discussion
In this work we have used calculations on
EPA's Cray supercomputer and other com-
puters with the GAUSSIAN program.6 In
detailing basis set effects on hydrogenolysis
energies of small hydrocarbons and car-
bocations, we have found5 some limitations
of augmented 6-31 G(d,p) basis sets for
high-accuracy work, that can be overcome
by employing triple-zeta Dunning basis sets
which give a description of the s and p shell
than possible with 6-31G or 6-311G bases.
We have also found that the use 6f 5d func-
tions instead of the usual 6d function default
in the Gaussian package gives no reduction
in accuracy and more internally consistent
basis set convergence as the bas|is sets are
augmented with further polarization and dif-
fuse functions. For the Pople basis sets, dif-
fuse functions are critical in achiel/ing high
accuracy in comparing energies of neutrals
with carbocations, as noted earlier for proto-
nation of lone pair neutral bases a|nd anionic
bases. Diffuse functions were not needed
with the Dunning triple-zeta basisjset, how-
ever, presumably because of the better sp
description of the diffuse region. '•
We have observed that very high level cal-
culations at the MP4/cc-pVTZ level give
hydrogenolysis energies closely comparable
to experimental values as illustrated in Table
1, page 62.7 The differences between
experiment and theory are small, almost all
positive, and increase somewhat regularly in
magnitude as the size of the ion of molecule
increases. The direction of this error is such
that the theory treats the molecule in ques-
tion less well than the simple symmetrical
reference molecule, methane, as might have
been expected. Significant for our interest
in Pais, however, the benzene molecule is
exceptional, showing a 3 kcal/mol negative
error, for resisons that are, as yet^ unclear.
Raising the level of correlation treatment to
CCSD(T) brings the benzene molecule
closer to experiment, but we have found in
other cases that such an effect is generally
offset by basis set effects in going to quadru-
ple-zeta basis sets. A solution to this prob-
lem will have to await further calculations on
benzene at larger basis sets and calcula-
tions at high level on other aromatic mole-
cules. In Table 2, page 63, are shown the
effects of calculations at various levels of
electron correlation on the proton affinities of
small molecules.7 The MP4 values are
nearly identical with experimental proton
60
NESC Annual Report - FY1994
-------
Experimental and Calculated Stabilities of Hydrocarbons and Carbocations
affinities and all lower levels of correlation
treatment are less satisfactory than MP4.
For larger molecules, however, this level of
theory is not practical, and we have looked
at the effectiveness of an additivity scheme
for approximation of the MP4/cc-pVTZ result
from MP2/cc-pVTZ and MP4/6-31+G(d,p)
values. The calculations at these levels of
theory are achievable with reasonable com-
putational resources for molecules of mod-
erate size and have been within 0.8 kcal/mol
of the known MP4/cc-pVTZ results for all
cases tested in smaller molecules. Calcula-
tions of the other properties of ionization
potential and electron affinity have been car-
ried out on a few small test molecules using
the same methods outlined above and are
giving results that agree closely with experi-
ment, for the cases studied.
Thus, we have found that the Gaussian
program for quantum mechanical calculation
at the MP4/cc-pVTZ levels of theory repro-
duce experimental energies within 1-2 kcal/
mol for a test series of C^ C2, C3, and C4
hydrocarbon ions and neutral molecules.
Further tests of Hartree-Fock level calcula-
tions of ionization potentials and electron
affinities via Koopmans' theorem correlate
with experimental results. These calibra-
tions justify sufficient confidence in the accu-
racy of the theoretical energies (and
structures) to be able to use them to help
interpret experimental results and make pre-
dictions for new molecules.

References
1 (a) Cox, J. D.; Pilcher, G. Thermochemis-
try of Organic and Organometallic Com-
pounds, Academic Press: New York,
1970. (b) Pedley, J. B.; Rylance, J. "Sus-
sex - N.P.L. Computer Analyzed Ther-
mochemical Data: Organic and
Organometallic Compounds"; University
of Sussex, 1977.
2 Lias, S.G.; Liebman, J.F.; Levin, R.D. J.
Phys. Chem. Fief. Data 1984, 13, 695,
Lias, S.G.; Bartmess, J. E.; Liebman,
J.F.; Holmes, J. L.; Levin, R.D.; Mallard,
W. G. J. Phys. Chem. Ref. Date 1988,
17, Suppl. 1.
3 (a) Aue, D.H.; Bowers, M.T. in Gas-Phase
Ion Chemistry, Bowers, M.T., Ed.; Aca-
demic Press: New York, 1979; Vol 2. (b)
Bartmess, J.E.; Mclver, R.T., Jr. Gas-
Phase Ion Chemistry Bowers, M.T., Ed.,
Academic Press, New York, 1979, Vol 2;
(c) Brauman, J.L. Gas-Phase Ion Chem-
istry Bowers, M. T., Ed., Academic
Press, New York, 1979, Vol 2. (d)
Kebarle, P. Annu. Rev. Phys. Chem.
1977, 28,445. (e) Taft, R.W. In Progress
in Physical Organic Chemistry, Taft,
R.W., Ed.; Wiley-lnterscience: New York,
1983; Vol.14, p.247. (f) Wolf, J.F.; Sta-
ley, R.H.; Koppel, I.; Taagepera, M.;
Mclver, R.T., Jr.; Beauchamp, J.L.; Taft,
R.W. J. Am. Chem. Soc. 1977, 99,5417.
g) Richert, J., Ph.D. Thesis, University of
California, Santa Barbara, 1989.
4 Hehre, W. J.; Radom, L.; Schleyer, P. v.
R.; Pople, J. A. Ab Initio Molecular
Orbital Theory, Wiley: New York, 1986.
5 Del Bene, J. E.; Aue, D. H.; Shavitt, I. J.
Am. Chem. Soc. 1992, 114,1631.
6 Frisch, M. J.; Head-Gordon, M.; Trucks, G.
W.; Foresman, J. B.; Schlegel, H.B.;
Raghavachari, K.; Robb, M. A.; Binkley,
J. S.; Gonzalez, C.; DeFrees, D. J.; Fox,
D. J.; Whiteside, R. A.; Seeger, R.;
Melius, C. F.; Baker, J.; Martin, R.; Kahn,
L. R.; Stewart, J. J. P.; Topiol, S.; Pople,
J. A. Gaussian 90; Gaussian, Inc.; Pitts-
burgh, PA, 1990.
7 Del Bene, J. E.; Aue, D. H.; Shavitt, I., to
be published. ••
NESC Annual Report - FY1994
61
-------
Experimental and Calculated Stabilities of Hydrocarbons and Carbocations ;
Table 1 : Hydrogenolysis Energies of Hydrocarbons and Carbocations (kcal/mol).
i

Gated
; ; Expti3 ',;;:;,
; E Difference ,--v
Data for most reliable exptl energies:
C2H2
C2H4
Ca^e
Propyne
Allene
Propene
Cyclopropane
Propane

CH3+
C2H3+
C2H5+
2-Propyl
Allyl
-107.2
-58.7
-18.6
-118.1
-120.6
-72.5
-79.6
-35.0

86.1
50.8
109.9
110.6
71.5
-107.3
-58.3
-18.4
-118.2
-120.0
-71.4
-78.7
-35.0

85.1
50.9
109.5
111.8
72.7
i -°-1
I 0.4
; 0.2
i -0.1
! 0.6
! 1-1
i 0.9
i
0.0

! -1.0
i 0.1
• -0.4
; 1.2
i 1-2
i
1 ,3-Butadiene
2-Methylpropene
Cyclopentadiene
Methylallyl
Cyclopententyl
1 -Butyne
Benzene
-122.8
-85.3
-155.3
72.1
48.9
-134.7
-166.4
-120.7
-84.0
-153.7
75.2
51.2
-133.9
-169.2
: 2.1
; 1.3
: 1-6
i 3.1
2.3
i 0.8
-2.8
Less well-known experimental energies:
Cyclopropene
CH5+
C2H7
-142.4
133.0
127.7
-139.6(?)
135.4(?)
129.1
2.8
2.4
; 1.4
62
NESC Annual Report - FY1994
-------
Experimental and Calculated Stabilities of Hydrocarbons and Garbocations
^
Ethyl+H2
Cyclopropenyl
Propargyl
2-Propenyl
Cyclopropyl
Corner-prot c-propyl
1 -Propyl
Caicd
110.9
35.4
7.3
60.1
34.2
105.2
90.6
Exptl3
115.5(7?)
35.0
8.9
57.0(?)
41 .8(7???)
106.8(7)
92.2(7)
E Difference

-0.4
1.6
-3.1

1.6
1.6
Experimental values are corrected to 0° K, with experimental ZPE's where available. All geome-
tries optimized at MP2/6-31 +G(d,p). ;

Table 2. Differences in Proton Affinities from Calculations at Different Moller-Plesset Correlation
Levels using the cc-pVTZ Basis Set.a
tn
CH4
2.6
0.4
0.0
0.4
2.9
0.1
CoH
2n2
-8.1
4.1
-2.2
1.9
-6.2
-1.9
C2H4
-8.4
2.4
-1.4
1.1
-7.4
-1.5
-2.8
1.8
0.3
2.1
-0.7
-0.1
allene (to allyl)
-16.4
2.9
-1.1
1.9
-14.5
-1.4
propyne
-17.0
7.8
-3.0
4.8
-12.2
-3.2
propene
-13.6
3.7
-1.6
2.1
-11.5
-2.2
cyclopropane
-7.1
2.3
-0.1
2.2
-4.9
-0.4
All energies in Real mol'1. Experimental values are corrected to 0° K, with experimental ZPE's
where available. All geometries optimized at MP2/6-31+G(d,p). ;

1 Donald H. Aue* and James W. Caras, Department of Chemistry, University of California, Santa Barbara, Santa
Barbara, CA 93106.
NESC Annual Report - FY1994
63
-------
Experimental and Calculated Stabilities of Hydrocarbons and Carbocations
64
NESC Annual Report - FY1994
-------
Infrared Frequencies and Intensities Calculated from MM3,
Semiempirical, and Ab Initlo Methods for Organic
Molecules1'2
Introduction
Gas-phase experimental vibratiorial
frequencies1 have been compiled from the
literature for comparison with calculated fre-
quencies. A goal of our work is to evaluate
the quality of the correlation of frequencies
calculated at different levels of theory with
experimental vibrational frequencies for a
large number of molecules. This correlation
will allow one to choose a calculation^
method for new molecules, taking into
account the accuracy that can be expected,
balanced against the calculational cost for
different levels of theory. This work includes
a nearly complete set of organic molecules
for which reliable gas-phase experimental
spectra have been assigned and is much
more extensive than widely-quoted compila-
tions of this type.2-3 The quality of the fit with
experimental infrared spectra is now suffi-
ciently good that these methods promise to
be of practical use in the assignment, or
confirmation of infrared spectra of unknown
molecules of interest in environmental ana-
lytical problems.

Methods
In this work we have used calculations on
EPA's Cray supercomputer and other com-
puters with the GAUSSIAN program.3 The
geometries of all of the molecules haive
been optimized at the single-determinant
Hartree Fock level with the 3-21G and
6-31 G(d) basis sets.3 Many of these struc-
tures are available from the Carnegie-Mellon
Quantum Chemistry Archive.4 Harmonic
vibrational frequencies were computed for
each structure to identify and distinguish
equilibrium structures from saddle-point
structures on the potential surfaces.
Discussion:
A sample of calculated and experimental
results are compared in Table 1, page 67. A
linear least squares correlation analysis with
the intercept constrained to be zero gives
scaling factors for the HF/3-21G and HF/
6-31 G* levels of theory that give a best fit to
experimental fundamental frequencies of
0.9023 and 0.9000, respectively, and signifi-
cantly different from the 0.89 scaling factor
most widely used in the literature for correc-
tion of Hartree Fock frequencies.2'3 Scaled
frequencies are tabulated using these scal-
ing factors and the differences from experi-
mental values are tabulated alongside.
Regression analysis with an unconstrained
linear least squares method gives nearly
identical results and intercepts close to zero.
The correlation coefficients(r^) are 0.9963
and 0.9980 with standard deviations of the
frequencies in the two correlations of 56.8
and 41.6 cm"1. Maximum errors are as high
as 100-300 cm'1 in a few cases. The
regression analysis results are summarized
in Table 2, page 69. Included in these corre-
lations are a large number of molecules of
interest for environmental analysis, for
example, aromatic hydrocarbons, oxirane,
formaldehyde, acrolein, aziridine, aniline,
pyrrole, and the pyridines.
The results indicate, encouragingly, that
these theoretical methods work reliably for a
wide variety of molecular skeletons and
functional groups and should, therefore, be
widely applicable for calculations on other
molecules. We have looked at whether dif-
ferent classes of molecules lead to better
correlations or different scaling factors and
have found no major differences among CH,
CHO, and CHN molecules. Dividing the cal-
culated frequencies into high (above 2800
NESC Annual Report - FY1994
65
-------
Infrared Frequencies and Intensities Calculated from MM3, Semiempirical, and Ab Initio Methods for
Organic Molecules
cm"1), medium (600-2800 cm"1), and low
(below 600 cm"') ranges leads to a signifi-
cant modification of the scaling factor and
reduction in the errors in some ranges, for
example to 27.7, 42.4, and 36.1 cm"1 for the
three frequency ranges at the HF/6-31G*
level with scaling factors of 0.9052, 0.8881,
and 0.8969, respectively. Separate treat-
ment of the CH stretching frequencies was
especially effective in reducing the standard
errors. In Table 1, page 67, the vibrational
assignments for frequencies are illustrated
showing the potential for separate scaling of
other frequency types to reduce the errors in
predicting experimental frequencies from
calculated values. The CC stretching fre-
quencies show the most dramatic differ-
ences with scaling factors closer to 1.0.
In most cases the frequencies have also
been calculated at the MM3, AM1, HF/STO-
3G levels, to see if lower levels of theory can
give usable results. The results in Table 2,
page 69, show that the quality of the correla-
tions decreases regularly with smaller basis
sets and with the semiempirical and molecu-
lar mechanics methods, but the correlations
could still be useful when more expensive
ab initio calculations are not possible.
A variety of molecules have been studied
at higher levels of theory to see if these
errors can be minimized. MP2 frequencies
and intensities have been calculated for a
variety of molecules with the 6-31 G(d), 6-
31+G(d,p), and 6-311+G(2df,2pd) basis
sets. Curiously, the frequencies calculated
at the two smaller basis sets give no better
correlations with experiment than the HF/6-
31G(d) frequencies, but the MP2/6-
311+G(2df,2pd) frequencies give what
appear to be better fits from the limited data
thus far. A summary table of the results of
regression analyses is shown in Table 2.
With the MP2 calculations, the fits with har-
monic experimental frequencies are better
and slopes closer to 1.0 than with funda-
mental frequencies, as expected. Curiously,
the HF/6-31G* frequencies do not, however,
give a closer fit with harmonic experimental
values. ;
Infrared intensities have been calculated
and tabulated for each frequency [reported
and at each level of theory and compared
with what experimental data has been found
to be available thus far. Where quantitative
experimental integrated intensity data is
available, the fits with calculated intensities
are good, and semiquantitative intensity
comparisons in experimental spectra appear
to be well predicted by calculated intensi-
ties. |
The results of this study permit us now to
calculate the infrared frequencies for mole-
cules of concern in the environment whose
spectra are not well known or properly
assigned with confidence in the accuracy
that can be expected from such calculations.
The compilation of theoretical and experi-
mental infrared intensity data in spreadsheet
form can be used to do a full simulation of
experimental spectra. |
i
References
1 Shimanouchi, T. "Tables of Molecular
Vibrational Frequencies", National Stan-
dard Reference Data Series, No. 39,
National Bureau of Standards1, Washing-
ton, D. C., 1972.; Shimanouchi, T. J.
Phys. Chem. Ref. Data 1977,! 6,1.
2 Pople, J. A.; Schlegel, H.B.; Kri'shnan, R.;
DeFrees, D. J.; Binkley, J. S.; Frisch, M.
J.; Whiteside, R. A.; Hout, R. F.; Hehre,
W. J. Int. J. Quantum Chem., Duantum
' Chem. Symp. 1981,15, 269. |Pople, J.
A.; Head-Gordon, M.; Fox, D. J.; Ragha-
vachari, K.; Curtiss, L. A. J. Chem. Phys.
1989, 90, 5622.
3 Hehre, W. J.; Radom, L; Schleyer, P. v. R.;
Pople, J. A. Ab Initio Molecular Orbital
Theory, Wiley: New York, 1986.
4 Whiteside, R. A.; Frisch, M.J.; Pople, J. A.
The Carnegie-Mellon Quantum Chemis-
try Archive; Carnegie-Mellon University:
Pittsburgh, 1983.
66
NESC Annual Report - FY1994
-------
Infrared Frequencies and Intensities Calculated from MM3, Semiempirical, and Ab Initio Methods for
i Organic MoJecuJes
Table 1. Comparison of Calculated and Assigned Experimental Frequencies (cm'1).
"*^ *,
*
CH4
(^Hg

-Symmetry
ai
e
t2

alg

aiu
^2u

Mode
sym
stretch
deg
deform
deg stretch
deg
deform
ch3s-
stretch
ch3 s-
deform
cc stretch
torsion
ch3 s-
stretch
ch3 s-
deform
ch3 d-
stretch
ch3 d-
deform
ch3 rock
ch3d-
stretch
ch3d-
deform
ch3 rock
HF/6-31G*
3197
1703
3302
1488
3206
1580
1062
326
3200
1548
3249
1644
1338
3274
1650
890
•a:Exptt .-
' ' > •'.-* *.:••-> "' '-
2917
1534
3019
1306 ;
2954 '.
\
1388
995
289
2896
1379
2969
1468
1190
2985
1469
822 :
Ratio
' • •:"*&* j'"!" :•••'..,.
0.9124
0.9008
0.9143
0.8777
0.9214
0.8785
0.9369
0.8865
0.9050
0.8908
0.9138
0.8929
0.8894
0.9117
0.8903
0.9236
NESC Annual Report - FY1994
67
-------
Infrared Frequencies and Intensities Calculated from MM3, Semiempirical, and Ab Initio Methods for
Organic Molecules

QsHg
/^mrrietry
a1

;,.:,;• Mocte^
ch3d-
stretch
ch3s-
stretch
ch2s-
stretch
ch3d-
deform
ch2 scis
ch3s-
deform
ch3 rock
cc stretch
ccc deform
ch3d-
stretch
ch3d-
deform
ch3s-
deform
ch3 rock
torsion
HF/6-31G*
3262
3201
3194
1658
1636
1571
1286
930
392
3250
1635
1436
985
234
fexpt!
•*•<•'!., •''• .• '\ ' -:-''
2977
2962
2887
1476
1462
1392
1158
869
369
2967
1451
1292
940
208
'•Ratta|;v
0.9125 i
i
0.9252
0.9039
0.8902 :
0.8936 !
0.8860
0.9001 |
0.9347 |
I
0.9418
0.9129 I
0.8875 !
0.8997 |
i
i
0.9543 i
0.8889
i

68
NESC Annual Report - FY1994
-------
Infrared Frequencies and Intensities Calculated from MM3, Semiempirical, and Ab Initio Methods (or
Organic Molecules
"'<

Symmetry
bi

Mode
ch3d-
stretch
ch3s-
stretch
ch3d-
deform
ch3s-
deform
ch2 wag
cc stretch
ch3 rock
ch3 d-
stretch
ch2a-
stretch
ch3 d-
deform
ch3 rock
ch2 rock

torsion
HF/6-31G*
3259
3193
1642
1561
1503
1126
1014
3264
3217
1652
1331
808

293
Exptl
2968
2887
i
1464
1378 ;
1338 ;
1054
922
2973 ,
2968
1472
1192 i
748 I

223
Ratio
0.9108
0.9042
0.8916
0.8828
0.8905
0.9359
0.9090
0.9107
0.9227
0.8910
0.8952
0.9259

0.7611
Table 2. Standard Errors in Predicted Infrared Frequencies of Organic (G,H,N,O) Molecules.
Standard Error in Frequency (cm-1)
/ ' " •'-<•'
.Level of theory

MM3
AM1
HF/STO-3G
Zero
Intercep
t, "
*•
85.3
87.9
66.9
Non-
zero
tntercep
t

85.0
87.9
63.7

f
v- Nl
^
949
1084
1086
NESC Annual Report - FY1994
69
-------
Infrared Frequencies and Intensities Calculated from MM3, Semiempirical, and Ab Initio Methods for
Organic Molecules i
> tv ^ V?
^. v^ ,, ~J. ^
Z. $Tf * v *' i*
> *- * *
Level of theory *
HF/3-21G
HF/6-31G(d)
MP2/6-31G(d)
MP2/6-31+G(d,p)
HF/6-31G(d)
MP2/6-31+G(d,p)
MP2/6-311+G(2df,2pd)
Zerofi~
Intercep
t
x
56.8
41.6
61.4
59.3
49.5
47.4
34.4
Non-
zero
fntercep
t
56.1
41.0
58.9
51.7
48.6
36.9
26.5
N
1085
1085
290
402
402
54
54
Fundamental Freqs:
HF/6-31G(d)
MP2/6-31+G(d,p)
37.0
54.8
35.4
46.8
133
133
Harmonic Freqs:
HF/6-31G(d)
MP2/6-31+G(d,p)
51.1
43.7
44.0
41.4
133
133
High Frequencies (above 2800 cm-1):
HF/6-31G(d)
MP2/6-31+G(d,p)
27.7
39.7

255
108
1 Donald H. Aue*. Michele Guidoni and James W. Caras, Department of Chemistry, University of California, Santa
Barbara, Santa Barbara, CA 93106. i
2 Donald Gurka, United States Environmental Protection Agency, Environmental Monitoring Systems Laboratory, Las
Vegas, NV 89119.
70
NESC Annual Report - FY1994
-------
Ab Initio Calculations of Proton Affinities, lonization
Potentials, and Electron Affinities of Polynuclear Aromatic
Hydrocarbons and Correlation with Mass Spectral Positive
and Negative Ion Sensitivities 1»2
Background
The field of environmental analytical
chemistry has been slow to utilize the power
of computational chemistry. Recently, statis-
tical treatment of the data and statistical
experimental design have been computer-
ized and have been a great help in methods
development and methods evaluation.
However, limited use has been made of
structure and energy optimization programs
for the benefit of analytical methods
development.
Part of a feasibility study for the potential
of computational chemistry in environmen-
tal analytical chemistry to support U.S. EPA
RCRA and Superfund legislation has been
our effort to predict relative intensities of
environmental pollutants, under both posi-
tive and negative ion Chemical lonization
(Cl) conditions in the mass spectrometer.
This is important because of the possibility
of achieving enhancement of the ion signal
under Cl compared with electron impact ion-
ization for trace analysis. The parameters
important for predicting these sensitivities
are lonization Potentials (IPs), Proton Affini-
ties (PAs), and Electron Affinities (EAs). It is
possible to calculate these parameters from
ab initio calculations.
Let us see how these parameters affect
the sensitivities in the positive and negative
ion modes. Siegel modeled the chemical
ionization source, considering both positive
and negative ion reactions in an ion source
and accounting for recombination rates and
ambipolar diffusion1. He came up with the
simple expression:
This tells us that the ratio of the sensitivi-
ties is just the ratio of the rate constants for
the attachment and charge transfer (or ion-
molecule reaction) processes.
Classical theories2 of ion-molecule inter-
actions predict a collision or capture-rate
coefficient (k,;) given by:
For example: a (polarizability) for HCN =
2.59 A3; Mo (dipole moment) = 2.98d
k = 1 0-9*1 0'
For CH5+ + HCN, ^(reduced mass) = 10.4
and kc = 3.38 x1(T9.
For e' + HCN, p, = (3.000545 and k~ = 4.67 x
10'7. ;
This shows that theoretically one can
have a two-orders-of-magnitude enhance-
ment of the negative ion over the positive ion
mode. On a more practical side, laboratory
determinations of rate coefficients kexp have
indicated that most exothermic proton trans-
fer reactions proceed extremely fast at ther-
mal energies3. The transfer of a proton,
when energetically allowed, often proceeds
on nearly every collision. There are few
exothermic reactions for which kexp/kc < 0.5.
The exothermicities of most of these reac-
tions are relatively small, AH < 15 kcal/mol.
Therefore, the proton affinities or relative
proton affinities seem to have an influence
on the rate of reaction and consequently the
intensity of the protonated molecule.
NESC Annual Report - FY1994
71
-------
Ab Initio Calculations of Proton Affinities, lonization Potentials, and Electron Affinities of Polyngclear
Aromatic Hydrocarbons and Correlation with Mass Spectral Positive and Negative Ion Sensitivities
For the electron attachment process:
A"'
For molecules with EA > 0, k-| is generally
large. If EA is sufficiently large, the molecu-
lar anion A"- will be stable against electron
detachment (i.e., k.1 will be small). There-
fore, an intense response to A"- will be
observed, assuming that A does not
undergo a dissociative capture process. If
EA is small (less than 10 kcal/mol), k.1 will
be large, and the intensity of A'- will be
weak4.
Let us now compare the responses (Table
1, page 73) that one obtains with Negative
Ion Chemical lonization (NICI) with those by
Electron Impact lonization (El). These
responses normalized to the electron impact
ionization response of pyrene show two
trends: (1) the El response is fairly constant
for these PAHs, ranging from 0.75 for
anthanthrene to 1.12 for benzo[puoran-
thene; and (2) the NICI response variation is
extreme, spanning 4 orders of magnitude in
response. Since the El response does not
show much variation, the influence of IPs on
the mass spectral intensities, at least for the
PAHs, is minimal in the common El ion
source that utilizes 70-eV electrons. In con-
trast, the effect of EAs on sensitivities is
likely to be great. A similar table that corn-
pares the negative to the positive ion inten-
sity under chemical ionization conditions is
presented as Table 2, page 746. Also, EAs7
for most of the PAHs listed are given. There
is a loose relationship between EAs and the
N/P ratio. Generally, the compounds with
EAs above 0.42 eV show an enhancement
in the negative ion mode. However, the
size of the enhancement does not correlate
well with the magnitude of the EAs. Other
factors are most likely involved, such as the
proton affinities. Also, some of the reported
EAs may have to be reexamined.
Method / Approach
The calculations were performed with
Gaussian software and mainly with the
National Environmental Supercorhputing
Center's (NESC) Cray computer. iThe calcu-
lations were generally attempted up to the
6-31G* levels. These levels will not give
good absolute levels (MP2 or MP^ would be
better), but for a series of compounds, such
as the PAHs, they should give good relative
values (within 1-2.5 kcal/mol).8 This level of
calculation should be satisfactory-for IPs
using Koopmans' theorem9 because of the
cancellation of errors when using Hartree-
Fock (HF) calculations for the highest occu-
pied molecular orbitals (HOMOs). The
same cannot be said about the EAs where
there is not a cancellation of errors in the
calculations.
The following expressions show the "true"
values of the IP and EA, respectively:

IP = -e, + E+(R) + XE|P(c6rr)
and
EA = -ea - E.(R) + XEEA(corr)

where 6j and ea are respectively the Har-
tree-Fock energies of the appropriate occu-
pied and unoccupied orbitals of the neutral
molecule; E+(R) and E.(R) are the| relaxation
(reoptimization) energies of the cation and
anion, respectively, and are the differences
between the energy of the ion using the
relaxed orbitals and the frozen neutral-mole-
cule orbitals; and XE|P(corr) and XEEA(corr)
are the electron correlation corrections to
the IP and EA, respectively. The correlation
energy difference XE|P(corr) is usually posi-
tive, whereas the relaxation energy of the
cation E+(R) is negative, and they are of
approximately equal magnitude. As a result,
the correlation and relaxation corrections
tend to cancel in the calculation of IPs. The
anion case is different. Again the correlation
energy difference XEEA(corr) is usually posi-
tive, but now the relaxation energy enters as
a positive quantity, -E.(R). The result is that
the correlation and relaxation corrections
72
NESC Annual Report - FY1994
-------
Ab Initio Calculations of Proton Affinities, lonization Potentials, and Electron Affinities of Polynuclear
Aromatic Hydrocarbons and Correlation with Mass Spectral Positive and Negative Ion Sensitivities

Table 1. Comparison of Response Factors for PAHs Obtained by NICI (with Methane) and
E-lectron Impact lonization5.
.COMPOUND
Phenanthrene
Anthracene
2-Methylanthracene
1 -Methylphenanthrene
Floranthene
Pyrene
Benzo[a]fluorene
Benzo[b]fluorene
Benzo[a]anthracene
Benzo[b]fluoranthene
Benzojj]fluoranthene
Benzo[k]fluoranthene
Benzofejpyrene
Benzo[a]pyrene
Perylene
ldeno[1 ,2,3-cd]pyrene
Benzo[ghi]perylene
Anthanthrene
El
0.94
0.89
0.78
0.84
0.99
1.00
0.99
1.07
1.09
1.07
1.12
1.02
0.80
0.89
0.90
0.81
0.87
0.75
NICI
0.04
0.54
0.35
0.30
8.90
0.03
0.12
0.16
6.50
38
42
36
0.01
120
; 1.5
120
i 64
170
are of the same sign and magnitude and,
therefore, -ea is not a good approximation to
the EA.

Results and Discussion
The IPs that were calculated using Koop-
mans' theorem correlated linearly to the
experimentally derived IPs. For the 18 cal-
culated IPs, the average difference from the
experimental numbers was 0.27 ev. The
EAs calculated from Koopmans' theorem did
not agree with the experimental values.
However, the experimental values showed
some correlation with the energies of the
lowest unoccupied molecular orbitals.
The proton affinities calculated from the
Hartree-Fock energies of the neutrals and
the protonated species are listed in Table 3,
page 75. The proton affinities calculated
with the 6-31G* basis set at the 3-21G
geometry (6-31GV/3-21G, single point cal-
culation) are approximately equal to the
NESC Annual Report - FY1994
73
-------
Ab Initio Calculations of Proton Affinities, lonization Potentials, and Electron Affinities of Polynuclear
Aromatic Hydrocarbons and Correlation with Mass Spectral Positive and Negative Ion Sensitivities

Table 2: NICI/PICI Response (N/P) for Selected PAHs.
COMPOUNDS
Naphthalene
Acenaphthylene
Fluorene
Phenanthrene
Anthracene
Fluoranthene
Pyrene
Benzo[a]anthracene
Chrysene
Benzo[b]fluoranthene
Benzo[k]fluoranthene
Benzo[a]pyrene
Benzo[e]pyrene
ldeno[1 ,2,3-cd]pyrene
Benzo[ghi]perylene
Dibenzo[a,h]anthracene
EA ...;; :
-0.06
0.77
—
0.03
0.49
0.63
0.45
0.42
0.26
—
—
0.64
0.35
—
0.51
0.37
;;..;-*,.:;;N/P, ... ;•_,,
0.006
6.0
0.31
I
0.11
2.9
8.1
0.23
13
0.12
220
150
400
0.46
380
210
27
proton affinities calculated at the fully opti-
mized 6-31G* geometry. The proton affini-
ties at the 3-21G level are approximately 5
kcal/mol less than the other values.
Although these values are closer to the
experimental numbers10, this is most likely
fortuitous. We are investigating the trend as
we go to higher levels. At the MP2 level for
benzene and toluene, the proton affinities
decrease by approximately 14 kcal/mol from
the 6-31 G* levels. Figure 1, page 76, plots
the experimental proton affinities against the
proton affinities calculated at the 6-31 G*
level. The theoretical values correlate fairly
well (correlation coefficient = 0.956) against
the experimentally determined proton
affinities. ;

Conclusion
At the levels of theory attempted for the
PAHs, the correlation is satisfactory. How-
ever, the absolute proton affinities^ and elec-
tron affinities are significantly different from
the experimentally determined values. The
enhancements in sensitivities in the nega-
tive ion chemical ionization mode pver those
in the El mode and in the positiveiion chemi-
cal ionization mode is mainly due to the
magnitude of EAs for these compounds and
the reaction rate for the electron qapture
process. Further work will concentrate on
74
NESC Annual Report - FY1994
-------
Ab Initio Calculations of Proton Affinities, lonization Potentials, and Electron Affinities of Poiynuctear
Aromatic Hydrocarbons and Correlation with Mass Spectral Positive and Negative Ion Sensitivities

Table 3. Comparison of Proton Affinities: Theory and Experiment.
COMPOUND
Benzene
Toluene
Naphthalene
Azulene
Biphenylene
Acenaphthylene
Anthracene
Phenanthrene
Pyrene
Triphenylene
Chrysene
Naphthacene
Coronene
1 -Methyl naphthalene
Acenaphthene
Fluorene
9-Methyl anthracene
PA(3-21G)
190.0
198.9
205.2
245.7
209.2
221.3
225.1
207.4
208.9
206.3
213.6
235.1
214.3
211.4
217.2
199.3
229.5
PA1
195.5

210.4
248.1
214.4

214.4
212 .4
219.4

220.5
216.1

205.3
233.6
PA(6r3tG*>
195.5
204.2
210.4
248.0
214.9
225.2
229.4
213.1
•

239.4

216.1
222.8
205.3
233.4
^?MP
-------
Ab Initio Calculations of Proton Affinities, lonization Potentials, and Electron Affinities of Polynuclear
Aromatic Hydrocarbons and Correlation with Mass Spectral Positive and Negative Ion Sensitivities
Figure 1: Correlation of Theoretical Proton Affinities with Experiment.
4 Valkenberg, C.A.; Knighton, W.B.; Grim-
srud, E.P. J. High Resolut. Chromatogr.
1986,9,320.
5 Oehme, M. Anal. Chem. 1983, 55, 2290.
6 Daishima, S.; lida, Y.; Shibata, A.; Kanda,
F. Org. Mass Spectrom. 1992, 27, 571.
7 Younkin, J.M.; Smith, L.J.; Compton, R.N.
Theoret. Chem. Acta1B7Q, 41,157.
8 Uggerud, E. Mass Spectrom. Reviews
1992,11,389. ;
9 Koopmans, T. Physica 1933,1 ,| 104.
10 Lias, S.G.; Liebman, J.F.; Levin, R.D. J.
Phys. Chem. Ref. Dafa1984,;13, 695.
1 L.D. Betowski and Steve Pyle, U.S. EPA, EMSL-Las Vegas, P.O. Box 93478, Us Vegas, NV 89193-3478.
2 Donald H. Aue and James W. Caras, Department of Chemistry, University of California, Santa Barbara, Santa
Barbara, CA 93106. '
76
NESC Annual Report - FY1994
-------
Parameterization of Octanoi / Water Partition Coefficients
(LogP) Using 3d Molecular Properties: Evaiyation of Two
Published Models for LogP Prediction
* •* * f " ^ "* ^^ .N.A , •• .j
Disclaimer
This manuscript has been reviewed by the
Health Effects Research Laboratory, U.S.
Environmental Protection Agency (EPA) and
approved for publication. Approval does not
signify that the contents necessarily reflect
the views and policies of the Agency, nor
does mention of trade names or commercial
products constitute endorsement or recom-
mendation for use.

Background
The water/lipid partitioning process that
governs transport and bioavailability in bio-
logical systems plays a central role in modu-
lating and determining the relative activity of
many xenobiotics. The development of
approximate computational means for mod-
eling this partitioning capability has contrib-
uted greatly to the field of quantitative
structure-activity relationships. The most
extensively parameterized, tested, and
widely used computational approach for
estimating such quantities is the empirical,
fragment-based approach of Hansch and
Leo1 (and the automated CLOGP version2)
that is heavily relied upon within the Agency
for toxicity screening. CLOGP treats a mol-
ecule as a collection of fragments and mod-
els the log of the octanol/water partition
coefficient (logP) as a sum of parameterized
fragment constants, where the contribution
of each fragment to the total partitioning is
assumed to be invariant to 3-dimensional
molecular environment. As a result, the
CLOGP method is unable to account for
many types of non-bonded interactions and,
thus, is incapable of distinguishing 3D con-
formational isomers or closely related posi-
tional isomers. The method also relies on
an incomplete set of fragment coefficients
and on a large number of correction factors
to account for cases where the strict additiv-
ity assumption does not hold.
An alternative approach to the estimation
of partition coefficients explicitly incorpo-
rates properties derived from the 3D molec-
ular structure. The present study contrasted
two published models for logP estimation
based on 3D molecular properties. The
goals were to evaluate the overall predictiv-
ity, relative performance, and ability of these
models to meaningiFully distinguish logP val-
ues for conformational and positional
isomers.

Methods and Models
The published Bodor and Huang3 logP
prediction model was derived from a training
set of 302 compounds. A large number of
potential parameters, including higher order
non-linear terms, were screened for statisti-
cal association with experimental logP val-
ues by a stepwise linear regression
procedure3. The final 18 term function has
the form shown in Equation 1 where:
LogP = Const. + a1 . n + a2. laik + a3 • SOON + a4. ABSQ + a5. MW + ae. NC +
a7.SQN + a8.SQN2 + a9.SQN4+ ;

a10 • SQO + an . SQO2 + a12 • SQO4 + ,
a13.SA + a14.SA2+ \

. OVAL + a! 6. OVAL2 + a^7. OVAL4 ;
Equation 1 ',
NESC Annual Report - FY1994
77
-------
Parameterization of Octanol/Water Partition Coefficients (LogP) Using 3d Molecular Properties: Evaluation
of Two Published Models for LogP Prediction j
3j - regression coefficients3, n = total dipole
moment; Iaik=1 for alkanes, 0 for non-
alkanes; SQN, SQO, and SOON are sums
and products of the partial atomic charges
on nitrogen and oxygen atoms; ABSQ is a
sum of absolute charges of all atoms; MW is
the molecular weight; NC is the number of
carbon atoms; SA is the Van der Waals sur-
face area; and OVAL is an ovality function.
All models of this general form are referred
to by us as BODOR models; the original
published model is referred to as BODORO.
The published Kantola, Villar and Loew4
logP prediction model function was pro-
posed prior to statistical analysis and, sub-
sequently, was "empirically calibrated" on a
training set of 90 chemicals. Three terms
involving summations over all atoms were
chosen to represent: hydrophobic cavity-for-
mation, hydrophilic charge contributions to
the interaction energy, and atomic polari;:-
ability contributions to the hydrophobic inter-
action. The 18 term function has the form
shown in Equation 2

LogP = 2, [ aj(R). SAj + p,(R). SAj. (Aq,)2
Equation 2

where: i sums over all atoms of six atom
types, R = C, H, N, O, Cl and F;
-------
Parameterization of Octanol / Water Partition Coefficients (LogP) Using 3d Molecular Properties; Evaluation
of Two Published Models for LogP Prediction
recalculated parameters for the TRAIN90
data set. This was successfully accom-
plished for Equation 2, page 78, with the
resulting model referred to as LOEW1.
LOEW1 and published LOEWO logFD values
were in good agreement. However, the
effort to derive new regression coefficients
for Equation 1, page 77, based on TRAIN90
was unsuccessful due to the highly collinear
nature of the regression parameters. To test
the significance of the non-linear terms in
Equation 1 and to effect a solution to the
regression problem, four new condensed
parameters - SQNX, SQOX, SAX, and
OVALX - were defined in terms of the
weighted polynomials in the original pub-
lished BODORO model3, e.g. SQNX = a7.
SON + aa. SQN2 + a9. SQN4. The result-
ing 11-term equation was denoted
BODOR1X. In addition, the performance of
models were examined in which the VDW
radii surface area was replaced by the sol-
vent-accessiblesurface area (defined by the
addition of 1.5 A to the VDW surface) -
denoted BODOR1XS and LOEW1S, and no
higher order polynomial terms were included
in an Equation 1-denoted BODOR1.

Results and Discussion
Summary statistics for application of
CLOGR BODORO' and various models
derived from TRAIN90 to prediction of the
TRAIN90 data set and TEST102 data set
are provided in Table 1. Not surprisingly, all
models performed reasonably well on the
TRAIN90 set. In contrast, all models, with
the sole exception of CLOGP, perform signif-
icantly worse when extrapolated to predic-
tion of the TEST102 set. The predictive
capability of these models varies consider-
ably within subclasses of the TEST102 data
set (results not shown), with predictions for
some chemical classes (steroids and gluco-
pyranosides) much worse than for others.
In order to assess the impact of the nature
and size of the TRAIN set on the predictivity
of the various models, 30 chemicals, repre-
sentative of the various chemicals classes in
TEST102, were added to the TRAIN90 set
to yield the expanded TRAIN120 set.
Table 1: Summary Statistics (Adj. R2) for Overall Model Performances.
^ * V
"* Vk
,, ,~>-~
\CL00f»;
BODORCU
v *•* \ *i v *
^BOBORIX
\xgEwr:o
"LOEWISV
TRAIN90 »:
TRAIN90
TEST102
0.909 *
0.957
0.884
0.729
0.919
0.748
0.921
0.731
0.927
0.711
TRAIN120'.
TEST102
0.957
0.729
0.857
0.901
0.884
* Adjusted R squared of the linear regression fit of predicted to experimental logP values.
BODOR1X, LOEW1 and LOEWIs models developed from TRAIN90 set; all models applied to prediction of logP for
the TRAIN90 set and the TEST102 set.
c BODOR1X, LOEW1 and LOEWIs models developed from TRAIN120 set (TRAIN90+30 from TEST102); all models
applied to prediction of logP for TEST102 set.
NESC Annual Report - FY1994
79
-------
Parameterization of Octanol / Water Partition Coefficients (LogP) Using 3d Molecular Properties': Evaluation
of Two Published Models for LogP Prediction
Application of this model to prediction of the
TEST102 set is presented in the last row of
Table 1. [Note that this is no longer a strict
extrapolation since TRAIN120 includes 30 of
the TEST102 chemicals.] The performance
of the BODOR and LOEW models markedly
improves, with the LOEW models realizing
greater overall improvement than; the
BODOR1X model. I
LogP predictions using the various models
developed from the TRAIN120 set are pre-
sented in Table 2 for a variety of positional
Table 2: LogP Model Predictions3 for Positional and Conformational Isomers.

2-cyanopyridine#
3-cyanopyridine
2,5-dichlorobiphenyl#
2,6-dlchlorobiphenyl
2,2'-dichlorobiphenyl
2,4'-dichlorobiphenyI
2,4,4-trIchlorobiphenyl
2,3,4-trichIorobiphenyi#
2,4,5-trichIorobiphenyl
2,4,6-trfchlorobiphenyl
2,3',4'-trichloroblphenyl
2,3,4,5-tetrachlorobiphenyl
2,3',4'15I-tetrachlorobiphenyl
3-fIuorophenylacetic acid#
4-fluorophenylacetic acid
2-propylamino benzonorbomene (exo)
2-propylamino benzonorbomene (endo)
6-trifluorornethyl-2-amino°"(exo)
6-trifluoromethyl-2-amino""(endo)
7-trifluoromethyI-2-amino"u(exo)
7-trifluoromethyi-2-amino ""(endo)
dexamethasone#
betamethasone
EXP
0.50
0.36
5.06
4.84
4.65
5.07
5.67
5.51
5.60
5.44
5.60
6.04
6.13
1.55
1.65
3.30
3.13
3.21
2.91
3.19
2.85
1.83
1.94
CLOGP
0.23
0.23
5.46
5.46
5.46
5.46
6.17
6.17
6.17
6.17
6.17
6.88
6.88
1.56
1.56
3.22
3.22
2.86
2.86
2.86
2.86
1.41
1.41
BODORO
1.56
1.45
5.43
5.38
5.30
5.28*
5.69
5.32
5.66
5.74*
5.30*
5.89
5.61*
1.66
1.74
3.56
3.72*
3.34
3.28
3.35
3.21
1.94
1.97
BODOR1X
1.13
1.1 4*b
5.07
4.98
4.83
4.74*
5.20
4.82
5.15
5.32*
4.76*
5.30
5.04*
0.97
1.11
3.39
3.57*
3.09
3.05
3.13
2.91
1.75
1.78
LOEW1
0.93 i
0.89
4.98 [
4.88 ;
4.79 |
4.91 i
5.29 ;
4.87 !
5.46
5.35
5.10*
5.73 |
5.44*
1.53
1.50*
3.07
3.09*
2.86 :
2.85
I
2.86 ,
2.76 i
1.38
1.34*'
LOEW1S
0.91
0.79
4.99
4.72
4.60
4.96
5.43
4.91
5.44
5.25
5.19*
5.58
5.62
1.36
1.38
3.16
3.14
2.84
2.72
2.81
2.74
1.72
1.68*
a All models developed from TRAIN120 set; set includes 7 chemicals marked with a # symbol. ':
b LogP predictions marked with an * represent an incorrect direction of change from the previous logP value
column, e.g., logP should decrease from 2- to 3-cyanopyridine but is incorrectly predicted to increase in
BODOR1X model.
in the
80
NESC Annual Report - FY1994
-------
Parameterization of Octanol / Water Partition Coefficients (LogP) Using 3d Molecular Properties: Evaluation
of Two Published Models for LogP Prediction
isomers and conformational isomers from ;
TEST102. Although the absolute errors in
logP predictions for all models is greater, on
average, than the small differences in exper-
imental logP values between closely related
isomers, cancellation of errors may occur for
similar chemicals. The results in Table 2,
page 80, clearly indicate that the BODOR
and LOEW models are able to distinguish
between closely related positional and con-
formational isomers, but it is not clear that
such models can correctly predict relative
changes in logP values between such iso-
mers? In Table 2 and results not shown, the
LOEW1S model has the best overall suc-
cess predicting the correct direction of logP
change between isomer pairs approxi-
mately 80%, while the success rate of the
BODOR models is near random for some
chemicals classes and much better than
random for others.

Conclusions
Both the BODOR and LOEW model func-
tions provide reasonably accurate means for
predicting logP values in cases where
CLOGP values are unavailable, with
adjusted R2 values near 0.9. The results of
this study are not conclusive but do suggest
that these models have some capability for
meaningfully distinguishing between closely
related isomers, although their accuracy and
reliability for this purpose should be demon-
strated for the chemical class of interest.
For the chemicals considered in this study,
the LOEW models seem to provide more
accurate prediction of relative logP changes
for closely related isomers, and realize
greater improvement in performance than
the BODOR models as the size of the train-
ing set increases. Other significant advan-
tages of the LOEW model functional form
are its a-priori rational design that could be
refined by a more rigorous consideration of
the solvation process, and its atom-based
parameterization that provides a convenient
means for examining fragment contributions
to the total hydrophobicity. Due to these
advantages, future investigations would
likely focus on refinement of the LOEW
model approach for logP prediction.
The Agency relies on the CLOGP method
for estimation of logP values for use in a
variety of toxicity estimation models. In
cases where CLOGP is unable to make a
prediction, or is unable to provide the logP
discrimination required for a given study,
alternative logP calculation methods based
on 3D structure should prove extremely
useful.

A full manuscript discussing this topic is
in preparation for journal publication.
|

References

1 Hansch, C. and Leo, A. (1979) Substituent
Constants for Correlation Analysis in
Chemistry and Biology, Wiley-lnter-
science, New York
2 Leo, A., and Weiininger, D. (1992)
CLOGP: Medchem Software Release
3.5, Medicinal Chemistry Project,
Pomona College, Claremont, CA.
3 Bodor, N. and Huang, M.-J. (1992) J.
PharmSci., 81,; p. 272.
4 Kantola, A., H. O. Villar, and G. Loew
(1991) J. Comput. Chem., 12, p. 681.
5 Hawker, D. W. and D. W. Connell (1988)
Environ. Sci. Technol., 22, p. 382.
6 Gobas, F. (1987) in: QSAR in Environ-
mental Toxicology II (K. L. E. Kaiser,
ed.), D. Reidel Publishing Co., p. 112.
7 Voogt, P. D., J. W. M. Wegener, J. C.
Klamer, G. A. Van Zijl and H. Govers
(1988) Biomed.and Environ.Sci. 1, p.
194. |
8 Pleiss, M. A. and G. L. Grunewald (1983)
J. Med. Chem., 26, p. 1760.
9 Mannhold, R., K. P. Dross and R. F. Rek-
ker (1990) Quant. Struct.-Act. Relat, 9,
p. 21.
10 Kim, K. H. and Y. C. Martin (1986) J.
Pharm. Sci, 75 p. 637.
NESC Annual Report - FY1994
81
-------
Parameterization of Octanol/ Water Partition Coefficients (LogP) Using 3d Molecular Properties: Evaluation
of Two Published Models for LogP Prediction

' Ann M. Richard, Carcinogenesis and Metabolism Branch, Health Effects Research Laboratory, U.S. Environmental
Protection Agency, Research Triangle Park, NC 27711 and Phillip F. Boone, Environmental Health Research and
Testing, Inc., Research Triangle Park, NC 27711. Current address: Integrated Laboratory Systems, Research
Triangle Park, NC 27711.
82
NESC Annual Report - FY1994
-------
Regional Acid Deposition Model (RADM) Evaluation1
EPA Research Objectives
Regional air quality models are needed
and used to extrapolate outside current con-
ditions, therefore, these advanced models
are developed with parameterizations and
physical and chemical mathematical
descriptions as close to first principles as
possible. The purpose of the evaluation is to
test the science incorporated into the
advanced models. Evaluation is diagnostic,
to explore quality of predictions and develop
an appraisal of model strengths and weak-
nesses. The data used were specially col-
lected for the Regional Air Deposition Model
(RADM) evaluation as part of the National
Acid Precipitation Assessment Program
(NAPAP) and a bi-national effort, the Eule-
rian Model Evaluation Field Study, (EMEFS).
The data were collected over two-year
period with special, several-week intensities
that used very advanced instruments to col-
lect air concentrations to provide data that
would support the most diagnostic testing.

Overview of Project
Early evaluation research concentrated on
examining the predictions for the sulfur
cycle. Significant improvements to the
RADM were accomplished (see references).
Current research continues to investigate
the nitrogen cycle, which is much more com-
plex. This investigation focuses on testing
the ability of the model to accurately repli-
cate in time and space the conversion (or
oxidation) of nitrogen oxides (NOX) to their
oxidized products, PAN and nitrates (particu-
late nitrate, NO3', and nitric acid, HNO3).
Measurements aloft taken by aircraft carry-
ing sophisticated instruments to measure air
quality in the EMEFS's 1990 aircraft inten-
sive and measurements at the surface at
two special sites, Scotia, PA and Egbert,
Ontario, are used for the diagnostic testing.

Background and Approach
The observations were developed from
measurements taken during ten aircraft
flights over a 45-day period from April 15 to
May 30,1990. The standard 80-km version
of RADM 2.6 was used to simulate the 45-
day period and a dlata-probe was "flown"
through the model,, These "data" from the
model are compared to equivalent data from
the aircraft measurements. Work from Fis-
cal Year 1993 showed that reducing grid
size had little effect on the rate of conversion
of NOX to PAN and HNO3, at least for grid
meshes ranging from 20 to 80 km. Thus,
the runs prepared for the April 15 to May 30,
1990 period focused on use of the 80-km
RADM that encompassed the aircraft flights.
Due to uncertainties in the NOX emissions
inventory, comparison runs were also made
with the new 1990 interim inventory.

Accomplishments Using the NECS's
Cray
The 80-km RADM runs for the April 15 to
May 30,1990 period required approximately
65 hours on a single-processor Cray Y-MP.
The reruns using the 1990 Interim Emis-
sions Inventory for a sub-period required
approximately 35 hours on a single proces-
sor of the NESC's Cray Y-MP. It required
about eight weeks to prepare the initial eval-
uation emissions and three weeks to pre-
pare the sub-set for testing the emissions
inventory. The model runs on the Cray were
able to be completed in about two weeks
and one week, respectively.
NESC Annual Report - FY1994
83
-------
Regional Acid Deposition Model (RADM) Evaluation

Scientific Results and Relevance to
EPA Mission
The full set of comparisons showed simi-
lar results to those from the 1988 period.
However, the meteorological model per-
formed much better for 1990 because it was
a more normal year, whereas 1988 was a
drought year with a significant portion of
rainfall being convective. The 1990 aircraft
results were very similar to those from 1988,
showing that photochemistry issues regard-
ing model performance are similar across
several seasons. Also, the results from the
surface sites during summer of 1988 were
very consistent with the aircraft
measurements.
This study enhances our understanding of
the working of regional model photochemis-
try for rural ambient concentrations condi-
tions. This in situ understanding is critical
because smog chambers cannot be used to
test the chemical mechanisms at the low
concentrations representative of regional
conditions. Proper computation of the pho-
tochemistry for rural conditions is important
to the ability of models to support explora-
tion of and establishment of appropriate
emissions controls to reduce and eliminate
violations of the ozone health standard.
Examination of nitrogen chemistry is impor-
tant because it is a central part of the oxida-
tion process forming ozone and because
rural oxidant production is generally
believed to be NOx-limited.
Future Objectives and Plans
The evaluation will continue with addi-
tional sensitivity studies directed at under-
standing biogenic emissions influences on
the RADM chemistry. Preliminary indica-
tions are that to improve ambient profiles of
isoprene concentrations, vertical resolution
will need to be increased from 15 to 30 lay-
ers. In addition, a new nested grid mesh
resolution set of 54-km coarse-grid and 18-
km fine-grid will be tested. Once the newly
adapted meteorological model has been
tested, roughly 400 of Cray Y-MP; hours will
be required to regenerate new meteorology
for the 1988 evaluation period, 60 of Cray
Y-MP hours to generate new 54-km RADM
results, and 300 of Cray Y-MP hours to gen-
erate new 18-km HR-RADM results for the
next round of diagnostic testing of the chem-
istry in RADM. !

Publications and Reports
Dennis, R.L., J.N. McHenry, W.R.| Barchet,
F.S. Binkowski, and D.W. Byuh, 1993:
Correcting RADM's sulfate underpredic-
tion: discovery and correction [Of model
errors and testing the corrections
through comparisons against jfield data.
Atmospheric Environment 27A, 975-
997. i
Cohn, R.D. and R.L Dennis, 1994: The
evaluation of acid deposition models
using principal component spaces.
Atmospheric Environment 28A(15),
2531-2543. [
1 Robin L. Dennis, Talat Odman, Richard D. Cohn, and Daewon Byun, U. S. EPA Regional Acid Deposition Model
Evaluation; EPA Atmospheric Research and Exposure Assessment Laboratory, Research Triangle Park, NC.
84
NESC Annual Report - FY1994
-------
Atmospheric Deposition of Nitrogen to Chesapeake Bay
EPA Research Objectives
Nitrogen is the primary cause of eutrophi-
cation in Chesapeake Bay. Nitrogen input
from the atmosphere represents a signifi-
cant source of nitrogen to the Bay (25-35%)
of the nitrogen loading. Water quality models
have incorporated atmospheric nitrogen, but
in a very simple manner. One objective of
this research is to provide more accurate
estimates of the quantity and the pattern of
nitrogen loading from the atmosphere to the
Chesapeake Bay watershed and the Bay
itself. These estimates will be provided as
inputs to the water quality models for the
watershed (the HSPF model adapted by the
Chesapeake Bay Program Office) and the
Bay (the 3-D Bay Water Quality model
developed by the Army Corps of Engineers).
Another objective of this research is to
determine the extent of the airshed that is
primarily responsible for the atmospheric
nitrogen affecting the Bay watershed. The
airshed will be larger than the watershed.
The overall purpose is to develop an under-
standing of which controls of NOX emissions
to the atmosphere will have the greatest
benefit on reducing the nitrogen loading to
coastal estuaries. This work is important to
the Chesapeake Bay Program Office's
efforts to achieve a 40% reduction in control-
lable nitrogen loading to the Bay by the year
2000 and to the upcoming 1996 Agency
decision on the amount of Phase 2 MOX
controls required by the 1990 Clean Air Act
Amendments.

Overview of Project
Development of more accurate spatial
fields of nitrogen loading estimates involves
estimation of annual average nitrogen depo-
sition to coastal areas using the Regional
Acid Deposition Model. These deposition
estimates are made for the new 1990 interim
emissions and representative meteorology.
Development of an understanding of the air-
shed influencing the Chesapeake Bay
watershed involves using the Regional Acid
Deposition Model (RADM) as a laboratory of
the real world to carry out sensitivity studies
that elucidate the contributions of different
emissions sources to the Bay watershed.
This source-receptor understanding is very
difficult to nearly impossible to develop from
empirical data and requires the designing of
sensitivity studies that will extract that infor-
mation from a mathematical model.

Background and Approach
Because the RADM is very computation-
ally intensive, it is not feasible, with today's
computing power to simulate an entire
year's worth of meteorology to develop
annual average estimates of deposition
loading. Instead, annual averages are
developed from a weighted average of a sta-
tistical sample of 30 five-day model runs.
The average is representative of meteorol-
ogy for the 1982 to 1985 period, which has a
rainfall pattern very close to a 30-year aver-
age. Meteorological events (synoptic pro-
gressions of high and low pressure systems)
with similar 850-mb wind-flow patterns were
grouped or classified by applying cluster
analysis to them. This resulted in 19 sam-
pling groups or strata. Meteorological cases
were randomly selected from each stratum,
based on the number of wind-flow patterns
in that stratum and on the number in each of
the other strata. This procedure approxi-
mates proportionate sampling. The number
of cases, 30, was set after carrying out a
sampling-error analysis on wet sulfur
NESC Annual Report - FY1994
85
-------
Atmospheric Deposition of Nitrogen to Chesapeake Bay
deposition and taking into consideration
computer resource limitations. These are
termed the aggregation cases.
Producing an estimate of the airshed
affecting the Bay watershed requires devel-
opment of a source-receptor understanding
on an annual basis. This development
requires an experimental design that will
extract this information from sensitivity stud-
ies with RADM. Because NOX emissions
contribute to oxidant production and there is
a dynamic interplay between the production
of ozone and nitric acid, the dominant form
by which nitrogen is deposited to the Earth's
surface, the modeling of nitrogen must incor-
porate full photochemistry as is done in
RADM. As a first approximation to the
source-receptor relations implicit in the
model calculations, a portion of the emis-
sions from sources of interest are subtracted
from the emissions fields. The 30 aggrega-
tion cases are run and the results subtracted
from results obtained with unperturbed
emissions fields. For this study, the objec-
tive was to develop an understanding of the
range of influence of ground-level NOX
emissions (such as automobiles) and upper-
level NOx emissions (such as power plants)
with regard to nitrogen deposition. From the
range of influence of different subregions,
the extent of the airshed can be estimated.
In addition, the same approach is used to
developing an estimate of the responsibility
of designated Bay states to deposition
across the Bay watershed.

Accomplishments Using NESC's Cray
Base 1990 emissions together with sub-
tractions of emissions from ten different
emissions areas roughly 200 km on a side
formed an effective "tagging" of the emis-
sions that was the core of the study. The ten
subregions span the potential regions of
influence on watershed deposition. The
determination of the range of influence of
the ten 'lagged" regions requiring 1,400
hours on a single-processor Cray Y-MP.
Runs to establish state-level responsibility
for nitrogen deposition required 850 hours
on a single processor Cray Y-MP. Initiation
of work for next year that will depend on 20
km resolution runs required 200 hours on a
single processor Cray Y-MP. The quantity of
CPU hours was large and was obtained over
two quarters, requiring close coordination
and cooperation between the NESC and the
RADM modeling team, a cooperation that
was successful. This project also! benefited
from 850 Cray Y-MP CPU hours of runs from
the Feasibility of Deposition Standards
Study. |

Scientific Results and Relevance to
EPA Mission :
The 'lagged" subregions study !on the
range of influence of NOX emissions on
nitrogen deposition as a function of the
height of the emissions (surface or tall-
stack) produced a somewhat unexpected
result. The range of influence of surface
emissions appears to be roughly 70-100% of
the range from tall stacks. This is different
than the "conventional" wisdom which would
"predict" that the range of influence from tall
stacks would be much greater. It is possible
that conventional wisdom has been influ-
enced by study of the sulfur system where
the primary specie, SO2, plays a significant
role in the total deposition. In the; nitrogen
system, nitrogen deposition is almost
entirely due to the secondary specie nitric
acid, HNOs (ignoring ammonia for the
moment). It is also likely that design effects
reduce the range computed by the model for
tall stacks from what it should be.
A comparison of the nitrogen 'lagged"
sub-regions with the tagged sulfur model for
the exact same subregion indicated the
range of influence of SO2 and NOX emis-
sions is very similar. This is not what was
expected, producing another surprise. This
result will need to be investigated further.
The distance over which NOX emissions
appear to noticeably influence the nitrogen
deposition is approximately 600 to 800 kilo-
meters. The comparability of the iSO2 and
NOX range of influence allowed us to use
the sulfur results to more precisely estimate
86
NESC Annual Report - FY1994
-------
or define the airshed affecting the Bay
watershed. The results of this study are that
the airshed affecting Chesapeake Bay is sig-
nificantly larger than the watershed. The air-
shed includes many sources along the
middle and upper Ohio River and it was sur-
prising how far down the Ohio River utility
sources were influencing the Chesapeake
Bay watershed. The airshed is roughly
200,000 to 220,000 mi2, more thanViree
times the watershed's 64,000 mi2. These
results are very important and, as intended,
have effectively set the stage for Fiscal Year
1995 work.

Future Objectives and Plans
Future plans call for a major shift to devel-
oping annual average nitrogen deposition
using the 20 km High Resolution RADM.
Higher resolution is needed to better resolve
urban influences, major point source influ-
ences and deposition to water surfaces for
more accurate linkage with the water quality
models. This shift will quadruple the CPU
requirements for each study. Two major
investigations are needed by the Chesa-
peake Bay Program Office. The first relates
to estimating the reductions in deposition
that are expected to occur due to the 1990
Clean Air Act and to implementation of more
stringent reductions in NOx emissions due
to requirements stemming from oxidants,
rather than acid rain. This study is expected
to require on the order of 2,000 Cray Y-MP
CPU hours. The second investigation
relates to defining the influence of urban
Atmospheric Deposition of Nitrogen to Chesapeake Bay

areas on nearby estuaries and defining the
portion of the nitrogen deposition that is due
to the urban areas. A modeling study is
required because monitoring data are lack-
ing. The second study is also expected to
require 2,000 Cray Y-MP hours.

Publications and Reports
Results from the 1994 studies are being
presented at the Annual SETAC Meeting in
November 1994. A book will be produced
after the meeting with a chapter describing
the results of these studies.

Figures
Figure 1, page 88. Range of integrated total
nitrogen deposition from a NOX source
region at the watershed boundary
(Southwest Pa/Northern WV), showing
the overlap with the Chesapeake Bay
watershed, as simulated by the Regional
Acid Deposition Model.
Figure 2, page 89. Range of integrated
total nitrogen deposition from a NOx
source region far from the watershed
boundary (Cincinnati Area), showing the
overlap with the Chesapeake Bay water-
shed, as simulated by the Regional Acid
Deposition Model.
Figure 3, page 90. Estimated boundary of
the airshed within which NOx emissions
significantly affect the Chesapeake Bay
watershed compared to the watershed
boundary.
1 Robin L. Dennis and Lewis C. Linker, U. S. EPA, Regional Acid Deposition Model Applications; EPA Atmospheric
Research and Exposure Assessment Laboratory, Research Triangle Park, NC 27711.
NESC Annual Report - FY1994
87
-------
Atmospheric Deposition of Nitrogen to Chesapeake Bay
RADM TOTAL NITROGEN DEPOSITION
INTEGRATED PERCENT CONTRIBUTION FROM
SUBREGION 13, SOUTHWEST PA/NORTHERN WV
1985 (100* REDUCTION IN NOX AND SO2)
/•*
Figure 1: Range of Integrated Total Nitrogen Deposition From a NOx Source at the
Watershed Boundary.
88
NESC Annual Report - FY1994
-------
Atmospheric Deposition of Nitrogen to Chesapeake Bay
RADM TOTAL NITROGEN DEPOSITION
INTEGRATED PERCENT CONTRIBUTION FROM
SUBREGION 22, CINCINNATI AREA ;
1985 (100% REDUCTION IN NOX AND S02) i
Figure 2: Range of Integrated Total Nitrogen Deposition From a NOX Source Region Far
From the Watershed Boundary.
NESC Annual Report - FY1994
89
-------
Atmospheric Deposition of Nitrogen to Chesapeake Bay
WATERSHED AND AJBSHED OUTLINES
Figure 3: Estimated boundary of the Airshed Within Which NOX Emissions Significantly
Affect The Chesapeake Bay Watershed. ,
90
NESC Annual Report - FY1994
-------
Study of the Feasibility of Acidic Deposition Standards1
EPA Research Objectives
Developing accurate estimates of the
impact of the 1990 Clean Air Act Amend-
ments (CAAA) on acidic deposition and
atmospheric sulfate (key to visibility degra-
dation in the Eastern United States) are
important to the Agency. The amount of
reduction in sulfur deposition to be antici-
pated by 2005 or 2010 due to implementa-
tion of Title IV, Phases I and II sets an
important baseline for understanding how
much mitigation in deposition is expected
and how much farther we might need to go
to provide protection to ecological
resources. The reduction in deposition load-
ing in Canada that is likely coming from the
United States and vice versa is important to
the U.S. Canada Air Quality Accord. These
estimates are important to the Canadians for
them to project whether they will achieve
their goal of wet sulfur deposition being
below 20 kg-SOf/ha. As well, the European
community is interested in estimates of the
long-range transport across the Atlantic of
sulfur-related acidic deposition that could be
affecting them. The objective is to develop
best estimates from evaluated and well-
characterized models of changes in acidic
deposition loading, visibility impacting pollut-
ants, and oxidants. Also, the cross-program
effects from the different Titles of the 1990
CAAA need to be characterized.

Overview of Project
Development of estimates of future depo-
sition involves creation of estimates of future
emissions that account for population and
economic growth plus the incorporation of
emissions controls called for by the 1990
CAAA. The new emissions are input to
RADM simulations to estimate the new
deposition, assuming the same meteorology
as today's. A difficult element for the projec-
tion of future emissions is the estimation of
power-plant retirements, and the installation
of new generating capacity to make up the
difference in on-line capacity and projected
demand. Of special difficulty is locating or
siting potential future plants for the model-
ing. These projections are generated by
experts in the field of emissions estimation
and projection. While the project focused on
sulfur deposition, in the course of the study it
became clear that nitrogen deposition was
also important. Therefore, defining the dep-
osition caused by the utility sector separate
from the other sectors became important. In
addition, the reduction that could be
obtained by enhanced reduction in utility
NOX emissions took on elevated importance
for the study. ;
i
Background and Approach
Because the RADM is very computation-
ally intensive, it is not feasible, with today's
computing power to simulate an entire years
worth of meteorology to develop annual
average estimates of deposition loading.
Instead, annual averages are developed
from a weighted average of a statistical
sample of 30 five-day model runs. The aver-
age is representative of meteorology for the
1982 to 1985 period, which has a rainfall
pattern very close to a 30-year average.
Meteorological events (synoptic progres-
sions of high and low pressure systems)
with similar 850-mb wind-flow patterns were
grouped or classified by applying cluster
analysis to them. This resulted in 19 sam-
pling groups or strata. Meteorological cases
were randomly selected from each stratum,
based on the number of wind-flow patterns
NESC Annual Report - FY1994
91
-------
Study of the Feasibility of Acidic Deposition Standards
in that stratum and on the number in each of
the other strata. This procedure approxi-
mates proportionate sampling. The number
of cases, 30, was set after carrying out a
sampling-error analysis on wet sulfur depo-
sition and taking into consideration com-
puter resource limitations. These are
termed the aggregation cases.
Several emissions projection scenarios
were developed for the year 2010. They
included reductions expected due to the
1990 CAAA under assumptions of trading
and no trading of SO2 emissions allocations.
They also included reductions beyond the
1990 CAA requirements for both utility and
industrial emissions. Very importantly, the
tagged sulfur engineering model was run for
53 emissions sub-regions for both 1985
emissions and 2010 CAAA-projected
emissions.
The 1990 Interim emissions were used as
a basis for assessing the responsibility of
the utility sector for nitrogen deposition and
for estimating the reduction in deposition
that might occur due to a 50% reduction in
utility emissions. The same technique as
used for the Chesapeake Bay long-range
influence study was used for this study. A
portion of the emissions from the sector of
interest was subtracted from the base and
the difference, scaled to 100%, defined the
magnitude of the responsibility for
deposition.

Accomplishments Using the NESC's
Cray
While the tagged engineering model runs
did not use the Cray, it is noteworthy that
more than 3,800 model runs on EPA's IBM
were required as part of this study. To
establish the sector responsibility for deposi-
tion (utilities, mobile sources, industry and
other) and to assess the effect a 50% reduc-
tions in utility and mobile emissions requires
approximately 850 Cray Y-MP CPU hours.
Scientific Results and Relevance to
EPA Mission
The reductions of total sulfur deposition
due to the Title IV acid rain controls are not
very protective and substantial reductions
beyond controls currently mandated may be
needed. In addition, reduction of nitrogen
deposition will be required. The results with
the Tagged Engineering model showed that
the distribution of responsibility fqr deposi-
tion shifts from 1985 to 2010 because the
largest sources reduce most and Isome of
the smallest source regions actually
increase. The tagged results were also
used to study the feasibility of geographi-
cally targeting emissions reductions while
still achieving deposition targets at the sen-
sitive aquatics regions. For modest deposi-
tion goals, geographic targeting was
feasible. For more stringent deposition
goals, geographic targeting became less
practical.
For nitrogen deposition, the spatial distri-
bution of emissions is such that utility emis-
sions and mobile emissions have! distinct
regions of influence. To address acidic dep-
osition at several sensitive aquatic regions,
control of both utility and mobile sources
may be needed. Therefore, if nitrogen dep-
osition is to be reduced at the right places,
then both acidic-deposition motivated and
ozone-motivated control of NOX emissions
may be needed. This is consistent with the
new emphasis in EPA on considering the full
range of multi-media and multi-program
effects and taking a more holistic perspec-
tive towards pollution control. '•,
I

Future Objectives and Plans
Future plans call for a more explicit study
of the ability to trade SO2 reductions for
NO/ reductions. In addition, benefit to
acidic deposition of oxidant-motivated con-
trols of NOX emissions will be studied
together with their benefit to nitrogen loading
reduction for coastal estuaries. \
92
NESC Annual Report - FY1994
-------
Study of the Feasibility of Acidic Deposition Standards
Publications and Reports
A report to Congress is in the process of
being reviewed for release early in FY 1995.

Figures
Figure 1. Map of the percent contribution
from utility NOX emissions to the anthro-
pogenic portion of the total (wet + dry)
annual nitrogen deposition for the RADM
modeling domain.
Figure 2, page 94. Map of the percent con-
tribution from mobile source NOX emis-
sions to the anthropogenic portion of the
total (wet + dry) annual nitrogen deposi-
tion for the RADM modeling domain.
RADM TOTAL (WET+ORY) OiTOmON
CASE
UTUTY SOURCES
OF fMO i
Figure 1: Utility NOX Emissions Contributions.

1 Robin L. Dennis, U. S. EPA, Regional Acid Deposition Model Applications; EPA Atmospheric Research and
Exposure Assessment Laboratory, Research Triangle Park, NC.
NESC Annual Report - FY1994
93
-------
Study of the Feasibility of Acidic Deposition Standards
RADM TOTAL (WST+DfWI NlieD<3EN
(EXCLUONG
MOBILE
OF 1990 8ASE
Figure 2: Mobile NOX Emissions Contributions.
94
NESC Annual Report - FY1994
-------
1990 Clean Air Act Section 812 Retrospective Study1
EPA Research Objectives
In the 1990 Clean Air Act Amendments, in
Section 812, Congress asked for a retro-
spective assessment of the benefits and
costs of the 1970 Clean Air Act. A multi-pol-
lutant assessment is called for that focuses
on criteria pollutants (SO2, NOX, and O3),
particulates, sulfates, visibility and acidic
deposition. The basic objective is to
develop a retrospective assessment. Thus,
estimates are to be developed of the change
in pollutant concentration and loads that
would have occurred had the 1970 CAA not
been enacted, and contrast these with the
pollutant concentrations and loadings that
historically occurred. The pollutant loading
estimates are to be linked to effects models
to generate estimates of effects. Costs and
benefits associated with implementation
against non-implementation are to be com-
pared and contrasted.

Overview of Project
The 812 Retrospective project involves a
coordinated effort between the Office of Pol-
icy Analysis Research (OPAR), the Office of
Program Planning and Evaluation (OPPE),
and the Office of Research and Develop-
ment (ORD). It is being coordinated and
tracked at the Assistant Administrator Level
within the Agency. This study has been
mandated by Congress and is being tracked
by the Government Accounting Office
(GAO).
All pollutants are being addressed in this
investigation at both the urban and regional
scale. Models are required, both of emis-
sions and of air quality, to develop estimates
of improvements expected due to full imple-
mentation of the 1990 CAAA, and estimates
of increases in air pollution that would have
accrued had not the CAAA been enacted.
The RADM model is being used because
it provides the advantage of being able to
consistently address in the same modeling
system regional SO2, sulfate, visibility deg-
radation due to sulfates, acidic deposition for
sulfur and nitrogen species, and ozone.
RADM inclusion of clouds and precipitation
makes it a tool of choice for predicting
regional ozone during summer months for
welfare (crop and terrestrial) effects.

Background and Approach
This project requires estimates of annual
averages for acidic; deposition (sulfur and
nitrogen), annual pollutant distributions for
sulfate and sulfate-associated visibility deg-
radation, and seasonal distributions of
oxidants (ozone).
Because the RADM is very computation-
ally intensive, it is not feasible, with today's
computing power to simulate an entire years
worth of meteorology to develop annual
average estimates of deposition loading or
annual concentration distributions. Instead,
annual averages and distributions are devel-
oped from a weighted average of a statisti-
cal sample of 30 five-day model runs. The
average is representative of meteorology for
the 1982 to 1985 period, which has a rainfall
pattern very close to a 30-year average.
Meteorological events (synoptic progres-
sions of high and low pressure systems)
with similar 850-mb wind-flow patterns were
grouped or classified by applying cluster
analysis to them. This resulted in nineteen
sampling groups or strata. Meteorological
cases were randomly selected from each
stratum, based on the number of wind-flow
patterns in that stratum and on the number
in each of the other strata. This procedure
approximates proportionate sampling. The
number of cases, 30, was set after carrying
out a sampling-error analysis on wet sulfur
deposition and taking into consideration
NESC Annual Report - FY1994
95
-------
1990 Clean Air Act Section 812 Retrospective Study
computer resource limitations. These are
termed the aggregation cases.
Annual average sulfur deposition, sulfate
distributions, and visibility degradation distri-
butions are developed with the RADM sul-
fur-only Engineering Model together with
aggregation. Annual average nitrogen dep-
osition estimates are developing using the
full RADM with aggregation. A sample of
the 1988 ozone season simulated with the
High Resolution RADM is used to estimate
the seasonal distribution of ozone. Histori-
cal data is used to establish the actual distri-
butions. The model is used to provide an
estimate of the relative change in the distri-
bution by simulating both the control and no-
control cases for the same average meteo-
rology. Because biogenic emissions are
highly uncertain and a new estimate for iso-
prene emission fluxes came out during the
study that was approximately three times
higher, a special sensitivity study regarding
the ozone simulations was also carried out
as part of the Retrospective Study to
develop a judgment of uncertainties
involved.

Accomplishments Using the NESC's
Cray
The RADM Engineering Model was run on
EPA's IBM mainframe. The RADM and High
Resolution RADM were run on the NESC's
Cray. To simulate the change in acidic dep-
osition for 1980 and 1990 required 550 Cray
Y-MP CPU hours. To create the seasonal
estimates of the ozone distributions for 1980
and 1990 required approximately 500 Cray
Y-MP CPU hours. The ozone sensitivity
study with respect to isoprene emissions
required approximately 600 Cray iY-MP
hours during FY 1994. This sensitivity study
is not yet complete. More than a third of the
CPU hours were needed by the sensitivity
study. It is estimated that the ozone sensi-
tivity study will eventually take half of the
total hours spent on the 812 Retrospective
study. This is crucial to the credibility of the
overall study. However, sufficient CPU
resources often are not available to allow an
appropriate development of uncertainty
estimates.

Scientific Results and Relevance to
EPA Mission
The main conclusion thus far is from the
uncertainty study. The relative change in the
distribution of ozone simulated by the
regional model is fairly insensitive to the
uncertainty in biogenic isoprene emissions,
even though the absolute ozone concentra-
tions that were predicted are sensitive to the
uncertainty. Other results are waiting guid-
ance from the effects groups as to how the
RADM outputs should be expressed. The
dose-response functions that will be used
are under Agency review.

Future Objectives and Plans;
Once the 812 Retrospective Study is com-
pleted, the Agency expects to learn from the
experience and design a Prospective Study,
also mandated by Congress. We [expect the
RADM model to be involved nereis well.
Publications and Reports
None as yet.
1 Robin L. Dennis and Jim DeMocker, U. S. EPA, Regional Acid Deposition Model Applications; EPA Atmospheric
Research and Exposure Assessment Laboratory, Research Triangle Park, NC. ;
96
NESC Annual Report - FY1994
-------
•. ., - . ,.;^r-.._ J\'"
The Role of Supercomputers in Pollution Prevention:
Predicting Bioactivation for the Design of Safer Chemicals1
Chemical design has traditionally focused
upon developing chemicals that perform a
specific function (e.g. solvent, reagent, dye,
etc.), with essentially no consideration given
to avoiding the inherent toxicity or hazard-
ous nature that chemicals often have. Years
of this reasoning has resulted in the synthe-
sis and release into the environment of enor-
mous quantities of many toxic chemicals.
The toxic effects that many of these chemi-
cals have had on human health and the
environment as a result of their production
and use has become a major societal con-
cern. From this concern, the concept of pol-
lution prevention evolved and the Pollution
Prevention Act was passed by Congress in
1990. Pollution Prevention has become the
central environmental ethic of the Environ-
mental Protection Agency (EPA) and the
nation.
The ultimate approach to pollution preven-
tion is source prevention or, more simply, not
to create toxic chemicals in the first place.
The most desirable and efficient way of pre-
venting toxic effects that chemicals may
have on human health and the environment
is during the design of new chemicals or the
redesign of existing chemicals. That is, one
of the best ways to prevent pollution is to
design chemicals such that they are not only
useful industrially or commercially, but that
they are not toxic as well. Not only is human
and environmental harm reduced, avoided,
or alleviated, but the cost of any regulatory
action in terms of job loss and capital invest-
ment is minimized. In responding to the
Administration's and EPA Administrator
Carol Browner's requests for EPA to incor-
porate pollution prevention whenever and
wherever possible, scientists of EPA's Office
of Pollution Prevention and Toxics (OPPT)
have recently initiated a new source preven-
tion initiative called "Designing Safer Chemi-
cals". The underlying principles of this
initiative are: 1) consideration of the toxicity
of existing chemicals already in commerce
and of new chemicals before they are manu-
factured and used; and 2) when necessary,
make structural modifications that reduce
toxicity without affecting overall commercial
usefulness (i.e., efficacy).
The toxicity of many chemicals is attribut-
able to their metabolism. That is, following
absorption, most chemicals are converted
(metabolized) in the body to other chemi-
cals. These latter chemicals (metabolites)
can be nontoxic, but are often toxic. The
metabolism of a chemical to toxic sub-
stances is known as bioactivation, and
accounts for the toxicity of most chemicals
that are known to tie toxic. Knowledge of
the metabolites that are formed, and the
structural requirements that lead to their for-
mation can be highly useful for designing
chemicals such tha,t they will not form toxic
metabolites. Using EPA's supercomputer to
perform ab initio calculations that otherwise
would be impossible on large databases, we
are developing models that can be used to
predict the tendency of a chemical to be bio-
activated. Using the supercomputer, we
have already developed and validated a
model that is not only useful for predicting
the hazard potential of new, untested chemi-
cals, but also for the design of less toxic and
equally efficacious analogs of the chemical.
Other models are currently being developed
using the supercomputer.
' Stephen C. DeVito, Office of Pollution Prevention and Toxics (OPPT), U.S. Environmental Protection Agency (Mail
Code 7406), Washington, DC 20460.
NESC Annual Report - FY1994
97
-------
The Role of Supercomputers in Pollution Prevention: Predicting Bioactivation for the Design of Safer
Chemicals
98 NESC Annual Report - FY1994
-------
Application of Molecular Orbital Methods to Qsar And Lfer:
Understanding and Characterizing interactions of Solutes
and Solvents1'2
Introduction
Quantitative Structure Activity Relation-
ships (QSAR) and linear free energy rela-
tionships (LFER) have been used widely to
correlate molecular structural features with
their known biological, physical and chemi-
cal properties1-7. QSAR and LFER assume
there is a quantifiable relationship between
the microscopic (molecular) and macro-
scopic (empirical) properties in a compound
or set of compounds. Based on the
assumption that properties can be related to
the change in free energy (AG), and that this
AG is dependent on structure, the original
relationships were identified and quantified
by Burkhardt and Hammett8-11. Hansch
eventually applied this technique to medici-
nal chemistry with the realization that the
octanol/water partition coefficient would pro-
vide reasonable correlations with many bio-
logical activities2.
The application of QSAR and LFER meth-
ods to predicting and understanding solute/
solvent interactions has been a relatively
recent development, primarily conceptual-
ized by Kamlet and Taft with the linear solva-
tion energy relationship (LSER) and the
solvatochromic parameters12-17. A number of
additional regressions have also appeared,
using additional descriptors to correlate sol-
vation effects18.
The LSER methodology, however, does
provide one advantage that most QSAR
and LFER regressions lack, a single set of
four descriptors that are used in every corre- •
lation. The advantage here is obvious: One
is able to immediately and directly compare
different data sets and activities and have a
base of reference. The generalized LSER,
formalized by Kamlet and Taft is shown in
Equation 1.
The LSER parameters (molar volume,
hildebrand solubility parameter, and three
spectroscopically (derived parameters
describing polarizability, hydrogen bond
acidity and hydrogen basicity) have been
used to successfully correlate over 250 dif-
ferent properties involving solute/solvent
interactions. The LSER has led to a much
better understanding of the effects of solvent
on properties.
Based on the LSER methodology, we
have developed a new set of theoretically
derived parameters for correlations in QSAR
and LFER relationships. Termed the theoret-
ical linear solvatiori energy relationship
(TLSER), this methodology has been
applied to a numbeir of diverse data sets cor-
relating general toxicity, specific receptor-
based toxicity, solute/solvent based physical
properties and UV-visible spectral shifts19-26.
In each case, the resulting TLSER regres-
sion is equivalent or better in correlative
capability to the similar LSER or classical
QSAR regression.
This paper provides a general overview of
the TLSER, and presents three examples of
how the TLSER descriptors have been used
to provide insight into solute/solvent interac-
tions. In particular, three different solute/sol-
vent properties are presented that provide
an example of the correlative ability of the
TLSER.
LOG (Property) =
bulk/cavity + polarizability/dipolarity
+ hydrogen bonding

Equation 1
NESC Annual Report - FY1994
99
-------
Application of Molecular Orbital Methods to Qsar And Lfer. Understanding and Characterizing interactions
of Solutes and Solvents
Methods
All geometries were optimized using the
MNDO algorithm as contained within
MOPAC vG.O27-28. Table 1 lists the com-
pounds used in this study. The experimental
data is taken directly from the original
sources29-35. Visualization and structure
entry were performed using the in-house
developed Molecular Modeling Analysis
and Display System and PC Model36'37. All
multiple regressions were performed using
Minitab38.
The TLSER descriptors were taken
directly from the MNDO calculations. These
descriptors consist of six molecular parame-
ters that attempt to describe the important
features involved in solute/solvent interac-
tions. These parameters were developed
from and are modeled after the LSER meth-
odology. The same generalized equation as
the LSER, as shown in Equation 1, page 99,
is applicable to the TLSER.
The bulk/steric term of the TLSER is
described by the molecular van der Waal's
volume (Vmo), given in cubic angstroms, and
is computed by the method of Hopfinger39.
The dipolarity/polarizability term uses the
polarizability index (n\), obtained by dividing
the polarization volume40-41 by the molecular
volume to obtain a unitless quantity42. The
resulting 7Cl is not generally correlated with
Vjnc and defines the ability of electron cloud
to be polarized by an external field.
The hydrogen bonding term from Equa-
tion 1 is divided into two effects in the LSER
approach, a hydrogen bond acidity (HBA)
and a hydrogen bond basicity (HBB). In the
TLSER, the HBA and HBB effects are fur-
ther subdivided into two contributions each,
covalent acidity and basicity, and electro-
static acidity and basicity. The covalent
HBB contribution is defined as the molecular
orbital basicity (sb), and is computed by sub-
tracting the energy of the highest [occupied
molecular orbital (HOMO) in the Substrate
from the energy of the lowest unoccupied
molecular orbital (LUMO) of water. The
covalent HBA contribution is the molecular
orbital acidity (sa), and is computed in an
analogous manner, with the energy of the
HOMO of water being subtracted from the
energy of the LUMO of the substrate. The
electrostatic HBB, or the electrostatic basic-
ity (q~) is the magnitude of the most negative
formal charge in the molecule. The electro-
static acidity (q+) is the value of the most
positive hydrogen. Both q+ and q" are
derived directly from the Mulliken iCharges.
The general form for this equation, then, is
shown in Equation 2. P is the property of
interest and P0 is the intercept. It is impor-
tant to note that although there are six
descriptors in the generalized model, most
correlations reduce to a 2-4 parameter u.

Results and Discussion
The TLSER can be used to characterize
three different types of data sets: a) multiple
solutes in a single solvent, b) multiple sol-
vents with a single solute, and c) multiple
solutes in multiple solvents. In each case,
the information derived from the solute/sol-
vent interaction is different. In case (a), the
resulting coefficients describe the effect of
the solvent in the process, and the parame-
ters provide the details of the solute. In
case (b) the reverse is true; the coefficients
provide the effects of the solute in the pro-
cess, and the descriptors provide informa-
tion about the solvents. Finally in' case (c),
which is the most general, the coefficient
describe the process independent of solute
or solvent, as descriptors are present to
describe both solute and solvent.
LOG P = PQ + aVmc + bftj + ca +dq' + e sa + fq+
Equation 2
100
NESC Annual Report - FY1994
-------
Application of Molecular Orbital Methods to Qsar And Lfer: Understanding and Characterizing Interactions
of Solutes and Solvents
solvents provides a unique probe with which
to study solute/solvent interactions35'46. The
data set from Ruoff, et al, was used to gen-
erate a TLSER35. It is of importance to note
that although the TLSER descriptors now
relate to the solvent (whereas in case (a) the
descriptors referred to the solute), it is the
same set that is used in case (a). The
resulting regressions is shown in Equation 4.
From this regression, increased solvent
volume and covalent acidity and basicity
increase solubility of 060, where increased
electrostatic basicity decreases solubility. In
this case, the covalent terms have been
slightly modified, with the e« and sa being
divided by 100 and subtracted from 0.3 in
order to present an increasing scale (larger
numbers indicate better acids or bases,
depending on the scale). The dependence
on volume can be seen as an indication of
the effect of the Hildebrand solubility param-
eter (8h is proportional to 1/V). This does
not explain why V is significant although 8h
is not. The dependence on both epsilons
indicate that the solubility is dependent on
primarily interactions between orbitals, and
not electrostatic interactions. This is further
reinforced by the negative dependence on
solvent electrostatic basicity.

Case C: Al-L of t-Butvl Haliries in Alcohols
and Water ;

The final case to be described is the
enthalpy of solution of t-butyl chloride, bro-
mide and iodide in water and 13
Case A: Solubility in Supercritical COo
As an example of case (a), let us exam-
ine the case of solubility of solutes in super-
critical carbon dioxide24'31-33'43-45. Solubility in
supercritical media has been recently
reviewed by a number of researchers.
Reactions in supercritical media is currently
being investigated as a means of enhancing
the reactivity of hazardous materials. The
resulting regression, using literature data of
solubility in supercritical CO2 is shown in
Equation 3.
The regression agrees very favorably to
the published regression of Politzer. By
examining the coefficients, it is possible to
speculate about the important aspects of the
solubility in supercritical carbon dioxide.
The electrostatic basicity, q_, is negative
suggesting an inverse relationship between
basicity and solubility. Likewise, the molecu-
lar orbital basicity is positive, indicating a
similar inverse relationship. The electro-
static acidity is also important in this regres-
sion, suggesting increased solubility as the
solute hydrogen bond acidity is increased.
The final term, n\, is also negative, indicating
harder solutes (in the Pearson or Drago
sense) would be better solvate. This is
entirely consistent with accepted models of
acidity and basicity.

Case B: Solubility of C60 in Various
Solvents
The carbon structure of C60 and the
method by which it is solvated by different
l4MPa
LOG S CQ = -6.0377tj+10.44080 -22.098^+24.350^-8.370

2 N=19 R=0.928 sd=0.477 F=22

Equation 3
LOGS = 1.515 -—- +45.761 Sp -3.081 q. +51.81 Osa -16.895

N=45 R=0.890 sd=0.789 F=37 ;

Equation 4 ',
NESC Annual Report - FY1994
101
-------
Application of Molecular Orbital Methods to Qsar And Lfer: Understanding and Characterizing Interactions
of Solutes and Solvents :
alcohols3*-47-*8. This presents a unique data
set for two reasons. First, it is one of only a
few thermodynamic properties we have
examined using the TLSER. Second, the
data sets consists of information on three
solutes in 14 solvents, providing the possibil-
ity of a multiple solute/multiple solvent
regression. Goncalves and coworkers mea-
sured this data and correlated it against a
series of more "conventional" LFER type
descriptors, including dipole moment, the
Kirkwood dielectric function, molar volume,
and the Reichardt-Dimroth ET. Both individ-
ual correlations for each halide and for the
combination was presented by Goncalves.
The corresponding TLSER regression for
the chloride, bromide and iodide is shown in
Equations 5 through 7, respectively.
These regressions do show the similarity
in the behavior of each of the halides in the
set of hydroxylic solvents. In each case, the
same descriptors are significant, and in each
case, the same descriptor is the "most
important" (based on the t statistic, not
shown). In each case, there is an inverse
relationship between the molecular orbital
acidity (8a is inversely related to the acidity,
so a positive coefficient indicates an inverse
relationship). Further, solvent electrostatic
basicity leads to lower values of AHS, and
increased solvent electrostatic hydrogen
acidity leads to a higher value of AHS.
Based on the assumptions and previous
findings of Goncalves, each of these
descriptions, along with the sign of the coef-
ficient, behaves as expected. Further, the
inclusion of each of; these descriptors
makes "chemical sense". This is a vital part
of the TLSER, and one to which we have
attempted to adhere.
In addition to regressions based on the
individual solutes, a single regression can
be obtained from the combination of the
enthalpies of all three cases. In this case,
the generalized TLSER equation is modified
to account for solute as well as solvent. This
form is shown in Equation 8, page 103.
This equation is identical to Equation 2,
page 100, except that those descriptors with
subscript 1 refer to the solvent, and those
descriptors with subscript 2 refer to the sol-
ute. As with the individual solute equations,
not all descriptors are statistically significant
(at the 0.95 confidence level). Further,
those terms that are significant in Equations
5 - 8, page 102 and page 103 are, for the
most part, also significant in the combined
equation, which is shown as Equation 9,
page 103.
The only aspect of the overall equation
that is not apparent in Equations 5 through 8
AHS = 4.827-r^f + 7.6378a - 749.35q. + 895.70q+ - 37.05
's ^•v"-' 100
N=14 R=0.965 sd=0.513 F=31
Equation 5
AHU = 4.056-
8.6518a - 898.90q. +934.90q+ - 35.14
N=14 R=0.959 sd=0.599 F=26

Equation 6

Vmc
9.343sa - 913.10q. + 1114,30q+ - 48.92
N=14 R=0.961 sd=0.608 F=27

Equation 7
102
NESC Annual Report - FY1994
-------
Application of Molecular Orbital Methods to Qsar And Lfer: Understanding arid Characterizing Interactions
of Solutes and Solvents
is the inclusion of the solvent molecular
orbital basicity. Plus, the solvent electro-
static basicity is no longer significant in the
generalized equation. A more detailed
description of this data set, as well as a
complete review of the Goncalves regres-
sions, is currently in preparation, and is
being submitted to the Journal of Physical
Organic Chemistry tor publication.

Conclusions
The TLSER methodology attempts to pro-
vide an alternate formalism to the classical
LFER and QSAR approaches, but still stay-
ing within the concepts advanced by Kamlet
and Taft in the LSER. Further, our efforts
have been geared toward identifying a con-
sistent set of parameters calculated solely
from molecular orbital methods, that can be
used as a replacement for the LSER based
solvatochromic parameters. The TLSER
has been shown to provide correlations on
the same order as the LSER when direct
comparisons are made, and significantly
better than classical QSAR approaches.
The examples provided here further rein-
force the usefulness of the TLSER, and
describe three very diverse types of interac-
tions dependent on solute/solvent interac-
tions. The TLSER provides a description of
macroscopic properties in terms of molecu-
lar orbital derived microscopic descriptors.
Further, due to the nature of these descrip-
tors, these descriptors usually do not cross
correlate (although this is not always the
case). Finally, the TLSER provides a funda-
mental capability of allowing for the a priori
prediction of properties of new compounds.

References
1 Gupta, S. Chem Rev 1987, 87,1183.
2 Hansch, C.; Fujita, T. J. J Am Chem Soc
1964,86,1616.
3 Kier, L. B.; Hall, L. H. Connectivity in
Structure-Activity Analysis; Research
Studies Press: Letchworth, 1986.
4 Lewis, D. F. V. In Progress in Drug
Metabolisms^; Vol. Plenum Press.
5 Reichardt, C. Solvents and Solvent
Effects in Organic Chemistry, Second
ed.; VCH: New York, 1988.
6 Shorter, J. Quart Rev 1970, 27, 44.
7 Silipo, C.; Vittoria, A. QSAR: Rational
Approaches to the Design ofBioactive
Compounds; Elsevier: New York, 1991;
Vol. 16, pp 573.
8 Burkhardt, G. N. Nature (London) 1935,
17,684. |
9 Hammett, L. P. Chem Rev-[935, 17,125.
10 Hammett, L. P. J Am Chem Soc 1937,
59,96. . i
11 Exner, O. In Correlation Analysis of
Chemical Data; J. Shorter, Ed.; Plenum
Press: New York, 1988; pp 25.
12 Kamlet, M. J.; Taft, R. W.; Abboud, J.-L.
M. J/4CS1977, 91, 8325.
13 Kamlet, M. J.; Taft, R. W. Prog Org Chem
1983, 48, 2877,,
14 Kamlet, M. J.; Abraham, M. A.; Doherty,
R. M.; Taft, R. W. JACS1984, 106, 464.
15 Kamlet, M. J.; Abraham, M. A.; Doherty,
R. M.; Taft, R. W. Nature 1985, 313, 384.
LOG P = a8h1Vmc2 + b TC
Equation 8
AHS = -0.502-^- -0.764Sp+0.531Sa-3049.0q.+43.12

Equation 9
NESC Annual Report - FY1994
103
-------
Application of Molecular Orbital Methods to Qsar And Lfer: Understanding and Characterizing ^Interactions
of Solutes and Solvents
16 Kamlet, M. J.; Taft, R. W. Acta Chem
Scand 1986, Part B 40, 619.
17 Kamlet, M. J.; Doherty, R. M.; Abraham,
M. H.; Taft, R. W. QSAR 1988, 7, 71.
18 Kamlet, M. J.; Taft, R. W.; Famini, G. R.;
Doherty, R. M. Acta Chem Scand 1987,
41, 589.
19 Famini, G. R. Using Theoretical Descrip-
tors in Structure Activity Relationships"
V., U.S. Army Chemical Research,
Development and Engineering Center,
1989.
20 Famini, G. R.; Penski, C. A.; Wilson, L. Y.
J Phys Org Chem 1992, 5, 395.
21 Famini, G. R.; Ashman, W. P.; Mick-
iewicz, A. P.; Wilson, L. Y. QSAR 1992,
11,162.
22 Famini, G. R.; DeVito, S. C.; Wilson, L. Y.
In ACS Symposium Series on Biomark-
ers; M. Saleh and J. Blancato, Ed.; 1992.
23 Famini, G. R.; Marquez, B. C.; Wilson, L.
Y. J Chem Soc P//1993, 773.
24 Famini, G. R.; Wilson, L. Y. J Phys Org
Chem 1993, 6, in press.
25 Wilson, L Y; Famini, G. R. J Med Chem
1991, 34,1668.
26 Cramer, C. J.; Famini, G. R.; Lowrey, A.
H. Accts Chem Res 1993, In Press.
27 Dewar, M. J. S.; Thiel, W. JACS1977,
99, 4899.
28 Stewart, J. J. P. "MOPAC Manual, Sixth
Edition", U.S. Air Force Academy, 1990.
29 Bartle, K. D.; Clifford, A. A.; Jafar, S. A.;
Shilstone, G. F. J Phys Chem RefData
1991,20,713.
30 Dobbs, J. M.; Wong, J. M.; Lahiere, R. J.;
Johnston, K. P. IndEng Chem Res 1987,
26, 56.
31 Nakatani, T.; Ohgaki, K.; Katayama, T. J
Supercrit Fluids 1989, 2, 9.
32 Sako, S.; Ohgaki, K.; Katayama, T. J
Supercrit Fluids 1988, 1,1.
33 Sako, S. Ohgaki, K.; Katayama, T. J
Supercrit Fluids 1989, 2, 3. j
34 Goncalves, R. M. C.; Albuquerque, L. M.
P. C.; Martins, R. E. L; Simoes, A. M. N.;
Ramos, J. J. M. J Phys Org Chem 1992,
5, 93.
35 Ruoff, R. S.; Tse, D. S.; Malhotra, R.;
Lorents, D. C. J Phys Chem 1993, 97,
3379.
36 Leonard, J. M.; Famini, G. R. A User's
Guide to the Molecular Modeling Analy-
sis and Display System (MMADS)", U.S.
Army Chemical Research, Development
and Engineering Center, 1989.
37 PCModel In Serena Software, PO Box
3076, Bloomington, IN 47402: pp.
38 Minitab In Minitab Inc, 3081 Enterprise
Dr, State College, PA 16801: pp.
39 Hopfinger, A. J. JACS 1980, 702,7126.
40 Dewar, M. J. S.; Stewart, J. J. P. Chem
P/?ys Lett 1984, 111, 416. \
41 Kurtz, H. A.; Stewart, J. J. P.; Dieter, K.
M. J Comp Chem 1990, 11, 82.
42 Famini, G. R. Using Theoretical Descrip-
tors in Structure Activity Relationships II:
Polarizability Index", U.S. Army Chemi-
cal Research, Development and Engi-
neering Center, 1988.
43 Politzer, P.; Lane, P.; Murray, J. S.;
Brinck, T. J Phys Chem 1992, 96, 7938.
44 Politzer, P.; Murray, J. S.; Concha, M. C.;
Brinck, T. J Mol Struc (THEOCHEM)
1992, submitted. ;
45 Politzer, P.; Murray, J. S.; Lane, P.;
Brinck, T. J Phys Chem 1992, in press.
46 Smith, A. L; Li, D.; King, B.; Romanow,
W. J. J Phys Chem 1993, Submitted.
47 Goncalves, R. M. C.; Albuquerque, L. M.
P. C.; Simoes, A. M. N. Port Elect Chim
/tote 1991, 9, 487.
48 Goncalves, R. M. C.; Simoes, A. M. N.;
Leitao, A. S. E.; Albuquerque,!. M. P. C.
J Chem Res 1992, 330. i
1 George R. Famini, U.S. Army Edgewood Research, Development and Engineering Center, Aberdeen Proving
Ground, MD 21010.
2 Leland Y. Wilson, Department of Chemistry, La Sierra University, Riverside, CA 92515. :
104
NESC Annual Report - FY1994
-------
Calculated Infrared Spectra for Haiogenated
Hydrocarbons12 ., \":
Background
The U.S. Environmental Protection
Agency (EPA) requires high quality refer-
ence spectra for reliable identification and
quantification of pollutants in environmental
monitoring. Costs involved in acquiring such
spectra, which include preparation and puri-
fication of standards, physical measure-
ments of spectra, safety considerations, as
well as waste disposal, can be substantial.
Consequently, the current experimental
methods for obtaining reference spectra in
environmental analysis are not economical
and cannot keep pace with the proliferation
of new chemicals requiring such
identification.
In an alternative approach, the U.S. EPA
(EMSL-LV) has begun a pilot project to eval-
uate the application of ab initio computa-
tional chemistry methods in the
determination of infrared spectra (frequen-
cies and intensities) for molecules of envi-
ronmental interest. In this study,
experimental vibrational frequencies for
halogenated hydrocarbons are correlated
with frequencies determined computation-
ally in order to ascertain a suitable level of
theoretical treatment with a desired level of
accuracy that does not result in excessive
computational demands. Results from this
work are promising and indicate strong pos-
sibilities for the role of computational meth-
ods in the determination of infrared spectra
for molecules whose infrared properties are
not known, or to aid in the interpretation of
experimental spectra that have not been
fully assigned.

Method / Approach
In this work, halogenated (fluorinated and
chlorinated) aliphatic and aromatic com-
pounds were treated with the computational
prescription for determining infrared fre-
quencies and intensities of compounds out-
lined by Aue et al. in their compilation study
of organic compounds1. The Hartree-Fock
level of theory was employed in combination
with a range of basis sets (STO-3G, 3-21G,
3-21 G*, and 6-31 G*). This level of treat-
ment is suitable for larger molecular systems
that would be of interest to EPA. All calcula-
tions were obtained using the GAUSSIAN
92 electronic structure program on the
NSCEE supercomputer at Las Vegas, NV.

Results / Discussion
The computed fundamental vibrational
frequencies are obtained from the harmonic
oscillator approximation, while the experi-
mentally determined values are best charac-
terized by an anharmonic potential.
Scaling factors multiplying the theoretical
frequencies have been utilized in the past to
compensate for this comparison of unlike
quantities as well as to account for system-
atic errors resulting from the neglect of elec-
tron correlation2. The standard textbook
scaling factor value of 0.89 developed by
Pople and coworkers, has been routinely
applied to frequencies computed at the Har-
tree-Fock level of theory3.
In this study, linear regression analysis
(with an intercept forced through the origin)
was used to compare experimental funda-
mental frequencies with those that had been
determined computationally in order to more
accurately correlate subsets of the haloge-
nated hydrocarbon compounds (Table 1,
page 107). Four groups of compounds
were defined: aliphatic fluorinated, aliphatic
chlorinated, a combination of aliphatic chlori-
nated and fluorinated, and aromatic chlori-
nated. Experimental gas phase
frequencies were used for comparison with
NESC Annual Report - FY1994
105
-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
theoretically assigned frequencies for all ali-
phatic compounds4. The only available
chlorinated aromatic assigned frequencies
were characterized in the liquid phase and,
in some cases, the solid phase5.
Using the best theoretical treatment
employed, HF/6-31G*, the aliphatic fluori-
nated compounds produced a slope of
0.8991 with a standard error (defined as the
measure of the amount of error in the predic-
tion of y=experimental frequency for an indi-
vidual #=theorical value) of 48.4 cm"1 (R2,
the Pearson product moment correlation
coefficient, equaled 0.9971). The scaling
factors obtained at the same level of theory
for the aliphatic and aromatic chlorinated
compounds were 0.8937 and 0.9017 with
standard errors of 28.3cm"1 and 27.3cm"1,
respectively (R2 for aliphatic chlorinated
equaled 0.9991 and for chlorinated aromat-
ics, the R2 value was 0.9988). We there-
fore conclude that our scaling factors are in
reasonable agreement with the previously
accepted value of 0.89. The HF/3-21G
(HF/3-21 G* for chlorinated aromatics) and
HF/STO-3G results produced larger stan-
dard errors than would typically be desired
for computing IR spectra, but both methods
can be systematically employed if high-level,
ab initio calculations are not computationally
feasible.
Coupling the scaled fundamental frequen-
cies with the calculated intensities provides
a theoretical representation of an IR spec-
trum that can be compared with an experi-
mental vapor phase spectrum. A typical
problem encountered by EPA is the identifi-
cation of a specific structural isomer for a
given compound. In this work, the spectra
(frequencies and intensities) for the three
structural isomers of dichlorobenzene were
determined at the HF/6-31 G* level and com-
pared with the experimental vapor phase
spectra6 (Figure 1, page 108, Figure 2,
page 109, and Figure 3, page 110).
Results indicate that the calculated spec-
tra are representative of the experimental
spectra for each structural isomer, providing
a unique identification of the fundamental
frequencies and their relative intensities.
Spectra computed for the remaining chlori-
nated aromatics gave similar results. We
would conclude that for this particular set of
compounds, theory provides a "fingerprint"
spectrum that can be compared with the
experimental gas phase spectrum for the
complete identification of a given bompound.

Conclusion
Computational IR spectra (frequencies
and intensities) for a number of halogenated
aliphatic and aromatic compounds have
been determined. A linear regression anal-
ysis between experimental gas phase fun-
damental frequencies (where available) and
the calculated frequencies provides a scal-
ing factor in reasonable agreement with the
currently accepted value of 0.89. \ Com-
puted spectra for the chlorinated aromatics
provides good agreement with the frequen-
cies and intensities of the experimental
vapor phase spectra, allowing for a com-
plete structural identification. Our results
indicate strong possibilities for the role of
computational chemistry techniques coupled
with experimental methods for the identifi-
cation of environmentally relevant
compounds.

References
1 D. H. Aue, M. Guidoni, J. W. Caras, & D.
Gurka, "Infrared Frequencies and Inten-
sities Calculated from MM3 Semiempiri-
cal and Ab Initio Methods for Organic
Molecules", Proceedings from the Com-
putational Chemistry Workshop spon-
sored by the Environmental Protection
Agency, Bay City, Ml, Sept. 27-29,1993.
2 W. J. Hehre, L. Radom, P.v.R. Schleyer, &
J.A. Pople, Ab Initio Molecular Orbital
Theory, Wiley, New York, 1936.
3 J. A. Pople, R. Krishnan, H. B. Schlegel, D.
DeFrees, J. S. Binkley, M. J. Frish, R. F.
Whitesicle, R. F. Hout, and W.'J. Hehre,
Int. J. Quantum Chem. Symft., 15, 269
(1981). '.
106
NESC Annual Report - FY1994
-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
Table 1: Linear Regression Statistics Comparing Theoretical (HF / 6-31G*)
Fundamental Frequencies (x-axis) and Experimental Frequencies (y-axis) for the
Halogenated Hydrocarbons.
Molecules N Metfaod/Baais
Std. Error
Aliphatic (F) 75 HF/6-31G* 48,4

Aliphatk (Q) 57 HF/6-31G* 283

AHphatk 132 HF/6-31G* 41,1

274
AromaticCd) 322 HF/6-31G*
Intercept £2
0.8991 0 0.9971

0.8937 0 0.9991

0.8967 0 0.9980
0.9017
0.9988
Aliphatfc(F) 75
Aliphati^Q) 57
Aliphatic 132
(CI&?)
Aromatk{Ci) 322
HF/3-21G
HF/3-21G
HF/3-21G
HF/3-21G*
623
645
63.0
43.6
0.8980
0.8959
0.8967
0.8970
0
0
0
0
0.9952
0.9955
0.9953
0.9969
AliphatkKF) 75
Afiphatic(CI) 57
Aliphatic 132
(Q&F)
Aromatic(a)322
•»"«i^^ !»•••••
HF/STO-3G
HF/STO-3G
HF/STO-3G
HF/STO-3G
!•! 1
753
40.4
619
58.0
0.8340
0.8259
0.8303
0.8366
0
0
0
0
0.9930
0.9982
0.9954
0.9944
Currently accepted scaling factor (slope) is 0.89 (see reference 3).
NESC Annual Report - FY1994
107
-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
1S7U ttM.7
a, -
MICRONS
« T ?
? 7
• T

II '
•
T
•Vi^v~n f
VAPOR
I lHn|>.22S*C
Sf
WAVENUUBERS
b.
Frequency (etn-1)
4000 3500 3QQO 2SOQ 2000 1500 1000 500 0
f I..I- I I | < I. 1 I f *- I I I | I .1 I I f » I I I [ ' ' ' ' ) ,1 ' ul ' I '_f l_f
3057
1131 1023
1478 771

a Experimental Spectrum (reference 6.)
b Scaled HF / 6-31G* frequencies and calculated relative intensities.
110
30
I
1
Figure 1: Experimental and theoretical spectra for 1,2-dichlorobenzene.
108
NESC Annual Report - FY1994
-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
9otas 141 as
1S81.S llJtS 7W.7
a.«»
* t 1
VAPOR
(PRI
I '
WAVENUUBERS
WCOCET60SXFWB
Frequency (cm-1)
4000 3500 3000 2500 2000 1500 1000 500 0
3048
8(11
- 1479
1609
772
:0

•20
•30
40

60
t70
80
90
a Experimental Spectrum (reference 6.) ;
b Scaled HF/ 6-31G* frequencies and calculated relative intensities.

Figure 2: Experimental and theoretical spectra for 1,3-dichlorobenzene.
NESC Annual Report - FY1994
109
-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
a«
v v
MORONS
V j_ ; f
1881.S 14)118 Krt.0
1«at£ 1flM£ (464
147M
u u u M u M» n n H
- VAPOR
•
s
o
R

S-

<&'
WAVEMUM8ERS
WCOLBT B08X FWB
Frequency (cm-1)

b. 4000 3500 3000 2500 2000 1500 1000 500 0
h
3076
1001 -

348
Jzhi
1498
1092
I
c
••100
•M20
a Experimental Spectrum (reference 6.)

b Scaled HF / 6-31G* frequencies and calculated relative intensities.
Figure 3: Experimental and theoretical spectra for 1,4-dichlorobenzene.
110
NESC Annual Report - FY1994
-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
4 T. Shimanouchi, Tables of Molecular
Vibrational Frequencies", National Stan-
dard Reference Data Series, No. 39,
National Bureau of Standards, Washing-
ton, DC, 1972. T. Shimanouchi, J. Phys.
Chem. Ref. Data, 6,1 (1977).
5 J. R. Scherer & J. C. Evans, Spectrochim-
icaActa,19, 1739 (1963); J. R.
Scherer, Spectrochimica Acta, 23A,
1489 (1967).
6 C. J. Pouchert, The Aldrich Library of
FT-IR Spectra, Edition I, Volume 3,
Aldrich Chemical Company, Milwaukee,
Wl, 1989. ;
I
Disclaimer

The U.S. Environmental Protection
Agency (EPA), through its Office of
Research and Development (ORD), pre-
pared this extended abstract for publication
in a conference proceedings. It does not
necessarily reflect the views of EPA or ORD.
1 Kathleen Robins, U.S. EPA, EMSL-Las Vegas, P.O. Box 93478, Las Vegas, NV 89193-3478 and Department of
Chemistry, University of Nevada, Las Vegas, 4505 South Maryland Parkway, Las: Vegas, NV 89154.
2 Donald H. Aue and James W. Caras, Department of Chemistry, University of California', Santa Barbara Santa
Barbara, CA 93106.
NESC Annual Report - FY1994
111
-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
112
NESC Annual Report - FY1994
-------
Development of Physiologically Based-Pharmacokinetic and
Biologically Based-Dose Response (PB-PK/BI3-DR) Models
for Dioxins and Related Compounds Incorporating
Structure-Activity Considerations12
Introduction / History
It is possible to understand the structure-
activity relationship for the toxicokinetic
properties of environmental chemicals by
revealing the relationship between their
molecular structure and their effects on bio-
logical systems based on the molecular
properties encoded in their structure. It was
previously reported (Waller and McKinney,
1992) for polychlorinated dibenzo-p-dioxins,
dibenzofurans, and biphenyls that differ-
ences between the three-dimensional nature
of their steric and electrostatic molecular
fields were indicative of their relative affini-
ties for the Ah receptor. These field differ-
ences were not fully descriptive of the
associated in vitro enzyme induction data. It
was hypothesized that additional parame-
ters (i.e., hydrophobicity) should be consid-
ered for this purpose. The initial results of
this approach are reported herein.

Method / Approach

3D-QSAR

Comparative molecular field analysis
(CoMFA) (Cramer et al., 1988), a three-
dimensional quantitative structure-activity
relationship (QSAR) paradigm, was used to
examine the physicochemical properties
underlying the toxicity of polyhalogenated
dibenzo-p-dioxins, dibenzofurans, and
biphenyls. The steric (van der Waals) and
electrostatic (coulombic) molecular field
characteristics were represented as point
values on a regularly-spaced grid surround-
ing each molecule. The potential utility of
these values as predictive descriptors of
receptor binding and enzyme induction was
then examined using partial least squares
regression in conjunction with cross-
validation.
Hydrohobia
Recent developments in the computa-
tional chemistry/molecular modeling arena
have allowed the hydrophobic nature of a
molecule to be represented as point values
on a grid surrounding the molecule. This
technique is the basis of the HINT program
(Kellogg et al., 1991). These values are
particularly well-suited for inclusion as
regressors in a CoMFA/QSAR study and
can be analyzed in a manner analogous to
that previously described.

Results and Discussion
The statistical results of all analyses are
listed in Table 1 , page 114. Using the
CoMFA steric and electrostatic fields as
regressors, significant relationships were
discovered with respect to Ah receptor bind-
ing affinity. In addition, graphical represen-
tation of the results of the regression
analyses as p-coefficient contour plots
allowed one to visually display those areas
in space surrounding the molecules under
study in which either increased or decreased
steric bulk or positive electrostatic character
was desired for increased binding affinity.
The steric and electrostatic regressors alone
were not as predictive of induction data for
the associated enzymes AHH and EROD
(data taken from in vitro studies). This was
thought to be primarily due to the lack of par-
titioning, or transport, of data. Preliminary
attempts to include calculated log
octanohwater partition coefficient (clogP)
values as QSAR regressors indicated a
severe limitation. The clogP algorithm is an
additive fragment-based method, and as
NESC Annual Report - FY1994
113
-------
Development of Physiologically Based-Pharmacokinetic and Biologically Based-Dose Response (PB-PK/
BB-DR) Models for Dioxins and Related Compounds Incorporating Structure-Activity Considerations

Table 1: CoMFA Statistical Summary.

Independent Variable: Ah Receptor Binding Data
Regressor(s)
Steric and Electrostatic
Hydrophobic
Steric, Electrostatic, and Hydrophobic
. l?x
0.783(8)
0.672(4)
0.771(9)
sep ;,
0.724
0.866
0.750
^•r2;;'
0.916
0.802
0.910
: .. $'. \;:.
0.452
0.674
0.471
Corit f
57,43
! 100
! 35,39,26
Independent Variable: AHH Induction Data
Regressorfs)
Steric and Electrostatic
Hydrophobic
Steric, Electrostatic, and Hydrophobic
, r2
x
0.612(3)
0.504(4)
0.580(3)
sep~
1.047
1.197
1.091
r2 .
, '
0.793
0.778
0.787
S v"
j
0.766
0.801
0.776
; ,Cont'.rjf
* <, A»-
49,51
100
, 30,30,40
Independent Variable: EROD Induction Data
Regressor(s) > ' ~
•-...-.•...: *. - - •» » < r-""» *- *;
Steric and Electrostatic
Hydrophobic
Steric, Electrostatic, and Hydrophobic
.2 *'
- -cy ,.
0.546(3)
0.430(1)
0.501(2)
/ sep-
*~ ??>
&
1.158
1.271
1.201
Hi2'./..
0.743
0.563
0.694
- « „,
«-^ * X. '
0.871
1.113
0.940
-' Cdntf;
•( v^. »v ' i^-
57,43
; 100
29,25,46
such It does not account for substitution pat-
terns. Therefore, structural isomers are
indistinguishable (1,2,3,4- TCDD and
2,3,7,8-TCDD have the same clogP value).
The addition of three-dimensional hydro-
phobicity parameters to the existing analy-
ses based on steric and electrostatic
qualities alone typically did not affect the sta-
tistical significance of the models. However,
if one considers the individual contributions
of the steric, electrostatic, and hydrophobic
fields in the combined models, they can be
seen to vary depending on the biological
endpoint being modeled. Hydrophobic fields
contributed significantly more to the induc-
tion response models than to the binding
affinity models.
Scientific Accomplishments and EPA
Mission Relevance
Even in the absence of increased statis-
tics, the resultant hydrophobic field coeffi-
cient contour plots from the QSAR equation
provide for additional insight into the under-
lying physicochemical properties responsi-
ble for the toxicokinetic properties of this
class of compounds, facilitating the ultimate
development of PB-PK/BB-DR models for
use in risk assessment.

Future Objectives
It is possible to represent binding affini-
ties, and other experimentally-determined
114
NESC Annual Report - FY1994
-------
Development of Physiologically Based-Pharmacokinetic and Biologically Based-Dose Response (PB-PK/
BB-DR) Models for Dioxins and Related Compounds Incorporating Structure-Activity Considerations
quantitative analytical data, for this conge-
neric series of molecules as toxic equiva-
lency factors (TEFs) relative to TCDD (TEF
= 1.0). It is hopeful that models of this type
will be useful in the prediction of TEFs for
untested compounds.

Acknowledgments
One author (CLW) acknowledges support
from NIH Training Grant #T32HLO7275.
Relevant Publications and Reports
(References)
C.L. Waller and J.D. McKinney, J. Med.
Chem.,1993, 35, 3660.
R.D. Cramer, D.E. Patterson, J.D. Bunce, J.
Am. Chem. Soc, 1988, 110, 5959.
G.E. Kellogg, S.E. Semus, D.J. Abraham, J.
Comput.-Aided Mot. Des., 1991, 5, 545.
Rgure 1: Hydrophobicity field coefficient contour plot. Black polyhedra represent areas where increased hydrophobia
character is desired. Gray polyhedra represent areas where less hydrophobia character is desired.
1 Chris L. Waller, Center for Molecular Design, School of Medicine, Washington University, St. Louis, MO 63130.
2 James D. McKinney, Health Effects Research Laboratory, Environmental Toxicology Division, Pharmacokinetics
Branch, U.S. EPA, RTP, NC 27711.
NESC Annual Report - FY1994
115
-------
Development of Physiologically Based-Pharrnacokinetic and Biologically Based-Dose Response (PB-PK/
BB-DR) Models for Dioxins and Related Compounds Incorporating Structure-Activity Considerations
116
NESC Annual Report - FY1994
-------
The Chesapeake Bay Program Simulation Models: A
Multimedia Approach '2
Background:
In 1983 the Chesapeake Bay Program
identified excess nutrients, or eutrophica-
tion, as the primary reason for the water
quality decline in the Chesapeake (Gillelan
et al., 1983). Several water quality models
have been developed and successfully
applied to help identify the nutrients contrib-
uting the eutrophication and to quantify the
nutrient reductions necessary to restore
Chesapeake Bay resources. Efforts are
underway to link the water quality models of
the estuary and its watershed with an air
deposition model and simulation models of
key living resources.
The Chesapeake Bay Program was
formed in 1983 for the purpose of restoring
the Chesapeake, and includes the states of
Maryland, Pennsylvania, Virginia, the Dis-
trict of Columbia, the Chesapeake Bay Com-
mission, and the U.S. Environmental
Protection Agency. These jurisdictions real-
ized that the Bay's deterioration could not be
arrested by any one of them acting alone.
They acknowledged that the Bay was
endangered because of changes in its entire
watershed, a 64,000 square mile area
extending from a northern boundary of Coo-
perstown, NY; south to Virginia Beach, VA;
and west to the Ohio River basin.
Between 1983 and 1987, the Chesapeake
Bay Program entered into a long-term plan-
ning and implementation phase. Nutrient
reductions were underway, but answers
were needed for two central questions:
What should the total nutrient reductions be,
and where should these reductions be
made? Water quality models were used to
help answer these questions. The models
provided an inventory of the sources of nutri-
ents from the watershed and quantified the
possible reductions of these sources, as well
as the water quality improvements resulting
from different nutrient reduction actions.
The first Chesapeake Bay Program model
application was the Watershed Model, which
quantified the magnitude and source of the
nutrient loads to the Bay for wet, dry, and
average hydrology years. Four major
upgrades to this model have been com-
pleted since 1983 (Hartigan, 1983; Donigian
et al., 1985; Donigian et al., and 1990; Doni-
gian etal., 1994).
A steady-state water quality model of the
Bay was completed in 1987 to examine the
impact of nutrient loads on the mainstem
Bay. Using simplified loading estimates and
simulation procedures, the steady-state
model calculated the average or steady-
state summer (June - September) conditions
in the mainstem Bay. Results from the
steady-state model indicated that a 40%
reduction in controllable nutrient loads would
eliminate anoxia (dissolved oxygen concen-
trations less than 1.0 rng/L) in the mainstem
(U.S. EPA Chesapeake Bay Program,
1987).
The planning phase was brought to a
close with the signing of the 1987 Bay
Agreement. This document called for reduc-
ing the controllable amount of nutrients
reaching the Bay by 40% by the turn of the
century. Of the many commitments in the
1987 Bay Agreement, the nutrient issue was
the only one of such consequence that the
goals were formulated in quantitative and
not merely qualitative terms. The Agree-
ment also called for a major reevaluation of
the nutrient reduction goal in 1992 using
refined computer simulation models. Con-
sequently, an extensive? program to affect
significant reductions of nutrients entering
the Bay was instituted along with increas-
ingly sophisticated water quality models to
NESC Annual Report - FY1994
117
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
guide decision-making on cost effective
water quality management in the Chesa-
peake Bay. Work began on an integrated
set of Chesapeake Bay water quality models
in 1987 as shown in Figure 1.
In 1992 the 40% reduction goal was con-
firmed by the results of the integrated water-
shed and estuarine computer models, the
first application of multimedia models in the
Chesapeake (Thomann et al. 1994).

The Water Quality Models
The Chesapeake Bay Watershed Model
(Linker et al., 1994; Donigian et al., 1994) is
designed to simulate nutrient loads deliv-
ered to the estuary under different manage-
ment scenarios. The model divides the Bay
basin into sixty-three model segments with
an average area of 260,300 hectares as
illustrated in Figure 2, page 119. Hydrology,
sediment, and nonpoint source loads
(including atmospheric deposition) are simu-
lated on nine land uses. In the river
reaches, sediment, nonpoint source loads,
point source loads, and water supply diver-
sions are simulated on a one hour time-step.
Paniculate and dissolved nutrients are
transported sequentially through each seg-
ment, and ultimately to the tidal Chesapeake
Bay. This model was used to 1) determine
the distribution of the point and nonpoint
source loads and the controllable and
uncontrollable portions of each; 2) deter-
mine the quantity of loads reduced under dif-
ferent management actions including
reductions in nitrogen atmospheric deposi-
tion; and 3) quantify the loads under future
(year 2000) conditions. These loads were
used as input conditions for the Chesapeake
Bay Water Quality Model (CBWQM). The
Watershed Model is based on EPA sup-
ported model HSPF, Version 10.
The Chesapeake Bay Water Quality
Model (CBWQM) takes loads from the
Watershed Model as input. The CBWQM is
a time variable, three dimensional water
quality model of the tidal Bay coupled with a
Figure 1: The Current Chesapeake Bay Integrated Modeling
118
NESC Annual Report - FY1994
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
NESC Annual Report - FY1994
119
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
model of estuarine sediment processes.
The CBWQM computational grid is shown in
Figure 3, page 121. The sediment model
provides simulation of sediment nutrient
sources and sinks. An ocean boundary sub-
model simulates the expected coastal
exchange of loads with the Chesapeake
under different nutrient management condi-
tions. The CBWQM is driven by a hydrody-
namic model simulating the movement of
Bay waters over the three year (1984 -
1986) simulation period on a five minute
time-step. The details of model develop-
ment, structure, calibration, and sensitivity
are given in separate reports (Cerco and
Cole, 1993; DiToro and Fitzpatrick, 1993;
Johnson et al., 1991). The CBWQM is
based on an extensively modified and
expanded WASP code.

The Findings So Far
Several key management scenarios were
carried out using the linked Watershed
Model and CBWQM. These key scenarios
provided a basic inventory of loads under
specific management conditions.

Base Case Scenario
This scenario is the base case year (1985)
loads to the Chesapeake Bay. The 1985
loads are the benchmark loads against
which all reductions, particularly the 40%
Bay Agreement reduction, are measured.

Bay Agreement Scenario
This scenario represents the nutrient
loads to be reduced by the year 2000 under
the Bay Agreement. The reduction is a 40%
reduction of controllable nutrient loads.
Controllable loads were defined as the base
case loads minus the loads delivered to the
Bay under an all forest condition. In other
words, controllable loads are defined as
everything over and above the total phos-
phorus and total nitrogen loads that would
have come from an entirely forested water-
shed. Point source loads are considered, in
this definition, to be entirely controllable.
After the year 2000, the 40% reduction goal
becomes a cap on nutrient loads. Post year
2000 efforts in Chesapeake resource man-
agement will focus on maintenance of the
nutrient cap in the face of load increases
due to growth.

Limit of Technology Scenario
The limit of current technology scenario is
defined as having all cropland in conserva-
tion tillage; the Conservation Reserve Pro-
gram fully implemented; nutrient •
management, animal waste controls, and
pasture stabilization systems implemented
where needed; a 20% reduction in urban
loads; and point source effluent controlled to
a level of 0.075 mg/L total phosphorus and
3.0 mg/L total nitrogen. It is important to
note that the limit of technology scenario
was used to determine the feasibility of the
40% nutrient reduction. In no basins did the
reduction called for in the Bay Agreement
exceed what could be achieved by current
technology. i

"No Action" Option Scenario
This scenario represents the growth in
population and the projected changes in
land use by the year 2000 with no additional
controls after 1985. Only the controls in
place in the base case scenario are applied
to the year 2000 point source flows and land
use. This scenario represents what the
loading conditions might be without the nutri-
ent reductions of the Bay Agreement. Bay
Program projections of land use changes
and growth in sewage treatment plant loads
by the year 2000 add 14.2 million; kilograms
of nitrogen to the 33.7 million kilograms that
must already be removed to reach the year
2000 nutrient cap. In other words, for every
kilogram we remove approximately one-half
kilogram returns simply as a result of popu-
lation growth. '•

1993 Progress Scenario
This scenario applies the actua;! reduc-
tions made in nitrogen and phosphorus by
120
NESC Annual Report - FY1994
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach

-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
the year 1993, the midpoint between the
base year of 1985 and the year 2000 goal.
Figures 4 and 5, page 123, show the 1985
Base Case total phosphorus and total nitro-
gen loads in each major Bay basin relative
to the 1993 Progress loads, the Bay Agree-
ment loads, the loads under the current limit
of technology, and the "No Action" Option
loads.
In all basins, the limit of technology phos-
phorus loads are less than the Bay Agree-
ment loads. For nitrogen loads, only the
point source dominated basins of Potomac,
James, and the West Shore have limit of
technology loads appreciably less than Bay
Agreement loads. For the nonpoint source
dominated basins of the Susquehanna,
Rappahannock/York, and the East Shore,
the limit of technology loads are essentially
equivalent to the Bay Agreement loads.
The "No Action" scenario has increased
loads of phosphorus and nitrogen for all
basins. The point source dominated basins
of Potomac, James, and West Shore, which
experience the greatest urbanization in the
Year 2000 scenario, have the greatest
increase in nutrient loads.
The 1993 Progress Scenario charts phos-
phorus reductions as being on track for the
year 2000 goal. The nitrogen reductions
posted by the 1993 Progress Scenario are
meager, in fact in many basins the Bay Pro-
gram has lost ground due to load increases
from growth. In the West Shore basin,
where progress is being made in point
source nitrogen reductions, the 1993 reduc-
tion in nitrogen is significant. Reductions in
nitrogen are expected to increase in the
other basins as nitrogen reductions are
made in more facilities. Given the proven
technologies and increasing cost-effective-
ness of new biological nutrient removal pro-
cesses, the point source role in the clean-up
is becoming more significant than previously
thought. Point source reductions are vital to
the restoration effort due to the inevitable
increase in point source loads to the Bay
with increases in population.
Given the difficulty of controlling nonpoint
source nitrogen and questions about the
level of participation achievable under volun-
tary programs, the nonpoint source role in
the clean-up may be relatively more chal-
lenging than previously believed. Neverthe-
less, nonpoint source control actions are a
key component in all tributary strategies.
A series of runs was conducted to test and
explore the Bay response to nutrient loading
scenarios. These runs indicate that anoxia
in the Bay, brought on by the excessive
nutrient loading, will be reduced by 20%
under the Bay Agreement nutrient caps.
The maximum nutrient reductions achieved
under the limit of technology loads produce
a 32% reduction in anoxia. Under the "No
Action" loads the Bay anoxia would increase
by about 120%. :
Phytoplankton growth is limited by the
availability of phosphorus in the upper Bay
and by nitrogen in the lower Bay, with a tran-
sition in the mid-Bay. Consequently, load
reductions of both phosphorus and nitrogen
are necessary to reduce eutrophication in
the Bay. However, reductions of phospho-
rus do not have as significant an effect on
bottom anoxia as do nitrogen reductions.
The impact of load reductions on bottom
water anoxia also varies geographically, with
the greatest anoxia reductions from load
reductions in the upper Bay and mid-Bay
areas.
The calculated ocean input load com-
prises 30 to 35% of the total nitrogen load
and 45% to 65% of the total phosphorus
load to the Bay. Excluding these uncontrol-
lable ocean loads, the upper limit of overall
total nutrient load reduction (current limit of
technology) is 20 to 30% for nitrogen and
30% to 55% for phosphorus.
Feasible reductions in nutrient loadings of
about 20 - 30% for both nitrogen and phos-
phorus result in an improvement in bottom
dissolved oxygen of about 0.2 to 0.4 mg/L
above the base case summer average bot-
tom dissolved oxygen. Load reductions of
50% or more result in minimum dissolved
122
NESC Annual Report - FY1994
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
Sceajirio Legend
198:5 B«« Cue
Bay Agreement
Limit of Technology
"No
Kappahaaaock/York
Western Shore
Figure 4: Chesapeake Bay Total Phosphorus Loads by Scenario
168
MO
;ioo
* *»
4O
20
S««!Barfo l^egeud
1
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
oxygen concentrations above 1 mg/L as
shown in Figure 6.

Looking to the Future
The Bay Agreement reductions become
nutrient load caps after the year 2000. Pop-
ulation continues to grow in the watershed,
so control efforts will have to be strength-
ened to maintain progress already achieved
by assuring that the caps are maintained.
The Baywide caps are 104.3 and 7.00 mil-
lion kilograms per year for nitrogen and
phosphorus respectively. The Watershed
Model will annually track changes in water-
shed loads to ensure that the caps are not
exceeded in any basin.
The nutrient reductions and caps will be
achieved through the implementation of trib-
utary strategies. The strategies examine the
mix of nutrient management controls for the
different tributaries and apply controls on
waste water treatment plants, agricultural
runoff, or effluent from urban areas. Alloca-
tion of the nutrient caps to each of ten major
tributary systems was achieved through the
application of the Watershed Model. Exist-
ing, modified, or in some cases, new imple-
mentation mechanisms are applied in point
source programs, nonpoint source pro-
grams, and in associated incentive or disin-
centive programs. Most importantly, citizen
action in the review, refinement, and imple-
mentation of the tributary strategies is key.
Model refinements are now being devel-
oped for the Watershed Model. One such
modification is a finer scale segmentation of
the watershed, dividing it into 86 segments
to provide better spatial resolution of model
Figure 6: Water Quality Response to Nitrogen and Phosphorus Reductions
124
NESC Annual Report - FY1994
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
results. Another refinement that will be used
in future Watershed Model runs is an
updated GIS Chesapeake Bay Program
Land Use Data Base. This data base was
developed using satellite imagery available
through EPA's Environmental Monitoring
and Assessment Program (EMAP) and
NOAA's Coastal Change Assessment Pro-
gram (CCAP). USDA Agricultural Census
Data were also used to provide crop and
animal information. The Chesapeake Bay
Program Land Use map is shown in Figure
7, page 126.
Further model refinements are under way
for the CBWQM. These refinements will
examine the relationship among air deposi-
tion, water quality and key living resource
areas including SAV, benthos, and phy-
toplankton/zooplankton. The refined model
analysis of air deposition and water quality/
living resource interactions will be com-
pleted in 1997 as illustrated in Figure 8,
page 127.
Throughout the 1994-96 period the Bay
Program will improve modeling of atmo-
spheric loads. These activities will move
toward estimates of the controllable atmo-
spheric load delivered to the tidal Bay.
Inherent in an improved understanding of
atmospheric loads are estimates of the con-
trollable and uncontrollable atmospheric
sources, the boundaries of the Chesapeake
airshed, and the transformations and losses
of deposited atmospheric loads. The nutri-
ent sources of point, nonpoint, and air depo-
sition will be simulated along with the
attainable controls of these sources to
develop a least cost pollutant reduction plan.
Estimates of the growth of atmospheric
sources of nitrogen are also important
because while current estimates of these
loads show an initial reduction through
implementation of the Clean Air Act, atmo-
spheric loads beyond the year 2005 will
increase unless further controls are initiated.
A major review of the goals and progress of
the tributary strategies will occur in 1997.
This evaluation will use the computer mod-
els now under development to examine con-
nections among watershed, airshed, estuary
water quality, and living resources. Under-
water grasses and benthic organisms will be
simulated, providing tributary specific goals
for nutrients based on habitat
improvements.

Conclusion

The participants in the Chesapeake Bay
Program have consistently marshalled the
resources needed to continue the effort to
restore and protect the Chesapeake Bay.
The challenge today is to finance, plan,
implement, and construct the needed nutri-
ent control measures by the year 2000 and
then to maintain these loadings even as
population and development continue to
expand.
We are making progress. Bay Program
tracking of nutrient reductions show a reduc-
tion of phosphorus by 1992 of 1.86 million
kilograms, an achievement of 48% of the
phosphorus reduction goal. A major factor
in the phosphorus reductions was the phos-
phate detergent ba,n, an excellent example
of pollution prevention in the Bay basin.
Reductions in nitrogen are coming more
slowly. By 1992, 2.95 million kilograms of
nitrogen were reduced, achieving 9% of the
nitrogen reduction goal.
And there are encouraging developments.
Recent advances in biological nutrient
removal, supported by CBP funding, dem-
onstrate that cost effective technologies for
year-round nutrient removal can achieve
significant reductions in nitrogen effluent at
municipal Sewage Treatment Plants (STPs).
Most tributary strategies contain biological
nutrient removal as a key element. The chal-
lenge ahead is to achieve similar technologi-
cal breakthroughs for controlling nutrients,
particularly nitrogen, from nonpoint sources.
Our improved understanding of atmo-
spheric nitrogen pollutants is also encourag-
ing. We have learned that about a quarter of
the nitrogen load entering the Bay comes
from atmospheric sources. These sources
originated from the tailpipes of cars and from
NESC Annual Report - FY1994
125
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
Emergent Wetland
Forest
Legend
Forested Urban
HertetceotE Urban
Low JMteasity Urban
High Intensity Ute
Maes, Quames, aod Boosed
Figure 7: Chesapeake Bay Program Land Use
126
NESC Annual Report - FY1994
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
* Mffi^fjjffi
Barest
Legend
Rsrested Urban
Herbaceous Urban
Low teeasity Urban
Urban
Maes,
Figure 8: THe Cross-media Watershed, Estuarine, Airshed, and Living Resource Model Now Being Developed
NESC Annual Report - FY1994
127
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
the smokestacks of power plants and indus-
tries. These sources may have originated in
the watershed or even from outside the
watershed boundaries. Accordingly, we
have learned to add a new word to our lexi-
con of Bay restoration - the airshed. Not
enough is yet known about air deposition as
a source of nutrients and how to control it.
This is one reason why air deposition reduc-
tions are not included in the nutrient caps.
Although air reductions are not counted in
the caps, the Clean Air Act is expected to
reduce nitrogen entering the Bay by air dep-
osition and to provide another 4% reduction
in anoxia in addition to the 20% reduction in
anoxia brought about by the Bay Agree-
ment. Unfortunately, like point sources, pop-
ulation increases will begin to erode gains
made in reducing the atmospheric source
after 2005. Further understanding of atmo-
spheric deposition and how to control it v/ill
be obtained by the inclusion of Chesapeake
airshed simulation models into the inte-
grated water quality models.
Many challenges lie ahead. The Chesa-
peake Bay Program is about to enter a new
phase, which will focus first on tracking nutri-
ent reductions as we move toward the year
2000 goal, and then on maintenance of the
nutrient caps. The multimedia simulation
models now being developed, will track
nutrient loads as the Chesapeake Basin
moves toward sustainable development. A
key element in attaining sustainable devel-
opment will be an analysis of the best and
least cost solutions to achieve and maintain
the nutrient caps.
An administrative challenge will be to
develop consistent and reliable methods to
assess progress in implementing tributary
strategies and determining movement
towards the 40% nutrient reduction goal.
The Bay Program partners will complete
annual tracking of the nutrient load reduc-
tions through computer model progress sce-
narios. Coordinated and targeted
monitoring efforts will verify model predic-
tions and provide a real world measure of
water quality and living resource response to
our efforts.

Acknowledgment
The author wishes to acknowledge the
state, regional, and federal members of the
Chesapeake Bay Program Modeling Sub-
committee for their essential guidance and
direction throughout the development of the
Chesapeake Bay Watershed Model, and in
particular Dr. Robert Thomann for his exper-
tise and experience gained through twenty
three years of water quality modeling work
on the Chesapeake and its tributaries.

References
Cerco, C.F. And T. Cole, 1993. Application
of the Three-Dimensional Eutrophica-
tion Model CE-QUAL-ICM to Chesa-
peake Bay. U.S. Corps of Engineers
Waterways Experiment Station. Vicks-
burg, MS.
D.M. DiToto and J.J Fitzpatrick, 1993.
Chesapeake Bay Sediment Flux Model.
U.S. Corps of Engineers Waterways
Experiment Station. Vicksburg, MS.
Donigian, Jr., A.S., Bicknell, and J.L. Kittle,
Jr., 1986. Conversion of the Chesa-
peake Bay Basin Model to HSPF Opera-
tion. U.S. EPA Chesapeake Bay
Program. Annapolis, MD. ;
Donigian, Jr., A.S., and H.H. Davis, 1978.
Users Manual for Agricultural Runoff
Management (ARM) Model. U.S. EPA
Environmental Research Laboratory.
Athens, GA.
Donigian, Jr., A.S., B.R. Bicknell, L.C.
Linker, J. Hannawald, C.H. Chang, and
R. Reynolds, 1990. Watershed Model
Application to Calculate Bay Nutrient
Loads: Phase I Findings and Recom-
mendations. U.S. EPA Chesapeake Bay
Program. Annapolis, MD.
Donigian, Jr., A.S., B.R. Bicknell, LC.
Linker, C.H. Chang, and R. Reynolds,
1994. Watershed Model Application to
Calculate Bay Nutrient Loads: Phase II
Findings and Recommendations. U.S.
128
NESC Annual Report - FY1994
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
EPA Chesapeake Bay Program. Annap-
olis, MD.
Gillelan, M.E., D. Haberman, G.B. Mackier-
nan, J. Macknis, and H.W. Wells, Jr.,
1983. Chesapeake Bay: A Framework
for Action. U.S. EPA Chesapeake Bay
Program. Annapolis, MD.
Hartigan, J.P., 1983. Chesapeake Bay
Basin Model. Final Report prepared by
the Northern Virginia Planninig District
Commission for the U.S. EPA Chesa-
peake Bay Program. Annapolis, MD.
Johnson, B.C., J.C. Imhoff, J.L. Kiggle, Jr.,
and A.S. Donigian, Jr., 1993. Hydrologic
Simulation Program - Fortran (HSPF):
User's Manual for Release 10.0. U.S.
EPA Environmental Research Labora-
tory. Athens, GA.
B. H. Johnson et al., 1991. Users Guide for
a Three-Dimensional Numerical Hydro-
dynamic, Salinity, and Temperature
Model of Chesapeake Bay. U.S. Corps
of Engineers Waterways Experiment
Station. Vicksburg, MS.
Linker, L.C., G.E. Stigall, and C.H. Chang,
1994. The Chesapeake Bay Water
Quality Model,, Accepted for publication
in Environmental Science and Technol-
ogy.
Thomann, R.V., J.R. Collier, A. Butt, E. Gas-
man, and L.C. Linker, 1994. Response
of the Chesapeake Bay Water Quality
Model to Loading Scenarios. U.S. EPA
Chesapeake Bay Program. Annapolis,
MD.
U.S. EPA Chesapeake Bay Program, 1987.
A Steady-State Coupled Hydrodynamic/
Water Quality Model of the Eutrophica-
tion and Anoxia Process in the Chesa-
peake Bay. Prepared by HydroQual, Inc.
for the U.S. EPA Chesapeake Bay Pro-
gram. Annapolis, MD.
1 Lewis C. Linker, U.S. EPA, Chesapeake Bay Program.
2 Katherine E. Bennett, Chesapeake Research Consortium.
NESC Annual Report - FY1994
129
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
130 NESC Annual Report - FY1994
-------
Visualization Techniques for the Analysis and Display of
Chemical, Physical, and Biological Data Across Regional
Midwestern Watersheds'2
Problem Description
An important goal of the Federal Water
Quality Act is that of defining conditions nec-
essary to maintain the biological integrity in
the nation's surface waters. Past studies
have largely relied on either biosurvey or
toxicological approaches to define water-
shed health. However, these approaches do
not supply comprehensive data for identify-
ing relationships among physical, chemical
and biological watershed variables. A series
of joint studies by the Natural Resources
Research Institute (NRRI) and the U.S. EPA
Environmental Research Laboratory -
Duluth (ERL-D) relate instream physical
(habitat), chemical (surface and substrate)
and biological (macroinvertebrate) proper-
ties to landscape and land use attributes
(Johnson and Richards, 1992; Richards et
al., 1993a; Richards et al. 1993b; Johnson
et al., 1994; Richards and Host, 1994). The
intention of this research is to identify biocri-
teria and ecocriteria for midwestern agricul-
tural watersheds. Important landscape
attributes are land cover, topography, and
soils. The degree to which the agricultural
land was used was found to be a factor influ-
encing the health of watersheds.
Comparing physical, chemical, and bio-
logical measurements to stream health can
be difficult because the measured variables
are often done on different temporal and
spatial scales, with varied levels of accuracy
and precision. In addition, it is difficult with
large multi-dimensional data sets to identify
and explain patterns and trends in the data
outside of relatively sophisticated multivari-
ate procedures. However, there is clearly a
need to present this data in a format that can
be interpreted by policy-makers and the
general public. Data visualization tech-
niques provide such means of interpretation.
Objectives
Our objective is to analyze, model, and
visualize the relationships found between
landscape/land use attributes and water-
shed physical, chemical, and biological
properties in the Saginaw River basin. The
analyses include calculation of spatial statis-
tics from the existing Geographic Informa-
tion System (GIS) databases. The modeling
component focuses on predicting either pos-
itive or negative changes in stream health as
a function of landscape/land use changes.
Data visualization is being used to rapidly
synthesize model outputs in formats to
enable researchers to gain further under-
standings about watershed properties and
interactions. A second advantage of using
visualization procedures is to place the
watershed research results into formats eas-
ily comprehensible to policy-makers and
public review groups.

Research Approach
As part of our studies to develop biocrite-
ria and ecocriteria for a midwestern agricul-
tural watershed, a (database spanning the
macroinvertebrate community, water chem-
istry and physical habitat has been gathered
representing over 60 stations distributed
throughout the Saginaw River Basin (Rich-
ards et al., 1993a, 1993b). We have also
assembled GIS databases in an ARC/INFO
format relating hydrology, landscape/land
use, elevation, soils, geomorphology, and
climate features (Johnson and Richards,
1992). Results using redundancy, principle
component, and multiple regression tech-
niques have shown presdictive relationships
among the watershed database variables
and landscape/land use features (Johnson
eta!., 1994).
NESC Annual Report - FY1994
131
-------
Visualization Techniques for the Analysis and Display of Chemical, Physical, and Biological Data Across
Regional Midwestern Watersheds
Preliminary Results/Progress
George Host, one of NRRI's principal
investigators on this project, attended a data
visualization training session at the National
Environmental Supercomputing Center
(NESC) during the summer of 1994.
Selected landscape data was uploaded onto
NESC workstations. Using the Cass River
watershed as a subset of the Saginaw River
basin database, features of geology, land
use, elevation, and hydrology were trans-
ferred into the FAST visualization package.
Monthly temperature for the entire basin was
also incorporated into the AVS visualization
package. These data were stored in a raster
format with a 1 km2 resolution.
The modular programming capabilities of
AVS were used to generate an animation of
temperature changes across the Saginaw
basin from January through December.
Features found with this analysis included
lake effects due to basin and topographic
effects in terms of river drainage and
morainal systems on seasonal tempera-
tures. These effects would be difficult to dis-
cern from the flat-file databases as inputs to
the visualization routines.
FAST was used to visualize geology, land
use, hydrology, and elevation features in the
Cass River watershed. A three-minute
video was developed representing an aerial
"fly-over" the watershed. The basin's fea-
tures were first generated as a flat image,
then rotated to show elevation from a three-
dimensional perspective. A flight path was
traced from the river's mouth, proceeding
upstream along the main stem and into the
headwater areas. Elevation changes were
rendered both as distance from a surface
and with color. The fly-over video, synthe-
sized from 15 megabytes of elevation data,
gave a clear picture of the Cass River water-
shed landscape features including relative
size of the moraines, outwash plains, low-
lands, and drainage channels.
These results will provide a template for
future analyses. By developing the data
transfer and transformation routines neces-
sary for incorporating GIS data into the visu-
alization systems, we can readily subject
new data to similar analyses. Several new
tasks are planned for next year. Other
watersheds from our existing database
along with new fish community data
obtained during the 1994 field season will be
incorporated into the AVS system. The
biotic data will be explored to quantify
trends. This would be a new application for
visualization, which historically has been
used to interpret data which varies continu-
ously (i.e., temperature) rather than dis-
cretely (i.e., with biological population
data). We also intend to use supercomput-
ers for quantifying fragmentation, lacuniarity,
and other landscape spatial statistics. Cur-
rently the routines necessary for generating
these statistics are extremely computer
intensive, taking literally weeks of worksta-
tion CPU time. By taking advantage of the
parallel processing capabilities of supercom-
puting environments, we hope to greatly
reduce computational times and conse-
quently be able to analyze landscapes in
larger spatial scales.

References
Johnson, L.B. and C. Richards. 1992.
Investigating landscape influences on
stream macroinvertebrate communi-
ties. Water Resources Update 87:41-
48.
Richards, C., G.E. Host, and J.W. Arthur.
1993a. Identifications of predominant
environmental factors structuring
stream macroinvertebrate communities
within a large agricultural catchment.
Freshwater Biology 29: 285-294.
Richards, C. et ai., 1993b. Landscape influ-
ences on habitat, water chemistry, and
macroinvertebrate assemblages in mid-
western stream ecosystems. 29 pp +
appendices, October, ERL-Duluth
Report Number 5815.
Johnson, L, C. Richards, G. Host, and J.
Arthur. 1994. Landscape influences on
water chemistry in midwestern stream
132
NESC Annual Report - FY1994
-------
Vfsuatfzatton Techniques for the Analysis and Display of Chemical, Physical, and Biological Data Across
Regional Midwestern Watersheds
ecosystems. Internal report for publica-
tion. 28 p.
Richards, C. and G.E. Host. 1994. Examin-
ing land use influences on stream habi-
tats and macroinvertebrates: A GIS
Approach. Water Resources Bulletin In
30:729-738.
1 Q
-------
Visualization Techniques for the Analysis and Display of Chemical, Physical, and Biological Data Across
Regional Midwestern Watersheds
134 NESC Annual Report - FY1994
-------
" 1 •', i -i. - -W • "*•• ,'.-:
Estimation of Global Climate Change Impacts on Lake
and Stream Environmental Conditions and Fishery
Resources1'2'3 /
Abstract
Mathematical models have been devel-
oped for estimating the effects of global cli-
mate change on lake and stream thermal
structure, dissolved oxygen concentrations
and on fishery resources. These models
require the development of lake and stream
classification systems to define waterbody
types, which in turn require the availability of
extensive regional databases for these
resources. Fishery resource response pre-
dictions require the development of large
field temperature and fish distribution data-
bases from which species and guild thermal
requirements can be derived. Supercom-
puting capabilities are being utilized in the
development and manipulation of the large
databases to integrate the various data and
program modules, and to make the calcula-
tions required to perform regional impact
estimates.

EPA Research Objectives
According to a 1992 technical review of
several general circulation models of ocean
atmosphere heat budgets by the Intergov-
ernmental Panel on Climate Change, dou-
bling atmospheric concentrations of CO2
could increase global mean air temperatures
by 1.5 to 4.5°C in the next 50 years. This is
likely to have many environmental conse-
quences, for example, changes in water
temperature and dissolved oxygen concen-
trations which, in turn, are likely to affect fish
populations. The fact that such changes
would occur many times faster than have
occurred previously has resulted in requests
for information on effects and response
options to the climate changes. The Envi-
ronmental Research Laboratory - Duluth
and the University of Minnesota have initi-
ated a cooperative study to determine the
impacts of global warming on lake and
stream environmental conditions and fishery
resources. In order to conduct this study,
fish thermal requirements need to be esti-
mated using a historical fish presence/tem-
perature record database.

Background/Approach
The Fish and Temperature Database
Management System (FTDMS) is a national
database system that spatially and tempo-
rally associates discrete fish sample records
with water temperature data. Recent efforts
have concentrated on the expansion of data-
base content by assembling information
from a multitude of sources, including Fed-
eral agencies (i.e., EPA/STORET and
USGS) and private museum and university
collections. The assimilation of data from
many sources necessitates the need for
automated spatial and temporal matching of
a fish record with water temperature data.
Prior versions of several program modules
have been converted from a PC database
platform to C and have; been run on the
National Environmental Supercomputing
Center's (NESC) Cray,

Scientific Accomplishments
A modeling approach has been developed
for estimating the effects of global warming
on the environmental conditions of lakes and
streams and fisheries resources. The initial
phase of the work was partially supported by
EPA Office of Policy, Planning, and Evalua-
tion. Results from the studies have yielded
data that are currently being applied to in an
economic impact analysis of global climate
impacts on the U.S. One ongoing program
is a component of the Committee on Envi-
ronmental and Natural Resources (CENR),
Global Climate Research Program. At last
count, over twenty project reports, most in
NESC Annual Report - FY1994
135
-------
Estimation of Global Climate Change Impacts on Lake and Stream Environmental Conditions and Fishery
Resources
the form of technical journal articles, had
been produced by the project.

Results
The program module that calculates raw
temperature data into weekly mean values
has been run on the NESC's Cray for 28
states. The results have been used to cal-
culate the maximum (warmest throughout
the year) 95th percentile temperature where
a fish species was collected for over 50 spe-
cies of North American freshwater fish. This
temperature is used as an approximation of
the lethal limit for that species and allows us
to estimate the distribution of fish after glo-
bal climate change. The data has also been
integrated with GIS to visually examine the
relationship between fish presence and
maximum stream temperatures.
The speed with which the weekly mean
temperatures are calculated on the super-
computer makes it possible to perform the
temporal matching in a number of different
ways. This feature has the potential for
improving estimates of thermal requirements
for fish. For example, the southern range of
distribution of cool-water fish is generally
near 40° north latitude. Maximum weekly
mean values from south of this parallel
would be expected to provide a better esti-
mate of thermal tolerances than values from
all of North America. Temporal matching cri-
teria can be restricted in other ways (fish
and temperatures sampled in the same year,
season, or month). Comparing "monthly"
and "yearly" datasets provides a means of
examining the importance of the temporal
relationship of temperature and fish records.

Future Objectives
Currently, the weekly mean temperature is
used to describe surface water conditions
for the week in which a fish sample was
taken. The weekly maximum temperature,
daily means and daily maximums have also
been calculated and matched to fish collec-
tions to re-calculate the maximum 95th per-
centile temperature. These values provide a
unique and valuable look at the relationship
between various expressions of a fish's ther-
mal regime and its geographic distribution.
Most laboratory-derived measures of ther-
mal tolerance, the source of most past tem-
perature effects information, have employed
constant temperature exposure conditions
when short-term peaks might be as much or
more important in nature. The relationship
of cold temperatures to the distribution of
warm water fishes has only been;examined
superficially. The data storage and manipu-
lation requirements and modeling demands
will be greatly increased to accommodate
these additional analyses. Research is
underway to incorporate functional ecosys-
tem responses (e.g. system productivity)
into models projecting climate change
impacts. The area of ecological processes
and effects research as related to aquatic
ecosystems is large and can benefit greatly
from enhanced computational capabilities.

Bibliography
Hondzo, M., and H.G. Stefan. 1993.
Regional water temperature characteris-
tics of lakes subjected to climate
change. Climate Change 24:187-211.
Stefan, H.G., M. Hondzo, J.G. Eaton, and
J.H. McCormick. 1994. Predicted effects
of climate change on fishes in Minnesota
lakes. Canadian Spec. Publ. Fisheries
and Aquatic Sci. 121.
Stefan, H.G., M. Hondzo, J.G. Eatxin, and
J.H. McCormick. 1994. Validation offish
habitat models for lakes. Ecological
Modeling.
Stefan, H.G., and B.A. Sinokrot. 1993. Pro-
jected global climate change impact on
water temperature in five north central
U.S. streams. Climate Change 24:353-
381.
136
NESC Annual Report - FY1994
-------
Estimation of Global Climate Change Impacts on Lake and Stream Environmental Conditions and Fishery
Resources

1 J.G. Eaton, U.S. EPA, ERL-Duluth, 6201 Congdon Blvd., Duluth, MN 55804 218-720-5357.
2 H.G. Stefan, St. Anthony Falls Hydraulic Lab, Dept. Civil & Mineral Eng., Mississippi R. & 3rd Ave. S.E., Minneapolis,

3 R.M. Scheller, Science Applications International Corporation, ERL-Duluth, 6201 Congdon Blvd., Duluth, MN 55804
NESC Annual Report-FY1994 •> 137
-------
Estimation of Global Climate Change Impacts on Lake and Stream Environmental Conditions and Fishery
Resources i
138 NESC Annual Report - FY1994
-------
Integration of Computational and Theoretical Chemistry
with Mechanistically-Based Predictive Ecotoxicology
Modeling1'2'3'4
Background
In the field of environmental toxicology,
and especially in aquatic toxicology, a vari-
ety of computer-based models have been
developed that are scientifically-credible
tools for use in predicting and characterizing
the ecological effect and fate of chemicals
when little or no empirical data is available.
These models are used in ecological risk
assessments by EPA as well as other Fed-
eral and State agencies. In some instances
these models are used to predict the effects
of new chemicals being proposed for
release in the environment (e.g., under
TSCA approximately 2,000 new chemicals
per year must be reviewed), while in other
cases the models are used to diagnose pos-
sible cause and effect relationships for
chemicals at impacted sites (e.g., CERCLA,
CAAA, CWA). Diagnostic risk assessments
are especially challenging when it is realized
that over 50,000 chemicals exist in commer-
cial production; however, rudimentary toxic-
ity data is available only for about 10% of the
compounds. Finally, predictive toxicology
models are also used to help predict the
future outcome of ecosystems, based on dif-
ferent environmental management scenar-
ios (e.g., CERCLA, CAAA, CWA).

Research
Reports from the early 1980s, in both the
U.S. and the Netherlands, established that
the majority of industrial organic chemicals
(excluding pesticides and pharmaceutical
agents) elicit their acute toxic effects through
a narcosis mechanism. With the develop-
ment of initial toxicity data sets, numerous
QSAR investigations were undertaken, and
the findings of Veith et al. (1983) and Konne-
mann (1981) established that the potency of
narcotics was entirely dependent upon
xenobiotic hydrophobicity. With subsequent
experimental studies and modeling efforts it
has been generally accepted that the rela-
tionships reported by Veith et al. (1983) and
Konnemann (1981) represent the minimum,
or baseline, toxicity that a compound can
elicit in the absence of a more specific mode
of toxic action. With additional study it
became clear that there were subclasses of
narcotics - more potent than would be pre-
dicted from the baseline narcosis QSARs -
that could be classified by either acute
potency and/or physiological and behavioral
characteristics of the narcosis response.
Further, it was obvious that some industrial
chemicals were significantly more toxic than
would be predicted from narcosis QSARs
because they were capable of acting as oxi-
dative phosphoryla.tion uncouplers, respira-
tory chain blockers, or other more specific
mechanisms (e.g., see Bradbury, 1994).
Although many modes of toxic action can
be reasonably predicted using non-elec-
tronic descriptors, the likelihood of a com-
pound to act as a reactive toxicant has not
been well developed. The specific issue of
classifying reactive toxicants, and subse-
quently predicting their acute toxicity, has
been an area of interest because these
compounds are typically among the most
potent industrial chemicals; their identifica-
tion also raises concern over possible
chronic effects. Clearly, the ability to predict
chemical reactivity requires the use of
QSARs that employ stereoelectronic
descriptors. ;

Research Objectives
To reduce uncertainties in ecological risk
assessments of chemical stressors, a sec-
ond generation of advanced predictive mod-
eling techniques is required. The second
NESC Annual Report - FY1994
139
-------
Integration of Computational and Theoretical Chemistry with Mechanistically-Based Predictive
Ecotoxicology Modeling
generation models must be based on funda-
mental principles of chemistry, biochemistry
and toxicology, and designed in such a way
that will efficiently assess the thousands of
chemicals in commercial use. Research
must be directed towards developing mech-
anistically-based QSARs for reactive chemi-
cals for the purpose of improving toxicity
predictions, estimates of metabolism, and
ultimately, chemical similarity.
Studies have been undertaken to explore
specific toxicological processes and associ-
ated chemical reactivity parameters that will
establish a mechanistically-based approach
for screening compounds and stereoelec-
tronic indices for QSAR models, thereby
focusing future three-dimensional calcula-
tions. Based on hypotheses concerning
toxic mechanisms and metabolic activation
pathways, several studies have been con-
ducted to explore the use of stereoelectronic
descriptors and to identify potentially reac-
tive toxicants. Descriptors of soft electrophi-
licity and one electron reduction potential
have been calculated for a diverse group of
aromatic compounds and used to discrimi-
nate the narcosis mode(s) of toxic action
from mechanisms associated with covalent
binding to soft nucleophiles in biomacromol-
ecules and oxidative stress, respectively.
These studies are providing some insights
into ways to develop a mechanistically-
based strategy for selecting and using elec-
tronic indices in QSARs for biochemical and
cellular toxicity.

Approach and Results to Date
Recently, a series of studies done at the
Environmental Research Laboratory in
Duluth, MN (ERL-Duluth), have described
exploratory approaches through which
diverse groups of compounds in the ERL-
Duluth fathead minnow acute mode of action
database, are being identified using global
and local measures of reactivity. These
studies deal with modes of toxic action asso-
ciated with the bonding of soft electrophiles
and oxidative stress. Through these initial
investigations - generally based on semi-
empirical methods - approaches to predict-
ing soft electrophilicity (Mekenyan et al.,
1993; Mekenyan and Veith, 1994; Veith and
Mekenyan, 1993) photo-activation potential
(Mekenyan et al., 1994a,b) and one electron
reduction potential (Bradbury et al., 1994)
have been established. In some cases,
these descriptors are being related to mode
of toxic action classifications and potency.
The results from these studies are briefly
discussed below and suggest that a system-
atic approach can be employed whereby
specific stereoelectronic parameters can be
calculated from structure and used to predict
toxic responses associated with reactive
chemicals.

Photoactivation of Polycyclic Aromatic
Hydrocarbons (PAHs) ;
Research with a variety of aquatic species
has shown that while PAHs generally are not
toxic in conventional laboratory tests, many
are extremely toxic in the presence of sun-
light. The increased toxicity is attributed to
the excitation of the parent PAH to an acti-
vated intermediate through the absorption of
UV radiation and the subsequent generation
of reactive oxygen species when the acti-
vated species returns to the ground state.
Photo-induced toxicity is the result of com-
peting processes, such as stability and light
absorbance which interact to produce a
complex, multilinear relationship between
toxicity and chemical structure. Initial
research undertaken through ERL-Duluth
(Mekenyan et al., 1994a) established that for
a series of 28 compounds a measure of
energy stabilization in the ground-state
(HOMO-LUMO gap), provided a useful
index to explain persistence, light absorption
and photo-induced toxicity. An additional
study was then conducted, in which 'HOMO-
LUMO gaps' for excited states of PAHs were
determined and also related to phototoxicity
(Mekenyan et al., 1994b).

One Electron Reduction Potentials
Benzoquinones, within an appropriate one
electron reduction potential range, can be
140
NESC Annual Report - FY1994
-------
Integration of Computational and Theoretical Chemistry with Mechanistically-Based Predictive
Ecotoxicology Modeling
reduced by flavoproteins to semiquinones
that subsequently react with molecular oxy-
gen to form superoxide anion and the regen-
eration of the parent compounds. This
redox cycling, a form of futile metabolism,
produces reactive oxygen species and
depletes the reducing equivalents of cells
without concomitant energy production. The
resulting cytotoxic effects have been termed
'oxidative stress.' Because the ability of a
quinone to undergo redox cycling is related
to its one electron reduction potential, a
study with eight benzoquinones was con-
ducted to determine if this property can be
estimated from structure. The results of this
study suggest that one electron reduction
potentials between +99 mV and -240 mV
can be estimated by several electronic indi-
ces obtained from semi-empirical quantum
chemistry models.
QSARs for reduction potential were
derived from electronic properties of the par-
ent quinones as well as the semiquinone
radical anions. Charge delocalization was
judged to be an important factor in interpret-
ing relationships between molecular descrip-
tors and reduction potentials. From the set
of eight benzoquinones, it is apparent that
EHOMO for the parent compound, the
charge on the carbonyl atoms of the parent
molecules and localization of electron den-
sity assessed by aromaticity indices, were
molecular descriptors for estimating poten-
tial. Additional studies with naphthoquino-
nes, nitro aromatics, and phenols have also
indicated that descriptors associated with
charge delocalization are critical (Bradbury
etal., 1994).

Soft Electrophilicity
Initial insights reported by our research
group indicated that for a,p unsaturated
alcohols, aldehydes and ketones and
related allene derivatives, descriptors of soft
electrophilicity, such as average superdelo-
calizability (SSI), could differentiate those
compounds classified as narcotics from
those classified as reactive toxicants (Mek-
enyan et al., 1993). Isoelectrophilic
windows were established that separated
alcohols that act as narcotics (average S&
values of approximately 0.285) from reactive
aldehydes and ketones (average S£v values
of approximately 0.305). It was also
reported in subsequent investigations (Mek-
enyan and Veith, 1994; Veith and Meken-
yan, 1993) with a larger set of substituted
benzenes, phenols, and anilines (identified
as narcotics-l, narcotics-ll, oxidative phos-
phorylation uncouplers, and proelectro-
philes/electrophiles in the ERL-Duluth mode
of action knowledge base) that there is a
tendency for modes of toxic action to cluster
according to soft electrophilicity. In this
case, narcotic-l and narcotic-ll compounds
had average S^v values of 0.280, and com-
pounds typically classified as uncouplers or
proelectrophiles/electrophiles had average
S^v values of 0.345.
The clustering of uncouplers and electro-
philes within a common range of S^v sug-
gests that highly reactive compounds with a
dissociable proton are capable of disrupting
oxidative phosphoiylation, while those with-
out dissociable protons produce toxic
responses through covalent binding with soft
nucleophiles within biomacromolecules
(Veith and Mekenyan, 1993). This classifi-
cation is, of course, not applicable for dis-
cerning hard nucleophiles. Within the soft
isoelectrophilic windows, QSARs based on
log P have been established. These results
have led to the suggestion (Mekenyan and
Veith, 1994; Veith and Mekenyan, 1993) that
the acute toxicity of chemicals can be
described by a nearly orthogonal relation-
ship between molecular descriptors for
hydrophobicity and soft electrophilicity.

Future Objectives
Although the results of studies completed
to date that address predictions associated
with soft electrophiliicity, photo-activation
potential and one electron reduction poten-
tial are encouraging, ssjveral important
issues remain unresolved. Future efforts will
involve a systematic extension of the
hypotheses developed thus far and will
NESC Annual Report - FY1994
141
-------
Integration of Computational and Theoretical Chemistry with Mechanistically-Based Predictive
Ecotoxicology Modeling ;
permit further examination of several impor-
tant topics. Specifically, on-going investiga-
tions include: (1) an assessment of the
predictive capability of current models using
ab initio, rather than semi-empirical methods
and (2) an assessment of whether con-
former distributions rather than single-opti-
mized geometries are more appropriate for
predicting toxic modes of action and
potency.
These studies are essential for the
advancement of the current state of the sci-
ence of predictive ecotoxicology. Efforts
associated with the first objective will permit
a further substantiation of preliminary find-
ings concerning important modes of toxic
action associated with reactive chemicals. If
further substantiated by additional research,
new QSARs could subsequently be included
in an ERL-Duluth supported QSAR system
(ASTER), which is used throughout EPA and
other Federal, state, and international gov-
ernments (Russom etal., 1991). Because
the calculation of optimized geometries can
be CPU intensive and because the assump-
tion that chemicals behave as a single opti-
mized geometry in biological systems may
not be valid, a study has also been under-
taken to test hypotheses that distributions of
probable conformers, rather than single opti-
mized structures, are adequate for some
QSAR applications (Mekenyan et al.,
1994c). This second objective of our ongo-
ing investigations will be addressed through
semi-empirical calculations of stereoelec-
tronic parameters associated with photoacti-
vation, one electron reduction potentials and
soft electrophilicity. By using the ERL-
Duluth fathead minnow database, which
contains toxicity data for approximately 650
chemicals, these studies are contributing to
an overall assessment of CPU-effective
strategies to calculate stereoelectronic prop-
erties for large datasets of industrial chemi-
cals that require ecotoxicoiogical
evaluations.
References
Bradbury, S.P. 1994. Predicting modes of
toxic action from chemical structure: An
overview. SAR QSAR Environ. Res. 2,
89-104. :
Bradbury, S.P., Mekenyan, O.G., Veith, G.D.
and Kamenska, V. 1994. SAR models for
futile metabolism: II. One-electron
reduction of quinones, phenols and
nitrobenzenes. SAR QSAR Environ.
Res. (Submitted).
Konnemann, H. 1981. Quantitative struc-
ture-activity relationships in fish toxicity
studies. Part 1: relationship for industrial
pollutants. Toxicology 19, 209-221.
Mekenyan, O.G., Veith, G.D., Bradbury, S.P.
and Russom, C.L. 1993. Structure-toxic-
ity relationship for a,p-unsaturated alco-
hols in fish. Quant. Struct.-Act Relat. 12,
132-136.
Mekenyan, O.G. and Veith, G.D. 1994. The
electronic factor in QSAR: MO-parame-
ters, competing interactions, reactivity
and toxicity. SAR QSAR Environ. Res. 2,
129-143.
Mekenyan, O.Q., Ankley, G.T., Veith, G.D.
and Call, D.J. 1994a. QSARs for photo-
induced toxicity: I. Acute lethality of
polycyclic aromatic hydrocarbons to
Daphnia magna. Chemosphere 28, 56-
582.
Mekenyan, O.G., Ankley, G.T., Veith, G.D.
and Call, D.J. 1994b. QSAR estimates
of excited states and photoinduced
acute toxicity of polycyclic aromatic
hydrocarbons. SAR QSAR Environ. Res.
(Submitted).
Mekenyan, O.G., Ivanov, J.M., Veith, G.D.
and Bradbury, S.P. 1994c. Dynamic
QSAR: A new search for active confor-
mations and significant stereoelectronic
indices. QSAR Struct.-Act. Relat. (In
press). ;
142
NESC Annual Report - FY1994
-------
Integration of Computational and Theoretical Chemistry with Mechanistically-Based Predictive
Ecotoxicology Modeling
Russom, C.L, Anderson, E.B., Greenwood,
B.E. and Pilli, A. 1991. ASTER: An inte-
gration of the AQUIRE data base and
the QSAR system for use in ecological
risk assessments. In: J.L.M. Hermens
and A. Opperhuizen (Eds.), QSAR in
Environmental Toxicology-IV. Elsevier,
Amsterdam pp. 667-670.
Veith, G.D., Call, D.J. and Brooke, LT.
1983. Structure-Toxicity relationships for
the fathead minnow, Pimephales
promelas: Narcotic industrial chemi-
cals. Can. J. Fish. Aquat. Sci. 40, 743-
748.
Veith, G.D. and Mekenyan, O.G. 1993. A
QSAR approach for estimating the
aquatic toxicity of soft electrophiles.
Quant. Struct.-Act. Relat. 12, 349-356.
1 S.P. Bradbury and G.D. Veith, U.S. EPA, Environmental Research Laboratory-Duluth, Duiuth, MN.
2 O. Mekenyan, Lake Superior Research Institute, University of Wisconsin-Superior, Superior, Wl.
3 R. Hunter, Natural Resources Research Institute, University of Minnesota-Duluth, Duiuth, MN.
4 E. Anderson and E. Heide, Computer Sciences Corporation, Duiuth, MN.
NESC Annual Report - FY1994
143
-------
Integration of Computational and Theoretical Chemistry with Mechanistically-Based Predictive
Ecotoxicology Modeling
144
NESC Annual Report - FY1994
-------
Evaluation and Refinement of the Littoral Ecosystem Risk
Assessment Model (LERAM) Year Two1
Abstract
The goal of this research is to define the
domain of application of the Littoral Ecosys-
tem Risk Assessment Model (LERAM) using
a number of littoral enclosure and pond field
studies. These field studies tested the eco-
logical fate and effects of different types of
pesticides and toxic chemicals at different
times and in different geographic regions.
Once the best uses for LERAM have been
delineated and its ecological risk assess-
ment capabilities defined, it will be made
available to the regulatory and risk manage-
ment communities. It is anticipated that
LERAM will be used by the Office of Preven-
tion, Pesticides and Toxic Substances
(OPPTS) during the registration of pesti-
cides and industrial chemicals and by risk
managers for predicting changes in ecologi-
cal risk associated with watershed manage-
ment options.
The NESC supercomputing facility is
being used for three tasks:
1) Calibrating the parameters of the
LERAM. This is done using an optimiza-
tion algorithm to minimize a function that
measures the distance between the
LERAM simulation of population bio-
masses and biomass data from a littoral
ecosystem.
2) Creating a user-friendly interface for
LERAM that will allow the user to make
predictions of ecological risk to a littoral
ecosystem from exposure to specified
stressors and display the results in real
time.
3) Evaluating LERAM by simulating the
effects of a number of concentrations of
selected chemical stressors and com-
paring these simulations to field data.
The supercomputing environment is
used to run 500 (or more) Monte Carlo
iterations of the model at each treatment
concentration for the purpose of sensitiv-
ity and uncertainty analysis.

Research Objectives
At the present time, laboratory tests and
mathematical models form the basis of the
ecological risk assessment paradigm used
by the U.S. EPA. Until the early 1980s, sin-
gle species tests were used almost exclu-
sively to provide hazard assessments of
chemicals. At that time, the National Acad-
emy of Sciences (1981) and others (Levin
and Kimball, 1984) documented the need for
supplementary information from field tests,
microcosm experiments, and mathematical
models to better assess chemical hazards
for different geographic regions, seasons,
application methods, spatial scales, and lev-
els of biological organization. Along with the
increased interest in using field tests, micro-
cosm experiments,, and mathematical mod-
els to predict system responses to
perturbations, it became apparent that little
was known about the accuracy of predic-
tions made by these techniques. EPA's
objectives for the research proposed here
include evaluating and refining one ecologi-
cal risk assessment technique using field
data from controlled experiments in natural
systems. :

Background / Approach
With this in mind, Lake Superior Research
Institute (LSRI) and U.S EPA Environmental
Research Laboratory - Duluth (ERL-Duluth)
researchers began to develop the LERAM in
June, 1989. LERAM is a bioenergetic eco-
system effects model that links single spe-
cies toxicity data to a bioenergetic model of
the trophic structure of an ecosystem in
order to simulate community and ecosystem
NESC Annual Report - FY1994
145
-------
Evaluation and Refinement of the Littoral Ecosystem Risk Assessment Model (LERAM) Year Two
level effects of chemical stressors. It uses
Monte Carlo iterations of these simulations
in order to calculate probable ecological risk
assessments of chemical stressors. To
date, LSRI and ERL-D researchers have
developed LERAM to the point where it
models the unperturbed behavior of a littoral
ecosystem (i.e., the "behavior" of control
communities), and the response of that sys-
tem to the insecticide chlorpyrifos, with a
high degree of both accuracy and precision
(Hanratty and Stay, 1994). Additionally,
LERAM's ability to predict the ecological
effects of the insect growth regulator
diflubenzuron has also been evaluated
(Hanratty and Liber, 1994). This evaluation
revealed the need for some minor improve-
ments to the model.
During fiscal year 1994, LERAM has been
modified to improve its efficiency on a vector
processor, to increase the flexibility of the
code for representing different food webs
and ecosystems, and to increase the stabil-
ity and accuracy of its predictions. After the
changes were completed, the model param-
eters were recalibrated to give the best pos-
sible representations of the ecosystems in
the control littoral enclosures of two different
field studies performed in Duluth, Minnesota
(Rowan, 1990). These two representations
of littoral ecosystems then could be used to
predict the effects of toxic chemicals used to
examine the impact of the introduction of an
exotic species, or be modified to represent a
similar ecosystem with a few more or less
organisms (e.g., channel catfish were added
to the parameters) and population structure
that best represented the Duluth littoral
enclosures in 1988 in order to simulate the
littoral enclosure system used to measure
the effects of trifluralin during the 1994 field
season. Current work devoted to evaluating
the improvement of LERAM's ability to pre-
dict ecological effects using data from a lit-
toral enclosure study of the fate and effects
of nonylphenol (Liber, pers. comm.) appears
quite promising. However, further work is
required to complete the evaluation of these
refinements of the model, to validate its out-
put using data from littoral enclosure studies
in other geographic regions, and to develop
a more user-friendly interface for the model.

Comparison with Pre-Cray Results
When the calibration program is run on
other computers, such as a DEC VAX or a
486 PC, it takes so long that the program
becomes impractical to use. The model
itself can be run on other computers, but at
the loss of the advantage of being able to sit
and wait about a minute for the results — an
advantage that allows the user to concen-
trate on the ecological risk assessment or
modeling problem at hand. When the simu-
lations are performed on other computers,
the model can take any where from twenty
minutes to three hours, depending on the
computer and the number of Monte Carlo
iterations performed, so the user must find
something else to do while waiting for the
results.

Future Objectives
The tasks proposed for evaluating and
refining LERAM using the NESC supercom-
puting facilities are as follows:

Fiscal year 1995:
1) Create a user-friendly interface for
LERAM.
2) Calibrate the LERAM parameters using
the control experimental units from the
trifluralin littoral enclosure study per-
formed at Auburn University in 1994.
3) Simulate the effect of trifluralin using the
parameters that best represent a Duluth
littoral ecosystem and using the parame-
ters that best represent an Auburn littoral
ecosystem in order to further evaluate
LERAM; to investigate methods for using
LERAM to extrapolate to different geo-
graphic regions, and to further define its
domain of application.
146
NESC Annual Report - FY1994
-------
Evaluation and Refinement of the Littoral Ecosystem Risk Assessment Model (LERAM) Year Two
Fiscal year 1996:
1) Develop a model similar to LERAM that
will simulate a wetland ecosystem.
2) Investigate methods for simulating eco-
logical risk at larger spatial and temporal
scales.
3) Apply these methods to make predic-
tions concerning ecosystems contained
in the Great Lakes.

References
Bartell, S.M. 1987. Technical Reference
and User Manual for Ecosystem Uncer-
tainty Analysis (EUA): 1) - The Pascal
PC Demonstration Program, 2) - The
Standard Water Column Model
1(SWACOM), 3) - The Comprehensive
Aquatic System Model (CASM). U.S.
EPA Office of Toxic Substances Report.
Bowie, G.L, W.B. Mills, D.B. Porcella, C.L.
Campbell, J.R. Pagenkopf, G.L. Rupp,
K.M. Johnson, P.W.H. Chan, S.A.
Gherini, Tetra Tech, Inc. and C.E. Cham-
berlin. 1985. Rates, Constants, and
Kinetics Formulations in Surface Water
Quality Modeling (Second Edition). EPA
600/3-85/040. U.S. Environmental Pro-
tection Agency, Athens, Georgia.
Brazner, J.C., L.J. Heinis, and D.A. Jensen.
1989. A littoral enclosure design for rep-
licated field experiments. Environmental
Toxicology and Chemistry 8:1209-1216.
Hanratty, M.P. and K. Liber. 1994. E-valua-
tion of model predictions of the persis-
. tence and ecological effects of
diflubenzuron in littoral enclosures. In,
M.F. Moffet (ed.) Effects, Persistence
and Distribution of Diflubenzuron in Lit-
toral Enclosures. U.S. EPA Environmen-
tal Research Laboratory- Duluth, Duluth,
Minnesota.
Hanratty, M.P. and F.S. Stay. 1994. Field
evaluation of the Littoral Ecosystem Risk
Assessment Model's predictions of the
effects of chlorpyrifos. J. Appl. Ecol.
31:439-453.
Hanratty, M.P., F.S. Stay, and S.J. Lozano.
1992. Field Evaluation of LERAM: The
Littoral Ecosystem Risk Assessment
Model, Phase I. U.S. EPA Environmen-
tal Research Laboratory- Duluth, Duluth,
Minnesota.
Hanratty, M.P. and S.J. Lozano. Field evalu-
ation of modeling and laboratory risk
assessment methods. In preparation.
Levin, S. A. and K. D. Kimball. 1984. New
Perspectives in Ecotoxicology. Environ-
mental Management 8:375-442.
Lozano, S.J., J.C. Brazner, M.L. Knuth, L.J.
Heinis, K.W. Sargent, O.K. Tanner, L.E.
Anderson, S.L. O'Halloran, S.L. Ber-
telsen, D.A. Jensen, E.R. Kline, M.D.
Balcer, F.S. Stay, and R.E. Siefert.
1989. Effects, Persistence and Distribu-
tion of Esfenvalerate in Littoral Enclo-
sures. Final Report 7592A. U.S.
Environmental Research Laboratory-
Duluth, Duluth, MM 55804. Revised
ApriM990.
Rowan, T. 1990. Functional Stability Analy-
sis of NumericaJ Algorithms, Ph.D. The-
sis, University of Texas, Austin.
1 M.P. Hanratty and K.A. McTavish, Lake Superior Research Institute, University of Wisconsin-Superior, Superior, Wl.
2 F.S. Stay, U.S. EPA Environmental Research Laboratory-Duluth, Duluth, MN. >
NESC Annual Report - FY1994
147
-------
Evaluation and Refinement of the Littoral Ecosystem Risk Assessment Model (LERAM) Year Two
148
NESC Annual Report - FY1994
-------
The Reactivities of Classes of Environmental Chemicals
Understanding Potential Health Risks1-2
The U.S. Environmental Protection
Agency is often faced with the problem of
making regulatory decisions about chemi-
cals for which there is little or no information
on possible health effects. The Office of
Toxic Substances receives applications for
the registration of approximately 2,000 new
chemicals a year. Of those there is acute
toxicity data for only about 35% of the chem-
icals. There is some data on physical prop-
erties and little data on animal toxicity. We
are learning about the many chemicals pro-
duced by secondary processes in the atmo-
sphere for which health data is only now
beginning to be collected. Additionally, there
are chemicals in the atmosphere produced
by combustion, chemicals in drinking water
that result from the purification processes
and chemicals in waste dumps for which
there is a paucity of human health data.
While the necessary data for the evalua-
tion of the potential hazard of these chemi-
cals is being developed, structure activity
relationships developed from the molecular
modeling paradigms may be used to provide
a tool for initial evaluation and provide a
rational basis for the prioritization of testing
needs. In order to more efficiently use these
techniques, it is necessary to understand the
molecular basis for the toxicity of chemical
classes of environmental importance.
There are two general strategies for
obtaining these insights:
• Postulation and development of a molec-
ular model for a specific mechanism of
action from existing biological data and
related chemical information. This
model may be quantified and tested
using computational molecular tech-
niques.
• Determination of correlations between
the structure or properties of molecules
and their biological activity. These corre-
lations are determined using various
types of statistical or artificial intelligence
techniques froim compiled data bases for
chemicals tested in sufficiently similar
protocols. ,
In both of these strategies for the develop-
ment of computational structure activity rela-
tionships, the model is derived from
available experimental information and its
quantitative results suggest additional
experiments for the verification and improve-
ment of the model,. In practical situations,
often elements of both 1.) causal and 2.)
correlative, strategies are used.
Recent advances in: 1) in the computa-
tional methods available for molecular mod-
eling, 2) the speed and storage capacity of
computers, and 3) the experimentally
acquired knowledge of the specific molecu-
lar targets for xenobiotics, make the applica-
tion of these methods to environmental
problems more practical. These methods
have made it possible to perform quantum
mechanical calculations in some instances
on the xenobiotic and its biomolecular target
in an aqueous surroundings. They make it
possible to use molecular dynamics and
mechanics methods to compute the struc-
tural changes in a biopolymer induced by
the binding of a xenobiotic to relevant
sequences in the biopolymer.
Polycyclic Aromatic Hydrocarbons
(PAHs) are a prevalent class of man-made
chemicals of varying mutagenic and carcino-
genic potency. There is a great deal of
experimental evidence to indicate that the
mechanism through which chemicals in this
class are active, requires metabolic activa-
tion to an epoxide or polyol-epoxide. In
association with protonation, the epoxide
opens and binds to nudeophilic sites in
NESC Annual Report - FY1994
149
-------
The Reactivities of Classes of Environmental Chemicals - Understanding Potential Health Risks
DNA. Because the metabolic formation of
the epoxide is enzymatic, both the electronic
configuration of the PAH and its interaction
with the enzyme active site determine the
site of epoxidation. However, the direction
of the opening of the protonated epoxide is
spontaneous.
Experimental scientists at the Health
Effects Research Laboratory (HERL) are
studying this class of chemicals, including a
subclass, the cyclopenta-PAHs (cPAH) (mol-
ecules that in addition to the six membered
rings of other PAHs also contained a five
membered ring). In the cPAHs under con-
sideration, the five membered ring was
formed by the addition of two carbon atoms
with an unsaturated bond to a backbone that
contained only six membered rings. We
were asked if we could determine the direc-
tion of ring opening of an epoxide formed
between the carbon atoms that complete the
five membered ring as shown in Figure 1.
The direction of ring opening determines the
atom that would bind to nucleophilic sites in
DNA.
AM1 was used to determine the three
dimensional structures of the carbocations
that could be formed by ring opening. The
electronic structures, energies and frontier
orbitals were determined using the Gauss-
ian series of programs with a 3-21 g basis set
for those carbocation geometries: The Mul-
liken definition of charge was used for deter-
mining group charges and atomic LUMO
densities. Molecular electrostatic potentials
were computed using our local multipole
method.
Rgure 1: A schematic diagram for the ring opening of aceanthrylene 1,2-epoxide
150
NESC Annual Report - FY1994
-------
The Reactivities of Classes of Environmental Chemicals - Understanding Potential Health Risks
The 2-hydroxy-carbocation was favored
over the 1 -hydroxy-carbocation for acean-
thrylene by 11.8 kcal/mol. This indicates
that the 1 -position (the nominally charged
position for the 2-hydroxy-carbocation) will
bind to DNA to form adducts. The surprising
result of these calculations is that the CH6
group is more positively charged than the
nominally charged CH1 group. The local
LUMO density is also greater at the 6 posi-
tion than the 1 position. Computation of the
molecular electrostatic potential of the 2-
hydroxy-carbocation also showed that the
region near the CH6 is more likely to attract
positively charged species than the CH1
position. All these local reactivities suggest
the possibility of binding to C6 and not the
nominally charged C1. However, similar cal-
culations of the binding of that carbocation
to small model nucleophiles show that the
binding to the 1 position is significantly
favored. The electronic effects are more
important than the electrostatic effects, com-
puted by these single molecule descriptors,
in determining the binding position.
A series of 13 similar cPAHs were investi-
gated. They all contained a backbone of six
membered rings (2-4) and a five-membered
ring completed in a similar manner to acean-
thrylene. It was found that these cPAHs
could be divided into two classes. Those
that had at least three linear six-mernbered
rings and the five membered ring configured
so that one of the carbon atoms was
attached to a middle ring, like acean-
thrylene, and those that did not. For the first
class the energy difference between the two
carbocations was large (>7.5 kcal/mol) and
favored the formation of the carbocation with
the nominal charge on the CH group
attached to the middle ring (like the two-car-
bocation of aceanthrylene). For those car-
bocations a large amount of the positive
charge was localized on the CH group in the
middle ring para to the nominally charged
group (like the CH6 group in aceanthrylene).
For the second class, the energy difference
between the two possible carbocations was
small (<4.0 kcal/mol) and the nominally
charged CH group always was the most
highly charged. ;
Experimental information was available
for four of the cPAHs studied, two from each
group. For the two from the first group, the
experimental data was consistent with the
formation of only one carbocation, the one
predicted by the computations. For the two
from the second group, the experimental
data indicates that both carbocations are
formed. This information suggests that
while our methods are capable of correctly
dividing the class into two subclasses based
on reactivity, the energy differences between
the two potential carbocations may be over-
estimated.
In order to understand the basis for this
overestimation of the energy difference
between the carbocations, ab initio methods
with both the 3-21 g and 6-31 g basis set and
a semiempirical method that includes the
bulk effects of water (AMSOL/SM2) were
used to obtain the carbocation geometries
and energies. It was found that the geome-
try changes introduced by these methods
had an insignificant effect on the difference
in energy between the carbocation pairs,
even though it caused a change in the total
energy of the individual cations that was the
same order of magnitude as the energy dif-
ference between the carbocations. The
introduction of the effects of water in the
semiempirical Hamiltonian (the AMSOL/
SM2 calculation) significantly decreased the
energy difference between the carbocation
pairs. The division of the cPAHs into two
classes remained unchanged. The inclusion
of bulk water stabilizes charge separation in
the carbocations and makes all atomic and
group charges larger. Within the carboca-
tion pairs, the higher energy carbocation has
greater charge separation and therefore is
more stabilized by the inclusion of water and
the energy difference between the carboca-
tion pairs becomes smaller.
NESC Annual Report - FY1994
151
-------
The Reactivities of Classes of Environmental Chemicals - Understanding Potential Health Risks
Relevant Publications
Weinstein, H., J.R. Rabinowitz, M.N. Lieb-
man and R. Osman, 1985. Determi-
nants of molecular reactivity as criteria
for predicting toxicity: Problems and
approaches, Environ. Health Perspect.
61,147-162.
Rabinowitz, J.R. and S.B. Little, 1991. Pre-
dictions of the reactivities of cyclopenta-
polynuclear aromatic hydrocarbons by
quantum mechanical methods, Xenobi-
otica 21,263-275.
Venegas, R.E., P.M. Reggio and J.R.
Rabinowitz, 1992. Computational Stud-
ies of the 3-dimensional structure of
cyclopenta polycyclic aromatic hydrocar-
bons containing a gulf region, Int. J.
Quant. Chem. 41, 497-516.
Rabinowitz, J.R., and S.B. Little, 1992.
Epoxide ring opening and related reac-
tivities of cyclopenta polycyclic aromatic
hydrocarbons: quantum mechanical
studies, Chem. Res. in Tox. 5; 286-292.
1 James R. Rabinowitz and Lan Lewis-Sevan, Carcinogenesis and Metabolism Branch, Health Effects Research
Laboratory, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711.!
2 Stephen B. Little, Integrated Laboratory Systems, Research Triangle Park, NC 27709.

NOTE: Lan Lewis-Sevan is a postdoctoral fellow in the Program in Toxicology of the University of North Carolina. This
report does not reflect U.S. EPA policy. The research described in this paper has been reviewed by the Health
Effects Research Laboratory of the U.S. Environmental Protection Agency and approved for publication.
Approval does not signify that the contents necessarily reflect the views and policy of the Agency nor does
mention of trade names or commercial products constitute endorsement or recommendation for use.
152
NESC Annual Report - FY1994
-------