United States       National Environmental Supercomputing  EPA 208/K-95-QQ1
            Environmental Protection    Center (NESC)          February 1 1995
            AQency         135 Washington Avenue
                       Bay City, Ml 48708-5845
&EPA      FY 1994 National
            Environmental
            Supercomputing
            Center (NESC)
            Annual Report

-------
UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
NATIONAL ENVIRONMENTAL SUPERCOMPUTING CENTER
The NESC Mission:

The mission of the NESC is to
provide compute-intensive
processing for scientific appli-
cations that have a high prior-
ity within the EPA's overall
mission. These applications
will come from the EPA
researchers, regulatory staff,
and support personnel. In
addition, the NESC will ser-
vice those universities, agen-
cies, and private companies
external to the EPA having grants, cooperative agreements, and
memoranda of understanding with the EPA, in which their spi-
ence qualifies as compute-intensive and has a high priority ;
within the EPA.
The computational services of the NESC include:
•   Management of a wide range of hardware, software, and
    services into an integrated supercomputing service.
•   Centralized compute-intensive processing facility.      •
    Data communications networks.                      \
•   Consultation services relating to the NESC.
A secondary mission of the NESC is to promote environmental
modeling and computational science within local schools, by
means of academic-year and summer programs.

-------
   Contents
 Introduction to the NESC FY1994 Annual Fteport.	1
 Message from the NESC Director.	3
 Framework for Environmental Modeling	19
 Models of Service Level for Jobs Submitted to the NESC's HPC
 Resources..		23
 Second Annual International Environmental Visualization Workshop	31
 Calculating the Rates of DNA-Catalyzed Reactions of Aromatic
 Hydrocarbon Diol Epoxides	33
 Prediction of Oxidative Metabolites by Cytochrome P450s with
 Quantum Mechanics and Molecular Dynamics Simulations.,.	37
 High Performance Computing For Environmental Research ,i	45
 U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes
 Modeling (WELLM)	49
 Visualization for Place-Based Management	57
 Experimental and Calculated Stabilities of Hydrocarbons and
 Carbocations	59
 Infrared Frequencies and Intensities Calculated from MM3,
 Semiempirical, and Ab Initio Methods for Organic Molecules	65
 Ab Initio Calculations of Proton Affinities, lonization Potentials,
 and Electron Affinities of Polynuclear Aromatic Hydrocarbons  and
 Correlation with Mass Spectral Positive and Negative Ion Sensitivities	71
 Parameterization of Octanol / Water Partition Coefficients (LogP)
 Using 3d Molecular Properties:   Evaluation of Two Published Models
for LogP Prediction	77
 Regional Acid Deposition Model (RADM) Evaluation.		83
Atmospheric Deposition of Nitrogen to Chesapeake Bay		85
Study of the Feasibility of Acidic Deposition Standards	91
NESC Annual Report - FY1994

-------
Contents
1990 Clean Air Act Section 812 Retrospective Study		P	95

The Role of Supercomputers in Pollution Prevention: Predicting
Bioactivation for the Design of Safer Chemicals		.	i	97

Application of Molecular Orbital Methods to Qsar And Lfer:             |
Understanding and Characterizing Interactions of Solutes and Solvents, i	99

Calculated Infrared Spectra for Halogenated Hydrocarbons		i	105

Development of Physiologically Based-Pharmacokinetic and            ;
Biologically Based-Dose Response (PB-PK/BB-DR) Models             ;
for Dioxins and Related Compounds Incorporating Structure-Activity    i
Considerations	•	f •
                                                                      113
                                                                      117
The Chesapeake Bay Program Simulation Models:  A Multimedia       j
Approach	•	\ •

Visualization Techniques for the Analysis and Display of Chemical,     !
Physical, and Biological Data Across Regional Midwestern Watersheds . L	131

Estimation of Global Climate Change Impacts on Lake and Stream     ;
Environmental Conditions and Fishery Resources		f	135

Integration of Computational and Theoretical Chemistry with          i
Mechanistically-Based Predictive Ecotoxicology Modelling	 ^	139

Evaluation and Refinement of the Littoral Ecosystem Risk            !
Assessment Model (LERAM) Year Two		145
The Reactivities of Classes of Environmental Chemicals -
Understanding Potential Health Risks		
                                                                      149
                                                    NESC Annual Report - FY1994

-------
   Introduction to the NESC FY1994 Annual Report1
    The National Environmental Supercom-
  puting Center (NESC) is EPA's latest invest-
  ment to assure that science of the highest
  quality is conducted to support environmen-
  tal protection.  The benefits of the NESC to
  the public and  EPA programs center on fos-
  tering collaborative efforts among scientists
  to bring the results of scientific research to
  the decision making process.
   To achieve success at the NESC, four
  tightly integrated programs are maintained.
   • First, operation of a supercomputing
     resource to provide the maximum
     amount of computing time to research-
     ers is the backbone of the NESC
     mission.
   • Second, a strong computational science
    support effort for Agency scientists is
    essential to ensure the efficient use of
    resources and the improvement of math-
    ematical models.
   • Third, an aggressive and innovative
    program in visualization and presenta-
    tion of scientific information is directed
    toward scientists, environmental deci-
    sion making officials, and the public.
   • Fourth, collaborative efforts among all
    groups are strongly supported through a
    nationwide telecommunications network,
    workshops and seminars, and educa-
    tional programs such as Earth Vision:
    EPA's Grand Challenge for High
    Schools.
   In its first years of operation, the NESC
has become the central resource for carry-
ing out the research programs that are vital
to large-scale studies of ecological and bio-
logical systems. Its continued success in
supporting these efforts depends upon the
collaboration among scientists, managers,
and the public.
  The NESC remains dedicated to providing
not only supercomputing resources, but also
to providing the coHegial environment nec-
essary for that collaboration.
  Walter M. Shackelford, Director of Scientific Computing, U.S. EPA, National Data Prcicessing Division (NDPD)
     Research Triangle Park, NC 27711.
NESC Annual Report - FY1994

-------
Introduction to the NESC FY1994 Annual Report
                                                            NESC Annual Report - FY1994

-------
   Message from the NESC Director
 Overview of The NESC
   The United States Environmental Protec-
 tion Agency's (EPA) National Environmental
 Supercomputing Center (NESC) has a sin-
 gle purpose; to support world-class environ-
 mental research.  The NESC provides
 scientists and researchers with the high-per-
 formance computational tools necessary to
 solve EPA's Grand Challenges.
   On August 25,1992 the first EPA-owned
 supercomputer, a Cray Research Y-MP 8i/
 232,  was delivered and installed at the cen-
 ter. In 1994, this computer was replaced
 with a Cray C94/264 (subsequently
 upgraded to a C94/364). The replacement
 was required to satisfy the demands of com-
 pute-intensive jobs, (e.g., computational
 chemistry) and those requiring more mem-
 ory space (e.g., the Regional Acid Deposi-
 tion Model).
   The NESC is housed in an 80-year-old
 building near downtown Bay City. The build-
 ing was extensively renovated to make it
 suitable as a supercomputing center.
 Despite that renovation, the building retains
 much of its circa 1910 Chicago-style exte-
 rior, belying the state-of-the-art sophistica-
 tion and functionality that lies within. In.
 addition to housing the computer, the  build-
 ing includes a Visitor's Center, classroom,
 visualization laboratory, and staff offices.

 The NESC's Customers
   In FY1994, 51 research projects had allo-
 cations on the NESC's supercomputer.
 Table 1, page 4, shows a summary of the
 projects. Table 2, page 7, shows a break-
 down  of the FY1994 projects by EPA
 organization.
  Although the bulk of research projects at
the NESC come from the ORD laboratories,
 a broad cross-section of EPA organizations
 are using the computing resources at the
 NESC to approach problems in a wide vari-
 ety of disciplines, j
   Within ORD, the breakdown is shown in
 Table 3, page 7. Table 4, page 8, lists the
 NESC's customers by their area of research.

 NESC Usage   i

   The NESC became officially operational in
 October of 1992.  Within three  months con-
 strained demand for NESC's resources very
 nearly saturated the first Cray Y-MP super-
 computer and the third processor that was
 soon added. The following May (1993), the
 initial Storage Technology silo (capacity of
 about one Terabyte) was full of environmen-
 tal research data,  and a second silo was
 installed. On May 6,1994 the Cray Y-MP
 was replaced with a more powerful Cray
 C94. The C94's two processors were
 almost immediately saturated, and on Octo-
 ber 12,1994, a third processor was added.
 Demand for the firs* four months of FY1995
 has averaged 91%. At this time the upper
 limit of research demand is unknown.
  It is useful to visualize the distribution of
 projects and organizations together with the
 percentages of the use of supercomputing
 resources at the NESC.  Figure 1, page 9,
 shows the usage of the NESC Cray C94
 supercomputer (Sequoia) for FY1994. Fig-
 ure 2, page 10, shows FY1994 Sequoia
 usage by project.  Figure 3, page 11, shows
 Sequoia FY1994 usage by organization.
  Three projects (Earth Systems Modeling,
 Regional Acid Deposition Model, and Struc-
ture Activity Relations) use more than one-
half of the NESC's resources, and eight
projects (Regional Acid Deposition Model,
Earth Systems Modeling, Structure Activity
Relations, Regional Oxidant Model -
NESC Annual Report - FY1994

-------
Message from the NESC Director
                           Table 1: NESC FY 1994 Projects

Project
AQDECON
ARSENIC
ASTER
BUFFSED
CERES
CHBAY
CORVTEST
DOSMETRY
ENGINE
ERTOM
FOXSED
GENESIS
GENSED
GLOBCHEM
HPCC
HSPF
HYDRO3D
EPALab
RSKERL Ada
ORD
HERL
ORD
ERL Duluth
ORD
LLRS
ORD
ERL Athens
ORD
CBPO
ERL Corvallis
ORD
HERL
ORD
NVFEL
OAR
RSKERL
ORD
LLRS
ORD
ERL Athens
ORD
LLRS
ORD
AREAL
ORD
AREAL
ORD
ERL Athens
ORD
ERL Athens
ORD
•'/' JV^'::V a ;v'':^;, Description --; ,'< .>;";;!
Transport of Multiple Components in Aquifers under Bio-
logical or Chemical Reactions
Pharmokinetic modeling of arsenic speciation, ligiand
exchange chemistry and toxicokinetics ,
Database of Electronic Structures for Environmental Risk
and Toxicology
Buffalo River Contaminant Transport Modeling
i
Earth Systems Model
Chesapeake Bay Program Water Quality Models!
Corvallis Evaluation Account '
Multiple Paths Dosimetry Model for Gas Uptake j
Flow Field Fuel-Air Mixing and Combustion Process in an
Alternative Fueled Engine
Inverse Solution of Electrical Resistance Tomography
Fox River/Green Bay Contaminant Transport Modeling
GENESIS Earth Systems Modeling Project :
Sediment and Contaminant Transport and Fate in Aquatic
Systems j
Global Chemical Transport Modeling for MODELS3
I
High Performance Computing and Communications
Linked Airshed/Watershed/Water Quality Simulat on Mod-
els
Integration of 3D Hydrodynamic, Water Quality, ahd Sedi-
ment Models in Surface Waters

4 NESC Annual Report - FY1 994

-------
                                                          Message from the NESG Director
                        Table 1:  NESC FY 1994 Projects (continued)
Project
'•-'' AV'"-:s£'-
INDOOR
JUNEAU
LAKEGCM
LERAM
LISOUND
LUNG
MICHUAM
MICRO
MM4
MOHAVE
MOMOLSAR
NEWCHEM
ONTARIO
PAHLV
PCB
PHOTOX
QSARHERD
RADM
.^.EP^jLab
AREAL
ORD
EPA Region 10
ERL Duluth
ORD
ERL Duluth
ORD
EPA Region 1
HERL
ORD
EPA Region 5
AREAL
ORD
AREAL
ORD
AREAL
ORD
HERL
ORD
OPPT
LLRS
ORD
EMSL Las
Vegas
ORD
EPA Region 7
ERL Duluth
ORD
OPPT
AREAL
ORD
. •• ; .'v': "' ' '. Description',-'-'-.-; ;-; .. •..'•'•' '•'••*A
" ,': :•••-,. •• '• .-,:'•• '••:•' - . • .. .• ,•'. ••••'•":• •••".• • ' ••'•'':. ." •,"'•'-?£
Indoor Air Exposure Modeling Program
Alaska Juneau Mine, Gastineau Channel
Estimation of Global Climate Changes and Agricultural
Activities on Lakes and Streams
Evaluation and Refinement of Ecosystem Risk Assess-
ment Models
Nitrogen Discharge Allocation in Long Island Sound
Mathematical Modeling of Fluid Dynamics in Lung Airways
and Behavior of Inhaled Particles
Photochemical Modeling for the Detroit-Ann Arbor Nonat-
tainment Area
Exposure Assessment Research in Microenvironments
Mesoscale Meteorological Modeling for Air-Quality Simula-
tions
Intensive Meteorological and Dispersion Simulations for
the MOHAVE Field Study
Molecular Modeling !
Toxicity of New Chemicals
A Linked Hydrodynamic-Water Quality Model for Lake
Ontario
Ab Initio Calculations on Polynuclear Aromatic Hydrocar-
bons
Theoretical IR spectra of polychlorobiphenyls and dioxins
3rediction of Photo-oxidation by PAH Modeling
Molecular modelling for health effects research
Regional Acid Deposition Model ;
NESC Annual Report - FY1994

-------
Message from the NESC Director |
Table 1 : NESC FY 1994 Projects (continued)
Project
RELMAP
ROMA
ROMO
RPM
SACRAIR
SARMAP
SEDCRIT
SNAKE
SOS
SPECIES
SUBSURF
TOXMODE
TRANSPRT
UAM
VENTUFIP
WIPP
EPA Lab
AREAL
ORD
AREAL
ORD
OAQPS
OAR
AREAL
ORD
EPA Region 9
EPA Region 9
EPA
OW
EPA Region 10
AREAL
ORD
ERL Corvallis
ORD
RSKERL
ORD
ERL Duluth
ORD
RSKERL
ORD
AREAL
ORD
EPA Region 9
ORIA
OAR
/Description
.-.. :•- •••-••• .-•-," • '. ' .,' •:•'•:;;:•• '•.. ' ';v
Regional Toxic Deposition Modeling
r
Regional Ozone Modeling - Research and Development
Regional Ozone Modeling to Support Clean Air Act Man-
dates
I
Regional Particulate Modeling
Sacramento Area Tropospheric Ozone Modeling for FIP
SJVAQS/AUSPEX Regional Model Adaptation Project
Development of Toxic Sediment Criteria Methodologies
F
A River Basin Modeling Framework for Ecological Risk
Analysis |
Southern Oxidant Study: Urban-Scale Photochemical
Modeling
Heuristic Approximations to the Exact Set Coverage of
Biogeographic Data ;
Uncertainty Analysis of Subsurface Hydrocarbon Releases
Artificial Intelligence Systems for Toxic Mode-of Action
Predictions
Three-dimensional Multiphase Flow and Contam nant
Transport Mathematical Model
Urban Photochemical and Meteorological Modeling
Ventura Air Basin Federal Implementation Plan Study
Waste Isolation Pilot Plant Performance Assessment Mod-
eling
NESC Annual Report - FY1994

-------
                                                           Message from the NESG Director
                             Table 2: NESC Project Affiliations
Affiliation
Office of Research arid Development
Regional Offices
Program Offices
Office of Air and Radiation
Office of Prevention, Pesticides and Toxic Sub-
stances
Office of Water
Number of
Projects
36
8
1
3
2
1
                          Table 3: NESC Projects - by Laboratory
Laboratory^
' ?"' ~ . ,,'v^
V A ~, y
AREAL
ERL Duluth
RSKERL Ada
ERL, Athens
HERL
LLRS
ERL Corvallis
EMSL Las Vegas
.Number of ^
\ Projecfs' -
12
5
4
4
4
4
2
1
NESC Annual Report - FY1994

-------
Message from the NESC Director
                       Table 4:  NESC Project - Areas of Research
Research
Area
Air Quality
Computational
Chemistry
Water Quality
Ground Water
Transport
Global Climate
Modeling
Health Effects
Emissions
Ecological and
linked models
Meterology
Other
Number of
'Projects
13
9
8
4
4
2
1
5
2
3
f
Projects . , £
i
INDOOR, MICHUAM, MICRO, RADM, RELMAP, ROMA,
ROMO, RPM, SACRAIR, SARMAP, SOS, UAM, VENTUFIP
ARSENIC, ASTER, MOMOLSAR, NEWCHEM, PAHLV,
PCB, PHOTOX, QSARHERD, TOXMODE \
BUFFSED, CHBAY, FOXSED, GENSED, HYDRO3D,
LISOUND, ONTARIO, SEDCRIT |
AQDECON, ERTOM, SUBSURF, TRANSPRT
I
CERES, GENESIS, GLOBCHEM, LAKEGCM
DOSMETRY, LUNG
ENGINE
HSPF, JUNEAU, LERAM, SNAKE, SPECIES
MM4, MOHAVE
CORVTEST, HPCC, WIPP .
OAQPS, Polynuclear Aromatic Hydrocar-
bons-UC, San Joaquin Valley Air Quality,
Chemical/Ecological Relation, and Mesos-
cale Meteorological Modeling) use more
than three-quarters of the NESC's resources
(see figure 2, page 10.)
  Figure 3, page 11, illustrates measured
percentages of NESC usage by organiza-
tion. Two organizations, Athens ERL and
AREAL, use 60% of the NESC's supercom-
puter. Four organizations,  AREAL, Athens
ERL, HERL, and OAQPS comprise three-
quarters of the demand.  And seven organi-
zations, AREAL, Athens ERL, HERL,
OAQPS, Las Vegas EMSL, Great Lakes,
and Region 9, use more than 94% of the
supercomputing resources.
Hardware                   ,
  The NESC is a state-of-the-art supercom-
puting center. Figure 4, page 12, |is a simpli-
fied schematic of the NESC's hardware
configuration as of September 1994.  The
major hardware components are detailed in
the following paragraphs.       \

Cray C94 - Sequoia           '
  The NESC's supercomputer, named
Sequoia, is a Cray Research C94/364.  The
C94 was installed in May 1994 and replaced
the NESC's original machine, a Cray Y-MP
8i/364. Specifically designed for scientific
and engineering disciplines, the Cj94 is one
of the most powerful supercomputers cur-
rently available.              j
  Sequoia has three central processing
units (CPUs), 64 megawords of central
                                                          NESC Annual Report - FY1994

-------
                                                         Message from the NESG Director
    N
   0.
   o
          o
                                                                                    I

                                                                                    "5
                                                                                    o


                                                                                    I
o
nf
•=


I
O

i
                                                                                    £

                                                                                    il
NESC Annual Report - FY1994

-------
Message from the NESC Director
                 (B
                  CD
             19  .Q
             N   E
                 o
                 CQ
            0  CQ
                 0)
             0   -
             3   Q
             IT  O
             **  4-1
             00
                                                                                  
-------
                                                       Message from the NESC Director
            c
            o

            "S  <*

            US
            C  r"
            CO  i:
                .Q
            *-  C

           o  I

            •j^ ^^

                O
            k>
            N  0)
            5  ?;

           "o  °
            3  *-
            CT
            01
           w
  •  i  i  i  i  i  i   i  i  i  i  i

UJNJi-JJlJOOO  "B *


||o|S-3Sc°S|||
                                        ID




                                        o
                                        o
                                        e
                                        in CQ




                                        o O
                                                              o
                                                              o
                                                              n
                                        0
                                        n
                                                              o
                                                              e
-I tt -I -I .1
                                                                           I"
                                                                           CL
                                                                           CJ

                                                                           ca
                                                                           (D
                                                                           CO
NESC Annual Report - FY1994
                                                       11

-------
Message from the NESC Director
                     Cray C94/364
                     3 processors
                     64 Megawords memory
                     512 Megaword SSD
     Communications
          links
                                                                            SAS server
                                                                           DG Aviion 5240
    StorageTekSTK4400
       Tape Silos (2)
    2.4 Terabytes total
                                                                         Chemistry server
                                                                           SGI Indigo 2
                                                       I	Jl	J
                              1600/6250 9 track
                               tape drives (4)
               DS60 disk drive units (24)
               DS62 disk drive units (16)
                 90.7 Gigabytes total
               Rgure 4: Hardware Configuration Overview, National Environmental Supercomputing
                                Center (NESC) - September 1994
memory, and a 512 megaword solid-state
storage device (SSD). The C94's clock
speed is 4.2 nanoseconds, with a theoretical
peak performance of 1.024 billion floating
point operations per second (Gigaflops).
With three CPUs, Sequoia's theoretical peak
performance is three Gigaflops and, if fully
configured with four CPUs, it is capable of
achieving a theoretical peak performance of
four Gigaflops.
  In addition to speed, a truly balanced
supercomputer must optimize data retrieval
and storage. Sequoia has very fast input
and output (I/O) channels, which enable it to
read and/or write data at a pace in keeping
with the speed of its CPUs. Internal transfer
rates of up to 200 megabytes per second
(MB/second) are achieved between the
CPUs, memory, and the I/O clusters.
Sequoia's highest transfer rates are
achieved when communicating at 1,800 MB/
second with the SSD. The SSD is used as
an extremely high-speed intermediate stor-
age area for both system and user data, and
can provide significant speedups tar
l/O-intensive applications.  This, coupled
with high-speed disk data storage (see the
next section), results in an  extremely well-
balanced computing environmentfor the
NESC's customers.             j

High-Speed Data Storage       :
  Augmenting Sequoia's central memory
and SSD are twenty-four DD-60 a;nd sixteen
DD-62 disk drives.  Eight of the DJD-60, and
all sixteen of the DD-62 disk drives were
installed during FY1994. Each DD-60 drive
can store 1.96 billion bytes (Gigabytes) of
data; each DD-62 holds 2.73 Gigabytes.
The disk drives give the NESC a total high-
speed storage capacity of more than 90
Gigabytes (9 xio'0 bytes) of information.
12
              NESC Annual Report - FY1994

-------
                                                           Message from the NESC Director
  These disk drives are connected to Sequoia
  by 40 data channels, each capable of a
  20MB/second transfer rate.

  Mass data storage
    In addition to disk drives, the NESC has
  two StorageTek 4400 robotic tape silos.
  Each silo contains approximately 6,000 tape
  cartridges and is capable of storing 1.2 tril-
  lion bytes  (Terabytes) of information. Com-
  bined, the two units provide a total storage
  capacity of 2.4 Terabytes (2.4x1012 bytes) of
  data. All tape handling is performed by
  robotic arms and is completely automatic
  and "transparent" to the user. Two six MB/
 second data channels connect the silos with
 the Cray.
   Other data transfer media, including
 "round" tape facilities may be available upon
 special request.

 Specialized Servers
   To provide expanded distributed comput-
 ing functionality to EPA research community,
 in FY1994 the NESC added two specialized
 server machines to its computing resources.
 "Sassafras" is a Data General Aviion 5240
 with four CPUs, 192 Megabytes of central
 memory, and 15.4 Gigabytes of disk stor-
 age.  Sassafras is dedicated to the manipu-
 lation and statistical processing of data files
 used by air quality modelers.
  The NESC's computational chemistry
 users are running applications on "Almond",
 a Silicon Graphics (SGI) Indigo2 with one
 CPU, 160 Megabytes of memory, and five
 Gigabytes of disk storage.  Almond pro-
 vides a platform for chemistry software that
 is not available or appropriate for use on the
 Cray, and complements the larger applica-
 tions that run on the C94.

 Telecommunications
  In order to meet its  mission, the NESC
 must serve customers throughout the United
 States. From its location in Mid-Michigan,
the NESC uses a sophisticated telecommu-
  nications network to serve customers at EPA
  sites around the country.
   The NESC's telecommunications network
  consists of both a Local Area Network (LAN)
  and a Wide Area Network (WAN). Each is
  described in greater detail in the following
  paragraphs.

  NESC LAN      ;
   The NESC's LAN, shown in Figure 5,
  page 14, is responsible for communications
  inside the NESC. It consists of four Ethernet
  backbones, capable of transmitting data at a
  rate of 10 million bits per second (MbS). In
 addition, there is a single Fiber Distributed
  Data Interface (FDDI), which moves data at
  100MbS.  These networks are responsible
 for moving data within the NESC.

 NESC WAN
   The NESC's WAN is illustrated in Figure
 6, page 15, and is responsible for moving
 data between the NESC and its customers.
 The WAN consists of one T3 transmission
 link  and two T1  transmission links. The T3
 link, which is capable of a peak transmission
 rate of 45MbS, connects the NESC with
 EPA's largest research facility in Research
 Triangle Park (RTP), North Carolina.
  One T1 link, which has a peak data trans-
 mission rate of 1.5MbS, also links the NESC
 with EPA's RTP facilities. The other T1 link
 connects the NESC to  EPA's Cincinnati
 communications hub. The second T1 link
 connects the NESC with MichNet, which in
 turn, connects the NESC with the NSFNet
 and the Internet. Plans are in place to
 upgrade the MichNet link to a T3
 connection.
  Telecommunications routing is handled
through three NSC high speed routers.
These routers are fully redundant, with each
router capable of managing all telecommuni-
cations traffic. Two 12MB/second data lines
connect the routers to Sequoia.
  Through the use of UNIX TCP/IP proto-
cols and the File Transfer Protocol (FTP),
NESC Annual Report - FY1994
                                     13

-------
Message from the NESC Director
                                i  i  i  i  rn
                                V H' V H- \S
                        ;
M/   i- \i



   ! ! ' | ' ! '
                                                                        1
                                                                        CB

                                                                        I
                                                                        £


                                                                        I
                                              ruT   iui
14
                       NESC Annual Report - FY1994

-------
                                                              Message from the NESG Director
                      Tl CIliCUIT «1.S«« MI-PS)
                      aa Ki-PS SKA UACM-OKl: KEIAVOliK
                      6U KEFS X.2S EACKECNB KETWORK
                      LAK IHilDGI: ClliCUITS
           TCP/IP CUiUUITS
              MSFS ANALOG ClliCUITS
           T5 CIliCUIT (
-------
Message from the NESC Director
Cooling
  Consumption of all that electrical current
produces an unwanted by-product of super-
computing, considerable heat created by the
computer's densely-packed circuitry.  With-
out sufficient cooling, Sequoia would be sub-
ject to extensive thermal damage.
  To keep Sequoia functioning within its
thermal envelope, the NESC has three 110-
ton chilling units which operate through two
175-ton cooling towers.  In the event of a
total failure of the chilling units, a 2,000-gal-
lon chilled water reservoir provides up to 15
minutes of emergency cooling capacity.

Monitoring
  During FY1994 the NESC completed the
installation and configuration of a Darwel
monitoring system. The Darwel system is a
PC-based system that is connected to sen-
sors throughout the NESC. It is used to
monitor the condition and security of all vital
facility support systems.

Software
  To complement the NESC's supercomput-
ing hardware, the NESC supports EPA
researchers and scientists with specialized
scientific application software packages.
Table 5, lists the software applications that
are available to researchers as of Septem-
ber 1994.
  The Cray supercomputer runs Cray's
standard operating system, UNICOS, which
is an acronym formed by the words UNIX
and C_ray Operating System.  As the first
part of the acronym suggests, UNICOS is a
UNIX System V-based system with Univer-
sity of California-Berkeley extensions. This
means that the UNICOS internals have been
extensively modified to make the system
usable on a supercomputer.
  UNIX is the de-facto standard operating
system in the scientific community. By using
a UNIX-based operating system, research-
ers can easily  move their programs and
applications between their local environment
   Table 5: Software applications available on
      Sequoia (as of September 1994.)
Discipline
Chemistry
Mathematics /
Statistics
Data Exchange
Graphics / Visual-
ization
Application ;; §?
I \*' '".':":"
Amber 4.0 :
AMSOL3.0 |
CHARMm 22
DISCOVER 2.9.0
DMol 2.3 i
GAUSSIAN 92
MOPAC 6.0.2
IMSL 2.0
NAGIibmarkhe
LIBSCI ;
netCDF
AVS 5.0 I
NCAR Graphics 3.2
and that of the NESC.  Once a user
becomes familiar with UNIX, those skills are
transferrable across a number of hardware
platforms, including the Cray.    ;
  Another advantage of a UNIX pllatform is
its adaptability to Distributed Computing. Be
it through Massively Parallel Processing
(MPP) or some form of distributed
computing such as Parallel Virtual  Machine
(PVM), UNIX permits the NESC to readily
embrace future trends in large-scale scien-
tific computing.                |
                             I
Visualization                ;
  In addition to "crunching numbers", the
power and speed of a supercomputer is ide-
ally suited to supporting the extensive use of
graphical visualization.  EPA scientists can
call upon state-of-the-art graphical visualiza-
tion and computer-modeling capabilities to
augment their research. These visualization
techniques permit the NESC's users to "see
the unseeable".               ;
  Using graphically-based scientific work-
stations, environmental researched develop
complex mathematical models of air
16
              NESC Annual Report - FY1994

-------
                                                            Message from the NESC Director
  pollution, atmospheric conditions, the chemi-
  cal components of pollution, and other EPA
  "Grand Challenges". The speed and data
  handling capabilities of supercomputers
  allow environmental scientists to model the
  interaction of the complex variables that,
  until now, could not be simulated in the
  laboratory.
   Another important aspect of the NESC's
  visualization support is in the vital area of
  Computational Chemistry.  This rapidly
  developing branch of chemistry permits
  chemists to use a supercomputer in place of
  their more traditional test tubes and flasks.
  Computational Chemistry experiments are
  intuitive, fast, and cost-effective.
   The NESC features a state-of-the-art visu-
 alization laboratory staffed by experts in sci-
 entific visualization. EPA researchers are
 encouraged to use the laboratory and its
 staff to transform their research data into
 strikingly meaningful graphical images.
   The NESC's visualization group includes
 skilled visualization specialists located at the
 NESC in Bay City and at EPA's Scientific
 Visualization Center in  RTP. They are avail-
 able to serve EPA's researchers with per-
 sonalized service and advice.
   Visualization training classes have been
 held for customers, both onsite and at cus-
 tomer sites. The training classes focus on
 the visualization tool kit adopted as EPA
 standard - Advanced Visualization Systems
 (AVS). Highly successful visualization con-
 ferences have been sponsored by EPA and
 have featured nationally-recognized
 speakers.

 Earth Vision
   EarthVision, EPA's educational program,
 is administered through a cooperative
 agreement with Saginaw Valley State Uni-
 versity, a local university.  This is a competi-
 tive program whereby high schools submit
 proposals for EPA evaluation. If selected,
 the school's students and teachers partici-
 pate in a tutorial program on Saturdays dur-
  ing the academic year. The high school
  teams then submit another proposal and a
  winner is selected for a three-week Summer
  Research Institute. It is during this institute
  that the schools work on their accepted
  projects. The schools continue to work on
  their projects during the academic year, and
  produce a report on their research.  Exam-
  ples of some research projects are:  Uptake
  and Food Chain  Transfer of Polychlorinated
  Byphenyls (PCBs) in the Zebra Mussel (Dre-
  issena polymorpha) and The Computer
  Modeling and Visualization of Contaminant
  Plume Development in Aquifers.

 The NESC Staff
   In addition to its hardware and software, a
 world-class supercomputing center requires
 considerable talent and expertise on the part
 of its staff. The NESC's staff includes
 experts in supercomputing operations, plan-
 ning, computational science, and related
 fields. The NESC staff is organized into the
 following functional areas:
   • Operations
   • Systems Support
   • Scientific User Support
   • Facilities      ',
   • Visualization  '•
   • Documentation
   • Management
   • Telecommunications
   The  NESC's staff is dedicated to support-
 ing the users.  Staff expertise is available to
 assist researchers with questions about
 computer systems, UNIX, code optimization,
 application porting, and documentation.
 User contacts and inquiries are encouraged.

 Customer Support
  The NESC's customers are supported by
 a special  scientific group dedicated to cus-
 tomer satisfaction. Staff members include
 scientists with advanced degrees in physics,
 chemistry, and computer science. This
 group helps scientists port their codes to the
supercomputer and optimize codes in order
to meet customer requirements. This group
NESC Annual Report - FY1994
                                     17

-------
Message from the NESC Director
has coordinated workshops in computational
chemistry and water modeling.

User Outreach
  The arrival of Working Capital Funding
demands a closer working relationship
between the environmental researchers
and experienced NESC Scientific Customer
Support staff to ensure that real and impor-
tant research needs are adequately met.
This process, started in FY1994 under the
title of "outreach", will become more efficient
as communications improve and computa-
tional research needs are better understood.
Outreach and collaboration with computa-
tional scientists at the NESC will materially
improve the efficiency with which the
NESC's resources are used and save EPA
significant funding in the process.

Collaborative Modeling.
  With the rapid pace of today's research,
scientists are increasingly turning to the
Internet,  rather than the customary journals,
for the latest in research information.  As col-
laborative tools such as collaborative visual-
ization (Mbone) improves and becomes
available at the centers of research, collabo-
rative modeling between environmental
researchers thousands of  miles apart will
become the preferred way to work together.
This process will also be augmented by col-
laboration with experienced computational
scientists at the NESC and at RTF.
                              i

Future Directions at the NESC
  The future is always predicated upon the
availability of funding supporting the NESC.
Assuming funding is adequate, one can
speculate on probable future activities at the
NESC. To keep up with resource demand at
the NESC, at least one more processor will
be needed to reach the maximum.processor
configuration on the existing C94,> bringing
the theoretical peak performance to about
four Gigaflops. In addition, sometime
between May and September of 1995,
AREAL-ORD funding will allow installation of
a Cray T3D Massively Parallel Processor
(MPP) computer at the NESC, an; architec-
ture useful for scientific applications that are
highly parallel in nature. Finally, FY1996 will
probably bring with it the installation of the
next generation of Cray Supercomputer.
1  Arthur G. Cullati, U.S. EPA Director, National Environmental Supercomputing Center (NESC), 135 Washington
    Avenue, Bay City, Ml 48708,517-894-7600.                                        !
18
              NESC Annual Report - FY1994

-------
   Framework for Environmental Modeling
 Background
   The ever-increasing complexity in the
 nature and extent of environmental prob-
 lems demands the use of more accurate and
 reliable integrated assessment tools by
 those directly responsible for solving the
 problems. Many of these more scientifically
 credible models, however, are difficult and
 time consuming to use. A different model
 exists for each pollutant of interest (i.e.
 ozone, acid deposition, particulate, nitrate
 loading) and for both the urban and regional
 scales. These individual models do not con-
 sider interactions between pollutant media,
 such as air and water.  Control of one pollut-
 ant could adversely affect the concentration
 of another pollutant, thus all key related pol-
 lutants must be considered simultaneously
 for integrated environmental assessments.
 Technical problems, such as slow execution
 times, inadequate access to large volumes
 of data, incompatibility of file formats, awk-
 ward human-computer  interfaces, and diffi-
 culty maintaining existing codes or porting
 large codes to new computer architectures,
 severely limit the usefulness of existing envi-
 ronmental assessment  tools by scientists
 and regulatory analysts. Therefore, an
 extensible framework is being developed to:
   • provide a community platform for contin-
    uous improvement of the scientific basis
    of environmental models
   • make model application, data, and policy
    relevant information more accessible to
    a variety of users to enable more effec-
    tive environmental decision making
   • support the infusion of emerging high
    performance computing and communi-
    cations technology and associated
    numerical algorithm  implementations.
 Research Objectives
   The three main objectives of the research
 are to:           ;

   • develop the Agency's capability to per-
     form complex multi-pollutant and multi-
     media pollutant assessments
   • build the States' capability to use more
     advanced assessment tools directly
     responsive to iheir needs
   • position the Agency to more easily inte-
     grate emerging computing technology
     into key assessment tools to ensure the
    most reliable and timely response to key
    environmental issues.
   The initial program is focused on develop-
 ing a common framework, called Models-3,
 to address multi-pollutant and multi-scale air
 quality managemeint issues with the flexibil-
 ity for future extension to cross media (air
 and water) issues.

 Approach       ;
  The initial version of Models-3 is designed
 to provide state-of-the-art urban and
 regional ozone, acid deposition, and aerosol
 modeling; user-friendly human-computer
 interaction; automated management of pro-
 cessing, data, and resources; and enhanced
 tools for analyzing environmental informa-
 tion. Atmospheric processes are treated as
 interchangeable science modules to enable
 rapid testing and integration of new science.
 These modules contain explicit formulations
 of scale and coordinate dependencies. To
 achieve acceptable turnaround for its users,
the system incorporates high performance
computing and communications technol-
ogy. For example, key algorithms are being
adapted to take advantage of parallel com-
puting. The Models-3 system will also be
NESC Annual Report - FY1994
                                     19

-------
Framework for Environmental Modeling
extensible to satisfy special distributed pro-
cessing needs of mixed-media modeling
(e.g. simultaneous simulation of air and
water quality).
  Rapid prototypes of user interfaces, data
and resource management, system control
and communication, science process, and
analysis and visualization are developed to
test the feasibility of different approaches
and to resolve critical design issues.  The
system implementation is based on formal
system requirements and design specifica-
tions integrating the knowledge gained
through rapid development and user testing
of prototypes.

Results
  The first five volumes of "EPA Third Gen-
eration Air Quality Modeling System" are in
final review prior to EPA clearance:
             Concept
             Project Management Plan
             Project Risk Management
             Project Verification and Valli-
Volume 1:
Volume 2:
Volume 3:
Volume 4:
dation
Volume 5: Project Configuration Man-
agement
An early prototype of a simplified air quality
model with linear chemistry demonstrated
that a modular approach could effectively
provide interchangeability among science
modules.  Scalable algorithms with general-
ized coordinate systems for key physical
and chemical processes are being tested,. A
benchmark chemistry solver was imple-
mented to serve as the  baseline for perfor-
mance evaluation of chemistry algorithms
on parallel architectures. The mesoscale
meteorological model (MM5)  has been
ported to Sequoia and is being tested for
multiscale data generation.
  Several Application Visualization System
(AVS) modules have been written to provide
in-house researchers with desktop data
analysis and animation  capabilities which
directly access data sets on Sequoia. Initial
versions of modules for X-Y plots, colored
tiles, colored mesh, and colored tile and col-
ored mesh comparison have been created
and linked with a generic (reusable) AVS
module named "Run Visualization" which
serves as a driver for selecting and running
data display modules. Scientists are able to
visualize  both input and output data associ-
ated with the Regional Oxidant Model, the
Regional Acid Deposition Model, and the
Urban Airshed Model. These tools handle
data access across computing platforms,
plotting and manipulating graphic| images,
and automatic selection of background
maps.                        |
  Rapid prototyping of system framework
capabilities such as user interface, model
builder, collaborative tools, interactive analy-
sis/visualization, and decision support have
been completed to better define require-
ments and design alternatives for1 environ-
mental decision support systems;
Knowledge gained from user feedback on
early prototypes is being used to provide a
balanced understanding of the hujman and
machine  interaction issues and for formal
specification of the Models-3 system
requirements and design.       ;
  Researchers at MCNC and NCSU have
demonstrated the use of dependency
graphs for automated execution of multiple
dependent programs including meteorology
and emissions processing for thejUrban Air-
shed Model The processing graphs are
created using the Heterogeneous! Network
Computing Environment (HeNCE) - a public
domain integrated graphical environment for
creating and running parallel programs over
a heterogeneous collections of computers.
These directed graphs which specify both
data and execution dependencies among
the tasks, support the building, stbring, and
reuse of large sequences of executions,
manage the scheduling and execution of the
computational processes, automate the
retrieval and storage of the data required as
inputs and produced as outputs. |MCNC
also  completed a portable suite of modules,
data, and networks for an analysis compo-
nent for the environmental decision support
20
                                                         NESC Annual Report - FY1994

-------
                                                        Framework for Environmental Modeling
  system.  Interactive analysis has focused on
  an interface for controlling compilation, exe-
  cution and visualization within the prototype.
  Communication channels between different
  computational modules are an important
  area of extensive testing.

  Future Objectives
    During FY1995 and FY1996 the detailed
  design specifications for the Models-3
  framework will be finalized, coded and
  tested. Tests will also be performed to ana-
  lyze communications issues between cou-
  pled meteorology and air quality models. A
  variety of tests will be performed to under-
  stand the advantages and limitations of dis-
 tributed computing in a hardware
 environment that employs both vector and
 parallel processing components. Models-3
 visualization prototypes will be tested to
 evaluate latency effects in a network parallel
 environment where functional modules are
 executed on distinct systems to support ani-
 mation, image rendering and image dis-
 play.  Remote collaborative computing
 approaches will be evaluated for
 effectiveness.
 Relevant Reports and Publications
 Byun, D.W., A.F. Hanna, et al., "Models-3 Air
  Quality Model Prototype Science Concept
  Development".  Transactions of the
  A&WMA Specialty Conference on
  Regional Photochemical Measurements
  and Modeling Studies.  November, 1993,
  San Diego, Caliifornia.
Coats, C.J., Jr., A.F. Hanna, et al., "Model
  Engineering Concepts for Air Quality Mod-
  els in an Integrated Environmental  Model-
  ing System.1' Transaction of the A&WMA
  Specialty Conference on Regional Photo-
  chemical Measurements and Modeling
  Studies. Novemiber, 1993, San Diego,
  California.
Dennis, R.L., D.W. Byun, et al., "The Next
  Generation of Integrated Air Quality Mod-
  eling: EPA's Models-3."  Transactions of
  the A&WMA Specialty Conference on
  Regional Photochemical Measurements
  and Modeling Studies. November, 1993,
  San Diego, California.
 1 Joan H. Novak*, Atmospheric Characterization and Modeling Division, Atmospheric Research and Exposure
     Assessment Laboratory (AREAL), RTP, NC 27711 (*on assignment from the National Oceanic and Atmospheric
     Administration (NOAA), U. S. Department of Commerce.)
NESC Annual Report - FY1994
                                    21

-------
Framework for Environmental Modeling
22
NESC Annual Report - FY1994

-------
   Models of Service Level for Jobs Submitted to the NESC's
   HPC Resources1. *
 Abstract
   The Network Queueing System (NQS)
 complex on the National Environmental
 Supercomputing Center's (NESC) C94/264
 was analyzed to determine service levels as
 required by the U.S.  EPA's National Data
 Processing Division (NDPD) under the exist-
 ing contract with Martin Marietta Technical
 Services, Inc. (MMTSI).  The queueing the-
 ory method described in the fiscal year 1993
 Annual Report has again been applied in the
 analysis of jobs submitted to twenty-two
 public and four private queues on Sequoia
 over an approximate five month period dur-
 ing fiscal year 1994. As previously
 observed, the analysis showed that the
 probability density function of queue wait
 times and service (or wallclock) times are
 hyperexponential distributions and that pro-
 cess rates and mean times are easily
 extracted after a fit to the empirical data.
 The queueing theory results have been
 entered into the qperf command on the
 Cray and this gives the expected queue wait
 time and service time for the queue appro-
 priate to the user specified CPU and mem-
 ory requirements. While past performance
 is no guarantee of future results, stability of
 the analysis is only suspect if the character
 of the whole job population changes drasti-
 cally at some future time. Clearly, the longer
 the time interval of the sample the more sta-
 ble the prediction and therefore the analysis
 has been updated on a regular basis at the
 NESC.

 Introduction
  It was a requirement of EPA's National
 Data Processing Division (NDPD) under the
 existing contract with MMTSI that users of
the NESC had access to a quantitative mea-
 sure of service levels for jobs submitted to
 the Cray resource at the NESC.  Detailed
 information on job processing is available
 from Cray Standard Accounting (CSA) and,
 at the NESC, locally written code extracts
 job-level transaction data from the CSA
 super-record on a daily basis. This process
 tabulates (for each batch job) the CPU, ser-
 vice (wall-clock) and queue wait times, and
 provides the data for queue analysis with a
 view to determining the quality of service
 users enjoy at NESC.

 The NESC NQS System and Job Level
 Data Collection

   The NESC's supercomputer, Sequoia,
 has twenty-two public and four private
 queues identified as shown in Table 1 on
 page 26. The period of the sample repre-
 sents approximately five months of through-
 put on Sequoia between commissioning of
 the two CPU C94 and the upgrade to a three
 CPU configuration. Table 1 shows, the sam-
 ple size, N, for the respective queues, the
 queue limits on memory and CPU time, and
 also some descriptive statistics on the actual
 CPU time used. It is interesting to observe
 that the mean CPU time is invariably signifi-
 cantly smaller than the queue CPU time
 limit. A total of 6,796 jobs were processed in
 this sample and Table 1 shows the distribu-
 tion with respect to the queues. Table 1 also
 shows values of the coefficient of variation
 (sample standard deviation divided by the
 arithmetic mean). This is typically larger
than unity and this indicates the likelihood of
a hyperexponential distribution. Although
not shown here, a simiiar result holds for
queue wait and sen/ice times.
NESC Annual Report - FY1994
                                    23

-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
Elements of Queueing Theory and
Analysis of NESC Data
  Once a job has been dispatched to a par-
ticular queue by NQS it can be viewed as
residing in a single server queue system.
Queueing theory then applies and treats
these times as random variables or, observ-
ables that do not have individually predict-
able values but whose values show
statistical regularity. In particular, a random
variable, X, is completely described by a
probability distribution function,
F(t)=Prob{X < t}, or by the corresponding
probability density f(t)=dF/dt. The latter is a
frequency distribution and may have various
possible shapes depending on the details of
the queueing system. However, in this anal-
ysis it is found that the distribution of queue
and service times is dominated by exponen-
tial shapes such as f(t)= ne'1*1, t > 0, or com-
binations of exponentials
(hyperexponential). Therefore, the analysis
requires the determination of the rate
parameter p. from which the probability distri-
bution is computed as F(t)=1-e^1. The func-
tion F(t) is also known as the cumulative
probability because it represents the "area"
under the density curve f(t). Since both job
queue wait and service times have been
recorded at the NESC, each is analyzed in
four simple steps: (1) a sort into bins in a his-
togram plot, (2) the fitting of the resulting dis-
tribution with jae"^ to determine n, (3)
computation of the mean as 1/u, (property of
the exponential distribution) and, (4) compu-
tation of F(t). The observed distribution is
characteristic of a hyperexponential func-
tion where the forward peak is described by
an exponential with one rate and the tail is
described by another exponential with
another rate. The mean times are simply
the inverses of the respective rates for each
exponential distribution. The fact that there
is a hyperexponential distribution shows that
the jobs are of (at least)  two types. How-
ever, since the largest fraction of the sample
falls in the peak of the distribution, the analy-
sis focused on results for an exponential dis-
tribution which fits this peak.
  Table 2, page 27, shows the values
obtained for the mean queue rate from an
empirical fit to the forward peak of the
observed probability density.  Cases where
N is small  (less than 40) should be viewed
with some circumspection and cases with
very small samples (less than 10), as intelli-
gent guesses at best. Table 2 shows the
distribution of mean queue times as a func-
tion of the  queue. This  is seen to| vary over
many orders of magnitude from 0:01 min-
utes for the 4MW, 600 second queue to over
1,000 minutes for queues with 10(3,000 sec-
ond time limits. Table 3, page 28, [shows the
corresponding results for the  mean service
rate and mean service time, again for expo-
nential empirical fits to the forward peak of
the probability density. The same bomments
on sample size apply here also.  •
  Table 3 shows the distribution ojf mean
service times which varies from one minute
to 4,264 minutes. One method of [estimating
a gross throughput is to use the sum of the
individual rates given in the row marked
"TOTALS" In Tables 2 and 3.  Forjthe queue
times this gives a rate of 162.7 per minute, a
value biased by two large entries of 80.4 and
81.6, which, when subtracted, leaye the
gross queue estimate of 0.8 jobs/minute.
For the service time the gross rate is 1.8
jobs per minute, with a strong bias of 0.77
from one queue, which when subtracted,
leaves 1.02 jobs per minute.  Thus, as a
gross estimate, a two CPU C94 configura-
tion processed one job per minute in the
mean. However, it should be  stated that this
simple argument takes no account of vari-
ability in the data. Also, while mean rates
and mean  times are shown in Tables 2 and
3, the mean time alone  is not the best indi-
cator of service level. Variability of the data
is taken into account when the cumulative
probability is computed  as the accumulated
area under the corresponding probability
density distribution. Also, the cumulative
probability distribution curve shows the
probability (or likelihood) that a given job has
the predetermined queue/service Itime cho-
sen for any specified value of the [time.
                                                             NESC Annual Report - FY1994

-------
                           Models of Service Level for Jobs Submitted to the NESC's HPC Resources
   Sequoia's qperf command has been
 revised to include the results of the C94
 analysis and made available online to users
 of the NESC.  This command generates a
 tabular form of the model probability distribu-
 tion function for both queue wait and service
 times.  The user specifies the CPU time and
 memory requirement and the appropriate
 queue is selected. Figure 1, page 29,
 shows an example of how the command is
 used and the resulting output that the com-
 mand provides the user.
Conclusions
  A five month sample of job level NQS data
on Sequoia as a G94/264 configuration has
been analyzed by a simple single server
queueing model. The resulting analysis
enables the prediction of expected job ser-
vice levels for queue wait and service times.
These predictions have been made avail-
able to NESC's users through a revision of
the on-line qperf command which tabu-
lates expected queue and service times and
associated probabilities.
 1  George Delic, Martin Marietta Technical Services, Inc. (MMTSI), National Environmental Supercomputing Center
     (NESC), 135 Washington Avenue, Bay City, Ml 48708-5845.
 2  Robert Upton, Martin Marietta Technical Services, Inc. (MMTSI), National Environmental Supercomputing Center
     (NESC), 135 Washington Avenue, Bay City, Ml 48708-5845.               ,
NESC Annual Report - FY1994
                                     25

-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
  Table 1: Queue names and sample staitistics for CPU times for the twenty-two public and four
   private queues of the NESC's NQS for a sample corresponding to the period 6 May to 30
 September 1994 when a Cray C94 2/64 was installed at the NEESC. Queue limits are shown in
 million words (memory) and seconds (CPU time). The sample size for each queue is shown in
                                the column labelled N.                      i
Queue name
Q4MW_600S
Q4MW_3600S
Q4MW_100KS
Q4MWJJNLS
Q8MW_600S
Q8MW_3600S
Q8MW_100KS
Q8MWJJNLS
Q12MW_600S
Q12MW_3600S
Q12MW_100KS
Q12MW_UNLS
Q18MW_600S
Q18MW_3600S
Q18MW_100KS
Q18MWJJNLS
Q24MW_600S
Q24MW_3600S
Q24MW_100KS
Q24MWJJNLS
Q40MW
AREAL1 (RADM)
AREAL2 (ROM)
AREAL_ROMDP
AREAL_RADMDP
NIGHT
Memory
(MW)
4
4
4
4
8
8
8
8
12
12
12
12
18
18
18
18
24
24
24
24
40
10
16
6
4
24
CPU
(sec)
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
999999
999999
999999
1830
1830
2880
N
•? -/
1136
1588
338
9
98
183
300
36
597
477
375
27
127
181
351
72
45
43
83
43
60
202
414
5
0
6
Mean ,
CPU ;
time
(min) ~
0.82
7.45
114.27
889.46
0.90
17.19
109.74
326.13
2.81
12.83
94.64
1070.54
2.04
12.08
90.88
174.63
1.71
17.74
116.29
623.65
60.57
135.17
51.13
0.03
-
165.88
Standard
deviation
1.88
12.55
211.31
1239.80
1.76
17.37
106.34
578.47
2.48
16.13
143.28
1771.95
2.91
16.91
146.99
389.51
2.76
1.38
145.92
929.47
103.81
93.53
52.27
0.03
-
147.64
Coefficient
of
variation
I 2.29
| 1.68
; 1.85
i 1.39
1.94
i 1.01
| 0.97
; 1-77
! 0.88
j 1.26
j 1.51
I 1.66
! 1.43
! 1.40
| 1.62
| 1.66
I 1.61
i 0.98
i 1.25
i 1.49
i 1.71
i 0.69
I
\ 1.02
i 0.96
i
I 0.89
26
NESC Annual Report - FY1994

-------
                          Models of Service Level for Jobs Submitted to the NESC's HPC Resources
  Table 2: Queueing model analysis results of Sequoia NQS data for a sample corresponding to the
 period 6 May to 30 September 1994 when a Cray C94 2/64 was installed at the NESC. The results
   show queue rate parameters and mean queue wait times as installed in the qperf user tool on
                                        Sequoia.
"* y -e
<. s- *? *
*-*'C\,
Queue name'
A < Z» *V
f ^s
^ ' 0
Q4MW_600S
Q4MW_3600S
Q4MWJOOKS
Q4MW_UNLS
Q8MW_600S
Q8MW_3600S
Q8MW_100KS
Q8MWJJNLS
Q12MW_600S
Q12MW_3600S
Q12MW_100KS
Q12MW_UNLS
Q18MW_600S
Q18MWJ3600S
Q18MW_100KS
Q18MW_UNLS
Q24MW_600S
Q24MW_3600S
Q24MW_100KS
Q24MW_UNLS
Q40MW
AREAL1 (RADM)
AREAL2 (ROM)
AREAL_ROMDP
AREAL_RADMDP
NIGHT
TOTALS
Memory
/-(MW)
4
4
4
4
8
8
8
8
12
12
12
12
18
18
18
18
24
24
24
24
40
10
16
6
4
24

CPU (sec)
^ •>
•?
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
999999
999999
999999
1830
1830
28800

N
1136
1588
338
9
98
183
300
36
597
477
375
27
127
181
351
72
45
43
83
43
60
202
414
5
0
6
6796
Mean
.queue
" rate
(number/
min)
80.3958260
0.1909948
0.0176312
0.0339270
0.1061542
0.0094518
0.0006956
0.0021341
0.0274937
0.0113368
0.0008207
0.0022504
0,0082988
0.0220156
0.0073623
0.0025847
0.0087332
0.0047635
0.0050432
0.0010706
0.1050747
0.0005905
81,5830440
0.2132761
-
0.0008520
162.761425
Mean
queue
time
(minutes) 1"
0.012
5.236
56.718
29.475
9.420
105.801
1437.631
468.591
36.372
88.208
1218.470
444.356
; ,t2G.49<3
45.422
135.828
386.895
114.506
209.932
198.288
934.091
9.517
1693.470
0.012
4.689
-
1173.659
0.006
NESC Annual Report - FY1994
27

-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
  Table 3: Queueing model analysis results for service (wallclock) rates and times for
corresponding to the period 6 May to 30 September 1994 when a Cray C94 2/64 was installed at
          NESC. These results are now installed in the qperf user tool on Sequpiaj
                a sample
Queue name '
• - > „
Q4MW_600S
Q4MW_3600S
Q4MW_100KS
Q4MW_UNLS
Q8MW_600S
Q8MW_3600WS
Q8MW_100KS
Q8MW_UNLS
Q12MW_600S
Q12MW_3600S
Q12MW_100KS
Q12MW_UNLS
Q18MW_600S
Q18MW_3600S
Q18MW_100KS
Q18MW_UNLS
Q24MW_600S
Q24MW_3600S
Q24MW_100KS
Q24MW_UNLS
Q40MW
AREAL1 (RADM)
AREAL2 (ROM)
AREAL_ROMDP
AREAL_RADMDP
NIGHT
TOTALS
Memory
(MW)
4
4
4
4
8
8
8
8
12
12
12
12
18
18
18
18
24
24
24
24
40
10
16
6
4
24

.CPU (sec) \f,
s.
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
600
3600
100000
999999
999999
999999
999999
1830
1830
28800

N
1136
1588
338
9
98
183
300
36
597
477
375
27
127
181
351
72
45
43
83
43
60
202
414
5
0
6
6796
Mean
service
rate
(number/ ,
min)
0.7730623
0.2481 766
0.0114870
0.0001752
0.1862655
0.0124481
0.0015649
0.0024683
0.0534026
0.0407103
0.0084882
0.0002345
0.1410151
0.0767252
0.0067908
0.0015469
0.0397179
0.0191779
0.0020375
0.0007568
0.0159816
0.0022673
0.0058797
0.1403448
—
0.0007308
1.7614555
Mean -
service
time" -
(minutes)
j 1 .294
j 4.029
87.055
5709.196
5.369
80.334
639.018
405.144
' 18.726
i 24.564
| 117.811
4264.010
j 7.091
: 13.034
i
[ 147.259
: 646.475
| 25.178
52.143
490.788
1321.414
i 62.572
441 .053
170.077
7.125
i —
1368.415
0.558
28
NESC Annual Report - FY1994

-------
                           Models of Service Level for Jobs Submitted to the NESC's HPC Resources
     Your memory request is 12  MWords.
     Your time request  is 600 seconds.

     Your job  will be placed in the q!2mw_600s queue.
     Queue memory limit is 12 MW.
     Queue time limit is 600 seconds.
     Sampling  period is 05/94-09/94.  Sample size is  597.
     Queue rate is 0.027494. Service rate  is 0.053403.
              10% of
              20% of
              30% of
              40% of
              50% of
              60% of
              70% of
              80% of
              90% of

              10% of
              20% of
             30% of
             40% of
             50% of
             60% of
             70% of
             80% of
             90% of
 jobs
 jobs
 jobs
 jobs
 jobs
 jobs
 jobs
 jobs
 jobs

 jobs
 jobs
 jobs
 jobs
jobs
jobs
jobs
jobs
jobs
 in this
 in this
 in this
 in this
 in this
 in this
 in this
 in this
 in this

 in this
 in this
 in this
 in this
in  this
in  this
in  this
in  this
in  this
 queue
 queue
 queue
 queue
 queue
 queue
 queue
 queue
 queue

 queue
 queue
 queue
 queue
queue
queue
queue
queue
queue
 are
 are
 are
 are
 are
 are
 are
 are
 are
scheduled
scheduled
scheduled
scheduled
scheduled
scheduled
scheduled
scheduled
scheduled
within
within
within
within
within
within
within
within
within
3.83  minutes.
8.12  minutes.
12.97 minutes.
18.58 minutes,
25.21 minutes,
33.33 minutes,
43.79 minutes.
58.54 minutes.
83.75 minutes.
 complete
 complete
 complete
 complete
complete
complete
complete
complete
complete
     execution
     execution
     execution
     execution
     execution
     execution
     execution
     execution
     execution
     within 1,
     within 4,
     within 6,
     within 9,
     within  12
     within  17
     within  22
     within  30
     within  43
       .97 minutes.
       .18 minutes.
       .68 minutes.
       .57 minutes.
       .98 minutes.
       .16 minutes.
       .55 minutes.
       .14 minutes.
       . 12 minutes.
  Rgure 1: This shows the response to Sequoia command line entry qperf -m 12 -t 600. The the queue and service rates are the
  mean rates found in the analysis and the corresponding mean queue time is 0.027494'1 = 36.4 minutes and mean service time is
 0.053403"  - 18.7 minutes. The tabulations are for the respective cumulative probability distributions in equal percentile increments
  to indicate the model prediction for the spread in times. The cumulative probability distribution table can also be read in reverse to
read the probability that a specific time is observed. As an example, for this queue, the probability that a job has a queue wait time of
  25.2 minutes and a service time of 22.6 minutes is 0.5 and 0.7, respectively. Since these results are for the C94/264 configuration
          they provide conservative estimates of the expected service levels on the current G94/364 installation.
NESC Annual Report - FY1994
                                                                29

-------
Models of Service Level for Jobs Submitted to the NESC's HPC Resources
30
NESC Annual Report - FY1994

-------
   Second Annual International Environmental  Visualization
   Workshop1
   The second annual International Environ-
 mental Visualization Workshop was held at
 the Marriott Society Center in Cleveland,
 Ohio from August 30 through September 1,
 1994. The event was a collaboration
 between the U.S. Environmental Protection
 Agency's (EPA) National Data Processing
 Division (NDPD), the Great Lakes National
 Program Office (GLNPO), the Gulf of Mexico
 Program Office, and EPA High Performance
 Computing and Communications (HPCC)
 Program.
   The workshop was designed to provide
 EPA and associated environmental
 researchers and policy analysts with an
 opportunity to learn about scientific visual-
 ization tools. The focus of the two-and-one-
 half-day workshop was on the application of
 visualization, high speed networking, and
 high performance computing technologies to
 environmental research problems. With that
 goal in mind, invited researchers from out-
 side EPA shared their experiences and
 viewpoints on exploring natural and physical
 sciences data sets.  Gregory J. McRae, pro-
 fessor of Chemical Engineering at the Mas-
 sachusetts Institute of Technology, provided
 his perspectives on applying visualization
 tools to air pollution problems. Mary Whit-
 ton, from SUN Microsystems and current
 Chair of the Association for Computing
 Machinery's Special  Interest Group on
 Graphics (ACM/Siggraph), gave us some
 insights into the trends and futures of visual-
 ization products.
  Three visualization researchers agreed to
 share the latest tools they are developing.
 Polly Baker, from the National Center for
 Supercomputing Applications  (NCSA), pre-
 sented task directed visualization tools
 which are oriented toward assisting specific
 inquiry and analysis activities of scientists.
 John Rasure, of Khoral Research Inc. and
 the University of Mew Mexico, showed us a
 complete application development system
 (Khoros 2.0) which redefines the software
 engineering process to include all members
 of the work group, from the application end-
 user (i.e. environmental scientists) to the
 infrastructure/visualization programmer.
 Peter Kochevar, from Digital Equipment Cor-
 poration (DEC)  and the San Diego Super-
 computer Center (SDSC), demonstrated
 the implementation of collaborative comput-
 ing, database management, virtual reality,
 and visualization tools to examine earth sci-
 ences data sets as sociated with the Sequoia
 2000 project.
   Researchers from the National Center for
 Atmospheric Research (NCAR), California
 Air Resources Board (GARB), Environment
 Canada, and Supercomputer Systems Engi-
 neering and Services Company (SSESCO)
 also agreed to share their practical experi-
 ences associated with applying visualiza-
 tion tools to environmental problems. Work
 underway at EPA research laboratories in
 Ada, Oklahoma  and Athens, Georgia and
 the Environmental Monitoring and Assess-
 ment Program (EMAP) was also featured at
 this visualization workshop.
  An important component of visualization
 technology transfer includes computer
 graphics education programs. Gloria
 Brown-Simmons, chair of the Visualization
 and Presentation Subcommittee for the Glo-
 bal Learning and Observations to Benefit the
 Environment (GLOBE) Program, demon-
strated current efforts underway to educate
children  around the world about environ-
mental sciences  research. Ralph Coppolla,
of Saginaw Valley State University, spoke
NESC Annual Report - FY1994
                                    31

-------
Second Annual International Environmental Visualization Workshop
about EarthVision, EPA's computational sci-
ence educational program for high school
students and teachers. Acha Debela, from
North Carolina Central University and cur-
rent Chair of the Historically Black Colleges
computer graphics education effort of ACM/
Siggraph, also shared with us his,latest
activities.
  The event was well received and planning
is already underway for the Third Annual
International Environmental Visualization
Workshop in 1995.              !
1  Theresa Marie Rhyne, Martin Marietta Technical Services, Inc. (MMTSI), Research Triangle Park, NC 27711.
32
               NESC Annual Report - FY1994

-------
   Calculating the Rates of DNA-Catalyzed Reactions of
   Aromatic Hydrocarbon DJol Epoxides1
   Some classes of small electrophiles - the
 diol epoxides of polycyclic aromatic hydro-
 carbons, for example - react to form cova-
 lent adducts with nucleophilic sites within
 specific nucleotide sequences of DNA. If
 the resulting lesion is not repaired prior to
 replication, a transcriptional block or a muta-
 tion,  possibly in a cancer-associated gene,
 may  result. The overall goal of our research
 program is to understand the factors regulat-
 ing the extent to which  these molecules
 react with  DNA as well  as their base
 sequence  specificity and the consequences
 of their binding.  Explaining the chemical
 events in the early etiology of chemical car-
 cinogenesis requires predicting relative
 extents of  DNA binding within a class of
 small molecules as well as deducing the
 nucleotide sequences that are "hot-spots"
 for mutation for each member of the class.
 Predicting  "mutation "hot-spots" - not neces-
 sarily coincident with "binding hot-spots" -
 may be possible if the adduct conformation
 is known as a function of sequence con-
 text.  For some molecules (e.g.
 benzo[a]pyrene diol epoxide and aflatoxin
 BI) binding hot-spots, mutation hot-spots
 and adduct conformations have been exper-
 imentally characterized.
   Some molecules exhibit substantial and
 varied sequence preferences in their cova-
 lent adduct formation with DNA,  The nucle-
 otide target is the primary identifier of adduct
 type but the sequence context of this nucle-
 otide  has a dramatic effect on the binding.
 For example, aflatoxin B-, has a preference
 for reacting with the N7 position of G.  In
 addition, in  considering the sequence spe-
 cific binding to the trinucleotide S'-XGY-S',
 the likelihood of adduct formation varies with
 X as G > C > A > T and  varies with Y as
 G > T > C > A and the 3' neighbor exerts a
 greater influence ihan does the 5' neighbor
 (Said and Shank, 1991). The alkylating
 agents MNU and CCNU also preferentially
 attack the central G of GGG sequences.
 The enantiomers of trans-7, 8-dihydroxy-
 anti-9,10-epoxy-tetrahydrobenzo[a]pyrene
 (BPDE-2) prefer different sequences in their
 binding to the exocyclic amine of G (Rill and
 Marsch, 1990). The (-) isomer prefers AGG,
 CGG and TGG while the (+) isomer rarely
 binds to TGY triplets.

   Both the sequence context and the adduct
 conformation contribute to the mutation
 spectrum of a chemical.  Mutation "hot-
 spots" do not necessarily coincide with bind-
 ing "hot-spots". The frameshift mutagen, N-
 2-Acetylaminofluorene (AAF), which forms
 adducts primarily at the C8 of guanine, binds
 approximately equally to all G residues
 regardless of sequence context, yet muta-
 tional "hot-spots" occur at alternating GC
 sequences and at contiguous G
 sequences.  A model has been advanced
 suggesting that the local structure surround-
 ing the AAF-DNA a,dduct depends on the
 sequence context and that this structure
 determines the adduct fate (Lambert et al.
 1992). The trans-7, 8-diol-9,10 epoxide of
 (+)anfr-benzol[a]pyrene, (+)BPDE-2,  a puta-
 tive "ultimate carcinogen", has been shown
 to predominately lead to G to T transversion
 in a site-directed mutagenesis study
 (MacKay et al 1992).  Rodriguez and
 Loechler (1993) proposed a model in which
 bulky adducts such as those formed by
 (+)BPDE-2 binding with the N2 of G exist in
multiple conformations which depend on
sequence context and that the conformation
determines the choice among G to T, G to A,
and G to C point mutations.
NESC Annual Report - FY1994
                                     33

-------
Calculating the Rates of DNA-Catalyzed Reactions of Aromatic Hydrocarbon Diol Epoxides
  The mutation spectrum of small molecules
is correlated with their potential to induce
specific neoplastic transformation. This
statement is based on DNA-adduct forma-
tion of several chemical carcinogens, most
notably, benzo[a]pyrene and aflatoxin B^, as
well as several alkylating agents and the
mutations involved in protooncogenes and
in the well studied tumor-suppressor gene,
p53.  For example, G to T transversionsat
the third base of codon 249 (AAG) in the p53
gene are frequently associated with hepato-
cellular carcinoma occurring in southern
Africa, Qidong, China, establishing a link
between exposure to aflatoxin B-\ and a spe-
cific mutation in a cancer-related gene.
Benzo[a]pyrene (B[a]p) has also been
shown to exhibit a high frequency of G to T
transversion in the "hot-spot" region of p53
and in codon 12 of the Ha-ras oncogene
(Bailleul et al.  1989). On the other hand, 7,
12-dimethylbenzanthracene (DMBA) exhib-
its a similar mutational frequency to B[a]p
but is only rarely linked with G to T transver-
sion in p53 (Ruggeri et al., 1993). No tumor
cell line derived from DMBA-induced tumors
contains G to T transversion (Ruggeri et al.
1991). The alkylating agents, N-nitrosoethy-
lurea and  N-nitrosomethylurea (MNU),
induce G to A transitions in codons 204 and
213 of p53 with high frequency (Ohgaki, et
al., 1992). The fact that the distinct mutation
spectra of different carcinogens are associ-
ated with different cancers suggests that the
ability to predict the mutations caused by a
given chemical may provide an etiologically
based predictive tool for its carcinogenicity.
Knowing the specific pathway for initiation
carcinogenesis for a given chemical  may
ultimately prove useful as suggested by the
recent demonstration that a ras-activating G
to A transition in O6- alkylated-DNA adducts
formed by reaction with MNU has been
blocked in mice by the transgenic expres-
sion of a single human DNA repair gene
(Dumenco, et al.,  1993).
  Epoxides are subject to acid-catalyzed
(pH-dependent; rate constant kH) and spon-
taneous (pH-independent; rate constant ko)
hydrolyses and the overall observed rate
constant can be written as:      |

              k = ko + [H*J kH
                Equation 1       |

  The diol epoxides of numerous j polycyclic
aromatic hydrocarbons undergo DNA-cata-
lyzed hydrolysis, the rate constants of which
transversions have been experimentally
measured by others. We have hypothesized
that acidic domains at the surface! of DNA
serve as catalytic zones for the creation of
reactive carbocations (Lamm and; Pack,
1990).  The rate constant for hydrolysis of a
diol epoxide (DE) in the presence! of DNA
can then be written in a way that embodies
this                           ;
hypothesis:                    !

      J {kO + kH [H+](R)| [DE] |(R) dV

               J  [DE](R)dV  j
                              i
                              |
                Equation 2      '
                              I
  In this equation, the spatial distribution of
hydronium ion concentration is written as
[H"I"](R). When the pseudo-first-o^der rate
constant kH [H+](R), is added to ko, the rate
constant for hydrolysis at R is obtained.
Multiplying this by the concentration of diol
epoxide at R, [DE](R), yields the local rate of
hydrolysis. The overall rate of hydrolysis, k,
is then given by the normalized summation
over the volume of the system, as; written in
Equation 2.  If ko and kH are experimentally
known, k can be calculated from the distribu-
tions [H+](R) and [DE](R).       j
  Using a variable dielectric Poisson-Boltz-
mann approach (Pack et al., 1993),  the spa-
tial distribution of hydrogen ions, [H+](R),
and the electrostatic potential,  (j)(jR), of the
DNA-electrolyte system have bee;n mapped
to a non-cartesian grid that contours the
nucleic acid surface. A Metropolis Monte
Carlo calculation was done to determine the
distribution of diol epoxide [DE](R) in this
average electrostatic field of the system.
34
               NESC Annual Report - FY1994

-------
              Calculating the Rates of DNA-Catalyzed Reactions of Aromatic Hydrocarbon Diol Epoxides
  Briefly, this was done by computing the inter-
  action energy, U, of the diol epoxide with the
  remainder of the system as shown in Equa-
  tion 3.
   The van der Waals parameters, a and c,
  were taken from the AMBER parameter set
  (Weiner etal., 1986). The electrostatic
  energy was obtained by determining the
  environmental grid cell, k, in which each
  atom, i, of the diol epoxide was located and
  subsequently multiplying the charge on that
  atom by the PB-calculated potential in that
  cell. The rigid diol epoxide was translated
  and rotated by random amounts to generate
  new configurations which were kept or
  rejected according to the usual Metropolis
  criterion.  Special adaptations, scaling the
  maxima for the translational and rotational
  motions at each step, were required to
  improve convergence.  These will be
 detailed in a subsequent publication.
   The (+)syn and (+)antf diastereomers of
 frans-7,8-diol-9,10-epoxy-tetrahy-
 drobenzo[a]pyrene were studied.  The over-
 all hydrolysis rate constant as well as the
 spatial distribution of rates have been calcu-
 lated as a function of pH and DMA concen-
 tration. These have been compared to the
 experimentally determined rate constants for
 DNA-catalyzed hydrolysis of these mole-
 cules (Islam etal. 1987). The agreement of
 the calculated and experimental rates for
 this  reaction as a function of DMA concen-
 tration and of pH indicate that the hypothesis
 is correct and suggests that this is a promis-
 ing step along the pathway in predicting the
 genotoxicity of a chemical from its molecular
 structure.
                           BPDE
                Equation 3
 References

 Said, B. and R.C. Shank, Nucl. Acids Res.
     19:1311-1316,1991.
 R. L Rill and G.A. Marsch, Biochemistry
     29:6050-6058, 1990.
 Lambert, I.E., R.L. Napolitano and R.P.R
     Fuchs, Proc. Natl. Acad. Sci. USA
     89:1310-1314,1992.
 MacKay, W., M. Benasutti, E. Drouin and
     E.L Loechler, Carcinogenesis 13:1415-
     1425,1992.
 Rodriguez, H. and E. L. Loechler, Biochem-
    istry 32:1759-1769,1993.
 Bailleul, B., K. Brown, M. Ramsden, R.J.
    Akhurst, F. Fee and A. Balmain, Environ.
    Health Perspect. 81:23-27,1989.
 Rugerri, B.,  M. DiRado,  S.Y. Zhang., B.
    Bauer, T. Goodrow and A.J.P. Klein-
    Szanto Proc. Natl. Acad. Sci. USA
    90:1013-1017,1993.
 Ruggeri, B. J. Caamano, T. Goodrow, M.
    DiRado,  A. Bianchi, D. Trono, C.J. Conti
    and A.J.P. Klein-Szanto, Cancer Res.
    51:6615-6621, 1991.
 Ohgaki, H., G.C. Hard, N. Hirota, A.
    Maekawa, M. Takahashi and P.
    Kleihues, Cancer Res. 52:2995-2998,
    1992.
 Dumenco, L.L., E. Allay, K.  Norton and S.L.
    Gerson, Science 259:219-222,1993.
 Lamm, G. and G. R. Pack, Proc. Natl. Acad.
    Sci.  USA 87:9033-9036, 1990.
 Pack, G.R., G.A. Garrett, L. Wong and G.
    Lamm, Biophysical J. (in press), 1993.
Weiner,  S.J., P.A. Kollman, D.T. Nguyen and
    D.A. Case. J. Comp Chem. 7:230-252,
    1986.
Islam, N.B., D.L. Whalen, H. Yagi, and D.M.
   Jerina, J. Am. Chern.  Soc. 109:2108-
   2111,1987.    '
1 George R. Pack and Linda Wong, UIC College of Medicine at Rockford, 1601 Parkview Ave., Rockford, Illinois
    61107.
NESC Annual Report - FY1994
                                     35

-------
Calculating the Rates of DNA-Catalyzed Reactions of Aromatic Hydrocarbon Diol Epoxides
36
NESC Annual Report - FY1994

-------
   Prediction of Oxidative Metabolites by Cytochrome  P450s
   with Quantum Mechanics and Molecular Dynamics
   Simulations1
 Background (History of Project)
   The ubiquitous cytochrome P450
 enzymes are a family of well known monox-
 ygenase which are involved in the phase
 one metabolism of a variety of xenobiotics
 leading to either benign or harmful (i.e., car-
 cinogenic, teratogenic, hepatotoxic, or neph-
 rotoxic) metabolites. It is therefore important
 to identify and characterize metabolites with
 toxic potency and modulators which lead to
 their formation. During the past few years,
 considerable work in our laboratory has
 been devoted to (1) the study of the mecha-
 nistic nature of the oxidative biotransforma-
 tion by P450s, and (2) the identification and
 characterization of properties which modu-
 late the competition among possible oxida-
 tion reactions for a specific group of
 compounds. In the case when the three
 dimensional structure of the specific P450
 enzyme is not known, the study is focused
 only on the intrinsic properties, such as elec-
 tronic structure, conformation, and functional
 groups, of the xenobiotics and their oxidative
 metabolites. On the other hand, when the
 3D structure of the P450 enzyme is known,
 as in the case of the bacterial P450cam
 enzyme, the possible modulations of com-
 peting reactions from the steric interaction
 between the enzyme and the xenobiotics
 are also considered.
   The major types of P450-metabolized oxi-
 dation reactions include aliphatic and aro-
 matic C-hydroxylation, epoxidation, and
 heteroatom oxygenation. Recently, studies
 from our laboratory of the hydroxylation of
 aliphatic acids, such as valproic acid and its
 analogues1-3 and the epoxidation of styrene
 analogues4-5 by cytochrome P450s have
 been reported.  Here, we reported  two
 investigations involving the oxidation of het-
 eroatom containing compounds. In the first,
 as shown below, the competition between
 heteroatom oxygenation and Ca-hydroxyla-
 tion of N- and S-containing compounds were
 studied. See Figure 1, page 38.
   The specific questions asked were (1)
 why S-containing compounds are more
 likely to proceed heteroatom oxidation than
 N-containing compounds? (2) why Ca-
 hydroxylation is preferred over N-oxidations
 in amines? and (3) why S-oxidation is
 favored over Ca-hydroxylation in thioethers?
 Ab initio quantum mechanical methods were
 used to find any possible electronic modula-
 tors leading to these results.6
   In the second study, Figure 2, page 38,
 the stereoselective suifoxidation of thioani-
 sole and p-methyl :thioanisole by P450cam
 was investigated.
  The experimental observations showed
 that P450cam-metabolized suifoxidation
 produced more R-enantiomer for thioanisole
 while more S-enantiomer for p-methyl thio-
 anisole. The theoretical study presented
 used the known 3D structure of P450cam
 enzyme and molecular dynamics simula-
 tions of the enzyme-substrate interaction to
 reproduce the experimental results and to
 find the steric modulators that lead to these
 observations.

 Method/Approach
  For the first study, three amines including
 methyl amine, methyl ethyl amine and N-
 methyl aniline and three thioethers including
 methanethiol, methyl ethyl thioether and
thioanisole were used. A triplet oxygen atom
was used as a model P450 since the active
state of P450 can be best described as con-
taining a triplet [Fe=O] moiety. Ab initio
NESC Annual Report - FY1994
                                    37

-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molecular
Dynamics Simulations                                                      :
   (1) Heteroatom Oxygenations
N— O     or
                                                             — OH
               S  -



   (2) Ca Hydroxylation
S—O
 \       I
   N—C-
/       I


-s-c-
           \       I
              N-C
                     I
                                                        -OH
                                             _S— C-OH
           Figure 1: Investigations Involving the Oxidation Of Heteroatom Containing Compounds - First Study;
                               P450cam/O2
        Figure 2: Investigations Involving the Oxidation of Heteroatom Containing Compounds - Second Study!
38
                       NESC Annual Report - FY1994

-------
        Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molecular
                                                                      Dynamics Simulations
 quantum mechanics were utilized to fully
 optimize the structures and minimize the
 energies of these parent compounds, the
 possible reactive intermediates, and their
 oxidative products.  HF was used for closed
 shell species while UHF or ROHF (when
 large spin contamination was found from
 UHF) were used for open shell species. 6-
 31G* basis set was used for all species.
   For the second study, an extended bind-
 ing site model of P450cam containing 87
 amino  acids, 7 waters and the protoporphy-
 rin IX heme unit and an active oxygen at a
 distance of 1.7A  above the Fe was used.
 Thioanisole and p-methyl
 thioanisole were  then
 docked in the binding site
 with six different orienta-
 tions. Three of the initial
 orientations have R config-
 urations while the other
 three have S configura-
 tions. After energy minimi-
 zation with the AMBER
 3.0A force field, the 6 different enzyme-sub-
 strate complexes were each subjected to 5
 ps of heating-equilibration and 125 ps of
 molecular dynamics simulations with two dif-
 ferent initial velocity distributions. Since the
 87 amino acids forming the extended bind-
 ing site are not contiguous, the backbone
 (N, Ca,  C) atoms  of the enzyme were con-
 strained in coordinate space with a harmonic
 potential of 100 kcal/A2 during the MD simu-
 lations.  The coordinates of all atoms during
 the 125 ps simulations were saved at 0.2ps
 interval resulting in 625 snapshots for each
 trajectory. For each snapshot,  if the dis-
 tance between  the active oxygen of
 P450cam and the S atom of the substrate
 was less than 4.0A, sulfoxidation was
 assumed to proceed. An angle parameter,
 defined as the angle between the normal of
 the C-S-C plane and the S to O vector, was
 then used to determine which lone pair of
 the S atom was to be attacked and the cor-
 responding enantiomer produced.  In this
 study, values of angle less than  80° lead to S
 enantiomer and those greater than 100°, to
  the R isomer.  Snapshots with values of 9
  between 80° and 100° were disregarded,
  since they correspond to the less stable
  transition state configurations between the S
  and R isomers.

  Results and Discussion

  (1) The Oxidation Reactions of N- and S-
  Containing Compounds

   (1) Origin of Observed Preference for Het-
  eroatom Oxide Formation: S > N  Guenger-
  ich et al. proposed that heteroatom oxide is
 formed via an initial electron transfer from
 the substrate to the [FeO] moiety of P450,
 followed by an oxygen transfer in the oppo-
 site direction.    \
R-X-R-+ [Fe=0]v--> [R-X-RT- + [Fe=0]lv
     Compound I          Compound II
                         R-X(O)-R' + [Fe]
                             Resting State
 If this electron transfer step is rate limiting,
 the energy of the cation radical, [R-X-R']-+,
 relative to its parent compound could be a
 modulator determining why heteroatom
 oxide formation is more likely in S-containing
 compounds than in N-containing com-
 pounds. Therefore!, the ionization potential
 of three similar compounds, methylamine,
 methanethiol and nnethylphosphine was cal-
 culated and the results are shown in Table  1,
 page 40. Comparison among the three
 compounds shows the trend of the IP: meth-
 anethiol > methylphosphine > methy-
 lamine. This contradicts the observation
 that the preference for heteroatom  oxide for-
 mation: P-O > S-O > N-O. Therefore, we
 concluded that the  relative stability of the
 cation radical cannot be a modulator deter-
 mining the oxide formation.  Our results also
 do not support the oxide formation via an
 electron transfer mechanism.
  Another possible mechanism for the het-
 eroatom oxide formation is a one-step direct
 oxygen atom transfer without the formation
 of the intermediate  cation radical. Based on
this mechanism,  a comparison of the
NESC Annual Report - FY1994
                                      39

-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molejcular
Dynamics Simulations                                                          j
heteroatom oxide stabilities of the three
compounds was made in order to examine if
this is a possible modulator of the observed
preference of heteroatom oxide formation in
the order of P-O > S-O > N-O.  As shown in
Table 1, our results show that the rank order
of the stability of X-O is parallel to the obser-
vation of the preference of formation. These
results demonstrate that the relative het-
eroatom oxide stability is a very good indica-
tor of the observed preference  of S-
containing compounds for heteroatom oxide
formation compared to N-containing com-
pounds.
the three types of products Ca-hydroxides,
N-hydroxides, and N-oxides were; consid-
ered. Specifically, the formation of Ca-
hydroxides and N-hydroxides was assumed
to proceed via an initial hydrogen [atom
abstraction producing corresponding Ca rad-
ical and N radical as reactive intermedi-
ates. The formation of N-oxide wias
assumed to proceed via a one-step direct
oxygen transfer mechanism due to its suc-
cess in explaining the preference jof oxide
formation between N- and S-compounds.
The stabilities of  the Ca-radical, the N-radi-
cal  intermediates in the proposed:H-abstrac-
        Table 1:  Energetics of N- and S-compounds chosen for this study. Energy is in
        kcal/mol  relative to the parent compound and the triplet oxygen atom.  Relatiye
                          stability of radical cations and products.
Substrate
CH3NH2
CH3SH
CH0PH2
IP '"
174.2
192.8
180.9
; [CHsXRj^o- ;;
&••*'* wrt X u.
215.1
233.7
221.8
\ * Ctt3X<0)R v
12.7
-2.5
-56.9
(2) Internal Competition in Formation of
Three Oxidation Products of Amine
  To identify reliable determinants of product
distribution in cytochrome P450 mediated
oxidations in amines, plausible pathways to
tion mechanism and the N-oxide product
were hence calculated. As shown in Table 2
for the three amine substrates studied, the
CL-radical is less stable than both the N-rad-
 •'a.
ical intermediate and the N-O product by a
        Table 2: Energetics of N- and S-compounds chosen for this study.  Energy is; in
        kcal/mol relative to the parent compound and the triplet oxygen atom. Relative
        stability of C and N-radical intermediates from  H-abstraction mechanism and the
                        three possible oxidation products of amines.              i
Substrate
CH3NH2
C'HaNHCVV
C6H5NHCH3
[CJ-fOH
15.5
15.9' 14.4"
16.9
INHPH-
12.3
10.2
9.5
[C«-OH3
-46.8
-45.8' -47.7"
-46.4
[N-OH]
-11.2
-10.3
-9.8
>[N:O]
12.7
6.1
12.9
  The two possible sites of Ca are denoted with ' and".
40
               NESC Annual Report - FY1994

-------
        Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molecular
                                                                 :      Dynamics Simulations
  few kcal/mol, a result that is in contrast to
  the observed preference of Ca-hydroxyla-
  tion-dealkylation in amines.  Therefore, the
  relative stability of competing Ca- and N-rad-
  ical intermediates  is not a reliable determi-
  nant of different product formation.
    Comparison of the stability of the three
  oxidation products also shown in Table 2,
  page 40, gives the order Ca-hydroxide > N-
  hydroxide > N-oxide in all three amines. The
  formation of the Ca-hydroxides and the N-
  hydroxides are both exothermic while the
  production of N-oxide is endothermic. The
  large energy difference of ~50 to ~60 kcal/
  mol obtained between the N-oxide and the
  Ca-hydroxide in the three amines suggests
 that the formation of Ca-hydroxy amiine is
 energetically much more favorable. This
 result agrees with the observation that Ca-
 hydroxylation-dealkylation is the predomi-
 nant oxidation reaction in amines. Thus it
 appears that the relative stabilities of the dif-
 ferent products modulate the competition
 among the various  reactions. Among the two
 heteroatom oxygenations, N-OH was  found
 to be ~35 kcal/mol  more stable than N-O
 product and would  therefore be predicted to
 be the next most abundant product.

 (3) Internal Competition in  Formation of
 Two Oxidation Products of Thioethers
  To identify reliable modulators of cyto-
 chrome P450 mediated oxidations of  sulfur-
 containing compounds, it was assumed that
 Ca-hydroxylation proceeds via H-abstraction
 and sulfoxide formation via direct oxygen
 transfer.  The stability of the Ca-radical and
 OH radical intermediates for each of the three
 substrates was calculated as shown in Table 3
 and compared with the stability of sulfoxide.
 In contrast to the results for the N-containing
 compounds, for all three thioethers the sulfox-
 ide is more stable than the intermediates Ca-
 radical and OH radical by -20 to -30 kcal/mol.
 Thus,  if the formation of the Ca-radical and
 OH radical is the rate limiting step in Ca-
 hydroxylation for thioethers, more S-oxide
 than C^-hydroxide product is predicted which
 is consistent with observations.
  To determine the uniqueness of this infer-
 ence, a second comparison was  made of the
 product stabilities as shown in Table 3. For all
 three S-containing parent compounds, the  Ca-
 hydroxylation product was found  to be ener-
 getically more stable than the sulfoxide by ~25
 to ~38 kcal/mol. Thus if the relative product
 stability is used as a criterion for predicting
 product distribution, it leads to the prediction
 that the Ca-hydroxide is the predominant
 product in all three thioethers, a result which
 contradicts known experimental works of a
 few selected thioethers. Therefore, it can be
 concluded that the relative stability between
 the radical intermediates and the  sulfoxide,
 rather than the final oxidation products, is an
 important factor of product distribution of
thioethers.
 Table 3: Energetics of N- and S-compounds chosen for this study.  Energy is in kcal/mol relative
 to the parent compound and the triplet oxygen atom.  Relative stability of C-radical intermediate
           from H-abstraction and the two possible oxidation products of thioethers.
[ Substrate
CHgSH2
C'^SC'^HS^
CgHsSCHs
[CJ^OH
18.6
18.8' 15.8"
18.4
[Ca-OHJ ,
-40.2
-37.8' -43.3"
-38.4
[S-0]
-2.5
-12.0
-11.5
  The two possible sites of Ca are denoted with ' and".
NESC Annual Report - FY1994
                                       41

-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molepular
Dynamics Simulations                                                           ;
Stereoselective Sulfoxidation of
Thioanisole and p-methyl Thioanisole
  The results of each individual 125 ps
molecular dynamics simulations for thioani-
sole and p-methylthioanisole using the two
geometric determinants, (r<4.oA for reactiv-
ity and 9 for sterochemistry) are summarized
in Tables 4 and 5, page 43, respectively.
The overall results yield enantiomeric ratios
of sulfoxide products R:S 65:35 for thioani-
sole and 22:78  (R:S) for p-methylthioani-
sole. Results from two different cutoff S-O
distances, 3.5 and 4.5 A, were also ana-
lyzed to test the sensitivity of the results to
the cutoff criterion. For thioanisole, 69:31
(R:S) and 61:39 (R:S) ratios were found for
the 3.5 and 4.5 A criteria, both very'similar to
the 65:35 (R:S) value for the 4.0 A. For p-
methylthioanisole, the values of 23:77 (R:S)
and 24:76 (R:S), respectively, were[also very
similar to that found for the 4.0 A cutoff. The
experimental results, also shown inJTable 4,
give 70:30 (R:S)  for thioanisole and 48:52
(R:S) for p-methyl thioanisole.     !
                                \
  The most interesting  results obtained from
both experiment  and theory is the modulation
of stereoselectivity from R to S by the thioani-
sole p-methyl substituent. In order to under-
stand the role of  the para substituent in
modulating this selectivity, a detailed analysis
of substrate interactions with the binding  site
residues and the heme prosthetic group is
required. Our analysis  from the trajectories
        Table 4: Summary of MD results for each 125 ps trajectory of thioanisole and p-
         methylthioanisole and the comparison between theoretical and experimental
                             enantiomeric ratios - thioanisole.
Run
01 V1
O1 V2
O2V1
O2V2
O3V1
O3V2
O4V1
O4V2
O5V1
O5V2
O6V1
O6V2
Total
Conf
R
R
S
S
S
S
R
R
S
S
R
R

Experimental
Total
318
443
394
364
322
419
398
356
11
11
18
55
3069

0-80°(S}
141
179
99
134
144
85
100
132
10
6
3
21
1035

^80:1QQ° '
6
11
13
16
6
12
15
12
0
4
10
23
114

UQ0rt8Q0(R}l
171
253
282
214
172
322
283
212
1
1
5
11
1920

'Ratid{S;B>;
f ** "*_ <...'...*..>. 	 ,..$.
i 45:55
i
41:59
26:74
39:61
; 46:54
21:79
26:74
38:62
91:1
86:14
37:63
66:34
35:65
28:72
42
                 NESC Annual Report - FY1994

-------
        Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molecular
                                                                        Dynamics Simulations

          Table 5: Summary of MD results for each 125 ps trajectory of thioanisole and p-
           methylthioanisole and the comparison between theoretical and experimental
                           enantiomeric ratios - p-methylthioanisole.
Run
O1 V1
O1 V2
O2V1
O2V2
O3V1
O3V2
O4V1
O4V2
O5V1
O5V2
O6V1
O6V2
Total
Conf
R
R
S
S
S
S
R
R
S
S
R
R

Experimental
Totaf
515
451
562
548
535
558
561
557
396
70
438
285
5478

0-80°(S)
247
437
549
334
383
534
514
483
250
11
335
13
4090

80-100°
3
12
8
6
3
3
8
5
49
18
23
85
223

100-18Q°(R)
; 265
2
; 5
'• 208
149
21
39
! 69
97
41
80
189
1165

Ratto(S:R) .
48:52
100:0
99:1
62:38
72:28
96:4
93:7
88:12
72:28
21:79
81:19
6:94
78:22
52:48
 shows the stronger lipophilic interaction of p-
 CH3 with Phe-87 and Tyr-96 causes the
 substrate to stay relatively rigid with a spe-
 cific orientation. In this orientation, the steric
 effect of residues Val-295, lle-395, and Val-
 396 forces the S-CH3 side chain to move
 inward toward the empty space of the active
 site pocket and results in more S(-) enanti-
 omer.

  On the contrary, the para proton of thioani-
 sole has less interaction with the surround-
 ing residues. The only significant exception
 is the electrostatic interaction of the para
 proton with Asp-297. This smaller interaction
 with the residues causes the para proton of
 thioanisole to have more mobility than the p-
 CH3 of p-methylthioanisole. Although it is
 relatively mobile in the binding site, we
 found a higher probability from the trajecto-
 ries for thioanisole to stay in the orientation
 in preference of R(t) sulfoxide formation
 because the steric effect of residues Leu-
 244 and Thr-101 forces the S-CH3 group
 into that configuration.
   The predicted value for the p-methylthio-
 anisole sulfoxide enantiomer ratio shifts in
 the same direction as the observed value
 although the two values differ in absolute
 terms. Thioanisole favors R(+) sulfoxide for-
 mation (R:S = 65:35, theoretical, and 72:28,
 experimental) whereas p-methylthioanisole
 favors S(-) sulfoxide formation (R:S = 22:78,
theoretical, and 48:52, experimental). The
theoretical treatment thus predicts, in agree-
ment with experiment, that introduction of a
p-methyl substituent leads to an inversion of
NESC Annual Report - FY1994
                                       43

-------
Prediction of Oxidative Metabolites by Cytochrome P450s with Quantum Mechanics and Molecular
Dynamics Simulations                                                          |
the preferred absolute stereochemistry of
the sulfoxide product. The experimental and
theoretical results thus indicate that the p-
methyl substituent is an important modulator
of the stereoselectivity. Lipophilic and elec-
trostatic interactions between the p-methyl
substituent and residues of the active site
appear to determine the favored orientation
of the thioanisole framework. This orienta-
tion, in conjunction with the orientation of the
S-CH3 group, determines whether the R or
S enantiomer is produced.

Scientific Accomplishments and
their Relevance to EPA's Mission
  The electronic modulators of the compet-
ing hydroxylation and heteroatom oxidation
reactions in N- and S-containing compounds
were identified in this study with ab initio
quantum mechanics. The results obtained
can be helpful for predicting the preferred
oxidation products and assessing their toxic
potency for other N- and S-containing com-
pounds which are so heavily used as drugs,
insecticides or other industrial materials.
  In the study of the steroselective sulfoxi-
dation of thioanisole and analogue by
P450cam, we demonstrated that the product
distribution can be predicted from studying
the enzyme-substrate direct interaction
using molecular dynamics simulations. Fur-
thermore, given the fact that xenobiotics-
macromolecule or metabolite-macromole-
cule interaction is one of the important pro-
cesses for triggering toxicity, the experience
obtained here can be useful for further appli-
cation to these important events.

Future Objectives, Goals, and Plans
  The mechanistic study of P450-metabo-
lized oxidation reactions for a few groups of
xenobiotics will be continued. We also plan
to begin the study of adduct formations
between xenobiotics and macromolecules
which is a very important step in the overall
toxic process. Specifically, interaction of
some relevant xenobiotics with the heme
unit of the P450 enzyme and the nucleic
acids will be  studied.          ;
                             i
Acknowledgment:          j
  The support from the Environmental Pro-
tection Agency Grant #CR-818677-01-0 is
gratefully acknowledged.

Relevant publications and reports
1 D. L. Camper, G. H. Loew, and J. R. Col-
   lins, Steric and Electronic Criteria for
   Teratogenicity of Short Chain Aliphatic
   Acids, Int. J. Quant. Chem. Q$S 17,173
   (1990).                   ;
2 J. R. Collins, D. L. Camper, and  G. H.
   Loew, Valproic Acid Metabolism by Cyto-
   chrome P450: A Theoretical Study of
   Steroelectronic Modulators of'Product
   Distribution, J. Am. Chem. Soc. 113,
   2736(1991).              ;
3 Y. T. Chang and G. H. Loew, Binding of
   Flexible Ligands to Proteins: Valproic
   Acid and Its Interaction with Cytochrome
   P450cam, Int. J. Quant. Chem. QBS 20,
   161  (1993).                |
4 P. R. Ortiz de Montellano, J. Fruetel, J. R.
   Collins, D. Camper, and G. H.iLoew,
   Theoretical and Experimental^Analysis
   of the Absolute Stereochemistry of cis-b-
   Methylstyrene Epoxidation byiCyto-
   chrome P450cam, J. Am. Chem. Soc.
   113,3195(1991).          |
5 P. R. Ortiz de Montellano, J. Friiietel, J. R.
   Collins, D. Camper, and G. HJLoew cal-
   culated and experimental absolute ste-
   reo chemistry of the styrene and b-
   Methylstyrene Epoxides formed by cyto-
   chrome p450 cam, J. Am. Chem. Soc.
   114,6987(1992).          j
6 The work presented here was published
   in Int. J. Quant. Chem. QCS 27, 815
   (1993). G. H. Loew and Y. T. Chang,
   Theoretical Studies of the Oxidation of
   N- and S-Containing Compounds by
   Cytochrome P450.         \
1 Gilda H. Loew and Yan-Tyng Chang, Molecular Research Institute, 845 Page Mill Road, Palo Alto, CA 94304.
44
              NESC Annual Report - FY1994

-------
   High Performance Computing For Environmental
   Research1-2
   Note: Certain sections of this article were
 presented in an invited talk at the "Design for
 the Environment" Symposium, American
 Chemical Society Fall Meeting, August
 1994, Washington, DC.

 The National Environmental
 Supercomputing Center
   The U.S.  Environmental Protection
 Agency (EPA) established the National Envi-
 ronmental Supercomputing Center (NESC)
 in October 1992 to provide a high perfor-
 mance computing facility for environmental
 research. This Supercomputing facility is
 located in the middle of the Great Lakes
 region of the United States, along the Sagi-
 naw River in Bay City, Michigan. The NESC
 is the only Supercomputing center in the
 world dedicated solely to the support of envi-
 ronmental research. As the central comput-
 ing facility for EPA scientific needs, the
 NESC has become a focal point for collabo-
 rative efforts in environmental research
 within a short period of time.
  The NESC has more than 45 environmen-
 tal research projects and has been support-
 ing more than 300 users throughout the
 United States.  These projects are related to
 disciplines such as: environmental computa-
 tional chemistry, air quality modeling, global
 climate change, and the modeling of
 streams, lakes, and estuary systems. These
 projects have consumed the majority of the
 NESC resources. This paper will  describe
 the efforts of the NESC to provide empower-
 ing services to EPA scientists while fostering
 collaboration across disciplines, across
 organizations, and across geographical
 locations. Specifically highlighted  will be the
 efforts in environmental computational
 chemistry.
 Hardware and Sloflware Resources
   The NESC began its operation with one of
 the most powerful supercomputers that was
 available at that time, a Cray Research Y-
 MP 8i/232. For data storage, the NESC had
 two StorageTek 4400, Automated Cartridge
 Systems (Silos) with a total capacity of 2.4
 terabytes of data. Since the NESC serves
 users across the United States, the center is
 well-connected through EPA's own dedi-
 cated high-speed telecommunication net-
 works and the Internet.
   Since the dedicsitiori of the NESC's super-
 computer (Sequoia) in October 1992, it has
 been running at or greater than 95% of its
 capacity. After the installation of the Cray Y-
 MP, more and more environmental research-
 ers have discovered the potential value of
 using Sequoia, and the NESC computing
 resources have become increasingly in
 demand. The NESC's user community and
 its computing requirements grew rapidly dur-
 ing the first six months of its operation.  In
 the fall of 1993, in order to accommodate
 this growing demand, the NESC upgraded
 its original two-processor Y-MP 2/32 to a
 three-processor Cray Y-MP 3/64. To meet
 the requirements of its growing customer
 community, the NESC replaced its original
 Cray Y-MP with a state-of-the-art Cray C94
 2/64 in May 1994.  In the late October 1994
 an additional (third) processor was added to
 the C94. These latest hardware upgrades
 are expected to provide increased through-
 put and faster job processing turnaround for
the NESC users.  In addition to the Cray
mainframe, the NESC also uses Silicon
Graphics and Data General distributed serv-
ers to  satisfy graphics and other UNIX work-
station requirements.
NESC Annual Report - FY1994
                                    45

-------
High Performance Computing For Environmental Research
  The NESC also features several public
domain and third-party application software
packages, databases, mathematical, and
statistical library routines. Most of the avail-
able computational chemistry software pack-
ages are also installed and supported at the
NESC. Details about the chemistry user
resources are covered in a later section of
this article. Since visualization is an impor-
tant part of supercomputing, the NESC
maintains and supports several state-of-the-
art hardware and many popular graphics
software packages.
  For more detailed information about the
NESC's hardware and software, please refer
to "Message from the NESC Director" on
page 3.

The NESC User Community and Their
Scientific Projects
  The NESC's users, scattered throughout
the United States, include EPA scientists,
engineers, and those agencies, universities,
or companies having grants, co-operative
agreements, memoranda of understanding,
or contracts with EPA.  The NESC's
resources are allocated annually, based  on
peer-review of proposals submitted by indi-
viduals satisfying any of the above require-
ments. Trial accounts are also available,
upon request, for new users. The current
NESC projects, which support Congres-
sional mandates, are the Clean Air Act, the
Clean Water Act, and the Superfund
Cleanup initiatives.  There are also projects
supporting local environmental initiatives
such as the Great Lakes, the Chesapeake
Bay, the Green Bay/ Fox River, and the San
Joaquin Valley projects.

Environmental Computational
Chemistry at the NESC
  In an attempt to provide a user-friendly
supercomputing environment at the NESC,
the authors of this article have implemented
many innovative approaches.  One such
effort provided remote chemistry users with
access to a molecular modeling software
package, which was installed on a powerful
graphics workstation at the NESG. Depend-
ing on the capacity of the network, most of
these remote users experienced excellent to
adequate throughput and response. Some
of the NESC's users at EPA's Headquarters
in Washington, DC are successfully access-
ing this software package on their personal
computers (PCs), using X-WindoW System
emulation software. Plans exist to interface
all the supercomputer  chemistry software
packages with at least one user-friendly
graphics workstation molecular modeling
software to improve productivity,  i
                             i
  The NESC's chemistry user community is
relatively large, very active, and they con-
sume a major share of its hardware and soft-
ware resources.  Some of the major projects
in the area of Environmental Computational
Chemistry include: The Mechanisms of
Chemical Toxicity (U.S. EPA, Health Effects
Laboratory,  RTP, NC),  Ab Initio Calculation
of the Electronic Spectra of Polycyclic Aro-
matic Hydrocarbons (U.S. EPA, EMSL-Las
Vegas, NV), Database of Moleculjar Elec-
tronic Structures for Environmental Risk
Assessment and Toxicology (U.Sj EPA ERL-
Duluth, MN), and Exploring Toxic IMecha-
nisms for the Design of Safe Chernicals
(U.S. EPA OPPTS-Washington, DC).
  To meet the various  requirements of the
chemistry users, the NESC has an extensive
list of computational chemistry software
packages such as: AMBER, AMSOL,
CHARMm, DISCOVER, DMol, Gaussian 927
DFT, and MOPAC. The most frequently
used software package from the above list is
Gaussian 92/DFT, distributed by paussian,
Inc. The group from EMSL-Las Vegas,
heavily uses this suite of programs to cali-
brate and predict molecular structure and
electronic spectra of polycyclic aromatic
hydrocarbon molecules, which arb one of
the major health  hazards and environmental
contaminants.
46
              NESC Annual Repbrt - FY1994

-------
                                       High Performance Computing For Environmental Research
 NESC Training and Workshops
   The NESC also provides extensive train-
 ing for its users.  These training sessions,
 ranging from Introductory UNIX to Advanced
 FORTRAN and C Code Optimization, are
 conducted throughout the year. The authors
 of this article coordinated the first major
 event, the Computational Chemistry Work-
 shop, held at the NESC in September 1993.
 The United States Environmental Protection
 Agency's (EPA) National Data  Processing
 Division (NDPD)  and the Environmental
 Monitoring Systems Laboratory (EMSL) -
 Las Vegas jointly sponsored that event. The
 main objective of that workshop was to sur-
 vey the scientific objectives and achieve-
 ments of EPA in the field of computational
 chemistry. Topics of discussion included
 some of the major work being done in the
 field of environmental computational chem-
 istry using the NESC's supercomputing
 resources.
   The Computational Chemistry Workshop
 was a unique opportunity to hear  in-depth
 details about the new and exciting scientific
 discipline, Environmental Computational
 Chemistry.  Approximately 60 people, repre-
 senting various EPA-related institutions and
 others, attended the three-day workshop.
 Prominent scientists from regional institu-
 tions, such as: Dow Chemical, Michigan
 Molecular Institute, the University of Michi-
 gan, Saginaw Valley State University, and
 Wayne State University, participated in the
 scientific presentations and discussions.
 Other speakers included scientists from
 national and international universities spe-
 cializing in the field of computational chemis-
 try. Experts in areas such as Quantitative
 Structure Activity Relationship (QSAR) for
 chemical exposure and risk assessment,
 Computational Analytical Chemistry, Water
 and Atmospheric Chemistry Modeling, Tox-
 icity Prediction, and  Database Design, led
the scientific presentations and  subsequent
discussions. Of particular interest was the
fact that, for  the first  time, scientists from the
regulatory and the research wings of EPA
met and exchanged  ideas and discussed
 issues of mutual interests. Thus, the work-
 shop at the NESC served as a melting pot
 for the Agency's long-term research and
 regulatory objectives in the field of environ-
 mental computational chemistry.
   In April 1994 the NESC conducted a work-
 shop titled "Watershed, Estuarine and  Large
 Lakes Modeling". In addition to these  work-
 shops, the  NESC was also responsible for
 the International Environmental Visualization
 Workshop successfully held for the past two
 years in Cleveland.  Workshops conducted
 at the NESC such as the Computational
 Chemistry and the modeling of Watershed,
 Estuarine and Large Lakes have helped pro-
 vide focus to widely distributed environmen-
 tal research teams.
  In September 1994, the NESC organized
 a very successful Gaussian 92/DFT work-
 shop. NESC  users from EPA, EPA contrac-
 tors, and scientists from other government
 agencies, industries and universities partici-
 pated in the workshop.  Of the 28 partici-
 pants registered, 25 attended the entire
 three days of the workshop. Logistic sup-
 port was provided by the Martin Marietta
 staff at the NESC: Management, Systems,
 Computational Science Services, Visualiza-
 tion, Documentation, and Facilities groups.
 Overall, the workshop was conducted
 smoothly. Some of the suggestions for
 improvement provided by the participants
were:
  • Split the workshop into Gaussian I and II.
  • Some fundamentals are needed, scien-
   tists like J. A.  Pople should be invited.
  • Is it possible for the workshop to focus
   on a subset of applications?
  • A bit more hands-on sessions, lectures
   are too long/complex.
  • Overall an excellent workshop. Very
   useful and informative.
  • Hands-on session could be more struc-
   tured.
  • This course would be incomprehensible
   to a real  novice. Also, starting with  the
   INPUT and OUTPUT is difficult if student
   is a novice.
NESC Annual Report - FY1994
                                     47

-------
High Performance Computing For Environmental Research
  • More supervision/instruction in hands-on
    sessions, send some of the reference
    materials out in advance so that partici-
    pants can prepare before arriving.
  • Q & A sessions and hands-on session
    not organized well. Parts of presentai-
    tions tended to drag on.
  • I guess I would have enjoyed more the-
    ory. However, the course is intended to
    be more introductory.
  • Skip AVS and construct Z-matrix by
    hand. Have more help available from
    experts in hands-on session.  I spent
    75% of my time waiting for help.
  • Dr.  Schlagel was a very interesting
    speaker and I enjoyed talking to him very
    much.
  • Another workshop held. Pre-prepared
    exercises in a step-wise fashion. Practi-
    cal recipes for hands-on session.
  • Well organized and a very pleasant
    experience. Hands-on sessions were
    particularly interesting. Keep up the
    good work!

The Future
  The future of high performance computing
and general automated information process-
ing for environmental research appears to
be extremely promising. High performance
computing is a well-known paradigm in envi-
ronmental research. As a direct conse-
quence of this wonderful tool, we are
beginning to experience an information
explosion in this field.
  However, what is really lacking  in this field
are concerted tools for information manage-
ment. Too many databases are scattered all
over the world addressing certain limited
information. EPA, National Science Founda-
tion, Department of Energy, or any other
environmentally related scientific  organiza-
tions will have to come forward in order to
establish a comprehensive plan and initia-
tive in bridging together all this scattered
information. An organization such as the
National Library of Medicine (for the health-
related information processing) needs to
evolve for the environmental information
management as well.  It is also surprising to
note that a vast and interdisciplinary science
such as environmental science does not
have its own periodic abstracting jservice
similar to the Chemical Abstracts.;  Initiation
of an Environmental Abstract service would
be the first step in the right direction. The
authors  strongly believe that the near future
will witness the realization of such new
initiatives.                     I
                              i

Conclusion                  ;
  Using computers in environmental chem-
istry  research adds a new dimension to the
traditional experimental approach.  For
example, it is difficult to comprehend the
nature of ozone depletion or acid rain based
on certain rare and costly experimental data
points. By incorporating computers and
graphic visualization into this research, a
whole new world of graphical and, three-
dimensional images emerges. This not only
improves comprehension but also saves
time  and money while  permitting more
aggressive and innovative scientific
research.                      j
                              i

Acknowledgments
  K.  Namboodiri wants to express his sin-
cere  appreciation to Ben Bryan, Manager,
NESC, Martin Marietta Technical Services,
Inc.,  and the staff of the NESC, for their
excellent support.  Also, special thanks are
due to Kerslin Felske for carefully proofing
this manuscript.
                              i
Disclaimer
  The U.S. EPA does not endors'e or rec-
ommend any of the commercial products or
trade names mentioned in this manuscript.
1  Krishnan Namboodiri, Martin Marietta Technical Services, Inc. (MMTSI), National Environmental
    (NESC), 135 Washington Ave., Bay City, Ml 48708.
2  Walter M. Shackelford, Director of Scientific Computing, U.S. EPA, National Data Processing Division
    27711.
                         Supercomputing Center

                              (NDPD), RTP, NC
48
              NESC Annual Report - FY1994

-------
   U.S. EPA Workshop on Watershed, Estuarine, and Large
   Lakes Modeling (WELLR/i)^
 Overview
   The U.S. Environmental Protection
 Agency's (EPA) workshop on Watershed,
 Estuarine, and Large Lakes Modeling
 (WELLM) was the second major scientific
 meeting to be held at the National Environ-
 mental Supercomputing Center (NESC)
 since its dedication in October  1992. This
 workshop followed the highly successful
 Computational Chemistry workshop held
 September 27-29,1993.  These workshops,
 organized by Computational Science Ser-
 vices at the NESC, were designed to further
 the interaction between modeling, policy,
 and management staff within EPA-spon-
 sored programs.  In particular,  the work-
 shops focused on the relevance of the
 NESC's resources to the future of such pro-
 grams and they displayed the resources and
 services that EPA's National Data Process-
 ing Division (NDPD) provides for compute-
 intensive environmental modeling.
  The mission of the WELLM workshop was
 to survey the scientific objectives and
 achievements of EPA in modeling water
 quality, hydrology, hydrodynamics, sedi-
 ment transport and deposition, ground water
 pollutant transport, ecosystem based risk
 analysis, and policy analysis. Additional
 subjects included the multimedia aspects of
 water quality modeling, toolkit availability,
 and EPA support levels for models.
  The WELLM workshop was jointly spon-
 sored by EPA's NDPD, and the  following
 EPA organizations:  The Chesapeake Bay
 Program Office (Region 3), Annapolis, MD;
 The Great Lakes National Program Office,
 Chicago, IL; ERL-Athens,  GA; ERL-Duluth/
 Large Lakes Research Station,  Grosse He,
 Ml; and EPA's National Environmental
Supercomputing Center (NESC), Bay City,
 Ml. The Program Committee of five senior
 EPA staff was selected on the basis of their
 dedication to the importance of modeling
 within EPA and their commitment to the
 importance of the NESC as a resource for
 EPA's modeling community.  The program
 committee consisted of: Robert Carsel,
 EPA, ERL-Athens, GA; Lewis Linker, EPA,
 Chesapeake Bay Program Office, Annapo-
 lis, MD; Pranas Pranckevicius, EPA, Great
 Lakes National Program Office, Chicago, IL;
 William Richardson, EPA, Large Lakes
 Research Station, Grosse lie, Ml; and, Tho-
 mas Short, EPA,  FiS.  Kerr ERL, Ada, OK.
   The work of the Program Committee and
 Local Organizing Committee was greatly
 facilitated by the agreement of Professor
 Robert V. Thomann to act as the workshop
 moderator.  Professor Thomann, of the Envi-
 ronmental Engineering Department at
 Mahattan College, New York, is a world
 leader in the field of water quality issues and
 related subjects and proposed the Guiding
 Principle for the workshop and prepared the
 executive summary.
  Thirty-two participants attended the
 WELLM workshop and presented sixteen
 invited and six contributed papers of high
 quality and relevance to EPA interests. The
 speakers included representatives from EPA
 sites, other Federal agencies such as the
 National Oceanographic and Atmospheric
 Agency, and the U.S. Army Corps of Engi-
 neers, state agencies, environmental con-
 sulting companies, Martin Marietta Technical
 Services, Inc.  (MMTSI), and Universities
 (including EPA grantees). The invited pre-
sentations were grouped around themes in
separate (numbered) sessions and each
session had associated with it an extended
discussion session of similar length.
NESC Annual Report - FY1994
                                    49

-------
U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
Corresponding to each theme, Professor
Thomann posed a series of key questions to
be addressed in the group discussion ses-
sions.  Responses in the group discussion
sessions were coordinated by discussion
leaders and communicated to the workshop
participants in a Workshop Recommenda-
tions and Conclusions session. The final
session posed for discussion the theme of
"Workshop Follow-up and Implementation",
and the result was a group decision to pre-
pare the executive summary, shown below,
for presentation to EPA.
  The broad spectrum of participants and
corresponding areas of expertise made for a
productive and exciting workshop.  The
results of the workshop evaluation included
in Table 1, page 56, and Figures 5, page 55,
and 6, page 56, confirmed a high degree of
satisfaction on the part of the attendees.

Executive Summary

Introduction
  On April 18-20,1994, a group of scien-
tists and engineers experienced in environ-
mental modeling' [see NOTES at end of the
Executive Summary] came together at the
U.S. EPA National Environmental Super-
computing Center (NESC) to conduct an
"audit" of the current state of large scale
interactive environmental models. This
examination of the record of environmental
modeling revealed important insights into
the role of  predictive frameworks in advanc-
ing the understanding of environmental
behavior and, most importantly, in assisting
environmental decision making.
  More specifically, as the "audit" continued
during the Workshop, it became apparent
that large scale environmental modeling had
much to offer to national  environmental deci-
sion  making and policy. The assessment of
modeling large watersheds, estuarine, and
Great Lakes systems indicated that while
there were important scientific questions
and issues, the state of the art of modeling
had progressed sufficiently to be of signifi-
cant value in assisting upper level [managers
and directors in developing policy and in
providing support for environmental
decisions.
  This Executive Summary is intended to
provide an overview of the important results
of the Workshop with specific emphasis on
the role of modeling in decision making and
policy.                       |

Whv this Workshop?           j
  The motivation and rationale for this
Workshop is an outgrowth of several obser-
vations:                       |
  • The questions related to environmental
   decision making have grown i'n complex-
   ity related to environmental issues and
   the extent of spatial and tempbral scale2.
  • The economic and policy consequences
   of making a "right" or "wrong" decision
   have increased  substantially.'
  • The need for a high level of enforcement
   credibility and scientific defen&bility is
   absolutely essential.        j
  • As a result of these pressures,  model-
   ing of atmospheric, aquatic and terres-
   trial environmental systems in recent
   years has increased rapidly in geo-
   graphic scope and degree of complexity
   (i.e., incorporation of increasing physi-
   cal, chemical and biological pro-
   cesses). The complexity has been
   further expanded by another trend
   toward the linkage of these systems to
   quantify the cross-media impacts of
   pollutants.                 ;
  The principal results of the Workshop
Process3 included a series of  key j recom-
mendations supporting conclusions and
suggestions for implementation. '

Recommendations             ;
1. Because of the  increasing spatial, tem-
   poral, and multi-media scope of contem-
   porary environmental questions, the
   U.S. EPA should provide the leadership
50
              NESC Annual Report - FY1994

-------
                    U.S. EPA WorKshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
     among Federal agencies to develop,
     plan and build a National Environmental
     Decision Support System (NEDSS).
     Such a system would include multi-
     media models, telecommunications,
     visualization and Geographical Informa-
     tion System (GIS) tools integrated into a
     unit that would assist and support the
     decision making process.
 NEDSS would provide:
   • Quantitative evaluations of the environ-
     mental response to alternative control
     strategies.
   • Synthesis of monitoring data and model-
     ing to provide assessment of environ-
     mental quality trends.
   • Reduction of the uncertainty of policy
     results.
 2.   The basic modeling structure underlying
    the NEDSS should be a synthesized,
     linked, and  coupled interaction of  air-
    shed, watershed, and coastal models as
    illustrated in Figure 1, page 52.
 3.   In order to effectively utilize confirmed
    and accepted environmental models, the
    maintenance and upgrading of such
    models needs to  be institutionalized
    within the proposed NEDDS.
 4.  The National Environmental Supercom-
    puting Center (NESC) has provided the
    modeling community with a much
    needed, dedicated, and valuable
    resource for environmental modeling
    research and application. Support for
    this resource facility should be continued
    and computational resources should
    continue to expand to match growth in
    demand and model complexity.  The
    NESC can also assist in the  demonstra-
    tion of model utility to decision makers
    through practical applications to impor-
    tant issues confronting EPA and other
    resource agencies.
5.  Associated facilities such  as the visual-
    ization laboratory  and telecommunica-
    tions and teleconferencing capabilities
    have shown that such technologies can
   aid in comprehending complex environ-
   mental systems and should continue to
   be supported and expanded to fill latent
   demand.

Supporting Conclusions
  • The increased costs of environmental
   problems, the complexity of multi-media
   interactions, and the need to improve
   scientific credibility of environmental reg-
   ulations mandates the intelligent use of
   environmental models.
  • Models of environmental systems at
   EPA and other Federal and state agen-
   cies have been neither used nor sup-
   ported by management at an adequate
   level because of:
   a) an incomplete understanding of model
   utility and,
   b) the necessity for concerted and
   focused efforts by scientists, engineers,
   decision makers, and the general public.
 • Models are essential to decision making,
  yet excessive reliance on models that
  have not been confirmed in a manner
  acceptable to the community can lead to
  erroneous results.
 • The "track record" of the successful
  management and scientific utility of mod-
  els is now extensive and can be
  documented.
 • Early interactions v/ith managers and
  decision makers is crucial to successful
  predictive models.  Such interactions
  would facilitate properly structured mod-
  els which are consistent with policy and
  problem objectives.  Early interactive
  relationships determine model complex-
  ity that is consisitent with  policy goals.
 • Model confirmation with observed data is
  essential in all cases except for analyses
  of a "screening" nature where direct
  comparisons of model output to
  observed data are not possible.
 • Models are the essential and important
  component for interpolating and
NESC Annual Report - FY1994
                                    51

-------
U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
                        NATIONAL ENVIRONMENTAL

                  DECISION SUPPORT SYSTEM (NEDSS)
                                    Rgurel: NEDSS
   extrapolating observational data, thereby
   expanding in a cost-effective manner the
   ability to estimate and understand areas
   that were  not directly measured.
   • The credibility of model diagnoses and
   predictions are synonymous with contin-
   ual "post auditing" (i.e., monitoring of the
   efficacy of purported results to environ-
   mental actions or lack of action; "How
   well is the prediction actually matched by
   subsequent observations?")
   • Models (in the fully integrated sense of
   field observations and mathematical
   descriptions) have often pointed the v/ay
   for the development of future policies
   and decisions, especially with regard to
   the effectiveness of implementing vari-
   ous levels of environmental control
   strategies.
   • As problem contexts increase in geo-
   graphical  scale and process complexity,
   the divergence between model develop-
   ment and available field data is increas-
   ing. The depth of monitoring and field
   validation is lagging model computa-
   tional ability by an increasing amount.
  • For large scale models that are linked or
   coupled (e.g., atmospheric deposition
   and watershed models), the data
   requirements are unique and must
   reflect a common observational strategy
   and a resulting common data base.
  • Individual models exist at a variety of
   locations (e.g., watershed models in
   Chesapeake Bay, air quality models at
   RTP) but there is no central facility in
   EPA for development, maintenance and
   upgrading of environmental models.

Implementation Strategy
  • Senior EPA policy makers should
   receive a briefing report of these  recom-
   mendations reached at the WELLM
   Workshop.                 !
  • The first initiative of NEDSS should be to
   focus on the Eastern United States with
   specific emphasis on the Great Lakes,
   Chesapeake Bay, and additional coastal
52
              NESC Annual Report - FY1994

-------
                   U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
     regions and associated airsheds and
     watersheds4.
   • The various subsequent phases for con-
     structing the complete NEDSS should be
     undertaken by EPA in cooperation with
     other agencies.
   • Implementation should recognize the
     reorganization of the ORD and issues of
     how EPA program offices relate to one
     another and how they interact on these
     issues.
   • Representatives of the modeling com-
     munity as illustrated by those attending
     the WELLM workshop should provide
     input into the implementation process
     through examples of modeling needs,
     successful case studies and the impor-
     tance of the larger geographical per-
     spective on regional and local
     environmental quality.

 Notes
 Note 1:  Models of environmental systems
  are integrated and synthesized frame-
  works of observations from the laboratory
  and field, coupled with quantitative repre-
  sentations of environmental processes.
  Thus, models provide credible methods for
  predicting the results of environmental
  controls or policies.
 Note 2: An example of this increasing scale
  is  given by the Chesapeake Bay experi-
  ence where the initial focus was on the
  Bay proper.  Subsequent analyses indi-
  cated the significance of developing pre-
  dictive assessments of controls on the
  associated watershed and airshed of the
  Bay together with the nearby coastal
  region interactive with the Bay as illus-
  trated in Figure 2, page 54.
  The results of this expanding scale of
  impact on the Bay is illustrated in  Figure 3,
  page 54, which shows the estimated
  improvement in main Bay deep wafer oxy-
  gen as calculated by three linked models:
   an air quality model of the airshed, a
   Chesapeake Bay watershed model and a
   hydrodynamic/water quality model of the
   Bay. The Program Goal of 40% controlla-
   ble nutrient reduction is improved signifi-
   cantly when compared to the Limit of
   Technology by inclusion of nutrient reduc-
   tions from the Clean Air Act and the full
   scale Basin-wide application of controls.
 Note 3: The Workshop was organized
   around five themes:
   • Environmental Modeling: Understanding
    and Prediction;
   • Trade-offs Between Complexity and
    Utility.
   • Modeling Cross-Media Linkages and
    Coupling.
   • Model Computation: Current and Future.
   • Model Support Systems and
    Visualization.  :
  Analysis of each theme was begun by sev-
  eral presentations to set the stage for sub-
  sequent discussion of each theme. The
  discussions were centered around a series
  of focus questions. A final session  sum-
  marized the key recommendations and
  supporting conclusions and implementa-
  tion suggestions.
Note 4: The Great Lakes and Chesapeake
  Bay systems in particular have been
  shown to be impacted by atmospheric
  sources far distant from location of impact
  as shown in Figure 4, page 55. As such,
  the decision making process requires an
  integrated framework to credibly evaluate
  control strategies to meet environmental
  objectives for these water bodies and their
  associated watersheds.
                  I
Workshop Feedback
The results of workshop questionnaires are
  shown in Figure 5, page 55, Figure 6,
  page 56, and Table 1, page 56.
1  George Delic, WELLM Workshop Program Coordinator, Martin Marietta Technical Services, Inc., National
    Environmental Supercomputing Center, 135 Washington Avenue, Bay City, M 48708I.
2  Robert V. Thomann, Workshop Moderator, Manhattan College, Mew York 10471.    ,
NESC Annual Report - FY1994
                                     53

-------
U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
                                                            CHESAPEAKE BAY
                  Figure 2: Chesapeake Bay Watershed Model Expansion
                CHESAPEAKE BAY MODEL ESTIMATES OF IMPROVEMENT IN
              MAIN BAY DEEP WATER OXYGEN DUE TO NUTRIENT REMOVAL
                                        PROGRAMS

            % IMPROVEMENT FROM BASE CASE
no
40
30
20
10
•
•
•

	 2








2


i

D


SSOLVED OXYGEN L£
mg


"SSTHX
a.


N1
32



i
                  PROGRAM GOAL                    GOALtCAA+ALL BASIN
                               GOAL+CLEAN MR AMEND                   LIMIT OF TECHNOLOGY
                                         SCENARIO

             Figure 3:  Estimated Improvement in Main Bay Deep Water Oxygen
54
NESC Annual Report - FY1994

-------
                U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
                                   GREAT LAKES
                                   DOMAIf
        SOURCE CONTROL
        FOR GREAT LAKES
             IMPACTS
       CHESAPEAKE BAY VIA
        AIRSHED - COASTAL
              OCEAN
          INTERACTIONS
                                                    CHESAPEAKE
                                                    BAY DOMAIN
                                  Figure 4
              NESC WELLM Workshop, 18-20 April,  19941
                              Mean of Means =3.47
                              StdDev of Means = 0.31
                                 Question and area
        Figure 5: Mean score with N=19 respondents for each of sixteen questions
NESC Annual Report - FY1994
55

-------
U.S. EPA Workshop on Watershed, Estuarine, and Large Lakes Modeling (WELLM)
                NESC WELLM Workshop,  18-20 April, 1994
                                  Mean of Std Dev= 0.56
                                 Std Dev of Std Dev= 0.24
                                     Question and area
   Figure 6: Standard deviation of score with N=19 respondents for each of sixteen questions
Table 1: List of questions, sample mean (n) and standard deviation (a) for N=19 respondents on
                                  a scale of 0 to 4.                        :
- ' vC< ' y"~" • f •?- ,
':•* ' <
i - cr .*;
Evaluation of the presentations: the speakers ,
1
2
3
4
5
6
7
8
Eva
9
10
11
12
13
Eva
14
15
16
Possessed a thorough knowledge of the workshop subject area
Were well-organized and prepared for the workshop
Explained the presentation objectives clearly
Communicated the subject matter well
Stimulated interest in the workshop subject(s)
Were willing to answer questions in the workshop
Were courteous
Were willing to participate in discussions
3.7
3.5
3.2
3.2
3.5
3.8
3.8
3.7
0.45
0.49
0.54
0.56
0.77
0.50
0.50
0.45
uation of the workshop
The abstract book adequately covered topics discussed
The discussion questions listed were appropriate
The time allowed for discussion was adequate
The organization of the workshop was satisfactory
The pace of the workshop was comfortable
3.3
3.7
3.3
3.4
3.0
0.94
0.48
0.89
0.64
0.73
uation of the participant's response
I would attend a NESC workshop again given an opportunity
I measure my own participation in this workshop as
I measure how much this workshop meets my expectations as
3.8
3.1
3.4
0.47
0.69
0.61
56
NESC Annual Report - FY1994

-------
   Visualization for Place-Based Management1
   Most of us find it hard to value what we
 can't visualize. To illustrate the point, con-
 sider Alaska.  Back in 1867, when Ameri-
 cans heard that Secretary Seward had
 negotiated a $7.2 million deal to purchase
 Alaska from the Russians, they were out-
 raged. Cries of "Seward's folly" were heard,
 and the purchase barely won approval from
 Congress. Why? To most Americans,
 Alaska was an unknown place they
 assumed to be worthless wilderness. They
 couldn't visualize Alaska, and they couldn't
 value the un-visualizable.
   How greatly the situation had changed by
 1980.  In that year, Congress passed the
 Alaska National Interest Lands Conservation
 Act, preserving almost 100 million acres of
 land. A key difference across 113 years was
 that Americans could now visualize "Alaska"
 and perceive it as a valuable place, with
 wildlife and natural features worthy of pres-
 ervation. Thanks to widespread exposure to
 literature, photographs, movies, and televi-
 sion coverage, Alaska no longer seemed an
 unknown, worthless place.
  The Environmental Protection Agency
 (EPA) is moving to a "place-based," ecosys-
 tem management approach, in which we
 work in partnership with many others to pro-
 tect, restore, and manage ecosystems. I
 believe that visualization will be an indis-
 pensable tool to aid our efforts.
  Let me offer another example of the spe-
 cial importance of visualization. Can people
 regulate their high blood pressure without
 the use of drugs?  The medical literature
 suggests that the answer is yes, through  the
 use of bio-feedback techniques. In bio-feed-
 back, the patient is given tools to enable him
 or her to self-regulate parameters such as
 blood pressure. It seems that, if one pre-
 sents a person with the capability to continu-
ously monitor blood pressure - to visualize
the data if you will ~ she will find ways
 (many of them subconscious) to self-regu-
 late her blood pressure within reasonable
 limits.
   Isn't this what we would like to see in our
 ecosystems?  Rather than relying exclu-
 sively on command-arid-control regulation,
 penalties for non-compliance, and expen-
 sive, after-the-fact mitigation and restora-
 tion programs, shouldn't we also seek to
 empower people to self-regulate and oper-
 ate sustainably? (I am indebted to Dr. Dan
 Janzen, the tropical biologist, for suggesting
 this analogy.)
   Visualization is key to such a strategy for
 self-regulation for ecosystem protection ~
 what one might term: "eco-feedback." And
 many of the techniques, technologies, and
 tools that you and others have developed for
 data visualization can be applied to echo-
 feedback. Let me review some ideas about
 priority areas for using visualization for eco-
 systems,         i

 Helping People Visualize Places of
 Interest to Them
  All of us have places we're interested in:
 our "home turf;" places we've visited or
 would like to visit; places we've studied;
 places we're afraid of, and so forth.
  Using the coming National Information
 Infrastructure, I would like to see all Ameri-
 cans, and eventually all citizens of the globe,
 presented with rich capabilities to visualize
 ecosystems of interest. Combining
 remotely-sensed and ground-based moni-
toring data with spatial, demographic, and
 biological information, citizens should be
able to do "fly-overs, -throughs, and -
unders;" zoom in and out; select layers for
display and filter data; run animations of
time-series data such as land-use change;
and so forth. Citizens should be given sim-
ple, yet powerful and flexible, interfaces that
NESC Annual Report - FY1994
                                     57

-------
Visualization for Place-Based Management
allow them to visualize and interact with
information about ecosystems and run simu-
lations of alternative scenarios. I note, with-
out making any endorsement, that packages
such as Maxis Corporation's SimCity dem-
onstrate impressive capabilities along such
lines.
  Citizens should be empowered to define
their own ecosystems, rather than being pre-
sented with arbitrary boundaries and
instructed that those are the "correct" eco-
systems. We all need to know where the
political boundaries are and the land-owner-
ships, the watersheds and the ecoregions
according to various authorities, but we
should not impose them as the only ways to
think about ecosystems. And just as a citi-
zen can follow his own favored stock portfo-
lio, he should be able to follow his own
portfolio of ecosystems.
  Citizens must also be empowered to visu-
alize the components of ecosystems and the
processes associated with ecosystems.
This means innovative use of multimedia
technologies and perhaps virtual reality tools
to help citizens explore, learn, and under-
stand. Let a citizen see the inventory of
known biota of her place. Let her see what
each looks like, how it moves, what it
sounds like, what it eats, what eats it, what
is its habitat, and so forth.

Fostering "Friendly Competition"
  Americans believe in the positive power of
competition in the marketplace, on the
sports fields, and often in other contexts as
well.  I believe friendly competition could be
of great value for place-based management.
Give each political jurisdiction the visualiza-
tion tools to see what natural resources it
possesses now,  as well as in the past and
the predicted future. Do the same for indica-
tors of ecosystem service levels.  Empower
citizens to compare their ecosystem and
sustainability statistics ~ their "batting aver-
ages," if you will. Use advanced visualiza-
tion tools to make it compelling and fun ~
and it had better be compelling, arid it
should be fun.
   We'll be enabled to make tremendous
progress if we make it possible to foster
friendly competitions among towns, coun-
ties, states, and nations to gauge who's
doing the best jobs of managing their
places.                        |

Eco-Feedback or Self-Regulation
                               f
  Just as bio-feedback can enable individu-
als to regulate parameters such as blood
pressure, eco-feedback could enable com-
munities  and societies to self-regulate their
management of ecosystems. For jthis to
work, citizens need readily-visualizable,
unambiguous, real- or near-real-time indica-
tor information about their places or ecosys-
tems.  I know this is challenging, but I am
optimistic because we have seen this done
in other contexts that our society deems
important, such as weather and financial
information.  Through national partnerships,
I believe  we could construct usable indica-
tors of such things as land use/land cover
and ecosystem service levels that:were con-
sciously designed to empower and enable
self-regulation to protect, restore, pid man-
age ecosystems, conserve biodiversity, and
pursue sustainable development, j If the citi-
zenry is then bathed in this information, and
we are doing other things like promoting
friendly competition, I believe we will  see
increased self-regulation and less need for
costly after-the-fact controls,  mitigation, and
restoration.                     |
   I have  spoken elsewhere about ithe vision
for an "Environmental Channel" oh the infor-
mation superhighway - a two-way, interac-
tive capability to empower citizens to do
more to protect the environment.  You in the
visualization community are out \rit front in
demonstrating technical capabilities that will
be essential in making the Environmental
Channel  a reality.               \
                                                                             I
1 Written by Steve Young, Chief, Client Support Branch, Program Systems Division, Office of Information Resources
    Management. Presented by Bill Laxton at EPA's International Environmental Visualization Workshop, Cleveland,
    OH, 30 August 1994.                                                         !
58
               NESC Annual Report - FY1994

-------
                                        -
   Experimental and Calculated Stabilities of Hydrocarbons
   and Carbocations1
 introduction
   Mass spectral analysis of environmental
 samples depends critically upon the quanti-
 tative sensitivity of detection of a molecule in
 question. Analytical strategies for detection
 of new molecules can be designed most
 effectively when there is a basis for predic-
 tion of the detection sensitivity for that sam-
 ple and the type of mass spectral analysis to
 be done, El, Cl, positive or negative ion
 detection, FAB, electrospray, etc. The gas-
 phase basicities or proton affinities  (PA) and
 ionization potentials (IP) of molecules may
 be correlated with positive ion sensitivities,
 and electron affinities (EA) with negative ion
 sensitivities, for example with polynuclear
 aromatic hydrocarbons (PAH's). The quan-
 tum mechanical calculation of these proper-
 ties would make predictions of mass
 spectral sensitivities possible even in cases
 where experimental properties are not
 known. The results obtained might also per-
 mit the design of experiments for increased
 efficiency of detection with particular Cl
 gases tailored to match the properties of the
 molecule to be analyzed.
   As both gas-phase experimental tech-
 niques and theoretical methods have devel-
 oped to permit accurate determination of the
 energetics of chemical processes, careful
 comparisons of experimental and theoretical
 results provide important insights into the
 current state of theoretical methodology and
 into the interpretation of experimental data.
 Heats of formation of many stable neutral
 molecules have long been well known,1 but
 experimental developments in the area of
 gas-phase ion energetics have particularly
 accelerated in the last 20 years. With the
 advent of equilibrium methods based on
 high-pressure mass spectrometry and ion
 cyclotron resonance spectrometry for deter-
 mination of relative ion stability and comple-
 mentary methods for determination of
 absolute ion stabilities,2'3 we now have a set
 of experimental thermodynamic data on
 over 1000 positive and negative ions.
 These data now rival those for neutral mole-
 cules in quantity,  although their quantitative
 accuracy is often somewhat less reliable,
 and entropy and heat capacity data are less
 abundant or accurate.  The wide variety of
 charge and structural types among ions
 makes these species especially critical in
 testing the limits of quantum theoretical
 methodology.  Much effort has ensued,
 therefore, to calculate energies of ions now
 known experimentally, with some success in
 reproducing relative trends with lower levels
 of theory4 and more success in reproducing
 absolute ion energies with higher levels of
 theory on small ions. One particular aim of
 our current work is to extend previous stud-
 ies on proton affinities of heteroatom bases
 to proton affinities of hydrocarbon bases and
 hydride affinities of corresponding carboca-
 tions by investigating systematically the
 effects of (1) geometry optimization, (2)
 basis set, and (3)  correlation method on
 these computed reaction energies.
  We begin here by concentrating attention
 on a series of C-|,  C2, C3, and C4 hydrocar-
 bon ions and related neutral molecules with
 diverse structures typical of those found in
 larger carbocations and hydrocarbons of
 general interest, such as PAH's.  The car-
 bocation and hydrocarbon set chosen pro-
vides a particularly critical test of the ability
of theory to handle difficult problems with
structures containing pi-bonds, strained
rings, and resonance-stabilized, aromatic,
H-bridged, and non-classical carbocations.5
The experimental reaction energies are well-
NESC Annual Report - FY1994
                                     59

-------
Experimental and Calculated Stabilities of Hydrocarbons and Carbocations
known in most cases by multiple experimen-
tal methods, so that we can achieve a "cali-
bration" of different levels of theory for
application to larger systems. Systematic
calibration on a wide variety of structures of
known energy is important in assessing the
accuracy that can be expected as the level
of theory is improved. This calibration also
might allow the identification of some lower-
level theoretical methods that are computa-
tionally feasible larger ions, but,  at the same
time, are capable of giving results to "chemi-
cal accuracy" on the order of 1-3 kcal mol"1,
comparable to the  experimental  errors
expected for many larger hydrocarbon
ions.  In addition, these calibrations also
might justify sufficient confidence in the
accuracy of the theoretical energies (and
structures) to be able to use them to help
interpret experimental results, either by fill-
ing in data inaccessible experimentally or
clarifying cases where the experimental
results are in question. For carbocations,
experimental problems are sometimes espe-
cially difficult, because of the possibility of
rearrangement, problems in generation of
radicals and assignment of their photelec-
tron spectra, slow rates of proton-transfer,
lack of gas-phase structural data and
entropy data, and an incomplete knowledge
of the dynamics of ion-molecule reactions
used to determine thermochemical limits or
equilibrium data. From the results reported
here and as yet unpublished results on
larger ions, we are especially encouraged by
the prospect that currently accessible quan-
tum theoretical models are now capable of
making major contributions in the solution or
reliable circumvention of such experimental
problems.

Discussion
  In this work we have used calculations on
EPA's Cray supercomputer and other com-
puters with the GAUSSIAN program.6 In
detailing basis set effects on hydrogenolysis
energies of small hydrocarbons and car-
bocations, we  have found5 some limitations
of augmented  6-31 G(d,p) basis sets for
high-accuracy work, that can be overcome
by employing triple-zeta Dunning basis sets
which give a description of the s and p shell
than possible with 6-31G or 6-311G bases.
We have also found that the use 6f 5d func-
tions instead of the usual 6d function default
in the Gaussian package gives no  reduction
in accuracy and more internally consistent
basis set convergence as the bas|is sets are
augmented with further polarization and dif-
fuse functions. For the Pople basis sets, dif-
fuse functions are critical in achiel/ing high
accuracy in comparing energies of neutrals
with carbocations, as noted earlier for proto-
nation of lone pair neutral bases a|nd anionic
bases. Diffuse functions were not  needed
with the Dunning triple-zeta basisjset, how-
ever, presumably because of the better sp
description of the diffuse region. '•
  We have observed that very high level cal-
culations at the MP4/cc-pVTZ level give
hydrogenolysis energies closely comparable
to experimental values as illustrated in Table
1, page 62.7 The differences between
experiment and theory are small, almost all
positive, and increase somewhat regularly in
magnitude as the size of the ion of molecule
increases. The direction of this error is such
that the theory treats the molecule in ques-
tion less well than the simple symmetrical
reference molecule, methane, as might have
been expected. Significant for our interest
in Pais, however, the benzene molecule is
exceptional, showing a 3 kcal/mol negative
error, for resisons that are, as yet^ unclear.
Raising the level of correlation treatment to
CCSD(T) brings the benzene molecule
closer to experiment, but we have found in
other cases that such an effect is generally
offset by basis set effects in going to quadru-
ple-zeta basis sets.  A solution to this prob-
lem will have to await further calculations on
benzene at larger basis sets and calcula-
tions at high level on other aromatic mole-
cules. In Table 2, page 63, are shown the
effects of calculations at various levels of
electron correlation on the proton affinities of
small molecules.7 The MP4 values are
nearly identical with experimental proton
60
              NESC Annual Report - FY1994

-------
                           Experimental and Calculated Stabilities of Hydrocarbons and Carbocations
  affinities and all lower levels of correlation
  treatment are less satisfactory than MP4.
  For larger molecules, however, this level of
  theory is not practical, and we have looked
  at the effectiveness of an additivity scheme
  for approximation of the MP4/cc-pVTZ result
  from MP2/cc-pVTZ and MP4/6-31+G(d,p)
  values. The calculations at these levels of
  theory are achievable with reasonable com-
  putational resources for molecules of mod-
  erate size and have been within 0.8 kcal/mol
  of the known MP4/cc-pVTZ results for all
  cases tested in smaller molecules. Calcula-
  tions of the other properties of ionization
  potential and electron affinity have been car-
  ried out on a few small test molecules using
 the same methods outlined above and are
 giving results that agree closely with experi-
 ment, for the cases studied.
   Thus, we have found that the Gaussian
 program for quantum mechanical calculation
 at the MP4/cc-pVTZ levels of theory repro-
 duce experimental energies within 1-2 kcal/
 mol for a test series of C^ C2, C3, and C4
 hydrocarbon ions and neutral molecules.
 Further tests of Hartree-Fock level calcula-
 tions of ionization potentials and electron
 affinities via Koopmans' theorem correlate
 with experimental results. These calibra-
 tions justify sufficient confidence in the accu-
 racy of the theoretical energies (and
 structures) to be able to use them to help
 interpret experimental results and make pre-
 dictions for new molecules.

 References
 1  (a) Cox, J. D.; Pilcher, G. Thermochemis-
    try of Organic and Organometallic Com-
    pounds, Academic Press: New York,
    1970. (b) Pedley, J. B.; Rylance, J. "Sus-
    sex - N.P.L.  Computer Analyzed Ther-
    mochemical Data: Organic and
    Organometallic Compounds"; University
    of Sussex, 1977.
 2 Lias, S.G.; Liebman, J.F.; Levin, R.D. J.
     Phys. Chem. Fief. Data 1984, 13, 695,
     Lias, S.G.; Bartmess, J. E.; Liebman,
     J.F.; Holmes, J. L.; Levin, R.D.; Mallard,
     W. G. J.  Phys. Chem. Ref. Date 1988,
     17, Suppl. 1.
 3 (a) Aue, D.H.; Bowers, M.T. in Gas-Phase
     Ion Chemistry, Bowers, M.T., Ed.; Aca-
     demic Press: New York, 1979; Vol 2.  (b)
     Bartmess, J.E.; Mclver, R.T., Jr.  Gas-
     Phase Ion Chemistry Bowers, M.T., Ed.,
     Academic Press, New York, 1979, Vol 2;
     (c) Brauman, J.L. Gas-Phase Ion Chem-
     istry Bowers, M. T., Ed., Academic
     Press, New York, 1979, Vol 2. (d)
     Kebarle,  P. Annu. Rev. Phys.  Chem.
    1977, 28,445. (e) Taft, R.W. In Progress
    in  Physical Organic Chemistry, Taft,
    R.W., Ed.; Wiley-lnterscience: New York,
    1983; Vol.14, p.247. (f) Wolf,  J.F.; Sta-
    ley, R.H.; Koppel, I.; Taagepera, M.;
    Mclver, R.T., Jr.; Beauchamp, J.L.; Taft,
    R.W. J. Am. Chem. Soc. 1977, 99,5417.
    g)  Richert, J., Ph.D. Thesis, University of
    California, Santa Barbara, 1989.
4 Hehre, W. J.; Radom, L.; Schleyer, P. v.
    R.; Pople, J. A. Ab Initio Molecular
    Orbital Theory, Wiley: New York, 1986.
5 Del Bene, J. E.; Aue, D. H.; Shavitt, I. J.
   Am. Chem. Soc. 1992,  114,1631.
6 Frisch, M. J.; Head-Gordon, M.; Trucks,  G.
   W.; Foresman, J. B.; Schlegel, H.B.;
   Raghavachari, K.; Robb, M. A.; Binkley,
   J. S.; Gonzalez, C.; DeFrees, D. J.; Fox,
   D. J.; Whiteside, R. A.; Seeger, R.;
   Melius, C. F.; Baker, J.; Martin, R.; Kahn,
   L. R.; Stewart, J. J. P.; Topiol, S.; Pople,
   J. A. Gaussian 90; Gaussian, Inc.; Pitts-
   burgh, PA, 1990.
7 Del Bene, J. E.; Aue, D. H.; Shavitt, I., to
   be published.  ••
NESC Annual Report - FY1994
                                     61

-------
Experimental and Calculated Stabilities of Hydrocarbons and Carbocations ;
Table 1 : Hydrogenolysis Energies of Hydrocarbons and Carbocations (kcal/mol).
i

Gated
; ; Expti3 ',;;:;,
; E Difference ,--v
Data for most reliable exptl energies:
C2H2
C2H4
Ca^e
Propyne
Allene
Propene
Cyclopropane
Propane

CH3+
C2H3+
C2H5+
2-Propyl
Allyl
-107.2
-58.7
-18.6
-118.1
-120.6
-72.5
-79.6
-35.0

86.1
50.8
109.9
110.6
71.5
-107.3
-58.3
-18.4
-118.2
-120.0
-71.4
-78.7
-35.0

85.1
50.9
109.5
111.8
72.7
i -°-1
I 0.4
; 0.2
i -0.1
! 0.6
! 1-1
i 0.9
i
0.0

! -1.0
i 0.1
• -0.4
; 1.2
i 1-2
i
1 ,3-Butadiene
2-Methylpropene
Cyclopentadiene
Methylallyl
Cyclopententyl
1 -Butyne
Benzene
-122.8
-85.3
-155.3
72.1
48.9
-134.7
-166.4
-120.7
-84.0
-153.7
75.2
51.2
-133.9
-169.2
: 2.1
; 1.3
: 1-6
i 3.1
2.3
i 0.8
-2.8
Less well-known experimental energies:
Cyclopropene
CH5+
C2H7
-142.4
133.0
127.7
-139.6(?)
135.4(?)
129.1
2.8
2.4
; 1.4
62
NESC Annual Report - FY1994

-------
                            Experimental and Calculated Stabilities of Hydrocarbons and Garbocations
^
Ethyl+H2
Cyclopropenyl
Propargyl
2-Propenyl
Cyclopropyl
Corner-prot c-propyl
1 -Propyl
Caicd
110.9
35.4
7.3
60.1
34.2
105.2
90.6
Exptl3
115.5(7?)
35.0
8.9
57.0(?)
41 .8(7???)
106.8(7)
92.2(7)
E Difference

-0.4
1.6
-3.1

1.6
1.6
 Experimental values are corrected to 0° K, with experimental ZPE's where available.  All geome-
    tries optimized at MP2/6-31 +G(d,p).                            ;



 Table 2.  Differences in Proton Affinities from Calculations at Different Moller-Plesset Correlation
                            Levels using the cc-pVTZ Basis Set.a
                                 tn
  CH4
2.6
0.4
0.0
0.4
2.9
0.1
  CoH
   2n2
-8.1
4.1
-2.2
1.9
-6.2
-1.9
  C2H4
-8.4
2.4
-1.4
1.1
-7.4
-1.5
                       -2.8
            1.8
           0.3
           2.1
           -0.7
           -0.1
  allene (to allyl)
-16.4
2.9
-1.1
1.9
-14.5
-1.4
  propyne
-17.0
7.8
-3.0
4.8
-12.2
-3.2
  propene
-13.6
3.7
-1.6
2.1
-11.5
-2.2
 cyclopropane
-7.1
2.3
-0.1
2.2
-4.9
-0.4
All energies in Real mol'1. Experimental values are corrected to 0° K, with experimental ZPE's
    where available.  All geometries optimized at MP2/6-31+G(d,p).    ;


1  Donald H. Aue* and James W. Caras, Department of Chemistry, University of California, Santa Barbara, Santa
    Barbara, CA 93106.
NESC Annual Report - FY1994
                                                                63

-------
Experimental and Calculated Stabilities of Hydrocarbons and Carbocations
64
NESC Annual Report - FY1994

-------
   Infrared Frequencies and Intensities Calculated from MM3,
   Semiempirical, and Ab Initlo Methods for Organic
   Molecules1'2
 Introduction
   Gas-phase experimental vibratiorial
 frequencies1 have been compiled from the
 literature for comparison with calculated fre-
 quencies.  A goal of our work is to evaluate
 the quality of the correlation of frequencies
 calculated at different levels of theory with
 experimental vibrational frequencies for a
 large number of molecules. This correlation
 will allow one to choose a calculation^
 method for new molecules, taking into
 account the accuracy that can be expected,
 balanced against the calculational cost for
 different levels of theory. This work includes
 a nearly complete set of organic molecules
 for which reliable gas-phase experimental
 spectra have been assigned and is much
 more extensive than widely-quoted compila-
 tions of this type.2-3 The quality of the fit with
 experimental infrared spectra is now suffi-
 ciently good that these methods promise to
 be of practical use  in the assignment, or
 confirmation of infrared spectra of unknown
 molecules of interest in environmental ana-
 lytical problems.

 Methods
  In this work we have used calculations on
 EPA's Cray supercomputer and other com-
 puters with the GAUSSIAN program.3 The
 geometries of all of the molecules haive
 been optimized at the single-determinant
 Hartree Fock level with the  3-21G and
 6-31 G(d) basis sets.3 Many of these struc-
 tures are available from the Carnegie-Mellon
 Quantum Chemistry Archive.4 Harmonic
 vibrational frequencies were computed for
 each structure to identify and distinguish
 equilibrium structures from saddle-point
 structures on  the potential surfaces.
 Discussion:
   A sample of calculated and experimental
 results are compared in Table 1, page 67. A
 linear least squares correlation analysis with
 the intercept constrained to be zero gives
 scaling factors for the HF/3-21G and HF/
 6-31 G* levels of theory that give a best fit to
 experimental fundamental frequencies of
 0.9023 and 0.9000, respectively, and signifi-
 cantly different from the 0.89 scaling factor
 most widely used in the literature for correc-
 tion of Hartree Fock frequencies.2'3  Scaled
 frequencies are tabulated using these scal-
 ing factors and the differences from  experi-
 mental values are tabulated alongside.
 Regression analysis with an unconstrained
 linear least squares method gives nearly
 identical results and intercepts close to zero.
 The correlation coefficients(r^) are 0.9963
 and 0.9980 with standard deviations of the
 frequencies in the two correlations of 56.8
 and 41.6 cm"1. Maximum errors are as high
 as 100-300 cm'1  in a few cases. The
 regression analysis results are summarized
 in Table 2, page 69. Included in these corre-
 lations are a large number of molecules of
 interest for environmental analysis, for
 example, aromatic hydrocarbons, oxirane,
 formaldehyde, acrolein,  aziridine, aniline,
 pyrrole, and the pyridines.
  The results indicate, encouragingly, that
 these theoretical methods work reliably for a
 wide variety of molecular skeletons and
 functional groups and should, therefore, be
 widely applicable for calculations on other
 molecules. We have looked at whether dif-
 ferent classes of molecules lead to better
 correlations or different scaling factors and
 have found no major differences among CH,
 CHO, and  CHN molecules.  Dividing  the cal-
culated frequencies into  high (above  2800
NESC Annual Report - FY1994
                                     65

-------
Infrared Frequencies and Intensities Calculated from MM3, Semiempirical, and Ab Initio Methods for
Organic Molecules
cm"1), medium (600-2800 cm"1), and low
(below 600 cm"') ranges leads to a signifi-
cant modification of the scaling factor and
reduction in the errors in some ranges, for
example to 27.7, 42.4, and 36.1 cm"1 for the
three frequency ranges at the HF/6-31G*
level with scaling factors of 0.9052, 0.8881,
and 0.8969, respectively. Separate treat-
ment of the CH stretching frequencies was
especially effective in reducing the standard
errors.  In Table 1, page 67, the vibrational
assignments for frequencies are illustrated
showing the potential for separate scaling of
other frequency types to reduce the errors in
predicting experimental frequencies from
calculated values. The CC stretching fre-
quencies show the most dramatic differ-
ences with scaling factors closer to 1.0.
  In most cases the frequencies have also
been  calculated at the MM3, AM1, HF/STO-
3G levels, to see if lower levels of theory can
give usable results.  The results in Table 2,
page  69, show that the quality of the correla-
tions decreases regularly with smaller basis
sets and with the semiempirical and molecu-
lar mechanics methods, but the correlations
could still be useful when more expensive
ab initio calculations are not possible.
  A variety of molecules have been studied
at higher levels of theory to see if these
errors can be minimized.  MP2 frequencies
and intensities have been calculated for a
variety of molecules with the 6-31 G(d), 6-
31+G(d,p), and 6-311+G(2df,2pd) basis
sets.  Curiously, the frequencies calculated
at the two smaller basis sets give no better
correlations with experiment than the HF/6-
31G(d) frequencies, but the MP2/6-
311+G(2df,2pd) frequencies give what
appear to be better fits from the limited data
thus far. A summary table of the results  of
regression analyses is shown in Table 2.
With the MP2 calculations, the fits with har-
monic experimental frequencies are better
and slopes closer to 1.0 than with funda-
mental frequencies, as expected. Curiously,
the HF/6-31G* frequencies do not, however,
give a closer fit with harmonic experimental
values.                        ;
  Infrared intensities have been calculated
and tabulated for each frequency [reported
and at each level of theory and compared
with what experimental data has been found
to be available thus far. Where quantitative
experimental integrated intensity data is
available, the fits with calculated intensities
are good, and semiquantitative intensity
comparisons in experimental spectra appear
to be well predicted by calculated intensi-
ties.                          |
  The results of this study permit us now to
calculate the infrared frequencies for mole-
cules of concern in the environment whose
spectra are not well known or properly
assigned with confidence in  the accuracy
that can be expected from such calculations.
The compilation of theoretical and experi-
mental infrared intensity data in spreadsheet
form can be used to do a full simulation of
experimental spectra.           |
                              i
References
1 Shimanouchi, T. "Tables of Molecular
    Vibrational Frequencies", National Stan-
    dard Reference  Data Series, No. 39,
    National Bureau of Standards1, Washing-
    ton, D. C., 1972.; Shimanouchi, T.  J.
    Phys. Chem. Ref. Data 1977,! 6,1.
2 Pople, J. A.; Schlegel, H.B.; Kri'shnan, R.;
    DeFrees, D. J.; Binkley, J. S.; Frisch, M.
    J.; Whiteside, R. A.; Hout, R. F.; Hehre,
    W. J. Int. J. Quantum Chem., Duantum
'    Chem. Symp. 1981,15,  269. |Pople, J.
    A.; Head-Gordon, M.; Fox, D. J.; Ragha-
    vachari,  K.; Curtiss, L. A. J. Chem. Phys.
    1989, 90, 5622.
3 Hehre, W. J.; Radom, L; Schleyer, P. v. R.;
    Pople, J. A. Ab Initio Molecular Orbital
    Theory, Wiley: New York, 1986.
4 Whiteside, R. A.; Frisch, M.J.; Pople, J. A.
    The Carnegie-Mellon Quantum Chemis-
    try Archive; Carnegie-Mellon University:
    Pittsburgh, 1983.
66
               NESC Annual Report - FY1994

-------
       Infrared Frequencies and Intensities Calculated from MM3, Semiempirical, and Ab Initio Methods for
                                                                 i         Organic MoJecuJes
    Table 1. Comparison of Calculated and Assigned Experimental Frequencies (cm'1).
"*^ *,
*
CH4
(^Hg





-Symmetry
ai
e
t2

alg


aiu
^2u

eg


eu


Mode
sym
stretch
deg
deform
deg stretch
deg
deform
ch3s-
stretch
ch3 s-
deform
cc stretch
torsion
ch3 s-
stretch
ch3 s-
deform
ch3 d-
stretch
ch3 d-
deform
ch3 rock
ch3d-
stretch
ch3d-
deform
ch3 rock
HF/6-31G*
3197
1703
3302
1488
3206
1580
1062
326
3200
1548
3249
1644
1338
3274
1650
890
•a:Exptt .-
' ' > •'.-* *.:••-> "' '-
2917
1534
3019
1306 ;
2954 '.
\
1388
995
289
2896
1379
2969
1468
1190
2985
1469
822 :
Ratio
' • •:"*&* j'"!" :•••'..,.
0.9124
0.9008
0.9143
0.8777
0.9214
0.8785
0.9369
0.8865
0.9050
0.8908
0.9138
0.8929
0.8894
0.9117
0.8903
0.9236
NESC Annual Report - FY1994
67

-------
Infrared Frequencies and Intensities Calculated from MM3, Semiempirical, and Ab Initio Methods for
Organic Molecules

QsHg
/^mrrietry
a1








a2




;,.:,;• Mocte^
ch3d-
stretch
ch3s-
stretch
ch2s-
stretch
ch3d-
deform
ch2 scis
ch3s-
deform
ch3 rock
cc stretch
ccc deform
ch3d-
stretch
ch3d-
deform
ch3s-
deform
ch3 rock
torsion
HF/6-31G*
3262
3201
3194
1658
1636
1571
1286
930
392
3250
1635
1436
985
234
fexpt!
•*•<•'!., •''• .• '\ ' -:-''
2977
2962
2887
1476
1462
1392
1158
869
369
2967
1451
1292
940
208
'•Ratta|;v
0.9125 i
i
0.9252
0.9039
0.8902 :
0.8936 !
0.8860
0.9001 |
0.9347 |
I
0.9418
0.9129 I
0.8875 !
0.8997 |
i
i
0.9543 i
0.8889
i

68
NESC Annual Report - FY1994

-------
       Infrared Frequencies and Intensities Calculated from MM3, Semiempirical, and Ab Initio Methods (or
                                                                           Organic Molecules
"'<





Symmetry
bi






b2






Mode
ch3d-
stretch
ch3s-
stretch
ch3d-
deform
ch3s-
deform
ch2 wag
cc stretch
ch3 rock
ch3 d-
stretch
ch2a-
stretch
ch3 d-
deform
ch3 rock
ch2 rock

torsion
HF/6-31G*
3259
3193
1642
1561
1503
1126
1014
3264
3217
1652
1331
808

293
Exptl
2968
2887
i
1464
1378 ;
1338 ;
1054
922
2973 ,
2968
1472
1192 i
748 I

223
Ratio
0.9108
0.9042
0.8916
0.8828
0.8905
0.9359
0.9090
0.9107
0.9227
0.8910
0.8952
0.9259

0.7611
   Table 2. Standard Errors in Predicted Infrared Frequencies of Organic (G,H,N,O) Molecules.
                             Standard Error in Frequency (cm-1)
/ ' " •'-<•'
.Level of theory

MM3
AM1
HF/STO-3G
Zero
Intercep
t, "
*•
85.3
87.9
66.9
Non-
zero
tntercep
t

85.0
87.9
63.7

f
v- Nl
^
949
1084
1086
NESC Annual Report - FY1994
69

-------
Infrared Frequencies and Intensities Calculated from MM3, Semiempirical, and Ab Initio Methods for
Organic Molecules                                                                     i
> tv ^ V?
^. v^ ,, ~J. ^
Z. $Tf * v *' i*
> *- * *
Level of theory *
HF/3-21G
HF/6-31G(d)
MP2/6-31G(d)
MP2/6-31+G(d,p)
HF/6-31G(d)
MP2/6-31+G(d,p)
MP2/6-311+G(2df,2pd)
Zerofi~
Intercep
t
x
56.8
41.6
61.4
59.3
49.5
47.4
34.4
Non-
zero
fntercep
t
56.1
41.0
58.9
51.7
48.6
36.9
26.5
N
1085
1085
290
402
402
54
54
Fundamental Freqs:
HF/6-31G(d)
MP2/6-31+G(d,p)
37.0
54.8
35.4
46.8
133
133
Harmonic Freqs:
HF/6-31G(d)
MP2/6-31+G(d,p)
51.1
43.7
44.0
41.4
133
133
High Frequencies (above 2800 cm-1):
HF/6-31G(d)
MP2/6-31+G(d,p)
27.7
39.7


255
108
1 Donald H. Aue*. Michele Guidoni and James W. Caras, Department of Chemistry, University of California, Santa
     Barbara, Santa Barbara, CA 93106.                                                   i
2 Donald Gurka, United States Environmental Protection Agency, Environmental Monitoring Systems Laboratory, Las
     Vegas, NV 89119.
70
NESC Annual Report - FY1994

-------
   Ab Initio Calculations of Proton Affinities, lonization
   Potentials, and Electron Affinities of Polynuclear Aromatic
   Hydrocarbons and Correlation with Mass Spectral Positive
   and  Negative Ion Sensitivities 1»2
 Background
   The field of environmental analytical
 chemistry has been slow to utilize the power
 of computational chemistry.  Recently, statis-
 tical treatment of the data and statistical
 experimental design have been computer-
 ized and have been a great help in methods
 development and methods evaluation.
 However, limited use has been made of
 structure and energy optimization programs
 for the benefit of analytical methods
 development.
   Part of a feasibility study for the potential
 of computational chemistry in environmen-
 tal analytical chemistry to support U.S. EPA
 RCRA and Superfund legislation has been
 our effort to predict relative intensities of
 environmental pollutants, under both posi-
 tive and negative ion Chemical lonization
 (Cl) conditions in the mass spectrometer.
 This is important because of the possibility
 of achieving enhancement of the ion signal
 under Cl compared with electron impact ion-
 ization for trace analysis. The parameters
 important for predicting these  sensitivities
 are lonization Potentials (IPs), Proton Affini-
 ties (PAs), and Electron Affinities (EAs). It is
 possible to  calculate these parameters from
 ab initio calculations.
   Let us see how these parameters affect
 the sensitivities in the positive and negative
 ion modes.  Siegel modeled the chemical
 ionization source, considering both positive
 and negative ion reactions in an  ion source
 and accounting for recombination rates and
 ambipolar diffusion1. He came up with the
 simple expression:
   This tells us that the ratio of the sensitivi-
 ties is just the ratio of the rate constants for
 the attachment and charge transfer (or ion-
 molecule reaction) processes.
   Classical theories2 of ion-molecule inter-
 actions predict a collision or capture-rate
 coefficient (k,;) given by:
 For example: a (polarizability) for HCN =
 2.59 A3; Mo (dipole moment) = 2.98d
    k = 1 0-9*1 0'
 For CH5+ + HCN, ^(reduced mass) = 10.4
 and kc = 3.38 x1(T9.
 For e' + HCN, p, = (3.000545 and k~ = 4.67 x
 10'7.            ;
  This shows that theoretically one can
 have a two-orders-of-magnitude enhance-
 ment of the negative ion over the positive ion
 mode. On a more practical side, laboratory
 determinations of rate coefficients kexp have
 indicated that most exothermic proton trans-
 fer reactions proceed extremely fast at ther-
 mal energies3. The transfer of a proton,
 when energetically allowed, often proceeds
 on nearly every collision. There are few
 exothermic reactions for which kexp/kc < 0.5.
The exothermicities of most of these reac-
tions are relatively small, AH < 15 kcal/mol.
Therefore,  the proton affinities or relative
proton affinities seem to have an influence
on the rate of reaction and consequently the
intensity of the protonated molecule.
NESC Annual Report - FY1994
                                    71

-------
Ab Initio Calculations of Proton Affinities, lonization Potentials, and Electron Affinities of Polyngclear
Aromatic Hydrocarbons and Correlation with Mass Spectral Positive and Negative Ion Sensitivities
  For the electron attachment process:
                           A"'
  For molecules with EA > 0, k-| is generally
large.  If EA is sufficiently large, the molecu-
lar anion A"- will be stable against electron
detachment (i.e., k.1 will be small). There-
fore, an intense response to A"- will be
observed, assuming that A does not
undergo a dissociative capture process.  If
EA is small (less than 10 kcal/mol), k.1 will
be large, and the intensity of A'- will be
weak4.
  Let us now compare the responses (Table
1, page 73) that one obtains with Negative
Ion Chemical lonization (NICI) with those by
Electron Impact lonization (El).  These
responses normalized to the electron impact
ionization response of pyrene show two
trends: (1) the El response is fairly constant
for these PAHs, ranging from 0.75 for
anthanthrene to 1.12 for benzo[puoran-
thene; and (2) the NICI response variation is
extreme, spanning 4 orders of magnitude in
response. Since the El response does not
show much variation, the influence of IPs on
the mass spectral intensities, at least for the
PAHs,  is minimal in the common El ion
source that utilizes 70-eV electrons. In con-
trast, the effect of EAs on sensitivities is
likely to be great. A similar table that corn-
pares the negative to the positive ion inten-
sity under chemical ionization conditions is
presented as Table 2, page 746. Also, EAs7
for most of the PAHs listed are given. There
is a loose relationship between EAs and the
N/P ratio.  Generally, the compounds with
EAs above 0.42 eV show an enhancement
in the negative ion mode. However, the
size of the enhancement does not correlate
well with the magnitude of the EAs. Other
factors are most likely involved, such as the
proton affinities.  Also, some of the reported
EAs may have to be reexamined.
Method / Approach
  The calculations were performed with
Gaussian software and mainly with the
National Environmental Supercorhputing
Center's (NESC) Cray computer. iThe calcu-
lations were generally attempted up to the
6-31G* levels. These levels will not give
good absolute levels (MP2 or MP^ would be
better), but for a series of compounds, such
as the PAHs, they should give good relative
values (within 1-2.5 kcal/mol).8  This level of
calculation  should be satisfactory-for IPs
using Koopmans' theorem9 because of the
cancellation of errors when using Hartree-
Fock (HF) calculations for the highest occu-
pied molecular orbitals (HOMOs).  The
same cannot be said about the EAs where
there is not a cancellation of errors in the
calculations.
  The following expressions show  the "true"
values of the IP and EA, respectively:

       IP = -e, + E+(R) + XE|P(c6rr)
                  and
      EA = -ea - E.(R) + XEEA(corr)

where 6j and ea are respectively the Har-
tree-Fock energies of the  appropriate occu-
pied and  unoccupied orbitals of the neutral
molecule; E+(R) and E.(R) are the| relaxation
(reoptimization) energies of the cation and
anion, respectively, and are the differences
between the energy of the ion using the
relaxed orbitals and the frozen neutral-mole-
cule orbitals; and XE|P(corr) and XEEA(corr)
are the electron correlation corrections to
the IP and EA, respectively. The correlation
energy difference XE|P(corr) is usually posi-
tive, whereas the relaxation energy of the
cation E+(R) is negative, and they  are of
approximately equal magnitude. As a result,
the correlation and relaxation corrections
tend to cancel in the calculation of  IPs. The
anion case  is different. Again the correlation
energy difference XEEA(corr) is usually posi-
tive, but now the relaxation energy  enters as
a positive quantity, -E.(R). The result is that
the correlation and relaxation corrections
72
               NESC Annual Report - FY1994

-------
       Ab Initio Calculations of Proton Affinities, lonization Potentials, and Electron Affinities of Polynuclear
        Aromatic Hydrocarbons and Correlation with Mass Spectral Positive and Negative Ion Sensitivities

     Table 1. Comparison of Response Factors for PAHs Obtained by NICI (with Methane) and
                                 E-lectron Impact lonization5.
.COMPOUND
Phenanthrene
Anthracene
2-Methylanthracene
1 -Methylphenanthrene
Floranthene
Pyrene
Benzo[a]fluorene
Benzo[b]fluorene
Benzo[a]anthracene
Benzo[b]fluoranthene
Benzojj]fluoranthene
Benzo[k]fluoranthene
Benzofejpyrene
Benzo[a]pyrene
Perylene
ldeno[1 ,2,3-cd]pyrene
Benzo[ghi]perylene
Anthanthrene
El
0.94
0.89
0.78
0.84
0.99
1.00
0.99
1.07
1.09
1.07
1.12
1.02
0.80
0.89
0.90
0.81
0.87
0.75
NICI
0.04
0.54
0.35
0.30
8.90
0.03
0.12
0.16
6.50
38
42
36
0.01
120
; 1.5
120
i 64
170
 are of the same sign and magnitude and,
 therefore, -ea is not a good approximation to
 the EA.

 Results and Discussion
  The IPs that were calculated using Koop-
 mans' theorem correlated linearly to the
 experimentally derived IPs. For the 18 cal-
 culated IPs, the average difference from the
 experimental numbers was 0.27 ev. The
 EAs calculated from Koopmans' theorem did
not agree with the experimental values.
However, the experimental values showed
some correlation with the energies of the
lowest unoccupied molecular orbitals.
  The proton affinities calculated from the
Hartree-Fock energies of the neutrals and
the protonated species are listed in Table 3,
page 75. The proton affinities calculated
with the 6-31G* basis set at the 3-21G
geometry (6-31GV/3-21G, single point cal-
culation) are approximately equal to the
NESC Annual Report - FY1994
                                      73

-------
Ab Initio Calculations of Proton Affinities, lonization Potentials, and Electron Affinities of Polynuclear
Aromatic Hydrocarbons and Correlation with Mass Spectral Positive and Negative Ion Sensitivities

                    Table 2: NICI/PICI Response (N/P) for Selected PAHs.
COMPOUNDS
Naphthalene
Acenaphthylene
Fluorene
Phenanthrene
Anthracene
Fluoranthene
Pyrene
Benzo[a]anthracene
Chrysene
Benzo[b]fluoranthene
Benzo[k]fluoranthene
Benzo[a]pyrene
Benzo[e]pyrene
ldeno[1 ,2,3-cd]pyrene
Benzo[ghi]perylene
Dibenzo[a,h]anthracene
EA ...;; :
-0.06
0.77
—
0.03
0.49
0.63
0.45
0.42
0.26
—
—
0.64
0.35
—
0.51
0.37
;;..;-*,.:;;N/P, ... ;•_,,
0.006
6.0
0.31
I
0.11
2.9
8.1
0.23
13
0.12
220
150
400
0.46
380
210
27
proton affinities calculated at the fully opti-
mized 6-31G* geometry. The proton affini-
ties at the 3-21G level are approximately 5
kcal/mol less than the other values.
Although these values are closer to the
experimental numbers10, this is most likely
fortuitous.  We are investigating the trend as
we go to higher levels.  At the  MP2 level for
benzene and toluene, the proton affinities
decrease by approximately 14 kcal/mol from
the 6-31 G* levels. Figure 1, page 76, plots
the experimental proton affinities against the
proton affinities calculated at the 6-31 G*
level. The theoretical values correlate fairly
well (correlation coefficient = 0.956) against
the experimentally determined proton
affinities.                       ;

Conclusion
  At the levels of theory attempted for the
PAHs, the correlation is satisfactory. How-
ever, the absolute proton affinities^ and elec-
tron affinities are significantly different from
the experimentally determined values. The
enhancements in sensitivities in the nega-
tive ion chemical ionization mode pver those
in the El mode and in the positiveiion chemi-
cal ionization mode is mainly due to the
magnitude of EAs for these compounds and
the reaction rate for the electron qapture
process. Further work will concentrate on
74
               NESC Annual Report - FY1994

-------
       Ab Initio Calculations of Proton Affinities, lonization Potentials, and Electron Affinities of Poiynuctear
        Aromatic Hydrocarbons and Correlation with Mass Spectral Positive and Negative Ion Sensitivities

                Table 3.  Comparison of Proton Affinities: Theory and Experiment.
COMPOUND
Benzene
Toluene
Naphthalene
Azulene
Biphenylene
Acenaphthylene
Anthracene
Phenanthrene
Pyrene
Triphenylene
Chrysene
Naphthacene
Coronene
1 -Methyl naphthalene
Acenaphthene
Fluorene
9-Methyl anthracene
PA(3-21G)
190.0
198.9
205.2
245.7
209.2
221.3
225.1
207.4
208.9
206.3
213.6
235.1
214.3
211.4
217.2
199.3
229.5
PA1
195.5

210.4
248.1
214.4



214.4
212 .4
219.4

220.5
216.1

205.3
233.6
PA(6r3tG*>
195.5
204.2
210.4
248.0
214.9
225.2
229.4
213.1
•


239.4

216.1
222.8
205.3
233.4
^?MP
-------
Ab Initio Calculations of Proton Affinities, lonization Potentials, and Electron Affinities of Polynuclear
Aromatic Hydrocarbons and Correlation with Mass Spectral Positive and Negative Ion Sensitivities
             Figure 1: Correlation of Theoretical Proton Affinities with Experiment.
4 Valkenberg, C.A.; Knighton, W.B.; Grim-
   srud, E.P. J. High Resolut. Chromatogr.
   1986,9,320.
5 Oehme, M. Anal. Chem. 1983, 55, 2290.
6 Daishima, S.; lida, Y.; Shibata, A.; Kanda,
   F. Org. Mass Spectrom. 1992, 27, 571.
7 Younkin, J.M.; Smith, L.J.; Compton, R.N.
   Theoret. Chem. Acta1B7Q, 41,157.
8 Uggerud, E. Mass Spectrom. Reviews
   1992,11,389.               ;
9 Koopmans, T. Physica 1933,1 ,| 104.
10 Lias, S.G.; Liebman, J.F.; Levin, R.D. J.
   Phys. Chem. Ref. Dafa1984,;13, 695.
1  L.D. Betowski and Steve Pyle, U.S. EPA, EMSL-Las Vegas, P.O. Box 93478, Us Vegas, NV 89193-3478.
2  Donald H. Aue and James W. Caras, Department of Chemistry, University of California, Santa Barbara, Santa
    Barbara, CA 93106.                                                            '
76
               NESC Annual Report - FY1994

-------
  Parameterization of Octanoi / Water Partition Coefficients
  (LogP)  Using 3d Molecular Properties:    Evaiyation of Two
  Published Models for LogP Prediction
             *     •*   *    f   " ^ "*   ^^ .N.A ,     ••                 .j
 Disclaimer
   This manuscript has been reviewed by the
 Health Effects Research  Laboratory, U.S.
 Environmental Protection Agency (EPA) and
 approved for publication.  Approval does not
 signify that the contents necessarily reflect
 the views and policies of the Agency, nor
 does mention of trade names or commercial
 products constitute endorsement or recom-
 mendation for use.

 Background
   The water/lipid partitioning process that
 governs transport and bioavailability in bio-
 logical systems plays a central role in modu-
 lating and determining the relative activity of
 many xenobiotics. The development of
 approximate computational means for mod-
 eling this partitioning capability has contrib-
 uted greatly to the field of quantitative
 structure-activity relationships. The most
 extensively parameterized, tested, and
 widely used computational approach for
 estimating such quantities is the empirical,
 fragment-based approach of Hansch and
 Leo1 (and the automated CLOGP version2)
 that is heavily relied upon within the Agency
 for toxicity screening. CLOGP treats a mol-
 ecule as a collection of fragments and mod-
 els the log of the octanol/water partition
 coefficient (logP) as a sum of parameterized
 fragment constants, where the contribution
 of each fragment to the total partitioning is
 assumed to be invariant to 3-dimensional
 molecular environment. As a result, the
 CLOGP method is unable to account for
 many types of non-bonded interactions and,
 thus, is incapable of distinguishing 3D con-
 formational isomers or closely related posi-
 tional isomers.  The method also relies on
 an incomplete set of fragment coefficients
 and on a large number of correction factors
 to account for cases where the strict additiv-
 ity assumption does not hold.
   An alternative approach to the estimation
 of partition coefficients explicitly incorpo-
 rates properties derived from the 3D molec-
 ular structure. The present study contrasted
 two published models for logP estimation
 based on 3D molecular properties. The
 goals were to evaluate the overall  predictiv-
 ity, relative performance, and ability of these
 models to meaningiFully distinguish logP val-
 ues for conformational and positional
 isomers.

 Methods and Models
  The published Bodor and Huang3 logP
 prediction model was derived from a training
 set of 302 compounds. A large number of
 potential parameters, including higher order
 non-linear terms, were screened for statisti-
cal association with experimental logP val-
ues by a stepwise linear regression
procedure3. The final 18 term function has
the form shown in Equation 1 where:
      LogP  = Const. + a1 . n + a2. laik + a3 • SOON + a4. ABSQ + a5. MW + ae. NC +
              a7.SQN + a8.SQN2 + a9.SQN4+                ;

              a10 • SQO + an . SQO2 + a12 • SQO4 +             ,
              a13.SA + a14.SA2+                            \

                 . OVAL + a! 6. OVAL2 + a^7. OVAL4            ;
                                     Equation 1                ',
NESC Annual Report - FY1994
                                   77

-------
Parameterization of Octanol/Water Partition Coefficients (LogP) Using 3d Molecular Properties: Evaluation
of Two Published Models for LogP Prediction                                       j
3j - regression coefficients3, n = total dipole
moment; Iaik=1 for alkanes, 0 for non-
alkanes; SQN, SQO, and SOON are sums
and products of the partial atomic charges
on nitrogen and oxygen atoms; ABSQ is a
sum of absolute charges of all atoms; MW is
the molecular weight; NC is the number of
carbon atoms; SA is the Van der Waals sur-
face area; and OVAL is an ovality function.
All models of this general form are referred
to by us as BODOR models; the  original
published model is referred to as BODORO.
  The published Kantola, Villar and Loew4
logP prediction model function was pro-
posed prior to statistical analysis and, sub-
sequently, was "empirically calibrated" on a
training set of 90 chemicals. Three terms
involving summations over all atoms were
chosen to represent: hydrophobic cavity-for-
mation, hydrophilic charge contributions to
the interaction energy, and atomic polari;:-
ability contributions to the hydrophobic inter-
action. The 18 term function has the form
shown in Equation 2

  LogP = 2, [ aj(R). SAj + p,(R). SAj. (Aq,)2
                           Equation 2


where: i sums over all atoms of six atom
types, R = C, H, N, O, Cl and F; 
-------
  Parameterization of Octanol / Water Partition Coefficients (LogP) Using 3d Molecular Properties; Evaluation
                                                  of Two Published Models for LogP Prediction
 recalculated parameters for the TRAIN90
 data set. This was successfully accom-
 plished for Equation 2, page 78, with the
 resulting model referred to as LOEW1.
 LOEW1 and published LOEWO logFD values
 were in good agreement.  However, the
 effort to derive new regression coefficients
 for Equation 1, page 77, based on TRAIN90
 was unsuccessful due to the highly collinear
 nature of the regression parameters. To test
 the significance of the non-linear terms in
 Equation 1 and to effect a solution to the
 regression problem, four new condensed
 parameters - SQNX, SQOX, SAX, and
 OVALX - were defined in terms of the
 weighted polynomials in the original pub-
 lished BODORO model3, e.g. SQNX = a7.
 SON + aa. SQN2 + a9. SQN4.  The result-
 ing 11-term equation was denoted
 BODOR1X.  In addition, the performance of
 models were examined in which the VDW
 radii surface area was replaced by the sol-
 vent-accessiblesurface area (defined by the
 addition of 1.5 A to the VDW surface) -
 denoted BODOR1XS and LOEW1S, and no
 higher order polynomial terms were included
 in an Equation 1-denoted BODOR1.

 Results and Discussion
   Summary statistics for application of
 CLOGR BODORO' and various models
 derived from TRAIN90 to prediction of the
 TRAIN90 data set and TEST102 data set
 are provided in Table 1.  Not surprisingly, all
 models performed reasonably well on the
 TRAIN90 set. In contrast, all models, with
 the sole exception of CLOGP, perform signif-
 icantly worse when extrapolated to predic-
 tion of the TEST102 set. The predictive
 capability of these models varies consider-
 ably within subclasses of the TEST102 data
 set (results not shown), with predictions for
 some chemical classes (steroids  and gluco-
 pyranosides) much worse than for others.
  In order to assess the impact of the nature
 and size of the TRAIN set on the predictivity
of the various models, 30 chemicals, repre-
sentative of the various chemicals classes in
TEST102, were added to the TRAIN90 set
to yield the expanded TRAIN120 set.
            Table 1: Summary Statistics (Adj. R2) for Overall Model Performances.
^ * V
"* Vk
,, ,~>-~
\CL00f»;
BODORCU
v *•* \ *i v *
^BOBORIX
\xgEwr:o
"LOEWISV
TRAIN90 »:
TRAIN90
TEST102
0.909 *
0.957
0.884
0.729
0.919
0.748
0.921
0.731
0.927
0.711
TRAIN120'.
TEST102
0.957
0.729
0.857
0.901
0.884
* Adjusted R squared of the linear regression fit of predicted to experimental logP values.
  BODOR1X, LOEW1 and LOEWIs models developed from TRAIN90 set; all models applied to prediction of logP for
    the TRAIN90 set and the TEST102 set.
c BODOR1X, LOEW1 and LOEWIs models developed from TRAIN120 set (TRAIN90+30 from TEST102); all models
    applied to prediction of logP for TEST102 set.
NESC Annual Report - FY1994
                                    79

-------
Parameterization of Octanol / Water Partition Coefficients (LogP) Using 3d Molecular Properties': Evaluation
of Two Published Models for LogP Prediction
Application of this model to prediction of the
TEST102 set is presented in the last row of
Table 1. [Note that this is no longer a strict
extrapolation since TRAIN120 includes 30 of
the TEST102 chemicals.] The performance
of the BODOR and LOEW models markedly
improves, with the LOEW models realizing
greater overall improvement than; the
BODOR1X model.              I
  LogP predictions using the various models
developed from the TRAIN120 set are pre-
sented in Table 2 for a variety of positional
         Table 2:  LogP Model Predictions3 for Positional and Conformational Isomers.

2-cyanopyridine#
3-cyanopyridine
2,5-dichlorobiphenyl#
2,6-dlchlorobiphenyl
2,2'-dichlorobiphenyl
2,4'-dichlorobiphenyI
2,4,4-trIchlorobiphenyl
2,3,4-trichIorobiphenyi#
2,4,5-trichIorobiphenyl
2,4,6-trfchlorobiphenyl
2,3',4'-trichloroblphenyl
2,3,4,5-tetrachlorobiphenyl
2,3',4'15I-tetrachlorobiphenyl
3-fIuorophenylacetic acid#
4-fluorophenylacetic acid
2-propylamino benzonorbomene (exo)
2-propylamino benzonorbomene (endo)
6-trifluorornethyl-2-amino°"(exo)
6-trifluoromethyl-2-amino""(endo)
7-trifluoromethyI-2-amino"u(exo)
7-trifluoromethyi-2-amino ""(endo)
dexamethasone#
betamethasone
EXP
0.50
0.36
5.06
4.84
4.65
5.07
5.67
5.51
5.60
5.44
5.60
6.04
6.13
1.55
1.65
3.30
3.13
3.21
2.91
3.19
2.85
1.83
1.94
CLOGP
0.23
0.23
5.46
5.46
5.46
5.46
6.17
6.17
6.17
6.17
6.17
6.88
6.88
1.56
1.56
3.22
3.22
2.86
2.86
2.86
2.86
1.41
1.41
BODORO
1.56
1.45
5.43
5.38
5.30
5.28*
5.69
5.32
5.66
5.74*
5.30*
5.89
5.61*
1.66
1.74
3.56
3.72*
3.34
3.28
3.35
3.21
1.94
1.97
BODOR1X
1.13
1.1 4*b
5.07
4.98
4.83
4.74*
5.20
4.82
5.15
5.32*
4.76*
5.30
5.04*
0.97
1.11
3.39
3.57*
3.09
3.05
3.13
2.91
1.75
1.78
LOEW1
0.93 i
0.89
4.98 [
4.88 ;
4.79 |
4.91 i
5.29 ;
4.87 !
5.46
5.35
5.10*
5.73 |
5.44*
1.53
1.50*
3.07
3.09*
2.86 :
2.85
I
2.86 ,
2.76 i
1.38
1.34*'
LOEW1S
0.91
0.79
4.99
4.72
4.60
4.96
5.43
4.91
5.44
5.25
5.19*
5.58
5.62
1.36
1.38
3.16
3.14
2.84
2.72
2.81
2.74
1.72
1.68*
a All models developed from TRAIN120 set; set includes 7 chemicals marked with a # symbol.       ':
b LogP predictions marked with an * represent an incorrect direction of change from the previous logP value
    column, e.g., logP should decrease from 2- to 3-cyanopyridine but is incorrectly predicted to increase in
    BODOR1X model.
                                   in the
80
               NESC Annual Report - FY1994

-------
  Parameterization of Octanol / Water Partition Coefficients (LogP) Using 3d Molecular Properties: Evaluation
                                                  of Two Published Models for LogP Prediction
  isomers and conformational isomers from ;
  TEST102. Although the absolute errors in
  logP predictions for all models is greater, on
  average, than the small differences in exper-
  imental logP values between closely related
  isomers, cancellation of errors may occur for
  similar chemicals. The results in Table 2,
  page 80, clearly indicate that the BODOR
  and LOEW models are able to distinguish
  between closely related positional and con-
  formational isomers, but it is not clear that
  such models can correctly predict relative
  changes in logP values between such iso-
  mers?  In Table  2 and results not shown, the
  LOEW1S model has the best overall suc-
 cess predicting the correct direction of logP
 change between isomer pairs  approxi-
 mately 80%, while the success rate of the
 BODOR models is near random for some
 chemicals classes and much better than
 random for others.

 Conclusions
   Both the BODOR and LOEW model func-
 tions provide reasonably accurate means for
 predicting logP values in cases where
 CLOGP values are unavailable, with
 adjusted R2 values near 0.9. The results of
 this study are not conclusive but do suggest
 that these models have some capability for
 meaningfully distinguishing between closely
 related isomers,  although their accuracy and
 reliability for this  purpose should be demon-
 strated for the chemical class of interest.
 For the chemicals considered in this study,
 the LOEW models seem to provide more
 accurate prediction of relative logP changes
 for closely related isomers, and realize
 greater improvement in  performance than
 the BODOR models as the size of the train-
 ing set increases. Other significant advan-
 tages of the LOEW model functional form
 are its a-priori rational design that could be
 refined by a more rigorous consideration of
 the solvation process, and its atom-based
 parameterization that provides a convenient
 means for examining fragment contributions
 to the total hydrophobicity.  Due to these
 advantages, future investigations would
 likely focus on refinement of the LOEW
 model approach for logP prediction.
   The Agency relies on the CLOGP method
 for estimation of logP values for use in a
 variety of toxicity estimation models.  In
 cases where CLOGP is unable to make a
 prediction, or is unable to provide the logP
 discrimination required for a given study,
 alternative logP calculation methods based
 on 3D structure should prove extremely
 useful.

    A full manuscript discussing this topic is
 in preparation for journal publication.
                 |

 References

 1  Hansch, C. and Leo, A. (1979) Substituent
    Constants for Correlation Analysis in
    Chemistry and Biology, Wiley-lnter-
    science, New York
 2  Leo, A., and Weiininger, D. (1992)
    CLOGP: Medchem Software Release
    3.5, Medicinal Chemistry Project,
    Pomona College,  Claremont, CA.
 3  Bodor, N. and Huang, M.-J. (1992) J.
    PharmSci., 81,; p. 272.
 4 Kantola, A., H. O. Villar, and G. Loew
    (1991) J. Comput. Chem., 12, p. 681.
 5 Hawker, D. W.  and D. W. Connell (1988)
    Environ. Sci. Technol., 22, p. 382.
 6 Gobas, F. (1987)  in: QSAR in Environ-
    mental Toxicology II (K. L. E. Kaiser,
    ed.), D. Reidel Publishing Co., p. 112.
 7 Voogt, P. D., J. W. M. Wegener, J. C.
    Klamer, G. A. Van Zijl and H. Govers
    (1988) Biomed.and Environ.Sci. 1, p.
    194.          |
8 Pleiss, M. A. and  G. L. Grunewald (1983)
   J. Med. Chem., 26, p. 1760.
9 Mannhold, R., K. P.  Dross and R. F.  Rek-
   ker (1990) Quant. Struct.-Act. Relat, 9,
   p. 21.
10  Kim, K. H. and Y. C. Martin (1986) J.
   Pharm. Sci, 75  p. 637.
NESC Annual Report - FY1994
                                     81

-------
Parameterization of Octanol/ Water Partition Coefficients (LogP) Using 3d Molecular Properties: Evaluation
of Two Published Models for LogP Prediction


'  Ann M. Richard, Carcinogenesis and Metabolism Branch, Health Effects Research Laboratory, U.S. Environmental
     Protection Agency, Research Triangle Park, NC 27711 and Phillip F. Boone, Environmental Health Research and
     Testing, Inc., Research Triangle Park, NC 27711. Current address:  Integrated Laboratory Systems, Research
     Triangle Park, NC 27711.
82
NESC Annual Report - FY1994

-------
   Regional Acid Deposition Model (RADM) Evaluation1
 EPA Research Objectives
   Regional air quality models are needed
 and used to extrapolate outside current con-
 ditions, therefore, these advanced models
 are developed with parameterizations and
 physical and chemical mathematical
 descriptions as close to first principles as
 possible. The purpose of the evaluation is to
 test the science incorporated into the
 advanced models. Evaluation is diagnostic,
 to explore quality of predictions and develop
 an appraisal of model strengths and weak-
 nesses.  The data used were specially col-
 lected for the Regional Air Deposition Model
 (RADM) evaluation as part of the National
 Acid Precipitation Assessment Program
 (NAPAP) and a bi-national effort, the Eule-
 rian Model Evaluation Field Study, (EMEFS).
 The data were collected over two-year
 period with special, several-week intensities
 that used very advanced instruments to col-
 lect air concentrations to provide data that
 would support the most diagnostic testing.

 Overview of Project
  Early evaluation research concentrated on
 examining the predictions for the sulfur
 cycle. Significant improvements to the
 RADM were accomplished (see references).
 Current research continues to investigate
 the nitrogen cycle, which is much more com-
 plex. This investigation focuses on testing
 the ability of the model to accurately repli-
 cate in time and space the conversion (or
 oxidation) of nitrogen oxides  (NOX) to their
 oxidized products,  PAN and nitrates (particu-
 late nitrate, NO3', and nitric acid, HNO3).
 Measurements aloft taken by aircraft carry-
 ing sophisticated instruments to measure air
 quality in the EMEFS's 1990  aircraft inten-
 sive and measurements at the surface at
 two special sites, Scotia, PA and Egbert,
 Ontario, are used for the diagnostic testing.

 Background and Approach
   The observations were developed from
 measurements taken during ten aircraft
 flights over a 45-day period from April 15 to
 May 30,1990. The standard 80-km version
 of RADM 2.6 was used to simulate the 45-
 day period and a dlata-probe was "flown"
 through the model,,  These "data" from the
 model are compared to equivalent data from
 the aircraft measurements.  Work from Fis-
 cal Year 1993 showed  that reducing grid
 size had little effect  on the rate of conversion
 of NOX to PAN and HNO3, at least for grid
 meshes ranging from 20 to 80 km.  Thus,
 the runs prepared for the April 15 to May 30,
 1990 period focused on use of the 80-km
 RADM that encompassed the aircraft flights.
 Due to uncertainties in  the NOX emissions
 inventory, comparison runs were also made
 with the new 1990 interim inventory.

 Accomplishments Using the NECS's
 Cray
  The 80-km RADM runs for the April 15 to
 May 30,1990 period required approximately
 65 hours on a single-processor Cray Y-MP.
 The reruns using the 1990 Interim Emis-
 sions Inventory for a sub-period required
 approximately 35 hours on a single proces-
 sor of the NESC's Cray Y-MP. It required
 about eight weeks to prepare the initial eval-
 uation emissions and three weeks to pre-
 pare the sub-set for  testing the emissions
 inventory. The model runs on the Cray were
able to  be completed in about two weeks
and one week, respectively.
NESC Annual Report - FY1994
                                    83

-------
Regional Acid Deposition Model (RADM) Evaluation


Scientific Results and Relevance to
EPA Mission
  The full set of comparisons showed simi-
lar results to those from the 1988 period.
However, the meteorological model per-
formed much better for 1990 because it was
a more normal year, whereas 1988 was a
drought year with a significant portion of
rainfall being convective. The 1990 aircraft
results were very similar to those from 1988,
showing that photochemistry issues regard-
ing model performance are similar across
several seasons. Also, the results from the
surface sites during summer of 1988 were
very consistent with the aircraft
measurements.
  This study enhances our understanding of
the working of regional model photochemis-
try for rural ambient concentrations condi-
tions. This in situ understanding is critical
because smog chambers cannot be used to
test the chemical mechanisms at the low
concentrations representative of regional
conditions. Proper computation of the pho-
tochemistry for rural conditions is important
to the ability of models to support explora-
tion of and establishment of appropriate
emissions controls to reduce and eliminate
violations of the ozone health standard.
Examination of nitrogen chemistry  is impor-
tant because it is a central part of the oxida-
tion process forming ozone and because
rural oxidant production is generally
believed to be NOx-limited.
Future Objectives and Plans
  The evaluation will continue with addi-
tional sensitivity studies directed at under-
standing biogenic emissions influences on
the RADM chemistry. Preliminary indica-
tions are that to improve ambient profiles of
isoprene concentrations, vertical resolution
will need to be increased from 15 to 30 lay-
ers. In addition, a new nested grid mesh
resolution set of 54-km coarse-grid and 18-
km fine-grid will be tested. Once the newly
adapted meteorological model has been
tested, roughly 400 of Cray Y-MP; hours will
be required to regenerate new meteorology
for the 1988 evaluation period, 60 of Cray
Y-MP hours to generate new 54-km RADM
results, and 300 of Cray Y-MP hours to gen-
erate new 18-km HR-RADM results for the
next round of diagnostic testing of the chem-
istry in RADM.                !

Publications and Reports
Dennis, R.L., J.N. McHenry, W.R.| Barchet,
    F.S.  Binkowski, and D.W. Byuh, 1993:
    Correcting RADM's sulfate underpredic-
    tion: discovery and correction [Of model
    errors and testing the corrections
    through comparisons against jfield data.
    Atmospheric Environment 27A, 975-
    997.                      i
Cohn, R.D. and R.L Dennis, 1994: The
    evaluation of acid deposition models
    using principal component spaces.
    Atmospheric Environment 28A(15),
    2531-2543.                [
1  Robin L. Dennis, Talat Odman, Richard D. Cohn, and Daewon Byun, U. S. EPA Regional Acid Deposition Model
    Evaluation; EPA Atmospheric Research and Exposure Assessment Laboratory, Research Triangle Park, NC.
84
              NESC Annual Report - FY1994

-------
   Atmospheric Deposition of Nitrogen to Chesapeake Bay
 EPA Research Objectives
   Nitrogen is the primary cause of eutrophi-
 cation in Chesapeake Bay. Nitrogen input
 from the atmosphere represents a signifi-
 cant source of nitrogen to the Bay (25-35%)
 of the nitrogen loading. Water quality models
 have incorporated atmospheric nitrogen, but
 in a very simple manner.  One objective of
 this research is to provide more accurate
 estimates of the quantity and the pattern of
 nitrogen loading from the atmosphere to the
 Chesapeake Bay watershed and the Bay
 itself. These estimates will be provided as
 inputs to the water quality models for the
 watershed (the HSPF model adapted by the
 Chesapeake Bay Program Office) and the
 Bay (the 3-D Bay Water Quality model
 developed by the Army Corps of Engineers).
 Another objective of this research is to
 determine the extent of the airshed that is
 primarily responsible for the atmospheric
 nitrogen affecting the Bay watershed. The
 airshed will be larger than the watershed.
 The overall purpose is to develop an under-
 standing of which controls of NOX emissions
 to the atmosphere will have the greatest
 benefit on reducing the nitrogen loading to
 coastal estuaries. This work is important to
 the Chesapeake Bay Program Office's
 efforts to achieve a 40% reduction in control-
 lable nitrogen loading to the Bay by the year
 2000 and to the upcoming 1996 Agency
 decision on the amount of Phase 2 MOX
 controls required by the 1990 Clean Air Act
 Amendments.

 Overview of Project
  Development of more accurate spatial
fields of nitrogen loading estimates involves
estimation of annual average nitrogen depo-
sition to coastal areas using the Regional
 Acid Deposition Model. These deposition
 estimates are made for the new 1990 interim
 emissions and representative meteorology.
 Development of an understanding of the air-
 shed influencing the Chesapeake Bay
 watershed involves using the Regional Acid
 Deposition Model (RADM) as a laboratory of
 the real world to carry out sensitivity studies
 that elucidate the  contributions of different
 emissions sources to the Bay watershed.
 This source-receptor understanding is very
 difficult to nearly impossible to develop from
 empirical data and requires the designing of
 sensitivity studies that will extract that infor-
 mation from a mathematical model.

 Background and Approach
   Because the RADM is very computation-
 ally intensive, it is  not feasible, with today's
 computing power to simulate an entire
 year's worth  of meteorology to develop
 annual average  estimates of deposition
 loading.  Instead, annual averages are
 developed from a weighted average of a sta-
 tistical sample of 30 five-day model runs.
 The average is representative of meteorol-
 ogy for the 1982 to 1985 period, which has a
 rainfall pattern very close to a 30-year aver-
 age. Meteorological events (synoptic pro-
 gressions of high and low pressure systems)
 with similar 850-mb wind-flow patterns were
 grouped or classified by applying cluster
 analysis to them. This resulted in 19 sam-
 pling groups or strata. Meteorological cases
 were randomly selected from each stratum,
 based on the number of wind-flow patterns
 in that stratum and on the number in each of
the other strata.  This procedure approxi-
 mates proportionate sampling. The number
of cases, 30, was set after carrying out a
sampling-error analysis on wet sulfur
NESC Annual Report - FY1994
                                     85

-------
Atmospheric Deposition of Nitrogen to Chesapeake Bay
deposition and taking into consideration
computer resource limitations. These are
termed the aggregation cases.
  Producing an estimate of the airshed
affecting the Bay watershed requires devel-
opment of a source-receptor understanding
on an annual basis. This development
requires an experimental design that will
extract this information from sensitivity stud-
ies with RADM. Because NOX emissions
contribute to oxidant production and there is
a dynamic interplay between the production
of ozone and nitric acid, the dominant form
by which nitrogen is deposited to the Earth's
surface, the modeling of nitrogen must incor-
porate full photochemistry as is done in
RADM. As a first approximation to the
source-receptor relations implicit in the
model calculations, a portion of the emis-
sions from sources of interest are subtracted
from the emissions fields. The 30 aggrega-
tion cases are run and the results subtracted
from results obtained with unperturbed
emissions fields.  For this study, the objec-
tive was to develop an understanding of the
range of influence of ground-level NOX
emissions (such as automobiles) and upper-
level NOx emissions (such as power plants)
with regard to nitrogen deposition. From the
range of influence of different subregions,
the extent of the airshed can be estimated.
In addition, the same approach is used to
developing an estimate of the responsibility
of designated Bay states to deposition
across the Bay watershed.

Accomplishments Using NESC's Cray
  Base 1990 emissions together with sub-
tractions of emissions from ten different
emissions areas roughly 200 km on a side
formed an effective "tagging" of the emis-
sions that was the core of the study.  The ten
subregions span the potential regions of
influence on watershed deposition.  The
determination of the range of influence of
the ten 'lagged" regions requiring 1,400
hours on a single-processor Cray Y-MP.
Runs to establish state-level responsibility
for nitrogen deposition required 850 hours
on a single processor Cray Y-MP.  Initiation
of work for next year that will depend on 20
km resolution runs required 200 hours on a
single processor Cray Y-MP.  The quantity of
CPU hours was large and was obtained over
two quarters, requiring close coordination
and cooperation between the NESC and the
RADM modeling team, a cooperation that
was successful. This project also! benefited
from 850 Cray Y-MP CPU hours of runs from
the Feasibility of Deposition Standards
Study.                         |

Scientific Results and Relevance to
EPA Mission                 :
  The 'lagged" subregions study !on the
range of influence of NOX emissions on
nitrogen deposition as a function of the
height of the emissions (surface or tall-
stack) produced a somewhat unexpected
result. The range of influence of surface
emissions appears to be roughly 70-100% of
the range from tall stacks. This is different
than the "conventional" wisdom which would
"predict" that the range of influence from tall
stacks would be much greater. It is possible
that conventional wisdom has been influ-
enced by study of the sulfur system where
the primary specie, SO2, plays a significant
role in the total deposition. In the; nitrogen
system, nitrogen deposition is almost
entirely due to the secondary specie nitric
acid, HNOs (ignoring ammonia for the
moment).  It is also likely that design effects
reduce the range computed by the model for
tall stacks from what it should be.
  A comparison of the nitrogen 'lagged"
sub-regions with the tagged sulfur model for
the exact same subregion indicated the
range of influence of SO2 and NOX emis-
sions is very similar. This is  not what was
expected, producing another surprise. This
result will need to be investigated further.
The distance over which NOX emissions
appear to  noticeably influence the nitrogen
deposition is approximately 600 to 800 kilo-
meters.  The comparability of the iSO2 and
NOX range of influence allowed us to use
the sulfur results to more precisely estimate
86
              NESC Annual Report - FY1994

-------
 or define the airshed affecting the Bay
 watershed.  The results of this study are that
 the airshed affecting Chesapeake Bay is sig-
 nificantly larger than the watershed. The air-
 shed includes many sources along the
 middle and upper Ohio River and it was sur-
 prising how far down the Ohio River utility
 sources were influencing the Chesapeake
 Bay watershed. The airshed is roughly
 200,000 to 220,000 mi2, more thanViree
 times the watershed's 64,000 mi2. These
 results are very important and, as intended,
 have effectively set the stage for Fiscal Year
 1995 work.

 Future Objectives and Plans
   Future plans call for a  major shift to devel-
 oping annual average nitrogen deposition
 using the 20 km High Resolution RADM.
 Higher resolution is needed to better resolve
 urban influences, major point source influ-
 ences and deposition to water surfaces for
 more accurate linkage with the water quality
 models. This shift will quadruple the CPU
 requirements for each study. Two major
 investigations are needed by the Chesa-
 peake Bay Program  Office. The first relates
 to estimating the reductions in deposition
 that are expected to  occur due to the 1990
 Clean Air Act and to implementation of more
 stringent reductions in NOx emissions due
 to requirements stemming from oxidants,
 rather than acid rain. This study is expected
 to require on the order of 2,000 Cray Y-MP
 CPU hours. The second  investigation
 relates to defining the influence of urban
Atmospheric Deposition of Nitrogen to Chesapeake Bay


      areas on nearby estuaries and defining the
      portion of the nitrogen deposition that is  due
      to the urban areas.  A modeling study is
      required because monitoring data are lack-
      ing. The second study is also expected  to
      require 2,000 Cray Y-MP hours.

      Publications and Reports
       Results from the 1994 studies are being
      presented at the Annual  SETAC  Meeting in
      November 1994. A book will be produced
      after the meeting with a chapter describing
      the results of these studies.

      Figures
      Figure 1, page 88. Range of integrated total
         nitrogen deposition from a NOX source
         region at the watershed boundary
         (Southwest Pa/Northern WV), showing
         the overlap with the Chesapeake Bay
         watershed, as simulated by the Regional
         Acid Deposition Model.
      Figure 2, page 89. Range of integrated
        total nitrogen deposition from a NOx
        source region far from the watershed
        boundary (Cincinnati Area), showing the
        overlap with the Chesapeake Bay water-
        shed, as simulated by the Regional Acid
        Deposition Model.
      Figure 3, page 90.  Estimated boundary of
        the airshed within which NOx emissions
        significantly affect the Chesapeake Bay
        watershed compared to the watershed
        boundary.
1 Robin L. Dennis and Lewis C. Linker, U. S. EPA, Regional Acid Deposition Model Applications; EPA Atmospheric
    Research and Exposure Assessment Laboratory, Research Triangle Park, NC 27711.
NESC Annual Report - FY1994
                                         87

-------
Atmospheric Deposition of Nitrogen to Chesapeake Bay
                                RADM TOTAL NITROGEN DEPOSITION
                                INTEGRATED PERCENT CONTRIBUTION FROM
                               SUBREGION 13, SOUTHWEST PA/NORTHERN WV
                                1985 (100* REDUCTION IN NOX AND SO2)
                  /•*
       Figure 1: Range of Integrated Total Nitrogen Deposition From a NOx Source at the
                                 Watershed Boundary.
88
NESC Annual Report - FY1994

-------
                                            Atmospheric Deposition of Nitrogen to Chesapeake Bay
                                   RADM TOTAL NITROGEN DEPOSITION
                                  INTEGRATED PERCENT CONTRIBUTION FROM
                                      SUBREGION 22, CINCINNATI AREA    ;
                                   1985 (100% REDUCTION IN NOX AND S02) i
     Figure 2: Range of Integrated Total Nitrogen Deposition From a NOX Source Region Far
                               From the Watershed Boundary.
NESC Annual Report - FY1994
89

-------
Atmospheric Deposition of Nitrogen to Chesapeake Bay
                       WATERSHED AND AJBSHED OUTLINES
    Figure 3: Estimated boundary of the Airshed Within Which NOX Emissions Significantly
                         Affect The Chesapeake Bay Watershed.               ,
90
NESC Annual Report - FY1994

-------
   Study of the Feasibility of Acidic Deposition Standards1
 EPA Research Objectives
   Developing accurate estimates of the
 impact of the 1990 Clean Air Act Amend-
 ments (CAAA) on acidic deposition and
 atmospheric sulfate (key to visibility degra-
 dation in the Eastern United States) are
 important to the Agency.  The amount of
 reduction in sulfur deposition to be antici-
 pated by 2005 or 2010 due to implementa-
 tion of Title IV,  Phases I and II sets an
 important baseline for understanding how
 much mitigation in deposition is expected
 and how much farther we might need to go
 to provide protection to ecological
 resources.  The reduction in deposition load-
 ing in Canada that is likely coming from the
 United States and vice versa is important to
 the U.S. Canada Air Quality Accord. These
 estimates are important to the Canadians for
 them to project whether they will achieve
 their goal of wet sulfur deposition being
 below 20 kg-SOf/ha. As well, the European
 community is interested in estimates of the
 long-range transport across the Atlantic of
 sulfur-related acidic deposition that could be
 affecting them.  The objective is to develop
 best estimates from evaluated and well-
 characterized models of changes in acidic
 deposition loading, visibility impacting  pollut-
 ants, and oxidants.  Also, the cross-program
 effects from the different Titles of the 1990
 CAAA need to be characterized.

 Overview of Project
  Development of estimates of future depo-
 sition involves creation of estimates of future
 emissions that account  for population and
 economic growth plus the incorporation of
 emissions controls called for by the 1990
 CAAA. The new emissions are input to
 RADM simulations to estimate the new
 deposition, assuming the same meteorology
 as today's. A difficult element for the projec-
 tion of future emissions is the estimation of
 power-plant retirements, and the installation
 of new generating capacity to make up the
 difference in on-line capacity and projected
 demand. Of special difficulty is locating or
 siting potential future plants for the model-
 ing. These projections are generated by
 experts in the field of emissions estimation
 and projection. While the project focused on
 sulfur deposition, in the course of the study it
 became clear that nitrogen deposition was
 also important. Therefore, defining the dep-
 osition caused by the utility sector separate
 from the other sectors became important.  In
 addition, the reduction that could be
 obtained by enhanced reduction in utility
 NOX emissions took on elevated importance
 for the study.      ;
                 i
 Background and Approach
   Because the RADM is very computation-
 ally intensive, it is not feasible, with today's
 computing power to simulate an entire years
 worth of meteorology to develop annual
 average estimates of deposition loading.
 Instead, annual averages are developed
 from a weighted average of a statistical
 sample of 30 five-day model runs. The aver-
 age is representative of meteorology for the
 1982 to 1985 period, which has a rainfall
 pattern very close to a 30-year average.
 Meteorological events (synoptic progres-
 sions of high and low pressure systems)
with similar 850-mb wind-flow patterns were
grouped or classified by applying cluster
analysis to them. This resulted in 19 sam-
pling groups or strata.  Meteorological cases
were randomly selected from each stratum,
based on the number of wind-flow patterns
NESC Annual Report - FY1994
                                     91

-------
Study of the Feasibility of Acidic Deposition Standards
in that stratum and on the number in each of
the other strata.  This procedure approxi-
mates proportionate sampling. The number
of cases, 30, was set after carrying out a
sampling-error analysis on wet sulfur depo-
sition and taking into consideration com-
puter resource limitations. These are
termed the aggregation cases.
  Several emissions projection scenarios
were developed for the year 2010. They
included reductions expected due to the
1990 CAAA under assumptions of trading
and no trading of SO2 emissions allocations.
They also included reductions beyond the
1990 CAA requirements for both utility and
industrial emissions.  Very importantly, the
tagged sulfur engineering model was run for
53 emissions sub-regions for both 1985
emissions and 2010 CAAA-projected
emissions.
  The 1990 Interim emissions were used as
a basis for assessing the  responsibility of
the utility sector for nitrogen deposition and
for estimating the reduction in deposition
that might occur due to a  50% reduction in
utility emissions. The same technique as
used for the Chesapeake Bay long-range
influence study was used for this study.  A
portion of the emissions from the sector of
interest was subtracted from the base and
the difference, scaled to 100%, defined the
magnitude of the responsibility for
deposition.

Accomplishments Using the NESC's
Cray
  While the tagged engineering model runs
did not use the Cray, it is noteworthy that
more than 3,800 model runs on EPA's IBM
were required  as part of this study. To
establish the sector responsibility for deposi-
tion (utilities, mobile sources, industry and
other) and to assess the effect a 50% reduc-
tions in utility and mobile emissions requires
approximately 850 Cray Y-MP CPU hours.
Scientific Results and Relevance to
EPA Mission
  The reductions of total sulfur deposition
due to the Title IV acid rain controls are not
very protective and substantial reductions
beyond controls currently mandated may be
needed.  In addition, reduction of nitrogen
deposition will be required. The results with
the Tagged Engineering model showed that
the distribution of responsibility fqr deposi-
tion shifts from 1985 to 2010 because the
largest sources reduce most and Isome of
the smallest source regions actually
increase. The tagged  results were also
used to study the feasibility of geographi-
cally targeting emissions reductions while
still achieving deposition targets at the sen-
sitive aquatics regions. For modest deposi-
tion goals, geographic targeting was
feasible.  For  more stringent deposition
goals, geographic targeting became less
practical.
  For nitrogen deposition, the spatial distri-
bution of emissions is such that utility emis-
sions and mobile emissions have! distinct
regions of influence. To address acidic dep-
osition at several sensitive aquatic regions,
control of both utility and mobile sources
may be needed. Therefore, if nitrogen dep-
osition is to be reduced at the right places,
then both acidic-deposition motivated and
ozone-motivated control of NOX emissions
may be needed. This is consistent with the
new emphasis in EPA on considering the full
range of multi-media and multi-program
effects and taking a more holistic perspec-
tive towards pollution control.    '•,
                              I

 Future Objectives and Plans
  Future plans call for a more explicit study
of the ability to trade SO2 reductions for
NO/ reductions.  In addition, benefit to
acidic deposition of oxidant-motivated con-
trols of NOX emissions will be studied
together with their benefit to nitrogen loading
reduction for coastal estuaries.   \
92
              NESC Annual Report - FY1994

-------
                                            Study of the Feasibility of Acidic Deposition Standards
  Publications and Reports
    A report to Congress is in the process of
  being reviewed for release early in FY 1995.

  Figures
  Figure 1.  Map of the percent contribution
     from utility NOX emissions to the anthro-
          pogenic portion of the total (wet + dry)
          annual nitrogen deposition for the RADM
          modeling domain.
       Figure 2, page 94.  Map of the percent con-
          tribution from mobile source NOX emis-
          sions to the anthropogenic portion of the
          total (wet + dry) annual nitrogen deposi-
          tion for the RADM modeling domain.
                         RADM TOTAL (WET+ORY)           OiTOmON
                                                               CASE
UTUTY SOURCES
          OF fMO i
                      Figure 1:  Utility NOX Emissions Contributions.


1  Robin L. Dennis, U. S. EPA, Regional Acid Deposition Model Applications; EPA Atmospheric Research and
    Exposure Assessment Laboratory, Research Triangle Park, NC.
NESC Annual Report - FY1994
                                                                                   93

-------
Study of the Feasibility of Acidic Deposition Standards
                   RADM TOTAL (WST+DfWI NlieD<3EN
                                (EXCLUONG
                                   MOBILE
                                            OF 1990 8ASE
                    Figure 2:  Mobile NOX Emissions Contributions.
94
NESC Annual Report - FY1994

-------
   1990 Clean Air Act Section 812 Retrospective Study1
 EPA Research Objectives
   In the 1990 Clean Air Act Amendments, in
 Section 812, Congress asked for a retro-
 spective assessment of the benefits and
 costs of the 1970 Clean Air Act. A multi-pol-
 lutant assessment is called for that focuses
 on criteria pollutants (SO2, NOX, and O3),
 particulates, sulfates, visibility and acidic
 deposition. The basic objective is to
 develop a retrospective assessment. Thus,
 estimates are to be developed of the change
 in pollutant concentration and loads that
 would have occurred had the 1970 CAA not
 been enacted, and contrast these with the
 pollutant concentrations and loadings that
 historically occurred. The pollutant loading
 estimates are to be linked to effects models
 to generate estimates of effects. Costs and
 benefits associated with implementation
 against non-implementation are to be com-
 pared and contrasted.

 Overview of Project
  The 812 Retrospective project involves a
 coordinated effort between the Office of Pol-
 icy Analysis Research (OPAR), the Office of
 Program Planning and Evaluation (OPPE),
 and the Office of Research and Develop-
 ment (ORD). It is being coordinated and
 tracked at the Assistant Administrator Level
 within the Agency.  This study has been
 mandated by Congress and is being tracked
 by the Government Accounting Office
 (GAO).
  All  pollutants are being addressed in this
 investigation at both the urban and regional
 scale. Models are required, both of emis-
 sions and of air quality, to develop estimates
 of improvements expected due to full imple-
 mentation of the 1990 CAAA, and estimates
 of increases in air pollution that would have
 accrued had not the CAAA been enacted.
   The RADM model is being used because
 it provides the advantage of being able to
 consistently address in the same modeling
 system regional SO2, sulfate, visibility deg-
 radation due to sulfates, acidic deposition for
 sulfur and nitrogen species, and ozone.
 RADM inclusion of clouds and precipitation
 makes it a tool of choice for predicting
 regional ozone during summer months for
 welfare (crop and terrestrial) effects.

 Background and Approach
   This project requires estimates of annual
 averages for acidic; deposition (sulfur and
 nitrogen), annual pollutant distributions for
 sulfate and sulfate-associated visibility deg-
 radation, and seasonal distributions of
 oxidants (ozone).
   Because the RADM is very computation-
 ally intensive, it is not feasible, with today's
 computing power to simulate an entire years
 worth of meteorology to develop annual
 average estimates of deposition loading or
 annual concentration distributions. Instead,
 annual averages and distributions are devel-
 oped from a weighted average of a statisti-
 cal sample of 30 five-day model runs. The
 average is representative of meteorology for
 the 1982 to 1985 period, which has a rainfall
 pattern very close to a 30-year average.
 Meteorological events (synoptic progres-
 sions of high and low pressure systems)
 with similar 850-mb wind-flow patterns were
 grouped or classified by applying cluster
 analysis to them. This resulted in  nineteen
 sampling groups or strata. Meteorological
 cases were randomly selected from each
stratum, based on the number of wind-flow
patterns in that stratum and on the number
in each of the other strata. This procedure
approximates proportionate sampling. The
number of cases, 30, was set after carrying
out a sampling-error analysis on wet sulfur
deposition and  taking into consideration
NESC Annual Report - FY1994
                                     95

-------
1990 Clean Air Act Section 812 Retrospective Study
computer resource limitations. These are
termed the aggregation cases.
  Annual average sulfur deposition, sulfate
distributions, and visibility degradation distri-
butions are developed with the RADM sul-
fur-only Engineering Model together with
aggregation. Annual average nitrogen dep-
osition estimates are developing using the
full RADM with aggregation. A sample of
the 1988 ozone season simulated with the
High Resolution RADM is used to estimate
the seasonal distribution of ozone. Histori-
cal data is used to establish the actual distri-
butions. The model is used to provide an
estimate of the relative change in the distri-
bution by simulating both the control and no-
control cases for the same average meteo-
rology. Because biogenic emissions are
highly uncertain and a new estimate for iso-
prene emission fluxes came out during the
study that was approximately three times
higher, a special sensitivity study regarding
the ozone simulations was also carried out
as part of the Retrospective Study to
develop a judgment of uncertainties
involved.

Accomplishments Using the  NESC's
Cray
  The RADM Engineering Model was run on
EPA's IBM mainframe. The RADM and High
Resolution RADM were run on the NESC's
Cray. To simulate the change in acidic dep-
osition for 1980 and 1990 required 550 Cray
Y-MP CPU hours. To create the seasonal
estimates of the ozone distributions for 1980
and 1990 required approximately 500 Cray
Y-MP CPU hours. The ozone sensitivity
study with respect to isoprene emissions
required approximately 600 Cray iY-MP
hours during FY 1994. This sensitivity study
is not yet complete. More than a third of the
CPU hours were needed by the sensitivity
study. It is estimated that the ozone sensi-
tivity study will eventually take half of the
total hours spent on the 812 Retrospective
study. This is crucial to the credibility of the
overall study. However, sufficient CPU
resources often are not available to allow an
appropriate development of uncertainty
estimates.

Scientific Results and Relevance to
EPA Mission
  The main conclusion thus far is from the
uncertainty study. The relative change in the
distribution of ozone simulated by the
regional model is fairly insensitive to the
uncertainty in biogenic isoprene emissions,
even though the absolute ozone concentra-
tions that were predicted are sensitive to the
uncertainty.  Other results are waiting guid-
ance from the effects groups as to how the
RADM outputs should be expressed. The
dose-response functions that will be used
are under Agency review.

Future Objectives and Plans;
  Once the 812  Retrospective Study is com-
pleted, the Agency expects to learn from the
experience and design a Prospective Study,
also mandated by Congress. We [expect the
RADM model to be involved nereis well.
Publications and Reports
  None as yet.
1  Robin L. Dennis and Jim DeMocker, U. S. EPA, Regional Acid Deposition Model Applications; EPA Atmospheric
    Research and Exposure Assessment Laboratory, Research Triangle Park, NC.              ;
96
              NESC Annual Report - FY1994

-------
                                      •. ., - . ,.;^r-.._ J\'"
   The Role of Supercomputers in Pollution Prevention:
   Predicting Bioactivation for the Design of Safer Chemicals1
   Chemical design has traditionally focused
 upon developing chemicals that perform a
 specific function (e.g. solvent, reagent, dye,
 etc.), with essentially no consideration given
 to avoiding the inherent toxicity or hazard-
 ous nature that chemicals often have. Years
 of this reasoning has resulted in the synthe-
 sis and release into the environment of enor-
 mous quantities of many toxic chemicals.
 The toxic effects that many of these chemi-
 cals have  had on human health and the
 environment as a result of their production
 and use has become a major societal con-
 cern. From this concern, the concept of pol-
 lution prevention evolved and the Pollution
 Prevention Act was passed by Congress in
 1990. Pollution Prevention has become the
 central environmental ethic of the Environ-
 mental Protection Agency (EPA) and the
 nation.
   The ultimate approach to pollution preven-
 tion is source prevention or, more simply, not
 to create toxic chemicals in the first place.
 The most desirable and efficient way of pre-
 venting toxic effects that chemicals may
 have on  human health and the  environment
 is during the design of new chemicals or the
 redesign of existing chemicals.  That is, one
 of the best ways to prevent pollution is to
 design chemicals such that they are not only
 useful industrially or commercially, but that
 they are not toxic as well. Not only is human
 and environmental harm reduced, avoided,
 or alleviated, but the cost of any regulatory
 action in  terms of job loss and capital invest-
 ment  is minimized. In responding to the
 Administration's and EPA Administrator
 Carol Browner's requests for EPA to incor-
 porate pollution prevention whenever and
 wherever possible, scientists of EPA's Office
 of Pollution Prevention and Toxics (OPPT)
 have recently initiated a new source preven-
 tion initiative called "Designing Safer Chemi-
 cals".  The underlying principles of this
 initiative are: 1) consideration of the toxicity
 of existing chemicals already in commerce
 and of new chemicals before they are manu-
 factured and used; and 2) when necessary,
 make structural modifications that reduce
 toxicity without affecting overall commercial
 usefulness (i.e., efficacy).
   The toxicity of many chemicals is attribut-
 able to their metabolism. That is, following
 absorption, most chemicals are converted
 (metabolized) in the body to other chemi-
 cals. These latter chemicals (metabolites)
 can be nontoxic, but are often toxic. The
 metabolism of a chemical to toxic sub-
 stances is known as bioactivation, and
 accounts for the toxicity of most chemicals
 that are known to tie toxic.  Knowledge of
 the metabolites that are formed, and the
 structural requirements that lead to their for-
 mation can be highly useful for designing
 chemicals such tha,t they will not form toxic
 metabolites. Using EPA's supercomputer to
 perform ab initio calculations that otherwise
 would be impossible on large databases,  we
 are developing models that can be used to
 predict the tendency of a chemical to be bio-
 activated.  Using the supercomputer, we
 have already developed and validated a
 model that is not only useful for predicting
the hazard potential of new, untested chemi-
cals, but also for the design of less toxic and
equally efficacious analogs of the chemical.
Other models are currently being developed
using the supercomputer.
'  Stephen C. DeVito, Office of Pollution Prevention and Toxics (OPPT), U.S. Environmental Protection Agency (Mail
    Code 7406), Washington, DC 20460.
NESC Annual Report - FY1994
                                     97

-------
The Role of Supercomputers in Pollution Prevention: Predicting Bioactivation for the Design of Safer
Chemicals
98                                                               NESC Annual Report - FY1994

-------
   Application of Molecular Orbital Methods to Qsar And Lfer:
   Understanding and Characterizing interactions of Solutes
   and Solvents1'2
 Introduction
   Quantitative Structure Activity Relation-
 ships (QSAR) and linear free energy rela-
 tionships (LFER) have been used widely to
 correlate molecular structural features with
 their known biological, physical and chemi-
 cal properties1-7. QSAR and LFER assume
 there is a quantifiable relationship between
 the microscopic (molecular) and macro-
 scopic (empirical) properties in a compound
 or set of compounds.  Based on the
 assumption that properties can be related to
 the change in free energy (AG), and that this
 AG is dependent on structure, the original
 relationships were identified and quantified
 by Burkhardt and Hammett8-11.  Hansch
 eventually applied this technique to medici-
 nal chemistry with the realization that the
 octanol/water partition coefficient would pro-
 vide reasonable correlations with many bio-
 logical activities2.
   The application of QSAR and LFER meth-
 ods to predicting and  understanding solute/
 solvent interactions has been a relatively
 recent development, primarily conceptual-
 ized by Kamlet and Taft with the linear solva-
 tion energy relationship (LSER) and the
 solvatochromic parameters12-17. A number of
 additional regressions have also appeared,
 using additional descriptors to correlate sol-
 vation effects18.
   The LSER methodology, however, does
 provide one advantage that  most QSAR
 and LFER regressions lack, a single set of
 four descriptors that are used in every corre- •
 lation. The advantage here is obvious: One
 is  able to immediately and directly compare
                 different data sets and activities and have a
                 base of reference. The generalized LSER,
                 formalized by Kamlet and Taft is shown in
                 Equation 1.
                   The LSER parameters (molar volume,
                 hildebrand solubility parameter, and three
                 spectroscopically (derived parameters
                 describing  polarizability, hydrogen bond
                 acidity and hydrogen basicity) have been
                 used to successfully correlate over 250 dif-
                 ferent properties involving solute/solvent
                 interactions.  The LSER has led to a much
                 better understanding of the effects of solvent
                 on properties.
                   Based on the LSER methodology, we
                 have developed a new set of theoretically
                 derived parameters for correlations in QSAR
                 and  LFER relationships. Termed the theoret-
                 ical linear solvatiori energy relationship
                 (TLSER), this methodology has been
                 applied to a numbeir of diverse data sets cor-
                 relating general toxicity, specific receptor-
                 based toxicity, solute/solvent based physical
                 properties and UV-visible spectral shifts19-26.
                 In each case, the resulting TLSER regres-
                 sion  is equivalent or better in correlative
                 capability to the similar LSER or classical
                 QSAR regression.
                  This paper provides a general  overview of
                 the TLSER, and presents three examples of
                 how the TLSER descriptors have been used
                 to provide insight into solute/solvent interac-
                tions. In particular, three different solute/sol-
                vent  properties are presented that provide
                an example of the correlative ability of the
                TLSER.
       LOG (Property) =
bulk/cavity + polarizability/dipolarity
+ hydrogen bonding

         Equation 1
NESC Annual Report - FY1994
                                                    99

-------
Application of Molecular Orbital Methods to Qsar And Lfer. Understanding and Characterizing interactions
of Solutes and Solvents
 Methods
  All geometries were optimized using the
MNDO algorithm as contained within
MOPAC vG.O27-28.  Table 1 lists the com-
pounds used in this study. The experimental
data is taken directly from the original
sources29-35. Visualization and structure
entry were performed using the in-house
developed Molecular Modeling Analysis
and Display System and PC Model36'37. All
multiple regressions were performed using
Minitab38.
  The TLSER descriptors were taken
directly from the MNDO calculations. These
descriptors consist of six molecular parame-
ters that attempt to describe the important
features involved in solute/solvent interac-
tions. These parameters were developed
from and are modeled after the LSER meth-
odology.  The same generalized equation as
the LSER, as shown in Equation 1, page 99,
is applicable to the TLSER.
  The bulk/steric term of the TLSER is
described by the molecular van der Waal's
volume (Vmo), given in cubic angstroms, and
is computed by the method of Hopfinger39.
The dipolarity/polarizability term uses the
polarizability index (n\), obtained by dividing
the polarization volume40-41 by the molecular
volume to obtain a unitless quantity42. The
resulting 7Cl is not generally correlated with
Vjnc and defines the ability of electron cloud
to be polarized by an external field.
  The hydrogen bonding term from Equa-
tion 1 is divided into two effects in the LSER
approach, a hydrogen bond acidity (HBA)
and a hydrogen bond basicity (HBB).  In the
TLSER, the HBA and HBB effects are fur-
ther subdivided into two contributions each,
covalent acidity and basicity, and electro-
static acidity and basicity.  The covalent
HBB contribution is defined as the molecular
orbital basicity (sb), and is computed by sub-
tracting the energy of the highest [occupied
molecular orbital (HOMO) in the Substrate
from the energy of the lowest unoccupied
molecular orbital (LUMO) of water. The
covalent HBA contribution is the molecular
orbital acidity (sa), and is computed in an
analogous manner, with the energy of the
HOMO of water being subtracted from the
energy of the LUMO of the substrate. The
electrostatic HBB, or the electrostatic basic-
ity (q~) is the magnitude of the most negative
formal charge in the molecule.  The electro-
static acidity (q+) is the value of the most
positive hydrogen. Both q+ and q" are
derived directly from the Mulliken iCharges.
The general form for this equation, then, is
shown  in Equation 2.   P is the property of
interest and P0 is the  intercept. It is impor-
tant to note that although there  are six
descriptors in the generalized model, most
correlations reduce to  a 2-4  parameter u.

Results and  Discussion
  The TLSER can be used to characterize
three different types of data sets: a) multiple
solutes in a single solvent, b) multiple sol-
vents with a single solute, and c) multiple
solutes in multiple solvents.  In  each  case,
the information derived from the solute/sol-
vent interaction is different.  In case (a), the
resulting coefficients describe the effect of
the solvent in the process, and  the parame-
ters  provide the details of the solute. In
case (b) the reverse is true;  the coefficients
provide the effects of the solute in the pro-
cess, and the descriptors provide informa-
tion about the solvents. Finally in' case (c),
which is the most general, the coefficient
describe the process independent of solute
or solvent, as descriptors are present to
describe both solute and solvent.
                      LOG P = PQ + aVmc + bftj + ca  +dq' + e sa + fq+
                                        Equation 2
100
               NESC Annual Report - FY1994

-------
  Application of Molecular Orbital Methods to Qsar And Lfer: Understanding and Characterizing Interactions
                                                                      of Solutes and Solvents
                                                 solvents provides a unique probe with which
                                                 to study solute/solvent interactions35'46.  The
                                                 data set from Ruoff, et al, was used to gen-
                                                 erate a TLSER35. It is of importance to note
                                                 that although the TLSER descriptors now
                                                 relate to the solvent (whereas in case (a) the
                                                 descriptors referred to the solute), it is the
                                                 same set that is used in case (a).  The
                                                 resulting regressions is shown in Equation 4.
                                                   From this regression, increased solvent
                                                 volume and covalent acidity and basicity
                                                 increase solubility of 060, where increased
                                                 electrostatic basicity decreases solubility. In
                                                 this case, the covalent terms have been
                                                 slightly modified, with the e« and sa being
                                                 divided by 100 and subtracted from 0.3 in
                                                 order to present an increasing scale (larger
                                                 numbers indicate better acids or bases,
                                                 depending on the scale).  The dependence
                                                 on volume can be seen as an indication of
                                                 the effect of the Hildebrand solubility param-
                                                 eter (8h is proportional to  1/V). This does
                                                 not explain why V is significant although 8h
                                                 is not. The dependence on both epsilons
                                                 indicate that the solubility  is dependent on
                                                 primarily interactions between orbitals, and
                                                 not electrostatic interactions. This is further
                                                 reinforced by the negative dependence on
                                                 solvent electrostatic basicity.

                                                 Case C: Al-L of t-Butvl Haliries in Alcohols
                                                 and Water        ;

                                                  The final case to be described is the
                                                 enthalpy of solution of t-butyl chloride, bro-
                                                 mide and iodide in water and 13
 Case A:  Solubility in Supercritical COo
    As an example of case (a),  let us exam-
 ine the case of solubility of solutes in super-
 critical carbon dioxide24'31-33'43-45.  Solubility in
 supercritical media has been recently
 reviewed by a number of researchers.
 Reactions in supercritical media is currently
 being investigated as a means of enhancing
 the reactivity of hazardous materials. The
 resulting regression, using literature data of
 solubility in supercritical CO2 is shown in
 Equation 3.
   The regression agrees very favorably to
 the published regression of Politzer. By
 examining the coefficients, it is possible to
 speculate about the important aspects of the
 solubility in supercritical carbon dioxide.
 The electrostatic  basicity,  q_, is negative
 suggesting an inverse relationship between
 basicity and solubility. Likewise, the molecu-
 lar orbital basicity is positive, indicating a
 similar inverse relationship. The electro-
 static acidity is also important in  this regres-
 sion, suggesting increased solubility as the
 solute hydrogen bond acidity is increased.
The final term, n\, is also negative, indicating
 harder solutes (in the Pearson or Drago
sense) would be better solvate.  This is
entirely consistent with accepted models of
acidity and basicity.

Case B:  Solubility of C60  in Various
Solvents
   The carbon structure of C60 and the
method by which it is solvated by different
                     l4MPa
              LOG S CQ    = -6.0377tj+10.44080 -22.098^+24.350^-8.370

                        2   N=19 R=0.928 sd=0.477  F=22

                                          Equation 3
              LOGS = 1.515 -—- +45.761 Sp -3.081 q. +51.81 Osa -16.895

                     N=45 R=0.890 sd=0.789 F=37                ;

                                          Equation 4                 ',
NESC Annual Report - FY1994
                                                                                     101

-------
Application of Molecular Orbital Methods to Qsar And Lfer: Understanding and Characterizing Interactions
of Solutes and Solvents                                                         :
alcohols3*-47-*8. This presents a unique data
set for two reasons. First, it is one of only a
few thermodynamic properties we have
examined using the TLSER.  Second, the
data sets consists of information on three
solutes in 14 solvents, providing the possibil-
ity of a multiple solute/multiple solvent
regression.  Goncalves and coworkers mea-
sured this data and correlated it against a
series of more "conventional" LFER type
descriptors, including dipole moment, the
Kirkwood dielectric function, molar volume,
and the Reichardt-Dimroth ET.  Both individ-
ual correlations for each halide and for the
combination was presented by Goncalves.
The corresponding TLSER regression for
the chloride, bromide and iodide is shown in
Equations 5 through 7, respectively.
  These regressions do show the similarity
in the behavior of each of the halides in the
set of hydroxylic solvents. In each case, the
same descriptors are significant, and in each
case, the same descriptor is the "most
important" (based on the t statistic, not
shown).  In each case, there is an inverse
relationship between the molecular orbital
acidity (8a is inversely related to the acidity,
so a positive coefficient indicates an inverse
relationship). Further, solvent electrostatic
basicity leads to lower values of AHS, and
increased solvent electrostatic hydrogen
                              acidity leads to a higher value of AHS.
                              Based on the assumptions and previous
                              findings of Goncalves, each of these
                              descriptions, along with the sign of the coef-
                              ficient, behaves as expected.  Further, the
                              inclusion of each of; these descriptors
                              makes "chemical sense".  This is a vital part
                              of the TLSER, and one to which we have
                              attempted to adhere.
                                 In addition to regressions based on the
                              individual solutes, a single regression can
                              be obtained from the combination of the
                              enthalpies of all three cases.  In this case,
                              the generalized TLSER equation is modified
                              to account for solute as well as solvent. This
                              form is shown in Equation 8, page 103.
                                This equation is identical to Equation 2,
                              page 100, except that those descriptors with
                              subscript 1 refer to the solvent, and those
                              descriptors with subscript 2 refer to  the sol-
                              ute.  As with the individual solute equations,
                              not all descriptors are statistically significant
                              (at the 0.95 confidence level). Further,
                              those terms that are significant in Equations
                              5 - 8, page 102 and page 103 are, for the
                              most part, also significant in the combined
                              equation, which is shown  as Equation 9,
                              page 103.
                                The only aspect of the  overall equation
                              that is not apparent in Equations 5 through 8
              AHS = 4.827-r^f + 7.6378a - 749.35q. + 895.70q+ - 37.05
's   ^•v"-'  100
    N=14  R=0.965  sd=0.513 F=31
                                        Equation 5
              AHU = 4.056-
                 8.6518a - 898.90q. +934.90q+ - 35.14
                    N=14  R=0.959  sd=0.599 F=26

                                        Equation 6

                          Vmc
                                  9.343sa - 913.10q. + 1114,30q+ - 48.92
                     N=14  R=0.961  sd=0.608 F=27

                                        Equation 7
102
                                             NESC Annual Report - FY1994

-------
 Application of Molecular Orbital Methods to Qsar And Lfer: Understanding arid Characterizing Interactions
                                                                   of Solutes and Solvents
 is the inclusion of the solvent molecular
 orbital basicity.  Plus, the solvent electro-
 static basicity is no longer significant in the
 generalized equation. A more detailed
 description of this data set, as well as a
 complete review of the Goncalves regres-
 sions, is currently in preparation, and is
 being submitted to the Journal of Physical
 Organic Chemistry tor publication.

  Conclusions
   The TLSER methodology attempts to pro-
 vide an alternate formalism to the classical
 LFER and QSAR approaches, but still stay-
 ing within the concepts advanced by Kamlet
 and Taft in the LSER. Further, our efforts
 have been geared toward identifying a con-
 sistent set of parameters calculated solely
 from molecular orbital methods, that can be
 used as a replacement for the LSER based
 solvatochromic parameters. The TLSER
 has been shown to provide correlations on
 the same order as the LSER when direct
 comparisons are made, and significantly
 better than classical QSAR approaches.
   The examples provided here further rein-
 force the usefulness of the TLSER, and
 describe three very diverse types of interac-
 tions dependent on solute/solvent interac-
 tions. The TLSER provides a description of
 macroscopic properties in terms of molecu-
 lar orbital derived microscopic descriptors.
 Further, due to the nature of these descrip-
 tors, these descriptors usually do not cross
 correlate (although this is not always the
 case). Finally, the TLSER provides a funda-
 mental capability of allowing for the a priori
 prediction of properties of new compounds.

 References
 1  Gupta, S. Chem Rev 1987,  87,1183.
 2  Hansch, C.; Fujita, T. J. J Am Chem Soc
    1964,86,1616.
 3  Kier, L. B.; Hall, L. H. Connectivity in
    Structure-Activity Analysis;  Research
    Studies Press: Letchworth, 1986.
 4  Lewis, D. F.  V. In Progress in Drug
    Metabolisms^; Vol. Plenum Press.
 5  Reichardt, C. Solvents and Solvent
    Effects in Organic Chemistry, Second
    ed.;  VCH: New York, 1988.
 6  Shorter, J. Quart Rev 1970,  27, 44.
 7  Silipo,  C.; Vittoria, A. QSAR: Rational
    Approaches to the Design  ofBioactive
    Compounds; Elsevier: New York, 1991;
    Vol. 16, pp 573.
 8  Burkhardt, G. N. Nature (London) 1935,
    17,684.     |
 9  Hammett, L. P. Chem Rev-[935, 17,125.
 10 Hammett, L. P. J Am Chem Soc 1937,
    59,96.      . i
 11  Exner, O. In Correlation Analysis of
    Chemical Data; J. Shorter,  Ed.; Plenum
    Press: New York, 1988; pp 25.
 12 Kamlet, M. J.; Taft, R. W.; Abboud, J.-L.
    M. J/4CS1977, 91, 8325.
13  Kamlet, M. J.; Taft, R. W. Prog Org Chem
   1983, 48, 2877,,
14  Kamlet, M. J.; Abraham, M. A.; Doherty,
   R. M.; Taft,  R. W. JACS1984, 106, 464.
15  Kamlet, M. J.; Abraham, M. A.; Doherty,
   R. M.; Taft, R. W. Nature 1985, 313, 384.
              LOG P = a8h1Vmc2 + b TC
                                       Equation 8
             AHS = -0.502-^- -0.764Sp+0.531Sa-3049.0q.+43.12

                                       Equation 9
NESC Annual Report - FY1994
                                    103

-------
Application of Molecular Orbital Methods to Qsar And Lfer: Understanding and Characterizing ^Interactions
of Solutes and Solvents
16 Kamlet, M. J.; Taft, R. W. Acta Chem
    Scand 1986, Part B 40, 619.
17 Kamlet, M. J.; Doherty, R. M.; Abraham,
    M. H.; Taft, R. W. QSAR 1988, 7, 71.
18 Kamlet, M. J.; Taft, R. W.; Famini, G. R.;
    Doherty, R. M. Acta Chem Scand 1987,
    41, 589.
19 Famini, G. R. Using Theoretical Descrip-
   tors in Structure Activity Relationships"
   V., U.S. Army Chemical Research,
    Development and Engineering Center,
    1989.
20 Famini, G. R.; Penski, C. A.; Wilson,  L. Y.
   J Phys Org Chem 1992, 5, 395.
21 Famini, G. R.; Ashman, W. P.;  Mick-
   iewicz, A. P.; Wilson, L. Y. QSAR 1992,
    11,162.
22 Famini, G. R.; DeVito, S. C.; Wilson,  L. Y.
   In ACS Symposium Series on Biomark-
   ers; M. Saleh and J. Blancato, Ed.; 1992.
23 Famini, G. R.; Marquez, B. C.;  Wilson, L.
   Y. J Chem Soc P//1993, 773.
24 Famini, G. R.; Wilson, L. Y. J Phys Org
    Chem 1993, 6, in press.
25 Wilson, L Y; Famini, G. R. J Med Chem
   1991, 34,1668.
26 Cramer, C. J.; Famini, G. R.; Lowrey, A.
   H. Accts Chem Res 1993, In Press.
27 Dewar, M. J. S.; Thiel, W.  JACS1977,
   99, 4899.
28 Stewart, J. J. P.  "MOPAC Manual, Sixth
   Edition", U.S. Air Force Academy, 1990.
29 Bartle, K. D.; Clifford, A. A.; Jafar, S. A.;
   Shilstone, G. F. J Phys Chem  RefData
   1991,20,713.
30 Dobbs, J. M.; Wong, J. M.; Lahiere, R. J.;
   Johnston, K. P. IndEng Chem  Res 1987,
   26, 56.
31 Nakatani, T.; Ohgaki, K.; Katayama,  T. J
   Supercrit Fluids 1989, 2, 9.
32 Sako, S.; Ohgaki, K.; Katayama, T. J
   Supercrit Fluids 1988, 1,1.
33 Sako, S. Ohgaki, K.; Katayama, T. J
    Supercrit Fluids 1989, 2, 3.   j
34 Goncalves, R. M. C.; Albuquerque, L. M.
    P. C.; Martins, R. E. L; Simoes, A. M. N.;
    Ramos, J. J. M. J Phys Org Chem 1992,
    5, 93.
35 Ruoff, R. S.; Tse, D. S.; Malhotra, R.;
    Lorents, D. C. J Phys Chem 1993,  97,
    3379.
36 Leonard, J. M.; Famini, G. R.  A User's
    Guide to the Molecular Modeling Analy-
    sis and Display System (MMADS)", U.S.
    Army Chemical Research, Development
    and Engineering Center,  1989.
37 PCModel In Serena Software, PO Box
    3076, Bloomington, IN 47402: pp.
38 Minitab In Minitab Inc, 3081 Enterprise
    Dr, State College, PA 16801: pp.
39 Hopfinger, A. J. JACS 1980,  702,7126.
40 Dewar, M. J. S.; Stewart,  J. J. P. Chem
    P/?ys Lett 1984, 111, 416.    \
41 Kurtz, H. A.; Stewart, J. J. P.; Dieter, K.
    M. J Comp Chem 1990,  11, 82.
42 Famini, G. R.  Using Theoretical Descrip-
   tors in Structure Activity Relationships II:
    Polarizability Index", U.S. Army Chemi-
   cal Research, Development and Engi-
    neering Center, 1988.
43 Politzer, P.; Lane, P.; Murray, J. S.;
    Brinck, T. J Phys Chem 1992, 96, 7938.
44 Politzer, P.; Murray, J. S.;  Concha, M. C.;
   Brinck, T. J Mol Struc (THEOCHEM)
   1992, submitted.            ;
45 Politzer, P.; Murray, J. S.; Lane, P.;
    Brinck, T. J Phys Chem 1992, in press.
46 Smith, A. L; Li, D.; King,  B.; Romanow,
   W. J. J Phys Chem 1993, Submitted.
47 Goncalves, R. M. C.; Albuquerque,  L. M.
    P. C.; Simoes, A. M. N. Port Elect Chim
   /tote 1991, 9, 487.
48 Goncalves, R. M. C.; Simoes, A. M. N.;
    Leitao, A. S. E.; Albuquerque,!. M. P. C.
   J Chem Res 1992, 330.      i
1  George R. Famini, U.S. Army Edgewood Research, Development and Engineering Center, Aberdeen Proving
    Ground, MD 21010.
2  Leland Y. Wilson, Department of Chemistry, La Sierra University, Riverside, CA 92515.           :
104
              NESC Annual Report - FY1994

-------
   Calculated Infrared Spectra for Haiogenated
   Hydrocarbons12       .,    \":
  Background
   The U.S. Environmental Protection
  Agency (EPA) requires high quality refer-
  ence spectra for reliable identification and
  quantification of pollutants in environmental
  monitoring. Costs involved in acquiring such
  spectra, which include preparation and puri-
  fication of standards, physical measure-
  ments of spectra, safety considerations, as
  well as waste disposal, can be substantial.
  Consequently, the current experimental
  methods for obtaining reference spectra in
  environmental analysis are not economical
 and cannot keep pace with the proliferation
 of new chemicals requiring such
 identification.
   In an alternative approach, the U.S. EPA
 (EMSL-LV) has begun  a pilot project to eval-
 uate the application of ab initio computa-
 tional chemistry  methods in the
 determination of infrared spectra (frequen-
 cies and intensities) for molecules of envi-
 ronmental interest.  In this  study,
 experimental vibrational frequencies for
 halogenated hydrocarbons are correlated
 with frequencies determined  computation-
 ally  in order to ascertain a suitable level of
 theoretical treatment with a desired level of
 accuracy that does not result in excessive
 computational demands.  Results from this
 work are promising and indicate strong pos-
 sibilities for the role of computational meth-
 ods  in the determination of infrared spectra
 for molecules whose infrared properties are
 not known, or to aid in the interpretation of
 experimental spectra that have not been
 fully assigned.

 Method / Approach
  In  this work, halogenated (fluorinated and
chlorinated) aliphatic and aromatic com-
pounds were treated with the  computational
  prescription for determining infrared fre-
  quencies and intensities of compounds out-
  lined by Aue et al. in their compilation study
  of organic compounds1. The Hartree-Fock
  level of theory was employed in combination
  with a range of basis sets (STO-3G, 3-21G,
  3-21 G*, and 6-31 G*).  This level of treat-
  ment is suitable for larger molecular systems
  that would be of interest to  EPA.  All calcula-
 tions were obtained using the GAUSSIAN
 92 electronic structure program on the
 NSCEE supercomputer at Las  Vegas, NV.

 Results / Discussion
   The computed fundamental vibrational
 frequencies are obtained from the harmonic
 oscillator approximation, while the experi-
 mentally determined values are best charac-
 terized by an anharmonic potential.
 Scaling factors multiplying the theoretical
 frequencies have been utilized in the past to
 compensate for this comparison of unlike
 quantities as well as to account for system-
 atic errors resulting from the neglect of elec-
 tron correlation2. The standard textbook
 scaling factor value of 0.89 developed by
 Pople and coworkers, has been routinely
 applied to frequencies computed at the Har-
 tree-Fock level of theory3.
   In this study, linear regression analysis
 (with an intercept forced through the origin)
 was used to compare experimental funda-
 mental frequencies with those that had been
 determined computationally in order to more
 accurately correlate subsets of the haloge-
 nated hydrocarbon compounds  (Table 1,
 page 107).   Four groups of compounds
 were defined: aliphatic fluorinated, aliphatic
 chlorinated, a combination of aliphatic chlori-
 nated and fluorinated, and aromatic chlori-
 nated.  Experimental gas phase
frequencies were used for comparison with
NESC Annual Report - FY1994
                                    105

-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
theoretically assigned frequencies for all ali-
phatic compounds4. The only available
chlorinated aromatic assigned frequencies
were characterized in the liquid phase and,
in some cases, the solid phase5.
  Using the best theoretical treatment
employed, HF/6-31G*, the aliphatic fluori-
nated compounds produced a slope of
0.8991 with a standard error (defined as the
measure of the amount of error in the predic-
tion of y=experimental frequency for an indi-
vidual #=theorical value) of 48.4 cm"1 (R2,
the Pearson product moment correlation
coefficient,  equaled 0.9971).  The scaling
factors obtained at the same level of theory
for the aliphatic and aromatic chlorinated
compounds were 0.8937 and 0.9017 with
standard errors of 28.3cm"1 and 27.3cm"1,
respectively (R2 for aliphatic chlorinated
equaled 0.9991 and for chlorinated aromat-
ics, the R2 value  was 0.9988).   We there-
fore conclude that our scaling factors are in
reasonable agreement with the previously
accepted value of 0.89.  The HF/3-21G
(HF/3-21 G* for chlorinated aromatics) and
HF/STO-3G results produced larger stan-
dard errors than would typically be desired
for computing IR spectra, but both methods
can be systematically employed if high-level,
ab initio calculations are not computationally
feasible.
  Coupling the scaled fundamental frequen-
cies with the calculated intensities provides
a theoretical representation of an IR spec-
trum that can be  compared with an experi-
mental vapor phase spectrum.   A typical
problem encountered by EPA is the identifi-
cation of a specific structural isomer for a
given compound. In this work, the spectra
(frequencies and intensities) for the three
structural isomers of dichlorobenzene were
determined at the HF/6-31 G* level and com-
pared with the experimental vapor phase
spectra6 (Figure 1, page  108, Figure 2,
page 109, and Figure 3, page 110).
  Results indicate that the calculated spec-
tra are representative of the experimental
spectra for  each structural isomer, providing
a unique identification of the fundamental
frequencies and their relative intensities.
Spectra computed for the remaining chlori-
nated aromatics gave similar results.  We
would conclude that for this particular set of
compounds, theory provides a "fingerprint"
spectrum that can be compared with the
experimental gas phase spectrum for the
complete identification of a given bompound.

Conclusion
  Computational IR spectra (frequencies
and intensities) for a number of halogenated
aliphatic and aromatic compounds have
been determined.  A linear regression anal-
ysis between experimental gas phase fun-
damental frequencies (where available) and
the calculated frequencies provides a scal-
ing factor in reasonable agreement with the
currently accepted value of 0.89. \ Com-
puted spectra for the chlorinated aromatics
provides good agreement with the frequen-
cies and intensities of the experimental
vapor phase spectra, allowing for a com-
plete structural identification.   Our results
indicate strong possibilities for the role of
computational chemistry techniques coupled
with experimental methods for the identifi-
cation of environmentally relevant
compounds.

References
1 D. H. Aue, M. Guidoni, J.  W. Caras, & D.
   Gurka, "Infrared Frequencies and Inten-
   sities Calculated from MM3 Semiempiri-
   cal and Ab Initio Methods for Organic
   Molecules", Proceedings from the Com-
   putational Chemistry Workshop spon-
   sored by the Environmental Protection
   Agency, Bay City, Ml, Sept. 27-29,1993.
2 W. J. Hehre, L.  Radom, P.v.R. Schleyer, &
   J.A. Pople, Ab Initio Molecular Orbital
   Theory, Wiley, New York, 1936.
3 J. A. Pople, R. Krishnan,  H. B. Schlegel, D.
   DeFrees, J. S. Binkley, M. J. Frish, R. F.
   Whitesicle, R. F. Hout, and W.'J.  Hehre,
   Int.  J.  Quantum Chem. Symft., 15, 269
   (1981).                    '.
106
              NESC Annual Report - FY1994

-------
                                       Calculated Infrared Spectra for Halogenated Hydrocarbons
           Table 1: Linear Regression Statistics Comparing Theoretical (HF / 6-31G*)
        Fundamental Frequencies (x-axis) and Experimental Frequencies (y-axis) for the
                               Halogenated Hydrocarbons.
    Molecules     N  Metfaod/Baais
Std. Error
Aliphatic (F)   75  HF/6-31G*         48,4

Aliphatk (Q)   57  HF/6-31G*         283

AHphatk    132  HF/6-31G*         41,1


                                      274
    AromaticCd) 322  HF/6-31G*
                                                                    Intercept  £2
                                                           0.8991       0     0.9971

                                                           0.8937       0     0.9991

                                                           0.8967       0     0.9980
                 0.9017
                                                                             0.9988
Aliphatfc(F) 75
Aliphati^Q) 57
Aliphatic 132
(CI&?)
Aromatk{Ci) 322
HF/3-21G
HF/3-21G
HF/3-21G
HF/3-21G*
623
645
63.0
43.6
0.8980
0.8959
0.8967
0.8970
0
0
0
0
0.9952
0.9955
0.9953
0.9969
AliphatkKF) 75
Afiphatic(CI) 57
Aliphatic 132
(Q&F)
Aromatic(a)322
•»"«i^^ !»•••••
HF/STO-3G
HF/STO-3G
HF/STO-3G
HF/STO-3G
	 !•! 1
753
40.4
619
58.0
0.8340
0.8259
0.8303
0.8366
0
0
0
0
0.9930
0.9982
0.9954
0.9944
  Currently accepted scaling factor (slope) is 0.89 (see reference 3).
NESC Annual Report - FY1994
                                        107

-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
                                                                      1S7U ttM.7
    a, -
MICRONS
«     T    ?
                                             ?  7
       • T

       II  '
       •
                      T
                         •Vi^v~n f
          VAPOR
         I lHn|>.22S*C
       Sf
                                         WAVENUUBERS
   b.
                         Frequency (etn-1)
4000    3500   3QQO    2SOQ   2000   1500    1000   500     0
              f I..I- I I  | < I. 1 I f  *- I  I I |  I .1 I I f »  I I I [  ' ' ' ' ) ,1 ' ul ' I '_f l_f
                           3057
                                                    1131  1023
                                                  1478       771

  a Experimental Spectrum (reference 6.)
  b Scaled HF / 6-31G* frequencies and calculated relative intensities.
                                                                         110
                                                                           30
                                                                 I
                                                                 1
           Figure 1:  Experimental and theoretical spectra for 1,2-dichlorobenzene.
108
                                                    NESC Annual Report - FY1994

-------
                                    Calculated Infrared Spectra for Halogenated Hydrocarbons
                                                              9otas 141 as
                                                              1S81.S llJtS 7W.7
a.«»
                                            *   t  1
           VAPOR
                                          (PRI
                                           I      '
                                 WAVENUUBERS
                                                              WCOCET60SXFWB
                                  Frequency (cm-1)
            4000    3500   3000   2500   2000   1500   1000    500    0
                        3048
                                                        8(11
                                              - 1479
                                          1609
                                                      772
                                                              :0

                                                              •20
                                                              •30
                                                              40

                                                              60
                                                             t70
                                                              80
                                                              90
  a Experimental Spectrum (reference 6.)                       ;
  b Scaled HF/ 6-31G* frequencies and calculated relative intensities.

          Figure 2:  Experimental and theoretical spectra for 1,3-dichlorobenzene.
NESC Annual Report - FY1994
                                                                      109

-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
a«
          v v
   MORONS
V  j_   ;   f
                                                                     1881.S 14)118 Krt.0
                                                                     1«at£ 1flM£ (464
                                                                     147M
                                                                   u  u u M u M» n n H
          - VAPOR
                                                                                 •
                                                                                 s
                                                                                 o
                                                                                 R

                                                                                 A

                                                                                 S-

                                                                                 <&'
                                         WAVEMUM8ERS
                                                                    WCOLBT B08X FWB
                                     Frequency (cm-1)

   b.        4000   3500   3000   2500   2000   1500   1000    500     0
               h
                        3076
                                                   1001  -



                                                         348
                                                               Jzhi
                                            1498
                                                      1092
                                                                          I
                                                                          c
                                                                       ••100
                                                                   •M20
   a Experimental Spectrum (reference 6.)

   b Scaled HF / 6-31G* frequencies and calculated relative intensities.
           Figure 3: Experimental and theoretical spectra for 1,4-dichlorobenzene.
110
                                                           NESC Annual Report - FY1994

-------
                                         Calculated Infrared Spectra for Halogenated Hydrocarbons
  4 T. Shimanouchi, Tables of Molecular
     Vibrational Frequencies", National Stan-
     dard Reference Data Series, No. 39,
     National Bureau of Standards, Washing-
     ton, DC, 1972. T. Shimanouchi, J.  Phys.
     Chem. Ref. Data, 6,1 (1977).
  5 J. R. Scherer & J. C.  Evans, Spectrochim-
     icaActa,19,  1739  (1963);  J. R.
     Scherer, Spectrochimica Acta, 23A,
     1489  (1967).
  6 C. J. Pouchert, The Aldrich Library of
     FT-IR Spectra, Edition I, Volume 3,
    Aldrich Chemical Company, Milwaukee,
    Wl, 1989.    ;
                I
Disclaimer

   The U.S.  Environmental Protection
Agency (EPA), through its Office of
Research and Development (ORD), pre-
pared this extended abstract for publication
in a conference proceedings.  It does not
necessarily reflect the views of EPA or ORD.
 1  Kathleen Robins, U.S. EPA, EMSL-Las Vegas, P.O.  Box 93478, Las Vegas, NV 89193-3478 and Department of
     Chemistry, University of Nevada, Las Vegas, 4505 South Maryland Parkway, Las: Vegas, NV 89154.
 2  Donald H. Aue and James W. Caras, Department of Chemistry, University of California', Santa Barbara Santa
     Barbara, CA 93106.
NESC Annual Report - FY1994
                                    111

-------
Calculated Infrared Spectra for Halogenated Hydrocarbons
112
NESC Annual Report - FY1994

-------
   Development of Physiologically Based-Pharmacokinetic and
   Biologically Based-Dose Response (PB-PK/BI3-DR) Models
   for Dioxins  and Related Compounds Incorporating
   Structure-Activity Considerations12
 Introduction / History
   It is possible to understand the structure-
 activity relationship for the toxicokinetic
 properties of environmental chemicals by
 revealing the relationship between their
 molecular structure and their effects on bio-
 logical systems based on the molecular
 properties encoded in their structure.  It was
 previously reported (Waller and McKinney,
 1992) for polychlorinated dibenzo-p-dioxins,
 dibenzofurans, and biphenyls that differ-
 ences between the three-dimensional nature
 of their steric and electrostatic molecular
 fields were indicative of their relative affini-
 ties for the Ah receptor.  These field differ-
 ences were not fully descriptive of the
 associated in vitro enzyme induction data. It
 was hypothesized that additional parame-
 ters (i.e., hydrophobicity) should be consid-
 ered for this purpose. The initial results of
 this approach are reported herein.

 Method / Approach

 3D-QSAR

  Comparative molecular field analysis
 (CoMFA) (Cramer et al., 1988), a three-
 dimensional quantitative structure-activity
 relationship (QSAR) paradigm, was used to
 examine the physicochemical properties
 underlying the toxicity of polyhalogenated
 dibenzo-p-dioxins, dibenzofurans, and
 biphenyls. The steric (van der Waals) and
 electrostatic (coulombic) molecular field
 characteristics were represented as point
 values on a regularly-spaced grid surround-
 ing each molecule. The potential utility of
 these values as predictive descriptors of
 receptor binding and enzyme induction was
 then examined using partial least squares
 regression in conjunction with cross-
 validation.
 Hydrohobia
   Recent developments in the computa-
 tional chemistry/molecular modeling arena
 have allowed the hydrophobic nature of a
 molecule to be represented as point values
 on a grid surrounding the molecule. This
 technique is the basis of the HINT program
 (Kellogg et al., 1991). These values are
 particularly well-suited for inclusion as
 regressors in a CoMFA/QSAR study and
 can be analyzed in a manner analogous to
 that previously described.

 Results and Discussion
   The statistical results of all analyses are
 listed in Table 1 , page 114.  Using the
 CoMFA steric and electrostatic fields as
 regressors, significant relationships were
 discovered with respect to  Ah receptor bind-
 ing affinity.  In addition, graphical represen-
 tation of the results of the regression
 analyses as p-coefficient contour plots
 allowed one to visually display those areas
 in space surrounding the molecules under
 study in which either increased or decreased
 steric bulk or positive electrostatic character
 was desired for increased binding affinity.
 The steric and electrostatic regressors alone
 were not as predictive of induction data for
 the associated enzymes  AHH and EROD
 (data taken from in vitro studies). This was
 thought to be primarily due to the lack of par-
 titioning, or transport, of data. Preliminary
 attempts to include calculated log
 octanohwater partition coefficient (clogP)
values as QSAR regressors indicated a
severe limitation. The clogP algorithm is an
additive fragment-based method, and as
NESC Annual Report - FY1994
                                   113

-------
Development of Physiologically Based-Pharmacokinetic and Biologically Based-Dose Response (PB-PK/
BB-DR) Models for Dioxins and Related Compounds Incorporating Structure-Activity Considerations

                          Table 1:  CoMFA Statistical Summary.

Independent Variable: Ah Receptor Binding Data
Regressor(s)
Steric and Electrostatic
Hydrophobic
Steric, Electrostatic, and Hydrophobic
. l?x
0.783(8)
0.672(4)
0.771(9)
sep ;,
0.724
0.866
0.750
^•r2;;'
0.916
0.802
0.910
: .. $'. \;:.
0.452
0.674
0.471
Corit f
57,43
! 100
! 35,39,26
Independent Variable: AHH Induction Data
Regressorfs)
Steric and Electrostatic
Hydrophobic
Steric, Electrostatic, and Hydrophobic
, r2
x
0.612(3)
0.504(4)
0.580(3)
sep~
1.047
1.197
1.091
r2 .
, '
0.793
0.778
0.787
S v"
j
0.766
0.801
0.776
; ,Cont'.rjf
* <, A»-
49,51
100
, 30,30,40
 Independent Variable: EROD Induction Data
Regressor(s) > ' ~
•-...-.•...: *. - - •» » < r-""» *- *;
Steric and Electrostatic
Hydrophobic
Steric, Electrostatic, and Hydrophobic
.2 *'
- -cy ,.
0.546(3)
0.430(1)
0.501(2)
/ sep-
*~ ??>
&
1.158
1.271
1.201
Hi2'./..
0.743
0.563
0.694
- « „,
«-^ * X. '
0.871
1.113
0.940
-' Cdntf;
•( v^. »v ' i^-
57,43
; 100
29,25,46
such It does not account for substitution pat-
terns. Therefore, structural isomers are
indistinguishable (1,2,3,4- TCDD and
2,3,7,8-TCDD have the same clogP value).
  The addition of three-dimensional hydro-
phobicity parameters to the existing analy-
ses based on steric and electrostatic
qualities alone typically did not affect the sta-
tistical significance of the models. However,
if one considers the individual contributions
of the steric, electrostatic, and hydrophobic
fields in the combined models, they can be
seen to vary depending on the biological
endpoint being modeled. Hydrophobic fields
contributed significantly more to the induc-
tion response models than to the binding
affinity models.
Scientific Accomplishments and EPA
Mission Relevance
  Even in the absence of increased statis-
tics, the resultant hydrophobic field coeffi-
cient contour plots from the QSAR equation
provide for additional insight into the under-
lying physicochemical properties responsi-
ble for the toxicokinetic properties of this
class of compounds, facilitating the ultimate
development of PB-PK/BB-DR models for
use in risk assessment.

Future Objectives
  It is possible to represent binding affini-
ties, and other experimentally-determined
114
              NESC Annual Report - FY1994

-------
   Development of Physiologically Based-Pharmacokinetic and Biologically Based-Dose Response (PB-PK/
       BB-DR) Models for Dioxins and Related Compounds Incorporating Structure-Activity Considerations
  quantitative analytical data, for this conge-
  neric series of molecules as toxic equiva-
  lency factors (TEFs) relative to TCDD (TEF
  = 1.0).  It is hopeful that models of this type
  will be useful in the prediction of TEFs for
  untested compounds.

  Acknowledgments
    One author (CLW) acknowledges support
  from NIH Training Grant #T32HLO7275.
Relevant Publications and Reports
(References)
C.L. Waller and J.D. McKinney, J. Med.
  Chem.,1993, 35, 3660.
R.D. Cramer, D.E. Patterson, J.D. Bunce, J.
  Am. Chem. Soc, 1988, 110, 5959.
G.E. Kellogg, S.E. Semus, D.J. Abraham, J.
  Comput.-Aided Mot. Des., 1991, 5, 545.
       Rgure 1:  Hydrophobicity field coefficient contour plot. Black polyhedra represent areas where increased hydrophobia
             character is desired. Gray polyhedra represent areas where less hydrophobia character is desired.
1 Chris L. Waller, Center for Molecular Design, School of Medicine, Washington University, St. Louis, MO 63130.
2 James D. McKinney, Health Effects Research Laboratory, Environmental Toxicology Division, Pharmacokinetics
     Branch, U.S. EPA, RTP, NC 27711.
NESC Annual Report - FY1994
                                     115

-------
Development of Physiologically Based-Pharrnacokinetic and Biologically Based-Dose Response (PB-PK/
BB-DR) Models for Dioxins and Related Compounds Incorporating Structure-Activity Considerations
116
NESC Annual Report - FY1994

-------
   The Chesapeake Bay Program Simulation Models:  A
   Multimedia Approach '2
 Background:
   In 1983 the Chesapeake Bay Program
 identified excess nutrients, or eutrophica-
 tion, as the primary reason for the water
 quality decline in the Chesapeake (Gillelan
 et al., 1983). Several water quality models
 have been developed and successfully
 applied to help identify the nutrients contrib-
 uting the eutrophication and to quantify the
 nutrient reductions necessary to restore
 Chesapeake Bay resources.  Efforts are
 underway to link the water quality models of
 the estuary and its watershed with an air
 deposition model and simulation models of
 key living resources.
   The Chesapeake Bay Program  was
 formed in 1983 for the purpose of restoring
 the Chesapeake, and includes the states of
 Maryland, Pennsylvania, Virginia,  the Dis-
 trict of Columbia, the Chesapeake Bay Com-
 mission, and the U.S. Environmental
 Protection Agency. These jurisdictions real-
 ized that the Bay's deterioration could not be
 arrested by  any one of them acting alone.
 They acknowledged that the Bay was
 endangered because of changes in its entire
 watershed, a 64,000 square mile area
 extending from  a northern boundary  of Coo-
 perstown, NY; south to Virginia Beach, VA;
 and west to the Ohio River basin.
   Between 1983 and 1987, the Chesapeake
 Bay  Program entered into a long-term plan-
 ning and implementation phase. Nutrient
 reductions were underway, but answers
 were needed for two central questions:
 What should the total nutrient reductions be,
 and where should these reductions be
 made?  Water quality models were used to
 help  answer these questions.  The models
 provided an inventory of the sources of nutri-
 ents  from the watershed and quantified the
possible reductions of these sources,  as well
 as the water quality improvements resulting
 from different nutrient reduction actions.
   The first Chesapeake Bay Program model
 application was the Watershed Model, which
 quantified the magnitude and source of the
 nutrient loads to the Bay for wet, dry, and
 average hydrology years. Four major
 upgrades to this model have been com-
 pleted since 1983 (Hartigan, 1983; Donigian
 et al., 1985; Donigian et al., and 1990; Doni-
 gian etal.,  1994).
   A steady-state water quality model of the
 Bay was completed in 1987 to examine the
 impact of nutrient loads on the mainstem
 Bay.  Using simplified loading estimates and
 simulation procedures, the steady-state
 model calculated the average or steady-
 state summer (June - September) conditions
 in the mainstem Bay.  Results from the
 steady-state model indicated that a 40%
 reduction in controllable nutrient loads would
 eliminate anoxia (dissolved oxygen concen-
 trations less than 1.0 rng/L) in the mainstem
 (U.S.  EPA Chesapeake Bay Program,
 1987).
  The planning phase was brought to a
 close with the signing of the 1987 Bay
 Agreement. This document called for reduc-
 ing the controllable amount of nutrients
 reaching the Bay by 40% by the turn of the
 century. Of the many commitments  in the
 1987 Bay Agreement, the nutrient issue was
the only one of such consequence that the
goals were formulated in quantitative and
not merely qualitative terms. The Agree-
ment also called for a major reevaluation of
the nutrient  reduction goal in 1992 using
refined computer simulation models.  Con-
sequently, an extensive? program to affect
significant reductions of nutrients entering
the Bay was instituted along with increas-
ingly sophisticated water quality models to
NESC Annual Report - FY1994
                                   117

-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
guide decision-making on cost effective
water quality management in the Chesa-
peake Bay.  Work began on an integrated
set of Chesapeake Bay water quality models
in 1987 as shown in Figure 1.
  In 1992 the 40% reduction goal was con-
firmed by the results of the integrated water-
shed and estuarine computer models, the
first application of multimedia models in the
Chesapeake (Thomann et al. 1994).

The Water Quality Models
  The Chesapeake Bay Watershed Model
(Linker et al., 1994; Donigian et al., 1994) is
designed to simulate nutrient loads deliv-
ered to the estuary under different manage-
ment scenarios.  The model divides the Bay
basin  into sixty-three model segments with
an average area of 260,300 hectares as
illustrated in Figure 2, page 119.  Hydrology,
sediment, and nonpoint source loads
(including atmospheric deposition) are simu-
lated on nine land uses. In the river
reaches, sediment, nonpoint source loads,
point source loads, and water supply diver-
sions are simulated on a one hour time-step.
Paniculate and dissolved nutrients are
transported sequentially through each seg-
ment, and ultimately to the tidal Chesapeake
Bay. This model was used to 1) determine
the distribution of the point and nonpoint
source loads and the controllable and
uncontrollable portions of each; 2) deter-
mine the quantity of loads reduced under dif-
ferent management actions including
reductions in nitrogen atmospheric deposi-
tion; and 3) quantify the loads under future
(year 2000) conditions. These loads were
used as input conditions for the Chesapeake
Bay Water Quality Model  (CBWQM). The
Watershed Model is based on EPA sup-
ported model HSPF,  Version 10.
  The Chesapeake Bay Water Quality
Model (CBWQM) takes loads from the
Watershed Model as input.  The CBWQM is
a time variable, three dimensional water
quality model of the tidal Bay coupled with a
                       Figure 1: The Current Chesapeake Bay Integrated Modeling
118
              NESC Annual Report - FY1994

-------
                           The Chesapeake Bay Program Simulation Models: A Multimedia Approach
NESC Annual Report - FY1994
119

-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
model of estuarine sediment processes.
The CBWQM computational grid is shown in
Figure 3, page 121.  The sediment model
provides simulation of sediment nutrient
sources and sinks. An ocean boundary sub-
model simulates the expected coastal
exchange of loads with the Chesapeake
under different nutrient management condi-
tions. The CBWQM is driven by a hydrody-
namic model simulating the movement of
Bay waters over the three year (1984 -
1986) simulation period on a five minute
time-step. The details of model develop-
ment, structure, calibration, and sensitivity
are given in separate reports (Cerco and
Cole, 1993; DiToro and  Fitzpatrick, 1993;
Johnson et al., 1991). The CBWQM is
based on an extensively modified and
expanded WASP code.

The Findings So Far
  Several key management scenarios were
carried out using the linked Watershed
Model and CBWQM. These key scenarios
provided a basic inventory of loads under
specific management conditions.

Base Case Scenario
  This scenario is the base case year (1985)
loads to the Chesapeake Bay. The 1985
loads are the benchmark loads against
which all reductions, particularly the 40%
Bay Agreement reduction, are measured.

Bay Agreement Scenario
  This scenario represents the nutrient
loads to be reduced by the year 2000 under
the Bay Agreement.  The reduction is a 40%
reduction of controllable nutrient loads.
Controllable loads were defined as the base
case loads  minus the loads delivered to the
Bay under an all forest condition.  In other
words, controllable loads are defined as
everything over and above the total phos-
phorus and total nitrogen loads that would
have come from an entirely forested water-
shed. Point source loads are considered, in
this definition, to be entirely controllable.
After the year 2000, the 40% reduction goal
becomes a cap on nutrient loads. Post year
2000 efforts in Chesapeake resource man-
agement will focus on maintenance of the
nutrient cap in the face of load increases
due to growth.

Limit of Technology Scenario
  The limit of current technology scenario is
defined as having all cropland in conserva-
tion tillage; the Conservation Reserve Pro-
gram fully implemented; nutrient •
management, animal waste controls, and
pasture stabilization systems implemented
where needed; a 20% reduction in urban
loads; and point source effluent controlled to
a level  of 0.075 mg/L total phosphorus and
3.0 mg/L total nitrogen.  It is important to
note that the limit of technology scenario
was used to determine the feasibility of the
40% nutrient reduction.  In no basins did the
reduction called for in the Bay Agreement
exceed what could be achieved by current
technology.                   i

"No Action" Option Scenario
  This scenario represents the growth in
population and the projected changes in
land use by the year 2000 with no additional
controls after 1985. Only the controls in
place in the base case scenario are applied
to the year 2000 point source flows and land
use. This scenario represents what the
loading conditions might be without the nutri-
ent reductions of the Bay Agreement. Bay
Program projections of land use changes
and growth in sewage treatment plant loads
by the year 2000 add 14.2 million; kilograms
of nitrogen to the 33.7 million kilograms that
must already be removed to reach the year
2000 nutrient cap. In other words, for every
kilogram we remove approximately one-half
kilogram returns simply as a result of popu-
lation growth.                 '•

1993 Progress Scenario
  This  scenario applies the actua;! reduc-
tions made in nitrogen and phosphorus by
120
              NESC Annual Report - FY1994

-------
                           The Chesapeake Bay Program Simulation Models: A Multimedia Approach
                                                                                   
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
the year 1993, the midpoint between the
base year of 1985 and the year 2000 goal.
  Figures 4 and 5, page 123, show the 1985
Base Case total phosphorus and total nitro-
gen loads in each major Bay basin relative
to the 1993 Progress loads, the Bay Agree-
ment loads, the loads under the current limit
of technology, and the "No Action" Option
loads.
  In all basins, the limit of technology phos-
phorus loads are less than the Bay Agree-
ment loads. For nitrogen loads, only the
point source dominated basins of Potomac,
James, and the West Shore have limit of
technology loads appreciably less than Bay
Agreement loads. For the nonpoint source
dominated basins of the Susquehanna,
Rappahannock/York, and the East Shore,
the limit of technology loads are essentially
equivalent to the Bay Agreement loads.
  The "No Action" scenario has increased
loads of phosphorus and nitrogen for all
basins.  The point source dominated basins
of Potomac, James, and West Shore, which
experience the greatest urbanization in the
Year 2000 scenario, have the greatest
increase in nutrient loads.
  The 1993 Progress Scenario charts phos-
phorus reductions as being on track for the
year 2000 goal. The nitrogen reductions
posted by the 1993 Progress Scenario are
meager, in fact in many basins the Bay Pro-
gram has lost ground due to load increases
from growth. In the West Shore basin,
where progress is being made in point
source nitrogen reductions, the 1993 reduc-
tion in nitrogen is significant. Reductions in
nitrogen are expected to increase in the
other basins as nitrogen reductions are
made in more facilities. Given the proven
technologies and increasing cost-effective-
ness of new biological nutrient removal pro-
cesses, the point source role in the clean-up
is becoming more significant than previously
thought.  Point source reductions are vital to
the restoration effort due to the inevitable
increase in point source loads to the Bay
with increases in population.
  Given the difficulty of controlling nonpoint
source nitrogen and questions about the
level of participation achievable under volun-
tary programs, the nonpoint source role in
the clean-up may be relatively more chal-
lenging than previously believed. Neverthe-
less, nonpoint source control actions are a
key component in all tributary strategies.
  A series of runs was conducted to test and
explore the Bay response to nutrient loading
scenarios.  These runs indicate that anoxia
in the Bay, brought  on by the excessive
nutrient loading, will be reduced by 20%
under the Bay Agreement nutrient caps.
The maximum nutrient reductions achieved
under the limit of technology loads produce
a 32% reduction in  anoxia.  Under the "No
Action" loads the Bay anoxia would increase
by about 120%.                :
  Phytoplankton growth is limited by the
availability of phosphorus in the upper Bay
and by nitrogen in the lower Bay, with a tran-
sition in the mid-Bay.  Consequently, load
reductions of both phosphorus and nitrogen
are necessary to reduce eutrophication in
the Bay. However,  reductions of phospho-
rus do not have as significant an effect on
bottom anoxia as do nitrogen reductions.
The impact of load reductions on bottom
water anoxia also varies geographically, with
the greatest anoxia reductions from load
reductions in the upper Bay and mid-Bay
areas.
  The calculated ocean input load com-
prises 30 to 35% of the total nitrogen load
and 45% to 65% of the total phosphorus
load to the Bay. Excluding these uncontrol-
lable ocean loads, the upper limit of  overall
total nutrient load reduction (current limit of
technology) is 20 to 30% for nitrogen and
30% to 55% for phosphorus.
  Feasible reductions in nutrient loadings of
about 20 - 30% for both nitrogen  and phos-
phorus result in an improvement in bottom
dissolved oxygen of about 0.2 to  0.4 mg/L
above  the base case summer average bot-
tom dissolved oxygen. Load reductions of
50% or more result in minimum dissolved
122
              NESC Annual Report - FY1994

-------
                               The Chesapeake Bay Program Simulation Models: A Multimedia Approach
                                                                       Sceajirio Legend
                                                                         198:5 B«« Cue
                                                                         Bay Agreement
                                                                         Limit of Technology
                                                                         "No
                                                   Kappahaaaock/York
                                                                  Western Shore
                           Figure 4: Chesapeake Bay Total Phosphorus Loads by Scenario
           168
           MO
          ;ioo
        * *»
           4O
           20
                                                                      S««!Barfo l^egeud
                                                                       1
-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
oxygen concentrations above 1 mg/L as
shown in Figure 6.

Looking to the Future
  The Bay Agreement reductions become
nutrient load caps after the year 2000.  Pop-
ulation continues to grow in the watershed,
so control efforts will have to  be strength-
ened to maintain progress already achieved
by assuring that the caps are maintained.
The Baywide caps are 104.3  and 7.00 mil-
lion kilograms per year for nitrogen and
phosphorus respectively. The Watershed
Model will annually track changes in water-
shed loads to ensure that the caps are not
exceeded in any basin.
  The nutrient reductions and caps will be
achieved through the implementation of trib-
utary strategies. The strategies examine the
mix of nutrient management controls for the
different tributaries and apply controls on
waste water treatment plants, agricultural
runoff, or effluent from urban areas. Alloca-
tion of the nutrient caps to each of ten major
tributary systems was achieved through the
application of the Watershed Model.  Exist-
ing, modified, or in some cases, new imple-
mentation mechanisms are applied in point
source programs,  nonpoint source pro-
grams, and in associated incentive or disin-
centive programs.  Most  importantly, citizen
action in the review, refinement, and imple-
mentation of the tributary strategies is key.
  Model refinements are now being devel-
oped for the Watershed Model. One such
modification is a finer scale segmentation of
the watershed, dividing it into 86 segments
to provide better spatial resolution of model
                    Figure 6: Water Quality Response to Nitrogen and Phosphorus Reductions
124
              NESC Annual Report - FY1994

-------
                           The Chesapeake Bay Program Simulation Models: A Multimedia Approach
  results. Another refinement that will be used
  in future Watershed Model runs is an
  updated GIS Chesapeake Bay Program
  Land Use Data Base.  This data base was
  developed using satellite imagery available
  through EPA's Environmental Monitoring
  and Assessment Program (EMAP) and
  NOAA's Coastal Change Assessment Pro-
  gram (CCAP). USDA Agricultural Census
  Data were also used to provide crop and
  animal information.  The Chesapeake Bay
  Program Land Use map is shown in Figure
  7, page 126.
    Further  model refinements are under way
 for the CBWQM. These refinements will
 examine the relationship among air deposi-
 tion, water quality and key living resource
 areas including SAV, benthos, and phy-
 toplankton/zooplankton. The refined model
 analysis of air deposition and water quality/
 living resource interactions will be com-
 pleted in 1997 as illustrated in Figure 8,
 page 127.
   Throughout the 1994-96 period the Bay
 Program will improve modeling of atmo-
 spheric loads. These activities will move
 toward estimates of the controllable atmo-
 spheric load delivered to the tidal Bay.
 Inherent in an improved understanding of
 atmospheric loads are estimates of the con-
 trollable and uncontrollable atmospheric
 sources, the boundaries of the Chesapeake
 airshed, and the transformations and losses
 of deposited atmospheric loads. The nutri-
 ent sources of point,  nonpoint, and air depo-
 sition will be simulated along with the
 attainable controls of these sources to
 develop a least cost pollutant reduction plan.
 Estimates of the growth of atmospheric
 sources of  nitrogen are also important
 because while current estimates of these
 loads show an initial reduction through
 implementation  of the Clean Air Act, atmo-
 spheric loads beyond the year 2005 will
 increase unless further controls are initiated.
 A major review of the goals and progress of
 the tributary strategies will occur in 1997.
 This evaluation will use the computer mod-
 els now under development to examine con-
  nections among watershed, airshed, estuary
  water quality, and living resources. Under-
  water grasses and benthic organisms will be
  simulated, providing tributary specific goals
  for nutrients based on habitat
  improvements.

  Conclusion

   The participants in the Chesapeake Bay
  Program have consistently marshalled the
  resources needed to continue the effort to
  restore and protect the Chesapeake Bay.
 The challenge today is to finance, plan,
 implement, and  construct the needed nutri-
 ent control measures by the year 2000 and
 then to maintain these loadings even as
 population and development continue to
 expand.
   We are making progress. Bay Program
 tracking of nutrient reductions show a reduc-
 tion of phosphorus by 1992 of 1.86 million
 kilograms, an achievement of 48% of the
 phosphorus reduction goal.  A major factor
 in the phosphorus reductions was the phos-
 phate detergent  ba,n, an excellent example
 of pollution prevention in the Bay basin.
 Reductions in nitrogen are coming more
 slowly.  By 1992, 2.95 million kilograms of
 nitrogen were reduced, achieving 9% of the
 nitrogen reduction goal.
  And there are encouraging developments.
 Recent advances in biological nutrient
 removal, supported by CBP funding, dem-
 onstrate that cost effective technologies for
 year-round nutrient removal can achieve
 significant reductions in nitrogen effluent at
 municipal Sewage Treatment Plants (STPs).
 Most tributary strategies contain biological
 nutrient removal as a key element. The chal-
 lenge ahead is to achieve similar technologi-
 cal  breakthroughs for controlling nutrients,
 particularly nitrogen, from nonpoint sources.
  Our improved understanding of atmo-
spheric nitrogen pollutants is also encourag-
ing. We have learned that about a quarter of
the nitrogen load  entering the Bay comes
from atmospheric sources. These sources
originated from the tailpipes of cars and from
NESC Annual Report - FY1994
                                    125

-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
          Emergent Wetland
          Forest
 Legend
Forested Urban
HertetceotE Urban
Low JMteasity Urban
High Intensity Ute
Maes, Quames, aod Boosed
                           Figure 7: Chesapeake Bay Program Land Use
126
                           NESC Annual Report - FY1994

-------
                            The Chesapeake Bay Program Simulation Models: A Multimedia Approach
            * Mffi^fjjffi
            Barest
 Legend
Rsrested Urban
Herbaceous Urban
Low teeasity Urban
                                                                      Urban
Maes,
       Figure 8: THe Cross-media Watershed, Estuarine, Airshed, and Living Resource Model Now Being Developed
NESC Annual Report - FY1994
                                                   127

-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
the smokestacks of power plants and indus-
tries.  These sources may have originated in
the watershed or even from outside the
watershed boundaries. Accordingly, we
have learned to add a new word to our lexi-
con of Bay restoration - the airshed.  Not
enough is yet known about air deposition as
a source of nutrients and how to control it.
This is one reason why air deposition reduc-
tions are not included in the nutrient caps.
Although air reductions are not counted in
the caps, the Clean Air Act is expected to
reduce nitrogen entering the Bay by air dep-
osition and to provide another 4% reduction
in anoxia in addition to the 20% reduction in
anoxia brought about by the Bay Agree-
ment. Unfortunately, like point sources, pop-
ulation increases will begin to erode gains
made in reducing the atmospheric source
after 2005.  Further understanding of atmo-
spheric deposition and how to control it v/ill
be obtained by the inclusion of Chesapeake
airshed simulation models into the inte-
grated water quality models.
  Many challenges lie ahead. The Chesa-
peake Bay Program is about to enter a new
phase, which will focus first on tracking nutri-
ent reductions as we move toward the year
2000 goal, and then on maintenance of the
nutrient caps. The multimedia simulation
models now being developed, will track
nutrient loads as the Chesapeake Basin
moves toward sustainable development. A
key element in attaining sustainable devel-
opment will be an analysis of the best and
least cost solutions to achieve and maintain
the nutrient caps.
  An administrative challenge will be to
develop consistent and reliable methods to
assess progress in implementing tributary
strategies and determining movement
towards the 40% nutrient reduction goal.
The Bay Program partners will complete
annual tracking of the nutrient load reduc-
tions through computer model progress sce-
narios. Coordinated and targeted
monitoring efforts will verify model predic-
tions and provide a real world measure of
water quality and living resource response to
our efforts.

Acknowledgment
  The author wishes to acknowledge the
state, regional, and federal members of the
Chesapeake Bay Program Modeling Sub-
committee for their essential guidance and
direction throughout the development of the
Chesapeake Bay Watershed Model, and in
particular Dr. Robert Thomann for his exper-
tise and experience gained through twenty
three years of water quality modeling work
on the Chesapeake and its tributaries.

References
Cerco, C.F. And T. Cole, 1993.  Application
   of the Three-Dimensional Eutrophica-
   tion Model CE-QUAL-ICM to Chesa-
   peake Bay.  U.S.  Corps of Engineers
   Waterways Experiment Station. Vicks-
   burg, MS.
D.M. DiToto and J.J Fitzpatrick, 1993.
   Chesapeake Bay Sediment Flux Model.
   U.S. Corps of Engineers Waterways
   Experiment Station. Vicksburg, MS.
Donigian, Jr., A.S., Bicknell, and J.L. Kittle,
   Jr., 1986.  Conversion of the Chesa-
   peake Bay Basin Model to HSPF Opera-
   tion. U.S. EPA Chesapeake Bay
   Program. Annapolis, MD.    ;
Donigian, Jr., A.S., and H.H. Davis, 1978.
   Users Manual for Agricultural Runoff
   Management (ARM) Model.  U.S. EPA
   Environmental Research Laboratory.
   Athens, GA.
Donigian, Jr., A.S., B.R. Bicknell, L.C.
   Linker, J. Hannawald, C.H. Chang, and
   R. Reynolds, 1990. Watershed Model
   Application to Calculate Bay Nutrient
   Loads: Phase I Findings and Recom-
   mendations. U.S. EPA Chesapeake Bay
   Program. Annapolis, MD.
Donigian, Jr., A.S., B.R. Bicknell, LC.
   Linker, C.H. Chang, and R.  Reynolds,
   1994. Watershed Model Application to
   Calculate Bay  Nutrient Loads:  Phase II
   Findings and Recommendations. U.S.
128
              NESC Annual Report - FY1994

-------
                          The Chesapeake Bay Program Simulation Models: A Multimedia Approach
     EPA Chesapeake Bay Program. Annap-
     olis, MD.
  Gillelan, M.E., D. Haberman, G.B. Mackier-
     nan, J. Macknis, and H.W. Wells, Jr.,
     1983. Chesapeake Bay: A Framework
     for Action. U.S. EPA Chesapeake Bay
     Program.  Annapolis, MD.
  Hartigan, J.P., 1983. Chesapeake Bay
     Basin Model. Final Report prepared by
     the Northern Virginia Planninig District
     Commission for the U.S. EPA Chesa-
     peake Bay Program. Annapolis, MD.
 Johnson, B.C., J.C. Imhoff, J.L. Kiggle, Jr.,
     and A.S. Donigian, Jr., 1993. Hydrologic
     Simulation Program - Fortran (HSPF):
     User's Manual for Release 10.0. U.S.
     EPA Environmental Research Labora-
     tory. Athens, GA.
 B. H. Johnson et al., 1991.  Users Guide for
     a Three-Dimensional Numerical Hydro-
     dynamic, Salinity, and Temperature
    Model of Chesapeake Bay. U.S. Corps
    of Engineers Waterways Experiment
    Station. Vicksburg, MS.
Linker, L.C., G.E. Stigall, and C.H. Chang,
    1994.  The Chesapeake Bay Water
    Quality Model,, Accepted for publication
    in Environmental Science and Technol-
    ogy.
Thomann, R.V., J.R. Collier, A. Butt, E. Gas-
    man, and L.C. Linker, 1994.  Response
    of the Chesapeake Bay Water Quality
    Model to Loading Scenarios. U.S. EPA
    Chesapeake Bay Program. Annapolis,
    MD.
U.S. EPA Chesapeake Bay Program, 1987.
   A Steady-State Coupled Hydrodynamic/
   Water Quality Model of the Eutrophica-
   tion and Anoxia Process in the Chesa-
   peake Bay. Prepared by HydroQual, Inc.
   for the U.S. EPA Chesapeake Bay Pro-
   gram. Annapolis, MD.
 1 Lewis C. Linker, U.S. EPA, Chesapeake Bay Program.
 2 Katherine E. Bennett, Chesapeake Research Consortium.
NESC Annual Report - FY1994
                                  129

-------
The Chesapeake Bay Program Simulation Models: A Multimedia Approach
130                                                          NESC Annual Report - FY1994

-------
   Visualization Techniques for the Analysis and Display of
   Chemical, Physical, and Biological Data Across Regional
   Midwestern Watersheds'2
 Problem Description
   An important goal of the Federal Water
 Quality Act is that of defining conditions nec-
 essary to maintain the biological integrity in
 the nation's surface waters. Past studies
 have largely relied on either biosurvey or
 toxicological approaches to define water-
 shed health. However, these approaches do
 not supply comprehensive data for identify-
 ing relationships among physical, chemical
 and biological watershed variables. A series
 of joint studies by the Natural Resources
 Research Institute (NRRI) and the U.S. EPA
 Environmental Research Laboratory -
 Duluth  (ERL-D) relate instream physical
 (habitat), chemical (surface and substrate)
 and biological (macroinvertebrate) proper-
 ties to landscape and land use attributes
 (Johnson and Richards, 1992; Richards et
 al., 1993a; Richards et al. 1993b; Johnson
 et al., 1994; Richards and Host, 1994).  The
 intention of this research is to identify biocri-
 teria and ecocriteria for midwestern agricul-
 tural watersheds.  Important landscape
 attributes are land cover, topography, and
 soils. The degree to which the agricultural
 land was used was found to be a factor influ-
 encing the health of watersheds.
   Comparing physical, chemical, and bio-
 logical measurements to stream health can
 be difficult because the measured variables
 are often done on different temporal and
 spatial scales, with varied levels of accuracy
 and precision. In addition, it is difficult with
 large multi-dimensional data sets to identify
 and explain patterns and trends in the data
 outside  of relatively sophisticated multivari-
 ate procedures.  However, there is clearly a
 need to  present this data in a format that can
 be interpreted by policy-makers and the
 general  public. Data visualization tech-
 niques provide such means of interpretation.
 Objectives
   Our objective is to analyze, model, and
 visualize the relationships found between
 landscape/land use attributes and water-
 shed physical, chemical, and biological
 properties in the Saginaw River basin. The
 analyses include calculation of spatial statis-
 tics from the existing Geographic Informa-
 tion System (GIS) databases. The modeling
 component focuses on predicting either pos-
 itive or negative changes in stream health as
 a function of landscape/land use changes.
 Data visualization is being used to rapidly
 synthesize model outputs in formats to
 enable researchers to gain further under-
 standings about watershed properties and
 interactions.  A second advantage of using
 visualization procedures is to place the
 watershed research results into formats eas-
 ily comprehensible to policy-makers and
 public review groups.

 Research Approach
  As part of our studies to develop biocrite-
 ria and ecocriteria for a midwestern agricul-
 tural watershed, a (database spanning the
 macroinvertebrate community, water chem-
 istry and physical habitat has been gathered
 representing over 60 stations distributed
 throughout the Saginaw River Basin (Rich-
 ards et al., 1993a, 1993b). We have also
 assembled GIS databases in an ARC/INFO
 format relating hydrology, landscape/land
 use, elevation, soils, geomorphology, and
climate features (Johnson and Richards,
 1992).  Results using redundancy, principle
component, and multiple regression tech-
niques have shown presdictive relationships
among the watershed database variables
and landscape/land use features (Johnson
eta!., 1994).
NESC Annual Report - FY1994
                                    131

-------
Visualization Techniques for the Analysis and Display of Chemical, Physical, and Biological Data Across
Regional Midwestern Watersheds
Preliminary Results/Progress
  George Host, one of NRRI's principal
investigators on this project, attended a data
visualization training session at the National
Environmental Supercomputing Center
(NESC) during the summer of 1994.
Selected landscape data was uploaded onto
NESC workstations. Using the Cass River
watershed as a subset of the Saginaw River
basin database, features of geology,  land
use, elevation, and hydrology were trans-
ferred into the FAST visualization package.
Monthly temperature for the entire basin was
also incorporated into the AVS visualization
package.  These data were stored in a raster
format with a 1 km2 resolution.
  The modular programming capabilities of
AVS were used to generate an animation of
temperature changes across the Saginaw
basin from January through December.
Features found with this analysis included
lake effects due to basin and topographic
effects in terms of river drainage and
morainal systems on seasonal tempera-
tures. These effects would be difficult to dis-
cern from  the flat-file databases as inputs to
the visualization routines.
  FAST was used to visualize geology, land
use, hydrology, and elevation features in the
Cass River watershed. A three-minute
video was developed representing an aerial
"fly-over" the watershed. The basin's fea-
tures were first generated as a flat image,
then rotated to show elevation from a three-
dimensional perspective.  A flight path was
traced from the river's mouth, proceeding
upstream  along the main stem and into the
headwater areas. Elevation changes were
rendered both as distance from a surface
and with color. The fly-over video, synthe-
sized from 15 megabytes of elevation data,
gave a clear picture of the Cass River water-
shed landscape features including relative
size of the moraines, outwash plains, low-
lands, and drainage channels.
  These results will provide a template for
future analyses.  By developing the data
transfer and transformation routines neces-
sary for incorporating GIS data into the visu-
alization systems, we can readily subject
new data to similar analyses. Several new
tasks are planned for next year. Other
watersheds from our existing database
along with new fish community data
obtained during the 1994 field season will be
incorporated into the AVS system.  The
biotic data will be explored to quantify
trends.  This would be a new application for
visualization, which historically has been
used to interpret  data which varies continu-
ously (i.e., temperature) rather than dis-
cretely (i.e., with  biological population
data). We also intend to use supercomput-
ers for quantifying fragmentation, lacuniarity,
and other landscape spatial statistics. Cur-
rently the routines necessary for generating
these statistics are extremely computer
intensive, taking literally weeks of worksta-
tion CPU time. By taking advantage  of the
parallel processing capabilities of supercom-
puting environments, we hope to greatly
reduce computational times and conse-
quently be able to analyze landscapes in
larger spatial scales.

References
Johnson, L.B. and C. Richards. 1992.
   Investigating  landscape influences on
   stream macroinvertebrate communi-
   ties. Water Resources Update 87:41-
   48.
Richards, C., G.E. Host, and J.W. Arthur.
   1993a. Identifications of predominant
   environmental factors structuring
   stream macroinvertebrate communities
   within a large agricultural catchment.
   Freshwater Biology 29: 285-294.
Richards, C. et ai.,  1993b. Landscape influ-
   ences on habitat, water chemistry, and
   macroinvertebrate  assemblages in mid-
   western stream ecosystems. 29 pp +
   appendices, October, ERL-Duluth
   Report Number 5815.
Johnson, L, C. Richards, G. Host, and J.
   Arthur.  1994. Landscape influences on
   water chemistry in  midwestern stream
132
              NESC Annual Report - FY1994

-------
    Vfsuatfzatton Techniques for the Analysis and Display of Chemical, Physical, and Biological Data Across
                                                                 Regional Midwestern Watersheds
      ecosystems. Internal report for publica-
      tion. 28 p.
  Richards, C. and G.E. Host. 1994.  Examin-
      ing land use influences on stream habi-
tats and macroinvertebrates: A GIS
Approach. Water Resources Bulletin In
30:729-738.
  1 Q
-------
Visualization Techniques for the Analysis and Display of Chemical, Physical, and Biological Data Across
Regional Midwestern Watersheds
134                                                             NESC Annual Report - FY1994

-------
                                  " 1 •', i -i. - -W • "*•• ,'.-:
   Estimation of Global Climate Change Impacts on Lake
   and Stream Environmental Conditions and Fishery
   Resources1'2'3               /
 Abstract
   Mathematical models have been devel-
 oped for estimating the effects of global cli-
 mate change on lake and stream thermal
 structure, dissolved oxygen concentrations
 and on fishery resources. These models
 require the development of lake and stream
 classification systems to define waterbody
 types, which in turn require the availability of
 extensive regional databases for these
 resources. Fishery resource response pre-
 dictions require the development of large
 field temperature and fish distribution data-
 bases from which species and guild thermal
 requirements can be derived. Supercom-
 puting capabilities are being utilized in the
 development and manipulation of the large
 databases to integrate the various data and
 program modules, and to make the calcula-
 tions required to perform regional impact
 estimates.

 EPA Research Objectives
  According to a 1992 technical review of
 several general circulation models of ocean
 atmosphere heat budgets by the Intergov-
 ernmental Panel on Climate Change, dou-
 bling atmospheric concentrations of CO2
 could increase global mean air temperatures
 by 1.5 to 4.5°C in the next 50 years. This is
 likely to have many environmental conse-
 quences, for example, changes in water
 temperature and dissolved oxygen concen-
 trations which, in turn, are likely to affect fish
 populations. The fact that such changes
 would occur many times faster than have
 occurred previously has resulted in requests
 for information on effects and response
 options to the climate changes. The Envi-
 ronmental  Research Laboratory - Duluth
 and the University of Minnesota have initi-
 ated a cooperative study to determine the
 impacts of global warming on lake and
 stream environmental conditions and fishery
 resources. In order to conduct this study,
 fish thermal requirements need to be esti-
 mated using a historical fish presence/tem-
 perature record database.

 Background/Approach
   The Fish and Temperature Database
 Management System (FTDMS) is a national
 database system that spatially and tempo-
 rally associates discrete fish sample records
 with water temperature data. Recent efforts
 have concentrated on the expansion of data-
 base content by assembling information
 from a multitude of sources, including Fed-
 eral agencies (i.e., EPA/STORET and
 USGS) and private museum and university
 collections. The assimilation of data from
 many sources necessitates  the need for
 automated spatial and temporal matching of
 a fish record with water temperature data.
 Prior versions of several program modules
 have been converted from a PC database
 platform to C and have; been run on the
 National Environmental Supercomputing
 Center's (NESC) Cray,

 Scientific Accomplishments
  A modeling approach has been developed
 for estimating the effects of global warming
 on the environmental conditions of lakes and
 streams and fisheries resources. The initial
 phase of the work was partially supported by
 EPA Office of Policy, Planning, and Evalua-
tion. Results from the studies have  yielded
data that are currently being  applied to in  an
economic impact analysis of global climate
impacts on the U.S.  One ongoing program
is a component of the Committee on Envi-
ronmental and Natural Resources (CENR),
Global Climate Research Program. At last
count, over twenty project reports, most in
NESC Annual Report - FY1994
                                   135

-------
Estimation of Global Climate Change Impacts on Lake and Stream Environmental Conditions and Fishery
Resources
the form of technical journal articles, had
been produced by the project.

Results
  The program module that calculates raw
temperature data into weekly mean values
has been run on the NESC's Cray for 28
states. The results have been used to cal-
culate the maximum (warmest throughout
the year) 95th percentile temperature where
a fish species was collected for over 50 spe-
cies of North American freshwater fish. This
temperature is used as an approximation of
the lethal limit for that species and allows us
to estimate the distribution of fish after glo-
bal climate change. The data has also been
integrated with GIS to visually examine the
relationship between fish presence and
maximum stream temperatures.
  The speed with which the weekly mean
temperatures are calculated on the super-
computer makes it possible to perform the
temporal matching in a number of different
ways. This feature has the potential for
improving estimates of thermal requirements
for fish.  For example, the southern range of
distribution of cool-water fish is generally
near 40° north latitude. Maximum weekly
mean values from south of this parallel
would be expected to provide a better esti-
mate of thermal tolerances than values from
all of North America. Temporal matching cri-
teria can be restricted in other ways (fish
and temperatures sampled in the same year,
season, or month). Comparing "monthly"
and "yearly" datasets provides a means of
examining the importance of the temporal
relationship of temperature and fish records.

Future Objectives
  Currently, the weekly mean temperature is
used to describe surface water conditions
for the week in which a fish sample was
taken. The weekly maximum temperature,
daily means and daily maximums have also
been calculated and matched to fish collec-
tions to re-calculate the maximum 95th per-
centile temperature.  These values provide a
unique and valuable look at the relationship
between various expressions of a fish's ther-
mal regime and its geographic distribution.
Most laboratory-derived measures of ther-
mal tolerance, the source of most past tem-
perature effects information, have employed
constant temperature exposure conditions
when short-term peaks might be as much or
more important in nature. The relationship
of cold temperatures to the  distribution of
warm water fishes has only been;examined
superficially. The data storage and manipu-
lation requirements and modeling demands
will be greatly increased to accommodate
these additional analyses. Research is
underway to incorporate functional ecosys-
tem responses (e.g. system productivity)
into models projecting climate change
impacts.  The area of ecological processes
and effects research as related to aquatic
ecosystems is large and can benefit greatly
from enhanced computational capabilities.

Bibliography
Hondzo, M., and H.G. Stefan. 1993.
   Regional water temperature characteris-
   tics of lakes  subjected to climate
   change. Climate Change 24:187-211.
Stefan, H.G., M. Hondzo, J.G. Eaton, and
   J.H. McCormick.  1994. Predicted effects
   of climate change on fishes in  Minnesota
   lakes. Canadian Spec. Publ. Fisheries
   and Aquatic Sci. 121.
Stefan, H.G., M. Hondzo, J.G. Eatxin, and
   J.H. McCormick.  1994. Validation offish
   habitat models for lakes. Ecological
   Modeling.
Stefan, H.G., and B.A. Sinokrot. 1993. Pro-
   jected global climate change impact on
   water temperature in five north central
   U.S. streams. Climate Change 24:353-
   381.
136
              NESC Annual Report - FY1994

-------
  Estimation of Global Climate Change Impacts on Lake and Stream Environmental Conditions and Fishery
                                                                                          Resources

  1 J.G. Eaton, U.S. EPA, ERL-Duluth, 6201 Congdon Blvd., Duluth, MN 55804 218-720-5357.
  2 H.G. Stefan, St. Anthony Falls Hydraulic Lab, Dept. Civil & Mineral Eng., Mississippi R. & 3rd Ave. S.E., Minneapolis,

  3 R.M. Scheller, Science Applications International Corporation, ERL-Duluth, 6201 Congdon Blvd., Duluth, MN 55804
NESC Annual Report-FY1994                                           •>                       137

-------
Estimation of Global Climate Change Impacts on Lake and Stream Environmental Conditions and Fishery
Resources                                                                    i
138                                                           NESC Annual Report - FY1994

-------
   Integration of Computational and Theoretical Chemistry
   with Mechanistically-Based Predictive Ecotoxicology
   Modeling1'2'3'4
 Background
   In the field of environmental toxicology,
 and especially in aquatic toxicology, a vari-
 ety of computer-based models have been
 developed that are scientifically-credible
 tools for use in predicting and characterizing
 the ecological effect and fate of chemicals
 when little or no  empirical data is available.
 These models are used in ecological risk
 assessments by EPA as well as other Fed-
 eral and State agencies. In some instances
 these models are used to predict the effects
 of new chemicals being proposed for
 release in the environment (e.g., under
 TSCA approximately 2,000 new chemicals
 per year must be reviewed), while in other
 cases the models are used to diagnose pos-
 sible cause and effect relationships for
 chemicals at impacted sites (e.g., CERCLA,
 CAAA, CWA). Diagnostic risk assessments
 are especially challenging when it is realized
 that over 50,000  chemicals exist in commer-
 cial production; however, rudimentary toxic-
 ity data is available only for about 10% of the
 compounds. Finally, predictive toxicology
 models are also used to help predict the
 future outcome of ecosystems, based on dif-
 ferent environmental management scenar-
 ios (e.g., CERCLA,  CAAA, CWA).

 Research
   Reports from the early  1980s, in both the
 U.S. and the Netherlands, established that
 the majority of industrial organic chemicals
 (excluding pesticides and pharmaceutical
 agents) elicit their acute toxic effects through
 a narcosis mechanism. With the develop-
 ment of initial toxicity data sets, numerous
 QSAR investigations were undertaken, and
 the findings of Veith et al.  (1983) and Konne-
 mann (1981) established that the potency of
 narcotics was entirely dependent upon
 xenobiotic hydrophobicity. With subsequent
 experimental studies and modeling efforts it
 has been generally accepted that the rela-
 tionships reported by Veith et al. (1983) and
 Konnemann (1981) represent the minimum,
 or baseline, toxicity that a compound can
 elicit in the absence of a more specific mode
 of toxic action.  With additional study it
 became clear that there were subclasses of
 narcotics - more potent than would be pre-
 dicted from the baseline narcosis QSARs -
 that could be classified by either acute
 potency and/or physiological and behavioral
 characteristics of the narcosis response.
 Further, it was obvious that some industrial
 chemicals were significantly more toxic than
 would  be predicted from narcosis QSARs
 because they were capable of acting as oxi-
 dative  phosphoryla.tion uncouplers, respira-
 tory chain blockers, or other more specific
 mechanisms (e.g., see Bradbury, 1994).
   Although many modes of toxic action can
 be reasonably predicted using non-elec-
 tronic descriptors, the likelihood of a  com-
 pound  to act as a reactive toxicant has not
 been well developed.  The specific issue of
 classifying reactive toxicants, and subse-
 quently predicting their acute toxicity, has
 been an area of interest because these
 compounds are typically among the most
 potent  industrial chemicals; their identifica-
 tion also raises concern over possible
 chronic effects.  Clearly, the ability to  predict
 chemical reactivity requires the use of
 QSARs that employ stereoelectronic
 descriptors.      ;

 Research Objectives
  To reduce uncertainties in ecological risk
assessments of chemical stressors, a sec-
ond generation of advanced predictive mod-
eling techniques is required. The second
NESC Annual Report - FY1994
                                   139

-------
Integration of Computational and Theoretical Chemistry with Mechanistically-Based Predictive
Ecotoxicology Modeling
generation models must be based on funda-
mental principles of chemistry, biochemistry
and toxicology, and designed in such a way
that will efficiently assess the thousands of
chemicals in commercial use. Research
must be directed towards developing mech-
anistically-based QSARs for reactive chemi-
cals for the purpose of improving toxicity
predictions, estimates of metabolism, and
ultimately, chemical similarity.
  Studies have been undertaken to explore
specific toxicological processes and  associ-
ated chemical reactivity parameters that will
establish a mechanistically-based approach
for screening compounds and stereoelec-
tronic indices for QSAR models, thereby
focusing future three-dimensional calcula-
tions. Based on hypotheses concerning
toxic mechanisms and metabolic activation
pathways, several studies have been con-
ducted to explore the use of stereoelectronic
descriptors and to identify potentially reac-
tive toxicants. Descriptors of soft electrophi-
licity and one electron reduction potential
have been calculated for a diverse group of
aromatic compounds and used to discrimi-
nate the narcosis mode(s) of toxic action
from mechanisms associated with covalent
binding to soft nucleophiles in biomacromol-
ecules and oxidative stress, respectively.
These studies are providing some insights
into ways to develop a mechanistically-
based strategy for selecting and using elec-
tronic indices in QSARs for biochemical and
cellular toxicity.

Approach and Results to Date
  Recently, a series of studies done  at the
Environmental Research Laboratory in
Duluth, MN  (ERL-Duluth), have described
exploratory approaches through which
diverse groups of compounds in the  ERL-
Duluth fathead minnow acute mode of action
database, are being identified using global
and local measures of reactivity. These
studies deal with modes of toxic action asso-
ciated with the bonding of soft electrophiles
and oxidative stress. Through these initial
investigations - generally based on semi-
empirical methods - approaches to predict-
ing soft electrophilicity (Mekenyan et al.,
1993; Mekenyan and Veith, 1994; Veith and
Mekenyan, 1993) photo-activation potential
(Mekenyan et al., 1994a,b) and one electron
reduction potential (Bradbury et al., 1994)
have been established. In some cases,
these descriptors are being related to mode
of toxic action classifications and potency.
The results from these studies are briefly
discussed below and suggest that a system-
atic approach can be employed whereby
specific stereoelectronic parameters can be
calculated from structure and used to predict
toxic responses associated with reactive
chemicals.

Photoactivation of Polycyclic Aromatic
Hydrocarbons (PAHs)          ;
  Research with a variety of aquatic species
has shown that while PAHs generally are not
toxic in conventional laboratory tests, many
are extremely toxic in the presence of sun-
light.  The increased toxicity is attributed to
the excitation of the parent PAH to an acti-
vated intermediate through the absorption of
UV radiation and the subsequent generation
of reactive oxygen species when the acti-
vated species returns to the ground state.
Photo-induced toxicity is the result of com-
peting processes, such as stability and light
absorbance which interact to  produce a
complex, multilinear relationship between
toxicity and chemical structure. Initial
research undertaken through ERL-Duluth
(Mekenyan et al., 1994a) established that for
a series of 28 compounds a measure of
energy stabilization in the ground-state
(HOMO-LUMO gap), provided a useful
index to explain persistence, light absorption
and photo-induced toxicity. An additional
study was then conducted, in which 'HOMO-
LUMO gaps' for excited states of PAHs were
determined and also related to phototoxicity
(Mekenyan et al., 1994b).

One Electron Reduction Potentials
  Benzoquinones, within an appropriate one
electron reduction potential range, can be
140
              NESC Annual Report - FY1994

-------
            Integration of Computational and Theoretical Chemistry with Mechanistically-Based Predictive
                                                                      Ecotoxicology Modeling
  reduced by flavoproteins to semiquinones
  that subsequently react with molecular oxy-
  gen to form superoxide anion and the regen-
  eration of the parent compounds. This
  redox cycling, a form of futile metabolism,
  produces reactive oxygen species and
  depletes the reducing equivalents of cells
  without concomitant energy production. The
  resulting cytotoxic effects have been termed
  'oxidative stress.'  Because the ability of a
  quinone to undergo redox cycling is related
  to its one electron reduction potential, a
  study with eight benzoquinones was con-
  ducted to determine if this property can be
  estimated from structure. The results of this
  study suggest that one electron reduction
  potentials between +99 mV and -240 mV
 can be estimated by several electronic indi-
 ces obtained from semi-empirical quantum
 chemistry models.
   QSARs for reduction potential were
 derived from electronic properties of the par-
 ent quinones as well as the semiquinone
 radical anions.  Charge delocalization was
 judged to be an important factor in interpret-
 ing relationships between molecular descrip-
 tors and reduction potentials. From the set
 of eight benzoquinones, it is apparent that
 EHOMO for the parent compound, the
 charge on the carbonyl atoms of the parent
 molecules and localization of electron den-
 sity assessed by aromaticity indices, were
 molecular descriptors for estimating poten-
 tial. Additional studies with naphthoquino-
 nes, nitro aromatics, and phenols have also
 indicated that descriptors associated with
 charge delocalization are critical (Bradbury
 etal., 1994).

 Soft Electrophilicity
  Initial insights reported by our research
 group indicated that for a,p unsaturated
 alcohols, aldehydes and ketones and
 related allene derivatives, descriptors of soft
 electrophilicity, such as average superdelo-
 calizability (SSI), could differentiate those
 compounds classified as narcotics from
 those classified as reactive toxicants (Mek-
 enyan et al.,  1993). Isoelectrophilic
  windows were established that separated
  alcohols that act as narcotics (average S&
  values of approximately 0.285) from reactive
  aldehydes and ketones (average S£v values
  of approximately 0.305). It was also
  reported in subsequent investigations (Mek-
  enyan and Veith,  1994; Veith and Meken-
  yan, 1993) with a larger set of substituted
  benzenes, phenols, and anilines (identified
  as narcotics-l, narcotics-ll, oxidative phos-
  phorylation uncouplers, and proelectro-
  philes/electrophiles in the ERL-Duluth mode
  of action knowledge base) that there is a
  tendency for modes of toxic action to cluster
  according to soft electrophilicity.  In this
  case, narcotic-l and narcotic-ll compounds
  had average  S^v values of 0.280, and com-
 pounds typically classified as uncouplers or
 proelectrophiles/electrophiles had average
  S^v values of 0.345.
   The clustering of uncouplers and electro-
 philes within a common range of  S^v sug-
 gests that highly reactive compounds with a
 dissociable proton are capable of disrupting
 oxidative phosphoiylation, while those with-
 out dissociable protons produce toxic
 responses through covalent binding with soft
 nucleophiles within biomacromolecules
 (Veith and Mekenyan, 1993). This classifi-
 cation is, of course, not applicable for dis-
 cerning hard nucleophiles.  Within the soft
 isoelectrophilic windows, QSARs based on
 log P have been established. These results
 have led to the suggestion (Mekenyan and
 Veith, 1994; Veith and Mekenyan,  1993) that
 the acute toxicity of chemicals can be
 described by a nearly orthogonal relation-
 ship between molecular descriptors for
 hydrophobicity and soft electrophilicity.

 Future Objectives
  Although the results of studies completed
to date that address predictions associated
with soft electrophiliicity, photo-activation
potential and one electron reduction poten-
tial are encouraging, ssjveral important
issues remain unresolved. Future efforts will
involve a systematic extension of the
hypotheses developed thus far and will
NESC Annual Report - FY1994
                                     141

-------
Integration of Computational and Theoretical Chemistry with Mechanistically-Based Predictive
Ecotoxicology Modeling                                                        ;
permit further examination of several impor-
tant topics. Specifically, on-going investiga-
tions include: (1) an assessment of the
predictive capability of current models using
ab initio, rather than semi-empirical methods
and (2) an assessment of whether con-
former distributions rather than single-opti-
mized geometries are more appropriate for
predicting toxic modes of action and
potency.
  These studies are essential for the
advancement of the current state of the sci-
ence of predictive ecotoxicology. Efforts
associated with the first objective will permit
a further substantiation of preliminary find-
ings concerning important modes of toxic
action associated with reactive chemicals.  If
further substantiated by additional research,
new QSARs could subsequently be included
in an ERL-Duluth supported QSAR system
(ASTER), which is used throughout EPA and
other Federal, state, and international gov-
ernments (Russom etal., 1991). Because
the calculation of optimized geometries can
be CPU intensive and because the assump-
tion that chemicals behave as a single opti-
mized geometry in biological systems may
not be valid, a study has also been under-
taken to test hypotheses that distributions of
probable conformers, rather than single opti-
mized structures, are adequate for some
QSAR applications (Mekenyan et al.,
1994c). This second objective of our ongo-
ing investigations will be addressed through
semi-empirical calculations of stereoelec-
tronic parameters associated with photoacti-
vation, one electron reduction potentials and
soft electrophilicity.  By using the ERL-
Duluth fathead minnow database, which
contains toxicity data for approximately 650
chemicals, these studies are contributing to
an overall assessment of CPU-effective
strategies to calculate stereoelectronic prop-
erties for large datasets of industrial chemi-
cals that require ecotoxicoiogical
evaluations.
References
Bradbury, S.P. 1994. Predicting modes of
   toxic action from chemical structure: An
   overview. SAR QSAR Environ. Res. 2,
   89-104.                    :
Bradbury, S.P., Mekenyan, O.G., Veith, G.D.
   and Kamenska, V. 1994. SAR models for
   futile metabolism: II. One-electron
   reduction of quinones, phenols and
   nitrobenzenes.  SAR QSAR Environ.
   Res. (Submitted).
Konnemann, H. 1981.  Quantitative struc-
   ture-activity relationships in fish toxicity
   studies. Part 1:  relationship for industrial
   pollutants. Toxicology 19, 209-221.
Mekenyan, O.G., Veith,  G.D., Bradbury, S.P.
   and Russom, C.L. 1993. Structure-toxic-
   ity relationship for a,p-unsaturated alco-
   hols in fish. Quant. Struct.-Act Relat. 12,
   132-136.
Mekenyan, O.G. and Veith, G.D.  1994. The
   electronic factor in QSAR: MO-parame-
   ters, competing interactions, reactivity
   and toxicity. SAR QSAR Environ. Res.  2,
   129-143.
Mekenyan, O.Q., Ankley, G.T., Veith, G.D.
   and Call, D.J. 1994a. QSARs for photo-
   induced toxicity: I. Acute lethality of
   polycyclic aromatic hydrocarbons to
   Daphnia magna. Chemosphere 28, 56-
   582.
Mekenyan, O.G., Ankley, G.T., Veith, G.D.
   and Call, D.J. 1994b. QSAR estimates
   of excited states and photoinduced
   acute toxicity of polycyclic aromatic
   hydrocarbons. SAR QSAR Environ. Res.
   (Submitted).
Mekenyan, O.G., Ivanov, J.M., Veith, G.D.
   and Bradbury, S.P. 1994c. Dynamic
   QSAR:  A new search for active confor-
   mations and significant stereoelectronic
   indices. QSAR  Struct.-Act. Relat. (In
   press).                     ;
142
              NESC Annual Report - FY1994

-------
            Integration of Computational and Theoretical Chemistry with Mechanistically-Based Predictive
                                                                        Ecotoxicology Modeling
  Russom, C.L, Anderson, E.B., Greenwood,
     B.E. and Pilli, A. 1991. ASTER: An inte-
     gration of the AQUIRE data base and
     the QSAR system for use in ecological
     risk assessments. In: J.L.M. Hermens
     and A. Opperhuizen (Eds.), QSAR in
     Environmental Toxicology-IV. Elsevier,
     Amsterdam pp. 667-670.
  Veith, G.D., Call, D.J. and Brooke, LT.
     1983. Structure-Toxicity relationships for
    the fathead minnow, Pimephales
    promelas: Narcotic industrial chemi-
    cals. Can. J. Fish. Aquat. Sci. 40, 743-
    748.
Veith, G.D. and Mekenyan, O.G.  1993. A
    QSAR approach for estimating the
    aquatic toxicity of soft electrophiles.
    Quant. Struct.-Act. Relat. 12, 349-356.
 1 S.P. Bradbury and G.D. Veith, U.S. EPA, Environmental Research Laboratory-Duluth, Duiuth, MN.
 2 O. Mekenyan, Lake Superior Research Institute, University of Wisconsin-Superior, Superior, Wl.
 3 R. Hunter, Natural Resources Research Institute, University of Minnesota-Duluth, Duiuth, MN.
 4 E. Anderson and E. Heide, Computer Sciences Corporation, Duiuth, MN.
NESC Annual Report - FY1994
                                     143

-------
Integration of Computational and Theoretical Chemistry with Mechanistically-Based Predictive
Ecotoxicology Modeling
144
NESC Annual Report - FY1994

-------
   Evaluation and Refinement of the Littoral Ecosystem Risk
   Assessment Model (LERAM) Year Two1
 Abstract
   The goal of this research is to define the
 domain of application of the Littoral Ecosys-
 tem Risk Assessment Model (LERAM) using
 a number of littoral enclosure and pond field
 studies.  These field studies tested the eco-
 logical fate and effects of different types of
 pesticides and toxic chemicals at different
 times and in different geographic regions.
 Once the best uses for LERAM have been
 delineated and its ecological risk assess-
 ment capabilities defined, it will be made
 available to the regulatory and risk manage-
 ment communities. It is anticipated that
 LERAM will be used by the Office of Preven-
 tion, Pesticides and Toxic Substances
 (OPPTS) during the registration of pesti-
 cides and industrial chemicals and by risk
 managers for predicting changes in ecologi-
 cal risk associated with watershed manage-
 ment options.
   The NESC supercomputing facility is
 being used for three tasks:
 1)  Calibrating the parameters of the
    LERAM. This is done using an optimiza-
    tion algorithm to minimize a function that
    measures the distance between the
    LERAM simulation of population bio-
    masses and biomass data from a littoral
    ecosystem.
 2)  Creating a user-friendly interface for
    LERAM that will allow the user to make
    predictions of ecological risk to a littoral
    ecosystem from exposure to specified
    stressors and display the results in real
    time.
3)   Evaluating LERAM by simulating the
    effects of a number of concentrations of
    selected chemical stressors and com-
    paring these simulations to field data.
    The supercomputing environment is
    used to run 500 (or more) Monte Carlo
    iterations of the model at each treatment
    concentration for the purpose of sensitiv-
    ity and uncertainty analysis.

 Research Objectives
   At the present time, laboratory tests and
 mathematical models form the basis of the
 ecological risk assessment paradigm used
 by the U.S. EPA. Until the early 1980s, sin-
 gle species tests were used almost exclu-
 sively to provide hazard assessments of
 chemicals. At that time, the National Acad-
 emy of Sciences (1981) and others (Levin
 and Kimball, 1984) documented the need for
 supplementary information from field tests,
 microcosm experiments, and mathematical
 models to better assess chemical hazards
 for different geographic regions, seasons,
 application methods, spatial scales, and lev-
 els of biological organization. Along with the
 increased interest in using field tests, micro-
 cosm experiments,, and mathematical mod-
 els to predict system responses to
 perturbations, it became apparent that little
 was known about the accuracy of predic-
 tions made by these techniques. EPA's
 objectives for the research proposed here
 include evaluating and refining one ecologi-
 cal risk assessment technique using field
 data from controlled experiments in natural
 systems.        :

 Background / Approach
  With  this in mind, Lake Superior Research
 Institute (LSRI) and U.S EPA Environmental
 Research Laboratory - Duluth (ERL-Duluth)
 researchers began to develop the LERAM in
June, 1989. LERAM is a bioenergetic eco-
system effects model that links single spe-
cies toxicity data to a bioenergetic model of
the trophic structure of an ecosystem in
order to simulate community and ecosystem
NESC Annual Report - FY1994
                                   145

-------
Evaluation and Refinement of the Littoral Ecosystem Risk Assessment Model (LERAM) Year Two
level effects of chemical stressors. It uses
Monte Carlo iterations of these simulations
in order to calculate probable ecological risk
assessments of chemical stressors. To
date, LSRI and ERL-D researchers have
developed LERAM to the point where it
models the unperturbed behavior of a littoral
ecosystem (i.e., the "behavior" of control
communities), and the response of that sys-
tem to the insecticide chlorpyrifos,  with a
high degree of both accuracy and precision
(Hanratty and Stay, 1994).  Additionally,
LERAM's ability to predict the ecological
effects of the insect growth regulator
diflubenzuron has also been evaluated
(Hanratty and Liber, 1994). This evaluation
revealed the need for some minor improve-
ments to the model.
  During fiscal year 1994, LERAM  has been
modified to improve its efficiency on a vector
processor, to increase the flexibility of the
code for representing different food webs
and ecosystems, and to increase the stabil-
ity and accuracy of its predictions. After the
changes were completed, the model param-
eters were recalibrated to give the  best pos-
sible representations of the ecosystems in
the control littoral enclosures of two different
field studies performed in Duluth, Minnesota
(Rowan, 1990). These two representations
of littoral ecosystems then could  be used to
predict the effects of toxic chemicals used to
examine the impact of the introduction of an
exotic species, or be modified to represent a
similar ecosystem with a few more or less
organisms (e.g., channel catfish were added
to the parameters) and population structure
that best represented the Duluth littoral
enclosures in 1988 in order to simulate the
littoral enclosure system used to measure
the effects of trifluralin during the 1994 field
season.  Current work devoted to evaluating
the improvement of LERAM's ability to pre-
dict ecological effects using data from a lit-
toral enclosure study of the fate and effects
of nonylphenol (Liber, pers. comm.) appears
quite promising.  However,  further work is
required to complete the evaluation of these
refinements of the model, to validate its out-
put using data from littoral enclosure studies
in other geographic regions, and to develop
a more user-friendly interface for the model.

Comparison with Pre-Cray Results
  When the calibration program is run on
other computers, such as a DEC VAX or a
486 PC, it takes so long that the program
becomes impractical to use. The model
itself can be run on other computers, but at
the loss of the advantage of being able to sit
and wait about a minute for the results — an
advantage that allows the user to concen-
trate on the ecological risk assessment or
modeling problem at hand. When the simu-
lations are performed on other computers,
the model can take any where from twenty
minutes to three hours, depending on the
computer and the number of Monte Carlo
iterations performed, so the user must find
something else to do while waiting for the
results.

Future Objectives
  The tasks proposed for evaluating and
refining LERAM using the NESC supercom-
puting facilities are as follows:

Fiscal year  1995:
1) Create a user-friendly interface for
   LERAM.
2) Calibrate the LERAM parameters using
   the control experimental units from the
   trifluralin littoral enclosure study per-
   formed at Auburn University in 1994.
3) Simulate the effect of trifluralin using the
   parameters that best represent a Duluth
   littoral ecosystem and using the parame-
   ters that best represent an Auburn littoral
   ecosystem in order to further evaluate
   LERAM; to investigate methods for using
   LERAM to extrapolate to different geo-
   graphic regions, and to further define its
   domain  of application.
146
              NESC Annual Report - FY1994

-------
        Evaluation and Refinement of the Littoral Ecosystem Risk Assessment Model (LERAM) Year Two
  Fiscal year 1996:
  1)  Develop a model similar to LERAM that
     will simulate a wetland ecosystem.
  2)  Investigate methods for simulating eco-
     logical risk at larger spatial and temporal
     scales.
  3)  Apply these methods to make predic-
     tions concerning ecosystems contained
     in the Great Lakes.

  References
  Bartell, S.M. 1987. Technical Reference
     and User Manual for Ecosystem Uncer-
     tainty Analysis (EUA): 1) - The Pascal
     PC Demonstration Program, 2) - The
     Standard Water Column Model
     1(SWACOM), 3) - The Comprehensive
     Aquatic System Model (CASM). U.S.
     EPA Office of Toxic Substances Report.
 Bowie, G.L,  W.B. Mills, D.B. Porcella, C.L.
     Campbell, J.R. Pagenkopf, G.L. Rupp,
     K.M. Johnson, P.W.H. Chan,  S.A.
    Gherini, Tetra Tech, Inc. and C.E. Cham-
    berlin. 1985. Rates, Constants, and
    Kinetics Formulations in Surface Water
    Quality Modeling (Second Edition). EPA
    600/3-85/040.  U.S. Environmental Pro-
    tection Agency, Athens, Georgia.
 Brazner, J.C., L.J. Heinis, and D.A. Jensen.
    1989. A littoral enclosure design for rep-
    licated field experiments. Environmental
    Toxicology and Chemistry 8:1209-1216.
 Hanratty, M.P. and K. Liber. 1994.  E-valua-
    tion of model predictions of the persis-
   . tence and ecological effects of
    diflubenzuron in littoral enclosures.  In,
    M.F. Moffet (ed.) Effects, Persistence
    and Distribution of Diflubenzuron in Lit-
    toral Enclosures.  U.S. EPA Environmen-
    tal Research Laboratory- Duluth, Duluth,
    Minnesota.
 Hanratty, M.P. and F.S. Stay.  1994.  Field
    evaluation of the Littoral Ecosystem Risk
    Assessment Model's predictions of the
    effects of chlorpyrifos.  J. Appl. Ecol.
    31:439-453.
 Hanratty, M.P., F.S. Stay, and S.J. Lozano.
    1992. Field Evaluation of LERAM: The
    Littoral Ecosystem Risk Assessment
    Model, Phase I.  U.S. EPA Environmen-
    tal Research Laboratory- Duluth, Duluth,
    Minnesota.
 Hanratty, M.P. and S.J. Lozano. Field evalu-
    ation of modeling and laboratory risk
    assessment methods. In preparation.
 Levin, S. A. and K. D.  Kimball. 1984. New
    Perspectives in Ecotoxicology. Environ-
    mental Management 8:375-442.
 Lozano, S.J., J.C. Brazner, M.L. Knuth, L.J.
    Heinis, K.W. Sargent, O.K. Tanner, L.E.
    Anderson, S.L. O'Halloran, S.L. Ber-
    telsen, D.A. Jensen, E.R. Kline, M.D.
    Balcer, F.S. Stay, and R.E. Siefert.
    1989. Effects,  Persistence and Distribu-
    tion of Esfenvalerate  in Littoral Enclo-
    sures. Final Report 7592A. U.S.
    Environmental Research Laboratory-
    Duluth, Duluth, MM 55804. Revised
    ApriM990.
Rowan, T. 1990. Functional Stability Analy-
    sis of NumericaJ Algorithms, Ph.D. The-
    sis, University of Texas, Austin.
1 M.P. Hanratty and K.A. McTavish, Lake Superior Research Institute, University of Wisconsin-Superior, Superior, Wl.
2 F.S. Stay, U.S. EPA Environmental Research Laboratory-Duluth, Duluth, MN.      >
NESC Annual Report - FY1994
                                    147

-------
Evaluation and Refinement of the Littoral Ecosystem Risk Assessment Model (LERAM) Year Two
148
NESC Annual Report - FY1994

-------
   The Reactivities of Classes of Environmental Chemicals
   Understanding Potential Health Risks1-2
   The U.S. Environmental Protection
 Agency is often faced with the problem of
 making regulatory decisions about chemi-
 cals for which there is little or no information
 on possible health effects. The Office of
 Toxic Substances receives applications for
 the registration of approximately 2,000 new
 chemicals a year. Of those there is acute
 toxicity data for only about 35% of the chem-
 icals.  There is some data on physical prop-
 erties and little data on animal toxicity. We
 are learning about the many chemicals pro-
 duced by secondary processes in the atmo-
 sphere for which health data is only now
 beginning to be collected. Additionally, there
 are chemicals in the atmosphere produced
 by combustion, chemicals in drinking water
 that result from the purification processes
 and chemicals in waste dumps for which
 there is a paucity of human health data.
   While the necessary data for the evalua-
 tion of the potential hazard of these chemi-
 cals is being developed, structure activity
 relationships developed from the molecular
 modeling  paradigms may be used to provide
 a tool for initial evaluation and provide a
 rational basis for the prioritization of testing
 needs. In order to more efficiently use these
 techniques, it is necessary to understand the
 molecular basis for the toxicity of chemical
 classes of environmental importance.
  There are two general strategies for
 obtaining these insights:
  • Postulation and development of a molec-
    ular model for a specific mechanism of
    action from existing biological data and
    related chemical information. This
    model may be quantified and tested
    using computational molecular tech-
    niques.
  • Determination of correlations between
    the structure or properties of molecules
    and their biological activity. These corre-
    lations are determined using various
    types of statistical or artificial intelligence
    techniques froim compiled data bases for
    chemicals tested in sufficiently similar
    protocols.    ,
   In both of these strategies for the develop-
 ment of computational structure activity rela-
 tionships, the model is derived from
 available experimental information and its
 quantitative results suggest additional
 experiments for the verification and improve-
 ment of the model,. In practical situations,
 often elements of both 1.) causal and 2.)
 correlative, strategies are used.
   Recent advances in: 1) in the computa-
 tional methods available  for molecular mod-
 eling, 2) the speed and storage capacity of
 computers, and 3) the experimentally
 acquired knowledge of the specific molecu-
 lar targets for xenobiotics, make the applica-
 tion of these methods to environmental
 problems more practical. These methods
 have made it possible to  perform quantum
 mechanical calculations in some instances
 on the xenobiotic and its biomolecular target
 in an aqueous surroundings. They make it
 possible to use molecular dynamics and
 mechanics methods to compute the struc-
 tural changes in a biopolymer induced by
 the binding of a xenobiotic to relevant
 sequences in the biopolymer.
   Polycyclic Aromatic Hydrocarbons
 (PAHs) are a prevalent class of man-made
 chemicals of varying mutagenic and carcino-
 genic potency. There is a great deal of
 experimental evidence to indicate that the
 mechanism through which chemicals in this
 class are active, requires  metabolic activa-
tion to an epoxide or polyol-epoxide. In
association with protonation, the epoxide
opens and binds to nudeophilic sites in
NESC Annual Report - FY1994
                                    149

-------
The Reactivities of Classes of Environmental Chemicals - Understanding Potential Health Risks
DNA.  Because the metabolic formation of
the epoxide is enzymatic, both the electronic
configuration of the PAH and its interaction
with the enzyme active site determine the
site of epoxidation. However, the direction
of the opening of the protonated epoxide is
spontaneous.
   Experimental scientists at the Health
Effects Research Laboratory (HERL) are
studying this  class of chemicals, including a
subclass, the cyclopenta-PAHs (cPAH) (mol-
ecules that in addition to the six membered
rings of other PAHs also contained a five
membered ring). In the cPAHs under con-
sideration, the five membered ring was
formed by the addition of two carbon atoms
with an unsaturated bond to a backbone that
contained only six membered rings. We
were asked if we could determine the direc-
tion of ring opening of an epoxide formed
between the carbon atoms that complete the
five membered ring as shown in Figure 1.
The direction of ring opening determines the
atom that would bind to nucleophilic sites in
DNA.
  AM1 was used to determine the three
dimensional structures of the carbocations
that could be formed by ring opening.  The
electronic structures, energies and frontier
orbitals were determined using the Gauss-
ian series of programs with a 3-21 g basis set
for those carbocation geometries: The Mul-
liken definition of charge was used for deter-
mining group charges and atomic LUMO
densities.  Molecular electrostatic potentials
were computed using our local multipole
method.
                 Rgure 1: A schematic diagram for the ring opening of aceanthrylene 1,2-epoxide
150
              NESC Annual Report - FY1994

-------
          The Reactivities of Classes of Environmental Chemicals - Understanding Potential Health Risks
   The 2-hydroxy-carbocation was favored
 over the 1 -hydroxy-carbocation for acean-
 thrylene by 11.8 kcal/mol. This indicates
 that the 1 -position (the nominally charged
 position for the 2-hydroxy-carbocation) will
 bind to DNA to form adducts. The surprising
 result of these calculations is that the CH6
 group is more positively charged than the
 nominally charged CH1 group.  The local
 LUMO density is also greater at the 6 posi-
 tion than the 1 position. Computation of the
 molecular electrostatic potential of the 2-
 hydroxy-carbocation also showed that the
 region near the CH6 is more likely to attract
 positively charged species than the CH1
 position. All these local reactivities suggest
 the possibility of binding to C6 and not the
 nominally charged C1. However, similar cal-
 culations of the binding of that carbocation
 to small  model nucleophiles show that the
 binding to the 1 position is significantly
 favored. The  electronic effects are more
 important than the electrostatic effects, com-
 puted by these single molecule descriptors,
 in determining the binding position.
  A series of 13 similar cPAHs were investi-
 gated. They all contained a backbone of six
 membered rings (2-4) and a five-membered
 ring completed in a similar manner to acean-
 thrylene.  It was found that these cPAHs
 could be divided into two classes.  Those
 that had  at least three linear six-mernbered
 rings and the five membered ring configured
 so that one of  the carbon atoms was
 attached to a middle ring, like acean-
 thrylene, and those that did not.  For the first
 class the energy difference between the two
 carbocations was large (>7.5 kcal/mol) and
 favored the formation of the carbocation with
 the nominal charge on the CH group
 attached to the middle ring (like the two-car-
 bocation  of aceanthrylene).  For those car-
 bocations a large amount of the positive
charge was localized on the  CH group in the
middle ring para to the nominally charged
group (like the CH6 group in aceanthrylene).
For the second class, the energy difference
 between the two possible carbocations was
 small (<4.0 kcal/mol) and the nominally
 charged CH group always was the most
 highly charged.  ;
   Experimental information was available
 for four of the cPAHs studied, two from each
 group. For the two from the first group, the
 experimental data was consistent with the
 formation of only one carbocation, the one
 predicted by the computations. For the two
 from the second group, the experimental
 data indicates that both carbocations are
 formed.  This information suggests that
 while our methods are capable of correctly
 dividing the class into two subclasses based
 on reactivity, the energy differences between
 the two potential carbocations may be over-
 estimated.
   In order to understand the basis for this
 overestimation of the energy difference
 between the carbocations, ab initio methods
 with both the 3-21 g and 6-31 g basis set and
 a semiempirical  method that includes the
 bulk effects of water (AMSOL/SM2) were
 used to obtain the carbocation geometries
 and energies. It was found that the geome-
 try changes introduced by these methods
 had an insignificant effect on the difference
 in energy between the carbocation pairs,
 even though it caused a change in the total
 energy of the individual cations that was the
 same order of magnitude as the energy dif-
 ference between the carbocations. The
 introduction of the effects of water in the
 semiempirical Hamiltonian (the AMSOL/
 SM2 calculation) significantly decreased the
 energy difference between the carbocation
 pairs.  The division of the cPAHs into two
 classes remained unchanged. The inclusion
 of bulk water stabilizes charge separation in
 the carbocations and  makes all atomic and
 group charges larger. Within the carboca-
 tion pairs, the higher energy carbocation has
 greater charge separation and therefore is
 more  stabilized by the inclusion of water and
the energy difference  between the carboca-
tion pairs becomes smaller.
NESC Annual Report - FY1994
                                     151

-------
The Reactivities of Classes of Environmental Chemicals - Understanding Potential Health Risks
Relevant Publications
Weinstein, H., J.R. Rabinowitz, M.N. Lieb-
    man and R. Osman, 1985.  Determi-
    nants of molecular reactivity as criteria
    for predicting toxicity:  Problems and
    approaches, Environ. Health Perspect.
    61,147-162.
Rabinowitz, J.R. and S.B. Little, 1991. Pre-
    dictions of the reactivities of cyclopenta-
    polynuclear aromatic hydrocarbons by
    quantum mechanical methods, Xenobi-
    otica 21,263-275.
Venegas, R.E., P.M. Reggio and J.R.
    Rabinowitz, 1992. Computational Stud-
    ies of the 3-dimensional structure of
    cyclopenta polycyclic aromatic hydrocar-
    bons containing a gulf region, Int. J.
    Quant. Chem. 41, 497-516.
Rabinowitz, J.R., and S.B. Little, 1992.
    Epoxide ring opening and related reac-
    tivities of cyclopenta polycyclic aromatic
    hydrocarbons:  quantum mechanical
    studies, Chem. Res. in Tox. 5; 286-292.
1 James R. Rabinowitz and Lan Lewis-Sevan, Carcinogenesis and Metabolism Branch, Health Effects Research
    Laboratory, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711.!
2 Stephen B. Little, Integrated Laboratory Systems, Research Triangle Park, NC 27709.

NOTE: Lan Lewis-Sevan is a postdoctoral fellow in the Program in Toxicology of the University of North Carolina. This
    report does not reflect U.S. EPA policy. The research described in this paper has been reviewed by the Health
    Effects Research Laboratory of the U.S. Environmental Protection Agency and approved for publication.
    Approval does not signify that the contents necessarily reflect the views and policy of the Agency nor does
    mention of trade names or commercial products constitute endorsement or recommendation for use.
152
               NESC Annual Report - FY1994

-------