1999 EPA Conference
o
3
3
3
fD
l/l
r*
fit
and Information w
Repository Material
Permanent Collection &EPA
-------
SP
2^2
-
' ~<
-J
Welcome Letter
Agenda
\J$ EPA
Headquarters and Chemical Libraries
EPA West Bldg Roonrl
Mailcode 340411
!30i Constitution Av§
Washington DC 2
202-566-0556
Repository Mat
Permanent Coll
EJBD
ARCHIVE
EPA
230-
R-
99-
002
Abstracts
Evaluation Form
Registrants
-------
af\-
ooi-
\t>
N
O
-------
-------
Welcome to the 1999 EPA Conference on Environmental Statistics and Information
It is my pleasure to welcome you on behalf of EPA's Center for Environmental
Information and Statistics to the 1999 EPA Conference on Environmental Statistics and
Information. For many "old timers" to this conference, you will immediately recognize that we
have changed the name and scope of the conference. Just as EPA is currently planning a new
Office of Information, we have expanded our coverage to information in addition to statistics.
We welcome new attendees to the conference which covers all aspects of data, information,
computer technology, data systems and statistics. We have the data gatherers, the data
administrators, the data users, the data analyzers, the data simulators, the data reinventors, the
data modelers, the data miners, the data assessors, and those who still can't figure out if data are
singular or plural. We hope that all of the attendees will better understand the interrelationships
among these groups as the new Office of Information will encompass so many of these activities.
This year's theme is "EPA's Vision for the 21st Century". I am well aware that many of
this year's conferences on any topic will have similar names. (I also fear any ho-hum building
constructed this year will- shortly be labeled "turn of the century" architecture.) However, as
EPA is undergoing major changes with the creation of the Office of Information, the Agency is
trying to position itself to meet the burgeoning information challenges of the new century, so I
feel the theme is truly appropriate. I hope that this conference will enable us to better understand
those challenges and approaches to meeting them.
We have an exciting collection of plenary sessions, featured talks, concurrent
presentations, training sessions, poster/computer sessions and panel discussions. However, as
with most meetings, some of the informal opportunities to meet and chat with your colleagues
can frequently be the most productive aspect of the conference. I encourage you to take full
advantage of this year's campus-type setting to continue your dialogues with your associates. I
owe a great deal of thanks to the planning and arrangements committees for their efforts to
organize this conference. Special thanks also go to Margaret Conomos and Connie Lorenz who
assisted me in putting it all together, and to Temple University's Institute for Survey Research
that handled the details and coordination. We encourage you to have a good time, learn a lot,
and tell us about any enhancements you would like to see in the future.
f
\^
\^ 1 fc*"t*.. i
ix V
Barry D. Nussbaum
1999 Conference Chair
Conference Planning Committee Arrangements Committee
Henry Kahn Susan Auby
Elizabeth Margosches Joan Bundy
George Flatman Trudy McCoy
Ruth Allen Ed Lloyd
John Warren
-------
-------
-------
Agenda for the 1999 EPA Conference on
Environmental Statistics and Information
Monday, May 10,1999
3:00-6:00 REGISTRATION AND CHECK-IN
Foyer
4:00-6:00 CONCURRENT TRAINING SESSIONS
Woollcott Smith (Temple University) and Peter Petraitis -.. . n
111 *. rn i \ HI I i. »* Dining Room
(University of Pennsylvania) - Workshop on Monte
Carlo Methods in Environmental Statistics
. Joe Anderson (EPA/OIRM) - EPA's Web Site and You Room D
6:00-7:00 Cash Bar
Foyer
Tuesday, May 11,1999
8:30-9:00 Welcoming Remarks and Introduction of Speakers
. Wendy Cleland-Hamnett, Director, EPA Center for
Environmental Information and Statistics
Peter Goodwin, Dean, Graduate School, Temple
University
Dining Room
9:00-9:30 Al Morris, Director, Office of Environmental Data, EPA Region Dining Room
III, Philadelphia, PA - Information, Statistics and the Region
9:30-10:30 Keynote Address
Jay Hakes, Administrator, Energy Information
Administration
Dining Room
10:00-12:00 Statistical Training Session (Videotapes)
RoomC
-------
Tuesday, May 11,1999 (Continued)
10:30-10:45 Break
10:45-12:00 CONCURRENT PRESENTATIONS
Statistics, Information, and GPRA (Chair: George Bonina, EPA)
. Judith Calem Lieberman, (EPA/OCFO), Analytic
Challenges and the Government Performance and
Results Act
George Bonina (EPA) - Reinventing Environmental
Information
RoomD
Local Applications of EPA Data (Chair: Ron Shafer, EPA/CEIS)
. Henry Topper (EPA/OPPT) - The Baltimore Community
Environmental Partnership: Lessons Learned
Kimberly Nelson (Pennsylvania Department of
Environmental Protection) - The Department of
Environmental Protection Compliance Reporting
System
. N. Bouwes, Steven M. Hassur (EPA/OPPT), S. Keane,
E. Fechner Levy, B. Firlie, and R. Walkling (Abt
Associates, Inc.) - Risk-Screening Environmental
Indicators Model
Dining Room
Statistical Methods for Lab and Air Quality Data Analysis
(Chair: Larry Cox, EPA/ORD/NERL)
. Mary Lou Thompson and Kerrie Nelson (University of
Washington) - Statistical Modeling of Multiply-
Censored Data
. Peter Craigmile (University of Washington) - Trend
Estimation Using Wavelets
. Joel H. Reynolds (University of Washington) -
Meteorological Adjustment of Surface Ozone for Trend
Analysis: Pick an Answer, Any Answer
RoomB
12:00-1:15 Lunch
-------
Tuesday, May 11,1999 (Continued)
1:15-2:30 CONCURRENT PRESENTATIONS
Databases: The Manager's View (Chair: Phil Lindenstruth, EPA) Room D
. Panel: Phil Lindenstruth - STORET, Abraham Siegel -
SDWIS, Mike A. Mundell - PCS (EPA)
Ensuring the Quality of Environmental Information Dining Room
(Chair: Nancy Wentworth, EPA/ORD)
Nancy Wentworth (EPA/ORD) - Quality Assurance and
Environmental Information
Malcolm Bertoni (Research Triangle Institute) - Using
SimSITE to Illustrate Sampling Techniques
Models and Model Assessment of Environmental Data Room B
(Chair: Mary Lou Thompson, University of Washington)
Rafael Ponce (University of Washington) - Development
of a Linked Pharmacokinetic-Pharmacodynamic Model
of Methylmercury-Induced Developmental
Neurotoxicity
Samantha Bates, Cullen, A. C., and A. E. Raftery
(University of Washington) - Bayesian Model
Assessment
. Marianne Turley, E. David Ford, and Joel Reynolds
(University of Washington) - Pareto Optimal Multi-
Criteria Model Assessment
2:30-4:30 Statistical Training Session (Videotapes) Room C
2:30-2:45 Break
-------
Tuesday, May 11,1999 (Continued)
2:45-4:00 CONCURRENT PRESENTATIONS
Use of the Internet for Sharing Statistics Dining Room
(Chair: Steve Hufford, EPA/CEIS)
Pat Garvey (EPA/OIRM) - Envirofacts Warehouse:
Environmental Data On The Internet - Empowering the
Citizen toward Environmental Protection and
Awareness
Anne Frondorf (USGS) - National Biological Information
Infrastructure
Chris Miller (NOAA) - National Environmental Data
Index
. Bob Shepanek (EPA/ORD) - EPA's Environmental
Information Management System
Epidemiology and Risk Assessment Cumulative and/or Aggregate Room B
Risk Assessment (Chair: Ruth Allen, EPA)
. David Miller (EPA/OPPTS/OPP) - Food Quality
Protection Act and its Implementation: an Overview of
Statistical and Probabilistic Issues Facing the Office of
Pesticide Programs
Hans D. Allender (EPA/OPP/HED) - Finding a Statistical
Distribution to Use in the Monte Carlo Exposure
Assessment of Livestock Commodities
. Breeda Reilly (EPA/CEPPO) - Applying Epidemiology to
Study the Prevention of Major Chemical Accidents
Statistical Research Issues in Quality Assurance (Chair: John Room D
Warren, EPA/ORD)
John Warren - Integrating Data Quality Indicators (DQIs)
into Data Quality Objectives (DQOs)
Charles White (EPA/OW/OST) - A Performance
Evaluation of the Method Detection Limit
-------
Tuesday, May 11,1999 (Continued)
4:15-5:15 PLENARY SESSION (Chair: Wendy Cleland-Hamnett, EPA/CEIS) Dining Room
. Corrinne Caldwell, Acting Provost & Vice President,
Temple University - Welcoming Remarks
. Tom Curran (EPA/OAR) - Data, Information and
Statistics: Putting it All Together for Decision-Making
5:15-8:00 POSTER AND COMPUTER SESSIONS
. George T. Flatman (EPA/ORD/NERL) - Satellite data for
Landscape Ecology
. Lawrence Lehrman (EPA/RMD/OIS) - Cluster analysis of
fish species and land use
. Connie Lorenz (EPA/CEIS) - All the Stats that are Fit to
Surf
. Brand Neimann (EPA) - Digital Library Demonstration
. Stuart H. Kerzner (EPA/Region III) - Information
Visualization - Turning Data into Information You Can
Easily Understand
. Maliha S. Nash (EPA/LEB) - Geostatistical Analysis of
Ecological Indicators
. Arthur Lubin (EPA/OSEA) - Environmental Random
Stratified Sampling Designs Developed Via Cluster
Analysis
. Heather Case (EPA Customer Service) - What the CEIS
National Telephone Survey will be able to Tell EPA's
Information Providers
. Susannah Dillman (EPA/OPPTS/OPPT) - Methods to
Minimize Human Error in Reporting Analysis Results
. John S. Graves (EPA/Region III) - Using Perl Scripts to
Import Data into GIS: An Example Using USGS
Ground Water Site Inventory Data
Rich Heiberger (Temple University) - Demonstration of
ESS, S-Plus, and Trellis Graphics
. William P. Smith (EPA/OPPE/CES) - CD Toxic Release
Inventory (TRI) Data Explorer
Room A
5:30-6:30 Wine and Cheese Party - Hosted by William Tash, Vice Provost
for Research, Temple University
Pool House
-------
Wednesday, May 12,1999
8:00-9:00 CONCURRENT PRESENTATIONS
Data Integration and Quality: Vision for the Future (Chair: Ruth
Allen, EPA)
. Susan Devesa, D. Grauman, W. Blot, G. Pennello, R.
Hoover, and J. Fraumeni (NCI) - Atlas of Cancer
Mortality in the United States, 1950-94
Ruth Allen (EPA) - Surveillance Improvement Report
Dining Room
Analysis of Cleanups (Chair: Michael J. Messner, EPA/OGWDW) Room B
. Michael J. Messner (EPA/OGWDW) - Cryptosporidium
Occurrence in the Nation's Drinking Water Sources
. Bimal Sinha (EPA/OPPE/CES) - Statistical Estimation of
Average Reid Vapor Pressure of Regular Gasoline
Sampling and Design Issues in Environmental Studies (Chair:
Tony Olsen, EPA/NHEERL)
. David Marker (WESTAT) - Sample Designs for
Environmental Data Collection: Ranked Set Sampling
and Composite Sampling
. Paul D. Sampson (University of Washington) -
Monitoring network design with applications to regional
air quality
Room D
9:00-9:15 Break
9:15-11:15 Statistical Training Session (Videotapes)
Room C
-------
Wednesday, May 12,1999 (Continued)
9:15-10:45 CONCURRENT PRESENTATIONS
Some Analyses and Potential Analyses at EPA (Chair: Doreen
Sterling, EPA/CEIS)
. Mike Barrette (EPA/OE) - Integrating Data for Planning
and Targeting
. Tom DeMoss and Tom Pheiffer (EPA) - The Mid Atlantic
Integrated Assessment Program (MAIA)
. John Moses (EPA/CEIS) - Strategy to Address Evolving
Environmental Information Needs
RoomD
Measurement Issues Related to our Water Supply (Chair: Barnes
Johnson (EPA/OSWER/OSW)
. Andrew Schulman, Jennifer Wu, and Benjamin Smith
(EPA/OGWDW) - Forays into the Unforgiving -
Occurrence Estimation in the Realm of Data with
Multiple Censoring Points (arsenic in the public water
supply)
. Henry Kahn, Helen L. Jacobs, and Kathleen A. Stralka
(EPA/OW/EAD) - Estimated Water Consumption In
The U.S. Based On The CSFII
. Virginia A.Colten-Bradley (EPA/OSWER/OSW) -
Development of a Neural Network Tool for Evaluation
of Waste Management Unit Designs
Dining Room
Assessing Risk (Chair: Elizabeth Margosches (EPA/OPPTS)
. Mary Marion (EPA/OPPT) - Simulation and Acute
Dietary Risk Assessments
. David Pawel (EPA/OAR) - Proposed EPA Methodology
for Assessing Risks from Indoor Radon
. Elizabeth H. Margosches, Ph.D., Jennifer Seed, Ph.D.,
and Khoan T. Dinh, Ph.D. (EPA/OPPTS) - Health Data:
How Do We Use It To Protect the Public/Environment?
Margaret Conomos (EPA/CEIS) - Discussant
RoomB
-------
Wednesday, May 12,1999 (Continued)
10:45-11:00 Break
11:00-11:30 PLENARY SESSION (Chair: Barry Nussbaum, EPA/CEIS)
Woollcott Smith (Temple University) - A Walk on the
Wild Side of Statistical Communication
Dining Room
11:30-12:30 PLENARY SESSION (Chair: Doreen Sterling, EPA/CEIS)
Robert English (EPA) - Proposed Information
Management Office
Dining Room
12:30-2:00 Lunch
2:00-4:00 Statistical Training Session (Videotapes)
Room C
2:00-3:30 CONCURRENT PRESENTATIONS
The Visual Presentation of Data (Chair: Al Morris, EPA, Dining Room
Region III)
. Al Morris (EPA) - Enviroviz-Turning Numbers into
Visual Relationships
. David Mintz (EPA/OAR/OAQPS) - Methods for
Displaying Temporal and Spatial Trends
. Daniel Carr (George Mason University) - Two Templates
for Visualizing Georeferenced Statistical Summaries
Listening To Our Information Customers (Co-Chairs: Brendan Room D
Doyle (EPA/CEIS) and Margaret Morgan-Hubbard (EPA/Office of
Communications)
. Panel: Margaret Morgan-Hubbard, Director, EPA Office
of Communications, Brendan Doyle, (Acting) Director,
CEIS Customer Survey and Access Division,
Emma McNamara, (Acting) Director, EIMD, OIRM,
and Pat Bonner, EPA Customer Service
8
-------
Wednesday, May 12,1999 (Continued)
Application of Sampling in Aquatic Resources (Chair: Henry
Kahn, EPA/OW/EAD)
. Henry Kahn and Silvestre Colon (EPA/OW/EAD) -
Composite Sampling Analysis of Contaminant Levels
in Fish
. Anthony R.Olsen (EPA/NHEERL) - National Fish Tissue
Contaminant Lake Survey: A New Spatially-Restricted
Survey Design
Barnes Johnson (EPA/OSWER) - How to Survey Water
Designs
3:30-3:45 Break
Room B
3:45-5:15 PLENARY SESSION: Statistics and Information at EPA as we
Start a New Century: Where Are We Going? (Chair: Phil Ross,
EPA/CEIS)
. Larry Cox (EPA/ORD/NERL)
. Karen Klima (EPA/IWI)
. Heather Case (EPA/CEIS)
G.P. Patil (Pennsylvania State University)
Dining Room
Thursday, May 13,1999
8:30-10:30 TRAINING
Steven P. Millard (PSI) - Applying Monte Carlo Simulation
Techniques with S-PLUS
Dining Room
9:00-10:30 PRESENTATIONS (Note this time overlaps with above training)
The Data Come In, the Data Go Out Room D
. Rick Westlund (EPA/OP) - Reducing Paperwork Burdens
at EPA
. Charlotte Cottrill (EPA/ORD) - EMPACT's Role in the
21st Century
9
-------
Thursday, May 13,1999 (Continued)
10:30-10:45 Break
WRAP-UP SESSION (Chair: Barry Nussbaum, EP/VCEIS) Dining Room
10-45 11 -45 * William Raub, Deputy Assistant Secretary for Science
Policy, Department of Health and Human Services -
Perspectives on Data and Information from the
Department of Health and Human Services
11:45-12:00 Door Prize and Closing Remarks Dining Room
10
-------
stats sched wpd Page 1
' SCHEDULE FOR CEIS EXHIBIT SUPPORT
FOR THE CEIS STATS CONFERENCE IN PHILLY MAY 10-13
Note: The last people manning the exhibit each day will need to assure the PC's are safely
secured and that the people manning the booth the next morning will know where the PC's are
and will have access to them.
MONDAY. MAY 10
Exhibit set up - 2:00 - 4:00 p.m.
Connie Lorenz
*Margaret Conomos
TUESDAY. MAY 11
Staff for Exhibit
(Including assuring PC set up, Internet access, etc.)
10:00 -12:00 a.m.
*Nathan Wilkes
Connie Lorenz
1:15-3:00 p.m.
Lee Ellis
Margaret Conomos
3:00-5:15 p.m.
Ed Brandt
Lee Ellis
5:15-8:00 p.m.
(Will allow for runs to the wine and cheese as needed)
Connie Lorenz
Nathan Wilkes
Margaret Conomos
WEDNESDAY. MAY 12
Staff for Exhibit
8:00 -10:00 a.m.
Lee Ellis
Ed Brandt
10:00-12:00 a.m.
Connie Lorenz
-------
stats sched wpd Page 2
Nathan Wilkes
WEDNESDAY. MAY 12 fConf d)
2:00 -5:15 p.m.
Margaret Conomos
Lee Ellis
THURSDAY. MAY 13
Staff for Exhibit
8:30 -10:30 a.m.
Lee Ellis
10:30-12:00 a.m.
*Ed Brandt
*Margaret Conomos
Note: Ed and Margaret will break the exhibit down, package it up and assure its return, along
with OP's lap top PC's, to Waterside Mall.
-------
-------
Notes
-------
ABSTRACTS
1999 EPA Conference on
Environmental Statistics and Information
4:00-6:00 Monday, May 10
CONCURRENT TRAINING SESSIONS
WORKSHOP ON MONTE CARLO METHODS IN ENVIRONMENTAL
STATISTICS
Woollcott Smith
Statistics Department, Temple University
Peter Petraitis
Biology Department, University of Pennsylvania
This workshop is divided into two parts:
1. Smith will present an overview of modern computer intensive Monte Carlo methods. The
review will include the statistical motivation as well as technical and philosophical
advantages and disadvantages in using these methods in administrative and legal settings.
We will briefly describe how these methods are used to attack hard statistical problems in
missing data imputation, measurement error and Bayesian analysis. Finally the details of
randomization and simulation methods will be illustrated using a basic aired comparison
design.
2. Petraitis will present a case study on the pros and cons of using randomization methods as an
alternative to analysis of variance and the analysis of covariance.
-------
10:45-12:00 Tuesday, May 11,1999
STATISTICS, INFORMATION, AND GPRA
(CHAIR: GEORGE BONINA, EPA)
ANALYTIC CHALLENGES AND THE GOVERNMENT PERFORMANCE
AND RESULTS ACT
Judith Calem Lieberman
OCFO, US Environmental Protection Agency
The Government Performance and Results Act (GPRA) of 1993 set into motion a spate of
activity in Agency strategic planning and accountability. In essence a legal constitution for good
management, the GPRA requires federal agencies to set goals, measure performance, and report
on the degree to which goals are met. It also places emphasis on attaining results rather than
tracking program activities. The Office of the Chief Financial Officer has been leading EPA's
effort to meet GPRA's statutory requirements, which includes development of a 5-year strategic
plan, annual performance plans (and budgets), and annual performance reports. During the first
cycle of GPRA implementation, several analytical challenges have been revealed. The most
significant ones relate to identification of outcome goals, development of performance measures,
validation/verification of performance data, and comparison of performance with annual goals.
Working through these challenges will require a good understanding of the Agency's mission, a
little creativity and the analytical skills to understand the impact of program activities on
environmental results.
LOCAL APPLICATIONS OF EPA DATA (CHAIR: RON
SHAFER, EPA/CEIS)
THE BALTIMORE COMMUNITY ENVIRONMENTAL PARTNERSHIP:
LESSONS LEARNED
Henry Topper
US Environmental Protection Agency
In this case study, participants in the Baltimore Community Environmental Partnership will
describe their experiences and present lessons they have learned. The experience presented will
be based on a three-year project involving a Partnership among the residents, governments, and
businesses in south Baltimore and northern Anne Arundel County. This Partnership worked
together to begin addressing the long term environmental and economic concerns in four
neighborhoods in south Baltimore and northern Anne Arundel County. For many years, both
residents and businesses in this heavily industrialized section of the metropolitan Baltimore area
have expressed concerns about health and the environment in their neighborhoods. By working
-------
together in a Partnership, the community completed a comprehensive review of all aspects of its
environment and has begun work to implement a plan to make real improvements.
The Partnership has taken a holistic view of community problems and has developed efforts to
address a broad range of issues facing the community including health concerns, housing issues,
illegal dumping, subsistence fishing, park restoration and enhancement, community gardening,
economic development, air quality, and crime. Based on this holistic approach, the Partnership
has begun to develop an understanding of the complexity of the environmental stresses facing the
community and the need for a multifaceted approach to improving community health and
building a sustainable community. In the area of community health, Partnership committees are
now working to address the issues of indoor air, fish consumption, truck traffic, and industrial
toxic releases. As a part of this effort, the Air Committee of the Partnership completed a
comprehensive screening analysis of air releases from all the businesses and facilities in and
around the Partnership area. This analysis, based on exposure modeling, has given the
community information on the cumulative concentrations of toxics from all sources in each of
the four Partnership neighborhoods. The Air Committee has developed a protocol to compare
these modeled concentrations with established health effect values to determine areas for
pollution prevention. The committee has also developed a protocol and screened for potential
combined effects of multiple chemicals that have similar target organs, e.g. all the chemicals that
are respiratory tract irritants. As a result of the work of the Air Committee, the community now
has some key parts of the information it needs to monitor and improve the local environment.
THE DEPARTMENT OF ENVIRONMENTAL PROTECTION
COMPLIANCE REPORTING SYSTEM
Kimberly Nelson
Pennsylvania Department of Environmental Protection
The Pennsylvania Department of Environmental Protection (DEP) has made significant strides in
improving data management. PA DEP has successfully integrated across more than 12 programs
data to present a holistic view of the people and places it regulates. The data reside in the DEP
client/site database which is fully integrated with departmentwide application processing and
compliance reporting systems. The DEP compliance reporting system is one of the few systems
in the country that can track multi-media inspections, violations, penalties and enforcement
actions for a single facility and is the only system in the country that is on-line for citizens to
track compliance activities. The client/site system also is integrated with the department's new
Pennsylvania Facility Analysis System, a web based GIS application that went on-line for the
public in March. Currently, the department is focusing priority attention on an Environmental
Futures Team whose charge it is to develop a plan for measuring environmental outcomes.
-------
OPPT'S RISK-SCREENING ENVIRONMENTAL INDICATORS MODEL*
Bouwes, N. and Hassur, S.
Office of Pollution Prevention and Toxics, U.S. Environmental Protection Agency
S. Keane, £, Fechner Levy, B. Firlie, and Walkling, R.
Abt Associates, Inc.
The Toxics Release Inventory (TRJ) provides raw data on the quantities of chemicals released by
US manufacturing facilities, but these raw data alone do not provide information about the
relative toxicity or exposure potential of these releases. The Office of Pollution Prevention and
Toxics (OPPT) of the US EPA has created the Risk-Screening Environmental Indicators Model
to provide a risk-based perspective of these releases, in a PC-based model. The Indicators Model
integrates toxicity scores with a measure of exposure potential and the size of the potentially
exposed population to calculate individual Indicator Elements for each combination of facility,
chemical, and release media reported under TRI. Each year of reporting generates
approximately 250,000 of these Elements which are summed to provide overall Indicator Values.
The Indicator Elements can also be summed to create sub-Indicators that rank relative impacts by
medium, chemical, geographic area, industry sector or a combination of these and other
variables. This flexibility provides the analyst with the opportunity to examine trends year-to-
year, and to rank and prioritize chemicals, industries and regions for strategic planning, risk-
related targeting for enforcement and compliance purposes, and community-based environmental
protection. The model also permits the user to investigate the relative influence of toxicity,
exposure and population on the results.
*Work supported under EPA Contract Number 68-W6-0021, WA#3-02.
STATISTICAL METHODS FOR LAB AND AIR QUALITY
DATA ANALYSIS (CHAIR: LARRY COX, EPA/ORD/NERL)
STATISTICAL MODELING OF MULTIPLY CENSORED DATA
Mary Lou Thompson and Kerrie Nelson
The National Research Center for Statistics and the Environment, University of
Washington
Laboratory analyses in a variety of contexts may result in doubly left censored measurements,
i.e. amounts of contaminants of concern may be reported by the laboratory as "non-detects" or
"trace". The analysis of singly censored observations has received attention in the biostatistical
(e.g. in the context of survival analysis) and in the environmental literature. We consider
maximum likelihood and semi-parametric approaches to linear models in the doubly censored
setting.
-------
TREND ESTIMATION USING WAVELETS
Peter Craigmile
Department of Statistics, University of Washington
A common problem in the analysis of environmental time series is how to deal with a possible
trend component, which is usually thought of as large scale (or low frequency) variations or
patterns in the series that might be best modeled separately from the rest of the series. Trend is
often confounded with low frequency stochastic fluctuations, particularly in the case of models
such as fractionally differenced processes (FDPs), which can account for long memory
independence (slowly decaying auto-correlation) and can be extended to encompass non-
stationary processes exhibiting quite significant low frequency components. In mis talk we
assume a model of polynomial trend plus fractionally differenced noise and apply the discrete
wavelet transform (DWT) to separate a time series into pieces that can be used to estimate both
the FDP parameters and the trend. The estimation of the FDP parameters is based on an
approximation maximum likelihood approach that is made possible by the fact that the DWT
decorrelates FDPs approximately. Once the FDP parameters have been estimated, we can then
test for a non-zero trend. After outlining the work that we have done to date on testing for non-
zero trends, we demonstrate our methodology by applying it to an air quality time series.
METEOROLOGICAL ADJUSTMENT OF SURFACE OZONE FOR
TREND ANALYSIS: PICK AN ANSWER, ANY ANSWER
Joel H. Reynolds
NRCSE, Department of Statistics, University of Washington
A variety of statistical methods for meteorological adjustment of surface ozone have been
proposed in the literature over the last decade. As part of a larger review of the literature, we
summarize and compare six different methods applied to the analysis of surface ozone
observations in the Chicago region from the 1981 -1991 period: nonlinear regression, regression
tree models, extreme events models, time-series filtering, nonlinear additive time-series models,
and canonical covariance analysis. Differences in the resulting trend analyses are discussed in
terms of differences in each analysis' spatial domain and choice of ozone statistic. The review
highlights the need for development of techniques for extreme value analysis of space-time
processes.
-------
1:15-2:30 Tuesday, May 11,1999
DATABASE: THE MANAGER'S VIEW (PANEL SESSION)
Philip Lindenstruth, Michael A. Mundell, and Abraham Siegel
US Environmental Protection Agency
The Panel will present for discussion several issues involved in the administration of a national
database. These issues start with requirements for the database and addresses optional data
fields, data quality, data ownership, database management issues, and support for the system
during its life cycle. Those on the Panel would like their initial presentations to stimulate a
discussion of these issues with the attendees.
ENSURING THE QUALITY OF ENVIRONMENTAL
INFORMATION (CHAIR: NANCY WENTWORTH, EPA/ORD)
USING SIMSITE TO ILLUSTRATE SAMPLING TECHNIQUES
Malcolm J. Bertoni
Center for Environmental Measurements and Quality Assurance, Research Triangle
Institute
The Simulated Site Interactive Training Environment (SimSITE) is a computer-based training
support system that helps environmental scientists and engineers learn how to plan a field
investigation at a hazardous waste site. Through the use of a graphical user interface provided
by the ArcView geographic information system (GIS), training participants apply concepts such
as Data Quality Objectives (DQOs), Data Quality Indicators (DQIs), statistical sampling design,
and Data Quality Assessment (DQA). SimSITE contains statistical design and analysis tools and
sampling simulation routines that allow the participants to develop and implement sampling
plans that satisfy their DQOs. SimSITE then generates a data set (including sampling and
measurement errors), and allows the participants to make decisions about whether or not to clean
up areas of the artificial site, based on their statistical analysis of the data. At the end of the
simulation, the features of the underlying true contamination are revealed to illustrate the
phenomenon of decision errors. During this interactive presentation, the features and classroom
uses of SimSITE will be demonstrated.
-------
MODELS AND MODEL ASSESSMENT OF ENVIRONMENTAL
DATA (CHAIR: MARY LOU THOMPSON, UNIVERSITY OF
WASHINGTON)
DEVELOPMENT OF A LINKED PHARMACOKINETIC-
PHARMACODYNAMIC MODEL OF METHYLMERCURY-INDUCED
DEVELOPMENTAL NEUROTOXICITY
T.A. Lewandowski, S.M. Bartell, R.A. Ponce, C.H. Pierce, and E.M. Faustman
Department of Environmental Health, University of Washington
Methyl mercury (MeHg) has been shown to cause adverse developmental effects in human and
animal conceptuses exposed in utero. A toxicological model of the disposition and cellular
action of MeHg in the developing fetus can be used to estimate health outcomes for various
levels of exposure. Modeling can also incorporate differences in dose rate, chemical species, or
inter-species variability. A linked toxicokinetic and toxicodynamic model for MeHg has been
developed for the rat based on work performed in our laboratory. The toxicokinetic model
incorporates many of the changes in organ size and blood flow associated with gestation.
In the toxicokinetic model, changes in the population of committed fetal neural cells have been
estimated based on the observed effects of MeHg on rates of cellular death, proliferation and
differentiation in vitro. We are currently determining these rates in vivo using BrdU-Hoechst
flow cytometry. The toxicokinetic model demonstrates an adequate fit to experimental
toxicokinetic data. For example, 3 days after a dose of 1 mg/kg (given on day 16 of gestation),
the model predicts fetal brain and fetal blood levels within 10% of the values observed by
Wannag (1976). In terms of toxicodynamic effects, the model predicts 20% and 65% decreases
in the number of committed neural cells (on gestational day 15, relative to untreated baseline) at
fetal brain concentrations of 10 and 50 umol/kg. It is anticipated that the existing model can be
extended to address other species (i.e., humans) and other developmental toxicants which act by
similar mechanisms (i.e., cell cycle disruption).
Sponsored by the following grants: USEPA R825358 and CR825173 and NIEHS T32ESO-7032.
BAYESIAN MODEL ASSESSMENT
Samantha Bates and A. E. Raftery
Department of Statistics, University of Washington,
Cullen, A.C.
Graduate School of Public Affairs, University of Washington
In this paper we discuss a Bayesian method of analysis which incorporates both prior knowledge
of the distributions of the inputs to a deterministic model and any available data on the model
inputs and outputs. This method uses Monte Carlo simulation from the prior distributions for the
inputs and resampling of these simulations with weights determined by the observed data under
-------
the sample importance resampling scheme of Rubin. The method yields posterior distributions
for the output from which to find distributions for quantities of interest. The method also allows
the separation of the contributions of variability and uncertainty on the posterior distribution of
soil concentration.
We will present an application of this method to modeling poly-chlorinated biphenyl (PCB)
concentrations in various media at a Superfund site in New Bedford Harbor (NBH), MA.
Dredging during this clean-up of the Harbor exposes inhabitants of the surrounding region to
PCB contaminated air, soil and plants. A deterministic model for PCB concentration in soil was
developed by Cullen (1992). The Bayesian method is used to find distributions for the PCB
concentration in soil at this site. In addition we will contrast the results of this Bayesian method
with those of a traditional Monte Carlo approach and a trial-and-error approach.
PARETO OPTIMAL MULTI-CRITERIA MODEL ASSESSMENT
Marianne Turley, E. David Ford, and Joel Reynolds
University of Washington
Evolutionary computation (EC) is an optimization technique for finding Pareto optimal solutions
to multiple objective functions. It borrows ideas from evolutionary theory to direct the
optimization search through the parameter space. We applied this optimization to process models
to improve model assessment by requiring a solution, a model parameterization, to achieve
multiple criteria simultaneously. In this talk, I will discuss the algorithm, two alternative search
errors and some examples.
2:45-4:00 Tuesday, May 11, 1999
USE OF THE INTERNET FOR SHARING STATISTICS (CHAIR:
STEVE HUFFORD, EPA)
ENVIROFACTS WAREHOUSE: ENVIRONMENTAL DATA ON THE
INTERNET - EMPOWERING THE CITIZEN TOWARD
ENVIRONMENTAL PROTECTION AND AWARENESS
Pat Garvey
US Environmental Protection Agency
Governments and the courts are acknowledging more and more that the Public has a right to
know what is being discharged and released to the environment. The U S Congress and the
Executive Branch have taken decisive action to ensure this public right to access of data and
information.
The U.S. EPA created the Envirofacts Warehouse to provide the public with direct access to the
vast amounts of information and data in its national program environmental data systems. The
8
-------
Envirofacts Warehouse helps EPA fulfill its responsibility to make information available to the
public, as required by federal legislation and Executive Order.
Envirofacts is available from the Internet, (www.epa.gov/enviro) allowing EPA to disseminate
information quickly and easily. Envirofacts Warehouse contains:
a relational database of the national databases on Superfund (abandoned hazardous waste)
sites, hazardous waste handlers, discharges to water, toxic releases, air releases, and drinking
water suppliers,
the relational database also contains the facility index system, the Envirofacts Master
Chemical Integrator, locational reference tables, and,
spatial data and demographic data from the other sources.
Internet applications are available and part of the Envirofacts Warehouse Internet site to provide
easily designed queries to the databases and to create maps and other reports.
The Presentation shows the capabilities and reasons for the Envirofacts Warehouse. The
presentation demonstrate the features and principles behind the design of the Web site, the
database design and model and demonstrates the various application features and query options
from the Web.
The presentation will demonstrate:
How On-line Queries and Results are useful to the concerned public, interested
organizations, governmental regulatory staff and to Environmental Officer of a plant, facility
or company;
CIS Mapping capabilities and Outputs that are On-line and what are the CIS capabilities in
the future;
Data refresh schedules and the importance of On-line Documentation; and
C'usiomer Feedback procedures for data quality and user needs.
The presentation will address the US EPA directions and program initiatives in public access of
governmental data and community empowerment with environmental data.
NATIONAL BIOLOGICAL INFORMATION INFRASTRUCTURE
Anne Frondorf
U.S. Geological Survey
This presentation will provide a brief description/overview of the National Biological
Information Infrastructure (NBII) program, a collaborative effort to build a distributed, Internet-
based federation of biological science data, information and analytical tools. Examples of the
types of data and information available from the NBII and the types of different agencies and
organizations and partnerships involved in building the NBII will be provided.
Two key elements of the NBII "infrastructure" (i.e. the standards-related activities that help to
support and pull together this distributed data network) will be highlighted. These are the
-------
development of a biological metadata content standard (and an accompanying biological
metadata clearinghouse network) and the continued development of the Integrated Taxonomic
Information System (ITIS) as a standard reference for biological nomenclature and taxonomy
ITIS is a partnership among USGS, EPA, NOAA, USDA, and the Smithsonian Institution.
EPA'S ENVIRONMENTAL INFORMATION MANAGEMENT SYSTEM
Bob Shepanek
Office of Research and Development, US Environment Protection Agency
Presented is an integrated vision for scientific information management approaches supporting
monitoring and assessment activities within the US EPA's, Office of Research and Development
(ORD). This vision was developed based upon lessons-learned from the implementation of
several scientific information management systems and from development of the ORD's strategic
and implementation plans for scientific information management. The vision reflects that
effective management of scientific information must address technical, cultural and management
challenges. Technical challenges include management and integration of metadata, data, and the
modeling, analysis, and visualization tools used as part of assessment activities. Cultural
challenges relate mainly to the protection of intellectual capital produced by individual
investigators. Management issues include commitment of adequate resources for systems
development and operation, support for related policies and procedures, and appropriate
incentives for involvement by staff and project participants.
EPIDEMIOLOGY AND RISK ASSESSMENT CUMULATIVE
AND/OR AGGREGATE RISK ASSESSMENT (CHAIR: RUTH
ALLEN, EPA)
FOOD QUALITY PROTECTION ACT AND ITS IMPLEMENTATION: AN
OVERVIEW OF STATISTICAL AND PROBABILISTIC ISSUES FACING
THE OFFICE OF PESTICIDE PROGRAMS
David Miller
US Environmental Protection Agency
With the passage of the Food Quality Protection Act, the Agency's Office of Pesticide Programs
is now required to aggregate risks from pesticides across exposure pathways and to accumulate
risks from pesticides across chemicals. As a result and in an attempt to develop better risk and
exposure estimates that consider the probabilities associated with simultaneous exposures, the
Office of Pesticide Programs is now using probabilistic (Monte Carlo) techniques in its risk and
exposure assessments. This had necessitated that OPP develop further refinements to its risk
assessment procedures. This presentation will provide an overview of FQPA and discuss its
major science impacts. It will review the traditional (deterministic) type methods used by OPP
in exposure and risk assessments as well as the probabilistic techniques now being used with
increasing frequency. Finally, it will review some of the statistical and policy issues which are
10
-------
now being considered by the Office as it implements the probabilistic risk analysis framework
now in place.
FINDING A STATISTICAL DISTRIBUTION TO USE IN THE MONTE
CARLO EXPOSURE ASSESSMENT OF LIVESTOCK COMMODITIES
Hans D. Allender, Ph.D., P.E.
US Environmental Protection Agency
The presentation develops a methodology to find a frequency distribution of animals'
contamination because of the ingestion of pesticide-contaminated food. Given the percentage of
crops treated (%CT), the methodology calculates the distribution of animals that will be exposed.
Determination of the frequency distribution can be used later in connection with the application
of a Monte Carlo Analysis to the Exposure Assessment of humans to contaminated animal
products. The flexibility of the method allows the construction of frequency distributions to
multiple cases with different %CT. A non-agricultural example explains the process in a way
that everyone can relate to the calculations. The ubiquitous spreadsheet is used as the preferred
medium to obtain random numbers, recalculate probabilities, generate totals, and produce
graphics. A detail explanation of how the spreadsheet is constructed ensures the audience the
possibility of duplicating the exercise. The simplicity of the methodology makes the process
easy to replicate and to extend to similar situations. It also allows the study of severe
contamination by pointing out the percentage of animals which diet has been contaminated from
different sources. In summary, the article indicates a way of calculating a realistic statistical
distribution of animal contamination based on ingestion of contaminated food. Also, the
procedure can be extended to non-agricultural situations.
APPLYING EPIDEMIOLOGY TO STUDY THE PREVENTION OF
MAJOR CHEMICAL ACCIDENTS
Breeda Reilly
Chemical Emergency Preparedness and Prevention Office, US Environmental Protection
Agency
Mandated by the Clean Air Act Amendments of 1990, accident histories from some 69,000
chemical facilities in the United States will become available in the fall of 1999. This
presentation describes the challenges of using the tools of epidemiology with this data to
investigate drivers of severity and frequency of accidents. This study was proposed by Center
for Risk Management and Decision Processes at the Wharton School and is a major focus of an
EPA cooperative agreement. The Major Accident Epidemiology Project aims to contribute to
the process of determining which plants are most likely to incur major events, by ascertaining
whether certain predictors (characteristics of manufacturing plants or of the companies that own
them) are associated with increased probability of a major event. This knowledge can be helpful
in two ways: (1) plants with such risk factors can be monitored more closely (by the companies
themselves as well as by regulators and other stakeholders); and (2) these associations may
provide clues about characteristics of companies' organizational systems that act as underlying
causes of major events.
11
-------
STATISTICAL RESEARCH ISSUES IN QUALITY ASSURANCE
(CHAIR: JOHN WARREN, (EPA/ORD)
INTEGRATING DATA QUALITY INDICATORS (DQIS) INTO DATA
QUALITY OBJECTIVES (DQOS)
John Warren
Quality Assurance Division, Office of Research and Development, US Environmental
Protection Agency
EPA Order 5360.1 CHG 1 (July 1998) requires all EPA organizations to use a systematic
planning process to develop acceptance or performance criteria for the collection, evaluation, or
use of environmental data. Systematic planning identifies the expected outcome of the project,
the technical goals, the cost and schedule, and the acceptance criteria for the final result. The
Data Quality Objectives (DQO) Process is the Agency's recommended planning process when
data are being used to select between two opposing conditions, such as decision-making or
determining compliance with a standard. The outputs of this planning process (the data quality
objectives themselves) define the performance criteria. The DQO Process is a seven-step
planning approach based on the scientific method that is used to prepare for data collection
activities such as environmental monitoring efforts and research. It provides the criteria that a
data collection design should satisfy, where to collect samples; tolerable decision error rates; and
the number of samples to collect.
Data Quality Indicators (DQIs) are the individual performance characteristics specified in the
mandatory Quality Assurance Project Plan (QAPP) that accompanies any environmental data
collection. Typical DQIs include precision, completeness, comparability, and sensitivity. This
discussion centers on how the Agency can effectively make the link between DQOs and DQI
A PERFORMANCE EVALUATION OF THE METHOD DETECTION
LIMIT
Charles White
US Environmental Agency
Performance criteria specified in the original (1981) publication are evaluated using EPA data.
Data available for preliminary evaluation include over thirty combinations of pollutant by
chemical analytical technique.
12
-------
5:15-8:00 Tuesday, May 11,1999
POSTER AND COMPUTER SESSIONS
INFORMATION VISUALIZATION - TURNING DATA INTO
INFORMATION YOU CAN EASILY UNDERSTAND
Stuart H. Kerzner
US Environmental Protection Agency, Region III
The poster shows "EnviroSnax", which are graphics showing tidbits of environmental
information in ways that are easy to understand and highlight past or future environmental
impacts on the Region. They are used for management briefings, public use, press releases and
presentations.
WHAT THE CEIS NATIONAL TELEPHONE SURVEY WILL BE ABLE
TO TELL EPA'S INFORMATION PROVIDERS
Heather Case
EPA Customer Service, US Environmental Protection Agency
This presentation will describe the potential uses of the results from a national telephone survey
recently completed by the CEIS. The national telephone survey, which began in February 1999,
was designed to:
identify and describe environmental information customers within the U.S. population;
identify the public's high interest environmental topics; and
determine the public's access preferences for obtaining and using information.
The survey results will be used to guide CEIS information product and service development.
The survey results will be available for peer review in mid-August 1999.
This presentation will highlight potential uses by information providers in the Programs and
Regions.
METHODS TO MINIMIZE HUMAN ERROR IN REPORTING ANALYSIS
RESULTS
Susannah Dillman
US Environmental Protection Agency
Using "Paste Special" multiple graphs and tables in Excel can be linked to the report in
WordPerfect 8 and updated all at once.
13
-------
USING PERL SCRIPTS TO IMPORT DATA INTO CIS: AN EXAMPLE
USING USGS GROUND WATER SITE INVENTORY DATA
John S. Graves
US Environmental Protection Agency, Region HI
One of the primary tools in EPA Region III for evaluating environmental data is the Geographic
Information System or GIS. A difficulty in using a GIS is that environmental data is not always
readily available in a GIS format. The Perl computer language was used to translate U.S.
Geological Survey ground water data into a format, which could then be imported into a GIS.
This poster presents relevant portions of the Perl script used with explanations of the data
processing steps undertaken as well as examples of GIS generated plots from the resulting data
in EPA Region III.
DEMONSTRATION OF ESS, S-PLUS, AND TRELLIS GRAPHICS
Richard M. Heiberger
Department of Statistics, Temple University
ESS [Emacs Speaks Statistics] is a GNU Emacs interface for interactive statistical programming
and data analysis. Languages supported include S-Plus, XLispStat, and SAS. ESS provides a
standard interface between statistical programs and statistical processes and has as one of its
goals an increase in efficiency for statistical programming and data analysis, over the usual tools.
ESS displays source code in these languages with syntactic indentation and highlighting of
source code. ESS interacts "directly" with the statistical package. ESS allows intelligent
interaction with the transcript of previous interactive session.
Trellis is a graphical display system that uses multiple panels to simultaneously view
relationships between different variables in your multivariate dataset through conditioning.
Trellis was developed at Bell Labs as part of S-Plus.
We will have a live demonstration of ESS, S-Plus, and trellis graphics. I will analyze and display
several examples of continuous and discrete multivariate and time series data sets.
CD TOXIC RELEASE INVENTORY (TRI) DATA EXPLORER
William P. Smith
Center for Environmental Information and Statistics, US Environmental Protection
Agency
The TRI Data Explorer is a web product designed to provide the user quick and easy queries to
EPA's TRI Chemical release data for years 1988-1997. The Explorer's portal to TRI chemical
release data is through multiple data views which provide detailed and comprehensive chemical
reports at all geographic levels down to the facility level by year or across years. In addition for
each chemical the explorer provides interesting information such as factoids and information on
the top 100 releasing facilities and counties.
14
-------
The TRJ Explorer will help our customers find information on topics such as: the chemicals
released in their county during the year; the facilities that are releasing these chemicals in the
county, state or the nation; the top chemicals released in their county, the state, or the nation;
and, the top 100 ranking facilities and counties in the nation that release a given chemical, or all
chemicals. And much more.
The application runs on the web at hup://athena.was.epa.gov:2002/~wsmith/tri2/explorer.htm.
or on CD for running off-line without the Internet. The CD application will be demonstrated.
8:00-9:00 Wednesday, May 12
DATA INTEGRATION AND QUALITY: VISION FOR THE
FUTURE (CHAIR: RUTH ALLEN, EPA)
ATLAS OF CANCER MORTALITY IN THE UNITED STATES, 1950-94
Susan Devesa, D. Grauman, W. Blot, G. Pennello, R. Hoover, and J. Fraumeni
Division of Cancer Epidemiology and Genetics, National Cancer Institute
The geographic patterns of cancer around the world and within countries have provided
important clues to the environmental determinants of cancer. In the mid-1970s the NCI prepared
county-based maps of cancer mortality in the U.S. that identified distinctive variations and hot-
spots for specific tumors, thus prompting a series of analytic studies of cancer in high-risk areas
of the country. We have prepared an updated atlas of cancer mortality in the United States during
1950-94. based on mortality data from the National Center for Health Statistics and population
estimates from the Census Bureau. Rates per 100.000 person-years, directly standardized using
the 1970 US population, were calculated by race (whites, blacks) and gender for 40 forms of
cancer. The new atlas includes more than 140 computerized color-coded maps showing
variation in rates during 1970-94 at the county (more than 3000 counties) or State Economic
Area (more than 500 units) level. Summary tables and figures are also presented. Selected
maps for the 1950-69 period are also included. Accompanying text describes the observed
variations and suggests explanations based in part on the findings of analytic studies stimulated
by the previous atlases. The geographic patterns of cancer displayed in this atlas should help to
target further research into the causes and control of cancer.
15
-------
ANALYSIS OF CLEANUPS (CHAIR: MIKE MESSNER,
EPA/OGWDW)
CRYPTOSPORIDIUM OCCURRENCE IN THE NATION'S DRINKING
WATER SOURCES
Michael J Messner, Ph.D.
US Environmental Protection Agency
Cryptosporidium is a microbial pathogen which occurs in most of the nations surface waters.
Information on cryptosporidium occurrence will be used in estimating the costs and benefits of
ruture drinking water regulations. A recently completed survey generated monthly estimates of
cryptosporidium concentrations in the source waters of over 400 large drinking water utilities.
With only two months of validated data in hand, it appears that 80 to 90 percent of the water
volumes analyzed yielded zero oocysts. On its face, this sparsely of nonzero results appears to
severely limit the data's usefulness. In this presentation, a Bayesian approach is outlined for
estimating hierarchical model parameters and their uncertainties. Time permitting, the approach
will be illustrated using a small simulated data set.
SAMPLING AND DESIGN ISSUES IN ENVIRONMENTAL
STUDIES (CHAIR: TONY OLSEN, EPA/NHEERL)
SAMPLE DESIGNS FOR ENVIRONMENTAL DATA COLLECTION:
RANKED SET SAMPLING AND COMPOSITE SAMPLING
David Marker
Westat
Historically environmental statistics and survey sampling have had
relatively limited interaction. Most environmental studies use pre-existing data collection
locations, collect from known hot spots, and/or purposively select data collection locations.
Efficient survey sampling that can support the evaluation of a wide range of hypotheses has been
used to a lesser degree with environmental data than in health, education, or many other types of
data.
This talk will describe two NRCSE funded research activities that try to bridge this gap between
survey sampling and environmental statistics.
Ranked set sampling (RSS) is a method to potentially increase precision and reduce costs by
using "rough but cheap" information to obtain a more representative sample before the real, more
expensive sampling is done. We have explored under what conditions RSS becomes cost-
effective for ecological and environmental field studies where the "rough but cheap"
measurement has a cost.
16
-------
We are continuing to explore when alternative forms of two-phase sampling are preferable to
RSS.
Composite sampling has been proposed in environmental settings where the costs of
measurement are high. It is hoped that by compositing data collected from multiple locations the
cost savings will outweigh the loss of information on the individual locations. Unfortunately it is
not clear how often this trade- off is successful. NRCSE has funded the collection of side-by-side
individual and composite samples so that this trade-off can be explored with real data from a
national survey of over 800 houses. The data collection protocol and types of planned analyses
will be discussed for this ongoing activity.
9:15-10:45 Wednesday, May 12
SOME ANALYSES AND POTENTIAL ANALYSES AT EPA
(CHAIR: DOREEN STERLING, EPA/CEIS)
INTEGRATING DATA FOR PLANNING AND TARGETING
Michael Barrette
US Environmental Protection Agency
For each major regulatory program implemented by EPA, the program office has designed
databases to house the information critical to the program's needs. In a changing world, data
users are now interested in looking at environmental information holistically, which means that
databases must relate to each other.
To plan its enforcement and compliance activities, EPA makes use of integrated data within the
Integrated Data for Enforcement Analysis (IDEA) system. This system provides access to more
than 15 databases maintained by EPA and other government agencies. When trying to compare
across databases, of course many discrepancies and data errors are found. In this presentation
several topics related to data quality and integration will be examined:
What is the critical step needed in order to integrate information across databases at the
facility level? Discussion will focus on EPA's data integration strategies.
What are key methods that have used existing data to find high-priority sector and
geographic issues? Discussion will focus on recent efforts to identify priority areas and
sectors for inspection targeting.
How can data integration be used to find violators? Discussion will focus on some concrete
examples showing how comparison of databases can lead facilities that are improperly
regulated.
17
-------
THE MID ATLANTIC INTEGRATED ASSESSMENT PROGRAM (MAIA)
Tom DeMoss
Environmental Services, U.S. Environmental Protection Agency, Region III
Tom Pheiffer
Atlantic Ecology Division, NHEERL, U.S. Environmental Protection Agency
The MAIA program is an integrated environmental assessment program being conducted by
USEPA, Region III, and US EPA's Office of Research and Development, partnership with other
Federal and State Agencies.
Objectives of the MAIA program are to build partnerships and get all stakeholders involved in
helping to (1) identify questions needed for assessing major ecological resource area, such as
ground water, surface water, forests, estuaries, wetlands, and landscapes; (2) characterize the
health of each resource are, based upon exposure and effect information; (3) identify possible
associations with stressors, including landscape attributes, that may explain impaired conditions
for both specific resources and the overall ecosystem; (4) target geographic areas and critical
resources for protection and restoration, and (5) monitor environmental management progress.
Our experience with partners uncovered certain key principles of effective watershed
management. They were (1) agreement on geologic boundaries and or units of assessment; (2)
conduct an assessment of their biological condition of resources; (3) target management to real
impairment based upon the biological assessments including TMDL, nutrients and habitat
restoration; (4) have watershed approach be holistic or segment by segment bases upon nature of
problem; (5) have five-year rotation to monitoring and to assessments to allow time for change
of environment and for progress from management action; (6) buy-in stakeholders so assessment
and monitoring plans use all available resources and innovative options; (7) success will be more
cost-effective monitoring and management fixes.
Successful State partnering involves early buy in well before products are developed. MAIA's
emphasis on aquatic biology and habitat is a departure from the water quality standards/TMDL
mentality and requires open dialogue with state biologists who must educate their managers on
the importance of habitat preservation and restoration as the new wave of management of their
aquatic resources.
STRATEGY TO ADDRESS EVOLVING ENVIRONMENTAL
INFORMATION NEEDS
John Moses
Center for Environmental Information and Statistics (CEIS), US Environmental Protection
Agency
While primarily a regulatory agency, the U.S. Environmental Protection Agency is devoting an
increasing amount of its resources to responding to public requests for information about
environmental quality, pollution sources, and human health and ecosystem concerns.
18
-------
Additionally, the Agency must report annually to Congress on its progress in protecting human
health and safeguarding the natural environment, as required under the Government Performance
and Results Act (GPRA). Yet, in many cases, the data EPA needs to respond to public questions
and to report on its progress are not readily available. The Evolving Information Needs Strategy
addresses the gaps between the data the Agency needs and the data it currently has.
Working with EPA Regional and Program Offices and external stakeholders, CEIS developed a
two-phase strategy to identify and address some of the Agency's key environmental information
gaps. Phase I is a general screening analysis for identifying major gaps in 26 key environmental
problem areas and for setting priorities among these problem areas. Phase II is a methodology
for performing a more in-depth analysis of and recommendations to address the gaps associated
with each environmental problem area. This paper reports on the Phase I screening analysis,
conducted from June through April 1999.
MEASUREMENT ISSUES RELATED TO OUR WATER SUPPLY
(CHAIR: BARNES JOHNSON (EPA/OSWER/OSW)
FORAYS INTO THE UNFORGIVING- OCCURRENCE ESTIMATION IN
THE REALM OF DATA WITH MULTIPLE CENSORING POINTS
Andrew Schulman, Jennifer Wu, and Ben Smith
US Environmental Protection Agency
Under the Safe Drinking Water Act, the Agency is charged with establishing standards for
allowable levels of contaminants in the Nation's public water systems. Central to the selection
of the regulatory level is the determination of the relative benefits and costs likely to be
achieved. Benefits and costs are directly proportional to the level of current occurrence.
Consequently, sound decision making requires the best possible estimation of occurrence be
utilized.
In developing a new regulation for arsenic, the Agency has data from over twenty States
covering a time span of up to twenty years. Because the current regulation is at a much higher
concentration than new options under investigation, however, many State data sets are heavily
censored by detection limits within the range of required estimation. This paper will discuss the
data and the approaches EPA is considering for the assimilation of the data into national and
intra-system occurrence estimation.
ESTIMATED WATER CONSUMPTION IN THE U.S. BASED ON THE
CSFII
Henry D. Kahn, Helen L. Jacobs, and Kathleen A. Stralka
US Environmental Protection Agency
Knowledge of drinking water intake is fundamental to the mission of the Office of Water and an
important component of a number of programs at EPA. This presentation provides a summary of
19
-------
our recent efforts to generate up-to-date estimates of water intake by the population of the United
States. To obtain current estimated water consumption distributions, we have analyzed the
United States Department of Agriculture's (USDA's) Combined 1994-96 Continuing Survey of
Food Intake by Individuals (CSFII) data set. Per capita water intake is estimated for three
sources of water: municipal/tap, bottled, and other sources of water (i.e., private well, private
cistern, or private or public well). For each source of water, distributions are generated for direct
and indirect water consumption. The distributions by age, gender, race, socioeconomic status,
and geographical region and separately for pregnant and lactating women are also estimated.
Survey design and statistical methodology are discussed. We anticipate that the water
consumption distributions will be used in a wide range of applications including: rules limiting
amounts of microbes; disinfectant by-products (DBF) rules; radon and other drinking water
contaminant rules; protection of sensitive populations and other exposure assessments.
DEVELOPMENT OF A NEURAL NETWORK TOOL FOR EVALUATION
OF WASTE MANAGEMENT UNIT DESIGNS
Virginia Cohen-Bradley
Economics, Methods, and Risk Analysis Division, Office of Solid Waste, US Environmental
Protection Agency
Samuel Figuli, Julia Lewis, and Katrin Arnold
HyroGeoLogic, Inc.
The Office of Solid Waste recently completed a neural network software tool designed for
evaluating leachate concentrations in four different waste management units, with three different
liner types. The purpose of the tool is to help non-hazardous industrial waste facilities determine
the concentration for the constituent of concern that can be disposed of safely in a specific waste
management unit design. The neural network software, EPA's Industrial Waste Management
Evaluation Model (IWEM) is based upon EPA's ground-water fate-and-transport model,
EPACMTP. EPACMTP was designed for national-level risk assessments. It is run in Monte
Carlo mode, using hydrologic data representative of the United States. Seven parameters judged
to be the most significant in EPACMTP were used to build four different neural network tools,
one for each of the waste management units: landfill, surface impoundment, waste piles, and
land application units.
IWEM has a multi-layer perceptron architecture and was trained in back-propagation mode from
target output generated by the Monte Carlo-style analyses with EPACMTP. Several different
approaches to producing training- and test-data sets were used. In general, the comparison
between the neural network and the EPACMTP results is good. The accuracy of the neural
networks varies with the location of the EPACMTP response surface that is being simulated.
20
-------
ASSESSING RISK (CHAIR: ELIZABETH MARGOSCHES
(EPA/OPPTS)
PROPOSED EPA METHODOLOGY FOR ASSESSING RISKS FROM
INDOOR RADON
David Pawel, Ph.D.
US Environmental Protection Agency
Radon has been determined to be the second leading cause of Jung cancer after cigarette smoking
(NAS 1998). Based on methodology published by the National Academy of Sciences (NAS) in
its BEIRIV report (NAS 1988) and in its "Comparative Dosimetry" report (NAS 1991), EPA
has previously estimated that 13,600 lung cancer deaths in the U.S. each year are radon related
(EPA 1992). Subsequently, the Agency sponsored a study by the NAS, which reviewed the large
body of evidence about radon that has become available since their earlier reports. The new
NAS study, BEIR VI (NAS 1998), confirmed that radon is a serious public health problem, and
provided new estimates of radon risk and of radon-attributable lung cancer deaths, which were
somewhat higher than EPA had projected previously, particularly for never smokers. The BEIR
VI committee concluded, moreover, that about one-third of these cases are preventable if all
homes above 4 pCi/L are remediated.
We will discuss proposed revisions to EPA's methodology for calculating radon-related risk
estimates in light of BEIR VI and the Agency's own previous analysis. These include estimates
of attributable risk and risk per working level month (WLM). Attributable risk is the proportion
of lung cancer deaths attributable to radon. Risk per WLM is the number of expected radon-
induced cancer deaths for the current population divided by the corresponding total of past and
future exposures. We will describe life table methods for calculating these quantities, and show
how changes in smoking patterns might impact these estimates of risk. It is anticipated that this
methodology would be used by EPA in a number of contexts, including: (1) updating its public
information aimed at reducing residential radon exposures; (2) its assessment of risk from radon
in drinking water; and (3) its assessment of risks associated with radium contaminated sites.
HEALTH DATA: HOW DO WE USE IT TO PROTECT THE
PUBLIC/ENVIRONMENT?
Elizabeth H. Margosches, Ph.D., Jennifer Seed, Ph.D., and Khoan T. Dinh, Ph.D.,
US Environmental Protection Agency
This talk will describe the types of data typically available for analysis by the EPA's Office of
Pollution Prevention and Toxics, and how they are used. These data are submitted under various
statutes or gathered from the open literature and are used to help decide to what degree the public
or the environment may be at risk of incurring adverse effects if certain exposures occur. The
decisions include such considerations as whether the available studies are experimental or
observed in situ and how inferences may be made from various animal studies to the wild or to
humans as well as inferences from one effect to another. Sampling and data collection issues,
21
-------
missing data, and data modeling are all critical statistical aspects of this activity. An example
will be given those focuses on generalizing inferences from a dose-response model.
2:00-3:30 Wednesday, May 12
THE VISUAL PRESENTATION OF DATA (CHAIR: AL
MORRIS, REGION III, EPA)
ENVIROVIZ-TURNING NUMBERS INTO VISUAL RELATIONSHIPS
Alvin R. Morris
Director, Office of Environmental Data, US Environmental Protection Agency, Region HI
We're drowning in data ever hear that plaintive wail? While we may not be drowning, we are
faced with under-utilizing data. Another more recent challenge facing us- in the spring of next
year is to prove to the congress, the public and others that we are using the funds they provide
to actually improve the environment-how much, where and for what price.
Data visualization can help solve both those challenges. This presentation will be the first
presentation of outputs of a prototype program we named EnviroViz. A program that
dynamically links air and water ambient and major point sources to:
where they are located: in the Region, state, county, and watershed
shows the 6-year trend for each of 7 air and 46 water parameters (stressors)
the GPRA goals to the sub-objective level
and shows for each sub-objective the associated FTE, contract $, state and tribal grants
It's a new approach to more easily understanding the meanings embedded in environmental data
and can be applied in many areas-please come see and comment.
METHODS FOR DISPLAYING TEMPORAL AND SPATIAL TRENDS
David Mintz
Air Quality Trends Analysis Group, US Environmental Protection Agency
EPA's Office of Air Quality Planning and Standards is tasked with developing an annual report
on the nation's air quality. This report, entitled National Air Quality and Emissions Trends
Report, uses various graphing techniques to present temporal and spatial trends in the data. This
paper discusses the methods employed in the report, their strong points, and their limitations.
Much of the graphical design is based on the principles of Edward Tufte and other leading
authorities on the visual display of information.
22
-------
TWO TEMPLATES FOR VISUALIZING GEOREFERENCED
STATISTICAL SUMMARIES
Daniel B. Carr
Center for Computational Statistics, George Mason University
This paper presents two new templates for visualizing spatially-index statistical summaries. The
first template called conditioned choropleth (CC) maps represents a powerful interactive
extension of classed choropleth maps. The basic layout is a 3 x 3 matrix of panels containing
nine juxtaposed maps. One conditioning variable corresponds to rows and the other to columns.
The analyst controls the highlighting of map regions by manipulating row and column sliders
that define acceptable intervals for the conditioning variables. A small tab in each panel shows a
value summarizing the highlighted region values. The presence or absence of main effects and
interaction are evident at a glance. Other analyst interactions including dynamic class interval
selection and simultaneous pan and zoom for all panels. The examples emphasize study of
human mortality rates for health service areas conditioned on environmental and demographic
variables.
The second template, called linked micromap (LM) plots, provides an alternative to traditional
classed choropleth maps. The new design trades off region boundary resolution for more
accurate or extensive statistical summaries. These summaries can be bar plots, dot plots, box
plots, time series, line high plots for over a hundred variables and so on. Color provides a local
link of each region's (or site's) statistical summary and it's spatial position in the micromap.
Examples show numerous variations of this template. The discussion addresses pattern
discovery and working in progress for drilling down from state to county to census tract.
LISTENING TO OUR INFORMATION CUSTOMERS (PANEL SESSION)
Margaret Morgan-Hubbard, Director, EPA Office of Communications
Brendan Doyle, (Acting) Director, CEIS Customer Survey and Access Division
Emma McNamara, (Acting) Director, EIMD, OIRM (invited), and
Pat Bonner, EPA Customer Service
US Environmental Protection Agency
This session will give participants an overview of how EPA and CEIS are surveying the
Agency's current and potential environmental information customers to better understand their
needs and access preferences. Several examples of how customer feedback is helping to shape
various EPA information products and services will be introduced. CEIS will re-cap lessons
learned from the Center's customer surveys over the past two years and give an update on their
national customer telephone survey (results due this fall). Session participants will have an
opportunity to express their interests in using the Center's survey data for their own analyses and
programs. The basics of using customer feedback on your products or services will also be
covered.
The panel agenda will include:
23
-------
Introductions: Brendan Doyle, CEIS
Margaret Morgan-Hubbard, OCEMR: The importance of focussing your information product or
service on your customers' needs and a vision for serving EPA's environmental information
customers in the future.
Brendan Doyle: Overview of what we've learned so far by implementing the CEIS customer
survey plan and what we hope to learn from our national information customer telephone survey
this fall.
Emma McNamara: EPA's web sites- incorporating customer input and customer service
principles into developing and maintaining a Web site.
Pat Bonner: EPA Customer feedback 101- will discuss how "Hearing the Voice of the
Customer" guidelines can help you to obtain useful customer feedback on your products,
processes and services.
APPLICATION OF SAMPLING IN AQUATIC RESOURCES
(CHAIR: HENRY KAHN, EPA/OW/EAD)
COMPOSITE SAMPLING ANALYSIS OF CONTAMINANT LEVELS IN
FISH
Henry D. Kahn and Silvestre Colon
US Environmental Agency
Samples of fish formed by physically mixing, i.e., grinding together, a number of fish into a
combined, aggregate sample are referred to as "composite samples". Chemical analysis of
composite samples is a cost-effective mechanism for estimating mean levels when the cost of
analysis is high and the cost of obtaining sample units, such as individual fish, is relatively low.
A possible concern in the analysis of composite sampling data is the absence of measurement
results on individual units that comprise the composite. This presentation considers a set of data
on contaminant levels in measured in composite samples offish and individual fish that
constitute the composite samples. The results allow for comparison of composite and individual
analyses. Additional topics discussed are: estimation of variance components associated with the
composite samples, using measurements made on subsamples of the composites and effects of
fish length and weight on contaminant levels.
24
-------
NATIONAL FISH TISSUE CONTAMINANT LAKE SURVEY: A NEW
SPATIALLY-RESTRICTED SURVEY DESIGN
Anthony R. Olsen
NHEERL Western Ecology Division, U.S. Environmental Protection Agency
In 1998, the U.S. Environmental Protection Agency initiated a national study offish tissue
contaminants in lakes and reservoirs. The study requires the development of a survey design to
meet the study objectives. For the national lake study, a list frame of waterbodies greater than 1
hectare is available. The frame provides information on the lake surface area and its geographic
location, in the form of a geographic information system (GIS) coverage. However, the frame
includes waterbodies that do not meet the definition of the target population. The frame
includes 270,761 waterbodies. This paper develops the survey designs for the study and
discusses how an underlying discrete global grid can be used to control the spatial distribution of
the sample and to address the imperfection of the frame. The survey design does not use finite
population sampling theory, but a continuous population in a bounded area theory that parallels
it. The spatially-restricted design enables the concept of a systematic sample to be implemented
while maintaining the ability to obtain design-based estimates and variance estimates.
8:30-10:30 Thursday, May 13
APPLYING MONTE CARLO SIMULATION TECHNIQUES WITH S-
PLUS
Steven P. Millard
Probability Statistics and Information (PSI)
Monte Carlo Simulation covers a broad range of topics, including simply generating random
numbers, probabilistic risk assessment, bootstrapping to obtain the distribution of (and hence
confidence intervals for) some statistic for which the distribution is unknown or not assumed,
and permutation tests. This talk will discuss the concepts behind each of these main topics, then
use examples to show you how to implement these methods using S-PLUS and
ENVIRONMENTAL STATS for S-PLUS.
25
-------
9:00-10:30 8:30-10:30 Thursday, May 13
THE DATA COME IN, THE DATA GO OUT
REDUCING PAPERWORK BURDENS AT EPA
Rick Westlund
Office of Policy, US Environmental Protection Agency
In the March 1995 Reinventing Environmental Regulation report, EPA established a long term
commitment to identify and eliminate obsolete, duplicative, and unnecessary monitoring,
reporting, and record keeping requirements. To date, EPA has removed more than 25 million
baseline burden hours, and built an internal watchdog culture dedicated to avoiding unnecessary
new paperwork burdens. Although total burden has continued to creep upward due to new
statutory requirements and new right-to-know collections, EPA programs continue to develop
creative approaches to chip away at burden without endangering environmental objectives. In
addition, EPA is developing many enterprise-wide initiatives designed as strategic investments
with the potential for much larger burden reductions three to five years from now.
In the last several years the Agency has accelerated its efforts to improve information collection
management, with a particular focus on reducing burdens associated with reporting and record
keeping, while at the same time enhancing data quality, coordinating our data activities with
States, improving our collection and display technologies, and compiling our data into a single
Internet site. We have taken major steps, but there is still more to do. The public's
right-to-know is now a fundamental cornerstone of our work at EPA, and we have all worked
hard to put information into the hands of the American people in the belief that this is one of the
best ways to protect public health and the environment. In the course of doing so, we have
learned that the Agency's effective management of its data is central to the measurement of our
progress in delivering the protections the American people expect. As we embark on a new era
of information technology and enhanced public access to data, we are committed to minimizing
our paperwork burden on the public while ensuring that our data are timely, accurate, useful to
the public, and able to effectively inform our own decision making.
The Agency has several initiatives underway to redesign or refocus the way we manage
information collection with primary goals to reduce burden on the public while accomplishing
our environmental protection mission. The most encompassing initiative is the recently launched
reorganization plan involving the formation of a new information organization that will bring
together all Agency information programs to better manage our information resources with an
expressed goal of reducing burden on the public while enhancing the data quality and integrity as
it is used within the Agency and made available to others outside the Agency.
Another major initiative, started over a year ago after the 1997 Information Streamlining Plan, is
the continued development of the Reinventing Environmental Information (REI) initiative. In its
early stages, the plan focuses on data quality and building infrastructure, but burden reduction
savings will become more apparent as the efficiencies in reporting options become available.
26
-------
The Agency has been very active working with the States on burden reduction especially through
partnership workgroups with the Environmental Council of States (ECOS). The workgroup is
identifying burden reduction opportunities by defining what information is and should be
collected, how information is transmitted, and how information is used. The workgroup is also
engaging industry, the public and others to help draft a tactical approach to burden reduction.
Within the Agency, the program offices are developing a range of streamlining and reinvention
initiatives to reduce burdens. They range from whole program streamlining as in the Office of
Solid Waste's comprehensive review of the RCRA program to the Office of Air's reengineering
of the pre-production certification program for new motor vehicles.
10:45-11:45 Thursday May 13
PERSPECTIVES ON DATA AND INFORMATION FROM THE
DEPARTMENT OF HEALTH AND HUMAN SERVICES
William F. Raub, Ph.D.
Deputy Assistant Secretary for Science Policy
Department of Health and Human Services
The Department of Health and Human Services (DHHS) employs a wide variety of data and
information systems as it seeks to enhance the well-being of Americans by providing for
effective health and human services and by fostering strong, sustained advances in the sciences
underlying medicine, public health, and social services. DHHS data-oriented efforts range from
(a) collection of national vital and health statistics to (b) systematic surveillance focused on
specific diseases and disorders to (c) special surveys oriented to particular public health issues
and'or particular population groups. A major contemporary challenge is to improve surveillance
for new and reemerging infectious diseases in general while improving preparedness to detect
and respond to potential acts of biological terrorism.
N c*«>j
TWW * US-
' -^ it _ i j«. ^i "^**
O-»-
-------
Analytic Challenges and
the Government
Performance and Results
Act
Judith Calem Lieberman
EPA/OCFO
13* EPA CONFERENCE ON STATISTICS
AND INFORMATION
May 10-13, 1999 Philadelphia, PA
OUTLINE
The GPRA-deflnition, objectives,
requirements, who's involved
4 Analytic Tasks
Analytic Challenges
Efforts to Improve EPA's Data Quality
-------
Government Performance
and Results Act (GPRA)
Legislation requiring agencies to set goals,
measure performance, and report on the degree to
which goals are met
A "legal constitution for good management"
Seeks to improve the efficiency, effectiveness,
and public accountability of federal agencies
Promotes transparency in decision-making with a
focus on program results (environmental
outcomes)
GPRA REQUIREMENTS
Strategic Plan
Description.
Mission and Long-term Goals
Activities and Resources
External Factors
After 'Managing for Results Analytic Challenges in Measuring Performance' (GAO. 5/97)
-------
GPRA REQUIREMENTS
Performance Plan
Description.
Annual Performance Goals/Targets
Links Budget with Goals
Performance Measures to Assess
Progress
After 'Managing for Resulu Analytic Challenges in Measuring Performance' (CAO, 3/97) 5
GPRA REQUIREMENTS
Performance Report
Description:
Assessment of Performance
Unmet Goals
After 'Managing for Resulu Analytic Challenges in Measuring Performance- (CAO, 3/97} 6
-------
WHO'S INVOLVED IN
GPRA IN EPA?
Managers and staff in the Program
Offices who are directly involved in
strategic planning, accountability and
budget formulation/execution
All staff
UJ
h-
CD
3
EPA's Planning Architecture
Goals (10)
Objectives (41)
Sub-objectives (118) (si
/ \ I
Annual
Performance
Goals (APGs)
Annual
Performance
Measures (APMs)
APG
APG
D
-------
ANALYTIC STAGES
GPRA
Requirement
Strategic
Plan
Performance Plan
Performance
Report
Analytic
Stage
Identify Goals
Develop
Performance
Measures
Validate and
Verify
Performance
Data
Analyze
and Report
Results
After'Managing for Results Analytic Challenges in Measuring Performance* (CAO, 5/97) 9
ANALYTIC STAGES
GPRA
Requirement
Strategic
Plan
Performance Plan
Analytic
Stage 1
Identify Goals
Challenges: Program mission makes it difficult to define
outcomes
APGs describe annual progress-difficult to
characterize as outcomes
After'Managing for Results Analytic Challenges in Measuring Performance' (GAO. 5/97) 10
-------
EXAMPLES
Environmental Outcome Sub-objective- By
2010, visibility in some eastern national parks and
wilderness areas (Class I areas) will improve by
as much as 30% from 1995 levels
Sub-objective written as an outcome and APG as
an activity:
D Sub-objective By 2010, make the air
safer to breathe for an additional 74
million Americans living in areas
expected to violate the revised standards
by attaining and maintaining the new
NAAQSforPM2.5
D APG (1999) Deploy PM 2.5 ambient
monitors at 1500 sites
ANALYTIC STAGES
GPRA
Requirement
Performance Plan
Analytic
Stage 2
Develop Performance
Measures
Challenges: Outcomes may take years to develop
Need Data
Clear relationship to goal
Cover key aspects of program
After 'Managing for Results Analytic Challenges in Measuring Performance- (GAO. 5/97) 12
-------
ANALYTIC STAGES
GPRA
Requirement
Performance
Plan
Performance
Report
Analytic
Stage 3
Validate and Verify
Performance Data
Challenges Data limitations
Quality of 3rd party data
After'Managing Tor Results Anilytic Challenges in Measuring Performance' (GAO, 3/97) 13
EXAMPLE
FY 2000 APG: Air toxics emissions nationwide
from stationary and mobile sources combined will
be reduced by 5% from 1999 (for a cumulative
reduction of 30% from the 1993 level of 1 3
million tons)
Corresponding Performance Measures1
D Combined stationary and mobile source
reductions in air toxics emissions (5%)
D Reductions in national highway vehicle
benzene emissions (21,871 tons)
QReductions in national highway vehicle
butadiene emissions (3,498 tons)
14
-------
ANALYTIC STAGES
GPRA
Requirement
Performance Report
Analytic
Stage 4
Analyze and Report
Results
Challenges Understanding impact of program activities on
results
Understanding roles of different players
After'Managing for Results Analytic Challenge] in Measuring Performance' (CAO. 5/97) I i
EFFORTS TO IMPROVE
EPA'S DATA QUALITY
Greater reliance on electronic data interchange
Trend in making data available electronically so
data can be reviewed by its source(s)
Use of external review Boards to review
environmental analyses
Development of standardized guidance or
regulatory definitions of key terms to promote
consistency
Use of customer surveys to identify data quality
and data management problems and action plans
16
-------
CONCLUSION
GPRA emphasizes goals and objectives
that have environmental outcomes
GPRA encourages EPA to rely more
heavily on performance data to inform
program and resource allocation decisions
Poor data quality will limit the usefulness
of the data in informing planning decisions
Program evaluation is an important
analytical tool to understand the impact of
program activities on results
17
REFERENCES
Managing for Results: Analytic
Challenges in Measuring Performance
(GAO, May 1997)
The Results Act: An Evaluator's Guide
to Assessing Agency Annual Performance
Plans (GAO, April 1998)
Managing for Results: Measuring
Program Results that are Under Limited
Federal Control (GAO, December 1998)
IS
-------
Monte Carlo and Permutation Methods in Statistics
Woollcott K Smith
EPA Statisticians Meeting
Sugarloaf, Temple University
May 10,1999
-------
Monte Carlo and Permutation
Methods in Statistics
Outline
What's Monte Carlo about Monte Carlo methods?
An Example: Paired Comparisons
Rules and Guidelines for Simulation Based Methods
Some Powerful Applications
Missing data: multiple imputation
Nonlinear errors in variables, SIMEX
Smith's statistical ecology applications - NOT
Rattlesnakes
Generic la-la Monte Carlo Diagram
Analysis of Paired Data
* HMO tonntnoH EWgMDff » out of TVS motr CFFCCTIVC ««» TO
naws uavtH. VUIUUIY wwu rcmmmit nttrtam. fat emmf. m
(0*ranut ttmg OSHH. rut no mtim us «tmom.r uttmta TO
uat tMxer* amr on LOT UWPV rm cuwun» vunuanv HG 10
Getting Semi-Real
Minitab Lakes Data Set Consisting of 149 Lakes
Pairs
Alkalinity reading around 1930
Alkalinity reading around 1980
Exercise Trellis Graphics
Alkalinity data by Lake Type
IT
Spring fed Lakes Paired Data
5-
-------
qq-Plot for Paired pH Differences
qq-Plot for Paired Alkalinity Differences
I1
I-
I-
Paired Analysis
I960 1930 gmnaa »* |0(
428.91 44256 .1385 2
44242 669 S3 .2721 5
714.98 539S6 75.3 14
717.72 669.63 48.09 10
980.18 787.71 193.47 16
71&25 715.05 32 1
785.34 750.47 14.87 3
791.93 39098 ~ 100.96 15
593.85 578.8 15.05 4
371.4 308.32 65.08 12
672.55 711.87 -3832 7
492.18 46073 3145 6
S94t S4693 4717 9
34505 41531 -70.26 1S
441 55 397 15 44 4 8
86384 608.32 55.52 11
2
5
7
13
Paired t-test
Assumptions: d's are iid normal random variables
t =
= 2.1996
p - value = Pr(|t|> 2.1996) = 0.0439
Exact Wilcoxon signed-rank test
Assumptions d's are iid symmetric random variables
Rank the absolute differences
V= sum of ranks associated with positive differences
= 109
p-value = Pr(v < 27 or v > 109) =.0335
Exact binomial test
sign test
Assumptions d's are iid AND Pr ( d> 0) =.5
X" number of positive differences
^observed
= 13
p - value = Pr(X < 3 or X > 13) =.0768
-------
Finally a Monte Carlo
Randomization Test
Assumptions: d's are i.i.d symmetric random variables
Assign signs at random to d|
Choose a good test statistic, sample mean of d
Repeat procedure n-1 times
Paired Comparison Monte Carlo Hypothesis Test Diagram
Finally a Monte Carlo Result
Repeat procedure 19 times
1+0
p value =
n
Repeat procedure 999 times
p - value =
1000
20
= 0.05
1000
From the Trivial to an Impossible
Multivariate
Ecological Paired Comparison
Three replicate Hester- Dendy collectors
were placed at both downstream and upstream locations
to study the effect of a municipal outfall.
Complex, highly variable data set, total of 96 taxonomic groups.
The 1996 upstream collectors contained 1,080
individuals and 32 taxonomic groups,
while a year earlier the same location
had only 65 individuals and 13 taxonomic groups.
-------
Paired Comparison over Time
Reach 1 > ir.d Downstream (Reach 2)
1200 :; -
1000 "*0
800 :o -
BOO " -
400 CO -
200 00 H
0 CO j
I
/\
A / \
A /
/ \/'
^s> v
v##+t
Reach 1
Total
Reach 2
Total
Ecological Distance Measure
The distance between two samples
is defined as the number of species not shared:
that is, the number of species that are present
in exactly one of the two samples.
False absence: A species is absent from the sample but
present in the population.
Smith, Solow and Preston ( Biometrics, 1996)
and many others have detailed the statistical problems
associated with presence-absence measures.
Observed between Location
and within Location Distance
Paired ComparisonMonteCarloHvpothesis Test Diagram
All possible permutations
Random permutations (paired example)
Jackknife
Single observation removed in each sample
Blocks of observations removed
Bootstrap
Sampling with replacement
Parametric Bootstrap
Sample from the estimated parametric model
Gibbs' Sampling and Markov Chain Monte Carlo
All permutations.
Any nonparametric text Lehmann, E. H.(1975)
An Example: Exact test on Markov chains
Smith. W, A.R. Solow (1996)
An Exact McNemar Test for Paired Binary Markov
Chains. Biometrics.
Software: StalExact
Random Permutation Methods
Sokal, R.R. and F.J. Rolf (1995). Biometry. : Chapter 18
Many, B.F.J. (1995) Randomization and Monte Carlo Methods
In Biology.
Software: Splus. SAS, Resampling Stats , Fortran. ,,
-------
Jackknife and Bootstrap
Efron .B. and Tfoshirani (1993) An Introduction to the Bootstrap
Efron. B. (1982> The Jackknife. Bootstrap and Other
Resampling Methods
Gray, H. L. and Schucany, W. R. (1972) The Generalized
Jackknife Statistic
An Application to Ecological Measures
Confidence Intervals for Similarity Measures using the
Two Sample Jackknife: Smith, Kravitz and Grassle (1979)
Multivariate Methods in Ecological Work.
Expected Species Shared- ESS
Normalized Expected Species Shared - NESS
Jackknifed NESS- JNESS
'Hidden Species Area Curves: Clara Chu's Temple Thesis
Properties of Samplers
Reputable
Will the sampler produce the same set of samples each rime?
That is, it's not a Monte Carlo Sampler.
Checkable
Could a reasonable analyst duplicate the procedure?
But not the exact results.
Could a reasonable analyst read and understand the
procedure?
Computationally Inexpensive
SIMEX- Simulation
Extrapolation
Raymond Carroll (199&). Measurement Error in
Epidemiologic Studies. Encyclopedia o/Biostatistics
Simple Measurement Error Model
Y= \ + x2 + £
Example of Errors in Variable Model
Paired Comparison Monte Carlo Hypothesis Test Diagram
SIMEX Plot
Means of simulated
estimates with error
-------
-------
Notes
-------
REGISTRANTS
The 1999 EPA Conference on
Environmental Statistics and Information
SugarLoaf Conference Center
Philadelphia, Pennsylvania May 10-13.1999
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
RUTH ALLEN
OPP/HED/CEB
US EPA
703-305-7191
301-402-4279
Allen.ruth@epamail.epa.gov
HANS ALLENDER
US EPA
703-305-7883
703-605-0645
AI lender.hans@epamail .epa.gov
JOSEPH ANDERSON
US EPA
202-260-3016
LARA P. AUTRY
OAR/OAQPS/EMAD
US EPA
919-541-5544
919-541-1039
Autrv.lara@epa.gov
MICHAEL BARRETTE
US EPA
202-564-7019
Barrette.michael@epamail.epa.gov
SAMANTHA BATES
UNIV OF WASHINGTON
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
MALCOLM BERTONI
RESEARCH TRIANGLE INSTITUTE
202-728-2067
202-728-2095
MJB@rti.org
CINDY BETHELL
US EPA
GEORGE BONINA
OIRM
202-260-6227
Bonma.george@epa.gov
PATRICIA BONNER
US EPA
202-260-0599
Bonner.patricia@epamail.epa.gov
ED BRANDT
CEIS/IAIAD
US EPA
202-260-6217
Brandt.edward@epamail.epa.gov
LORI BRUNSMAN
OPPTS/OPP/HED
US EPA
703-308-2902
703-605-0645
Brunsman.lori@.epamail.epa.gov
-------
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
DANIEL CARR
GEORGE MASON UNIV
703-993-1671
703-993-1521
HEATHER ANNE CASE
OP/CEIS
US EPA
202-260-2360
202-260-4903
Case.heather@epamail.epa.gov
WENDY CLELAND-HAMNETT
OP/CEIS
US EPA
206-260-4030
202-260-0275
Cleland-Hamnett.wendv@epa.gov
SILVESTRE COLON
OFFICE OF WATER
US EPA
202-260-3066
202-260-7185
Colon.silvestre@epamail.epa.gov
VIRGINIA A.COLTEN-BRADLEY
OSWER/EMRAD
US EPA
703-308-8613
703-308-0509
Colten-bradlev.virginia@epamail.epa.gov
MARGARET CONOMOS
OPPE/CEIS
US EPA
202-260-3958
202-260-4968
Conomos.margaret@epa.gov
LAWRENCE COX
ORD/NERL
US EPA
919-541-2648
919-541-7588
Cox.larrv@epamail.epa.gov
PETER CRAIGMILE
UNIV OF WASHINGTON
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
DAVID CROSBY
AMERICAN UNIVERISITY
202-885-3155
Dcrosbv@american edu
THOMAS CURRAN
OAR/OAQPS
US EPA
919-541-5694
919-541-4028
Curran.thomas@epamail.epa.gov
THOMAS DEMOSS
US EPA MAIA
410-305-2739
410-305-3095
Demoss.tom@epa.gov
SUSAN DEVESA
NATIONAL CANCER INSTITUTE
NIH
301-496-8104
301-402-0081
Devesas@epndce.nci.nih.gov
SUSAN DILLMAN
OPPTS/OPPT/NPCD
US EPA
202-260-5375
202-260-0001
Dillman susan@epa.gov
KHOAN TAN DINH
US EPA
202-260-3891
202-260-1283
Dinh.khoan@epamail.epa.gov
DONALD DOERFLER
ORD/ERC/NHEERL
US EPA
919-541-7741
Doerfler.donald@epamail.epa.gov
BRENDAN DOYLE
US EPA
202-260-2693
202-260-4968
Dovle.brendan@epamail.ena.gov
-------
LEE ELLIS
CEIS
US EPA
Phone 202-260-6123
Fax 202-260-4968
E-mail Ellis.lee@epamail.epa.gov
ROBERT ENGLISH
INFO TRANS/ORG PLANNING
US EPA
Phone 202-260-5995
Fax 202-260-3655
E-mail English.robert@epamail.epa.gov
DAVID FARRAR
OPP
US EPA
703-305-5721
703-305-6309
Farrar.david@epamail.epa.gov
TERENCE FITZ-SIMONS
US EPA
919-541-0889
Phone
Fax
E-mail
Phone
Fax
E-mail
GEORGE T. FLATMAN
ORD/NERL-CRD
US EPA
Phone 702-798-2528
Fax 702-798-2208
E-mail George.flatman@epamail.epa.gov
JOHN F. FOX
OST
US EPA
Phone 202-260-9889
Fax 202-260-7185
E-mail Fox.iohn@epamail.epa.gov
MARY FRANKENBERRY
OPPTS/OPP/EFED
US EPA
Phone 703-305-5694
Fax 703-305-6309
E-mail Frankenberrv.marv@epamail.epa.gov
ANNE FRONDORF
US GEOLOGICAL SURVEY
Phone 703-648-4205
Fax 703-648-4224
E-mail Anne_frondorf@usgs.gov
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
WILLIAM GARETZ
OPPE/CEIS
US EPA
202-260-2684
Garetz.william@,epamail.epa.gov
PAT GARVEY
OIRM/EIMD
US EPA
202-260-3103
202-401-8390
Garvey.pat@epamail.epa.gov
SUSAN P. GEYER
CEIS
US EPA
202-260-6637
Gever.susan@epa.gov
MELISSA GONZALES
ORD/NHEERL
US EPA
919-966-7549
919-966-7584
Gonzales.melissa@epa.gov
PETER GOODWIN
DEAN GRADUATE SCHOOL
TEMPLE UNIVERSITY
BRIAN GREGORY
OAR/ORIA/IED/CHB
US EPA 0
202-564-9024
202-565-2038
Gregory.brian@epamail.epa.gov
JAY HAKES
ENERGY INFORMATION
ADMINISTRATION
-------
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
STEVEN M. HASSUR
OPPT
US EPA
202-260-1735
202-260-0981
Hassur.steven@epamail.epa.gov
RICHARD HEIBERGER
TEMPLE UNIVERSITY
KAREN KLIMA
US EPA
JAMES HEMBY
OAQPS
US EPA
919-541-5459
919-541-2464
Hembv.iames@epa.gov
DAVID M. HOLLAND
ORD/NHEERL
US EPA
919-541-3126
919-541-1486
Holiand.david@epamail.epa.gov
STEVE HUFFORD
CEIS
US EPA
202-260-9732
202-260-4968
Hufford.steve@epamail.epa.gov
BARNES JOHNSON
OSWER/OSW
US EPA
703-308-8855
703-308-0511
Johnson.barnes@epamail.epa.gov
HENRY KAHN
OW/EAD
US EPA
202-260-5408
202-260-7185
Kahn.henrv@epamail.epa.gov
R. CATHERINE KING
US EPA OECEJ
215-814-0871
215-814-2905
Ktng.catherine@epamail.epa.gov
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
ARTHUR T. KOINES
OP/CEIS
US EPA
202-260-4030
202-260-0275
Koines.artliur@epamail.epa.gov
MEL KOLLANDER
INSTITUTE FOR SURVEY RESEARCH
202-537-6845
202-537-6873
LEE KYLE
OGWOW
US EPA
202-260-1154
202-401-3041
Kyle.lee@epamail.epa.gov
PEPI HERBERTLACAYO
CEIS
US EPA
202-260-2714
202-260-4968
Lacavo pepi@epamail.epa.gov
RASHMI LAL
OP/CEIS
US EPA
202-260-3007
202-260-8550
Rashmi.lal@epamail.epa.gov
JADE LEE
EPA OFFICE OF WATER
202-260-1996
202-260-7185
Lee.jade@epa.gov
JUDY LEE
WASTE/CHEM MGMT DIV
215-814-3401
215-814-3113
Lee.judy@epamail.epa.gov
-------
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
LAWRENCE LEHRMAN
RMD/OIS
US EPA
Lehrman.lawrence@epamail.epa.gov
ELEANOR LEONARD
OP/CEIS
US EPA
202-260-9753
703-525-3455
Elleonard@aol.com
JUDITH C. LIEBERMAN
OCEO
US EPA
202-260-8638
202-401-1515
Lieberman.iudv@epamail.epa.gov
PHILIP LINDENSTRUTH
OFFICE OF WATER
US EPA 0
202-260-6549
202-260-7024
Lindenstruth.phil@epamail.epa.gov
CONNIE LORENZ
OP/CEIS/CSAD
US EPA
202-260-4660
202-260-4903
ARTHUR LUBIN
OSEA
US EPA
312-886-6226
312-353-0374
Lubin.arthur@epamail.epa.gov
ALLAN MARCUS
NCEA
US EPA
919-541-0643
919-541-1818
Marcus.allan@epa.gov
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
ELIZABETH MARGOSCHES
OPPTS/OPPT
US EPA
202-260-1511
202-260-1279
Margosches@epamail.epa.gov
MARY A. MARION
OPPTS/OPP/HED
US EPA
703-308-2854
Marion.marv@epamail.epa.gov
DAVID MARKER
UNTV OF WASHINGTON
ETHAN MCMAHON
OP/CEIS
US EPA
202-260-8549
Mcmahon.ethan@epamail.epa.gov
MICHAEL MESSNER
OGWDW
US EPA
202-260-8107
Messner.michael@epamail.epa.gov
STEVEN P. MILLARD
PSI
206-528-4877
206-528-4802
Smillard@.probstatinfo.com
CHRISTOPHER MILLER
NOAA
Phone
Fax
E-mail
DAVID MILLER
OPPTS/OPP
US EPA
703-305-5352
703-305-5147
Miller.davidJ@epamail.epa.gov
-------
DAVID MINTZ
OAR/OAQPS
US EPA
Phone 919-541-5224
Fax 919-541-1903
E-mail Mintz.david@epa.gov
MARGARET MORGAN -HUBBARD
DIRECTOR, OFFICE OF
COMMUNICATION
US EPA
202-260-5965
Morgan-hubbard .margaret
@epamail.epa.gov
Phone
Fax
E-mail
AL MORRIS
OFFICE OF ENVIRON DATA
Phone 215-814-5701
Fax 215-814-5718
E-mail morris.alvin@epa.gov
REBECCA MOSER
CEIS
US EPA
202-260-6780
202-260-4903
Phone
Fax
E-mail
JOHN MOSES
OP/CEIS
US EPA
Phone 202-260-6380
Fax 202-401-7617
E-mail Moses.iohn@epamail.epa.gov
NICKNAPOLI
US EPA
Phone 215-816-2621
Fax 215-814-2783
E-mail Napoli.nick@epamail.epa.gov
MALIHA S. NASH
ORD/NERL-CRD
US EPA
Phone 702-798-2528
Fax 702-798-2208
E-mail Nash .maliha@epamai 1 .epa.gov
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
KJMBERLY NELSON
PA DEPT OF ENVIR PROTECTION
717-787-3534
717-783-8926
Nelson.kimberly@dep.state.pa.us
BARRY NUSSBAUM
OPPE/CEIS
US EPA
202-260-1493
202-460-4968
Nussbaum.barrv@epamail.epa.gov
ROB O'BRIEN
BATTELLE
509-375-6769
509-375-2604
Robert.obrien@pnl.gov
ANTHONY R. OLSEN
US EPA NHEERL
541-754-4790
541-754-4716
Tolsen@mail.cor.epa.gov
G. P. PATIL
PENNSYLVANIA STUNIV
814-865-9442
814-863-7114
Gpp@stat.psu.edu
ROBERT M. PATTERSON
COLLEGE OF ENGINEERING
TEMPLE UNIVERITY
215-204-1665
215-204-6936
rpatterson@thunder.temple.edu
DAVID PAWEL
ORIA
US EPA
202-564-9202
PETER PETRAITIS
UNIVERISTY OF PENNSYLVANIA
-------
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
ANNE POLITIS
CEIS/IAIAD
US EPA
202-260-5345
202-260-4903
Pol itis.anne@epamai 1 .epa.gov
RAFAEL PONCE
UNTV OF WASHINGTON
WILLIAM F. RAUB
DEPUTY ASST SECTY SCI POLICY
DEPT HEALTH & HUMAN SERV
BREEDA REILLY
CEPPO
US EPA
202-260-0716
Reillv.breeda@,eparnail.epa.gov
JOSEPH RETZER
OP
US EPA
202-260-2472
Relzer.ioseph@epamail.epa.gov
JOEL REYNOLDS
UNIV OF WASHINGTON
EDNA RODRIGUEZ
OP/CEIS/CSAD
US EPA
202-260-3301
202-260-4903
Rodriguez.edna@epamail.epa.gov
N. PHILLIP ROSS
CEIS
US EPA
202-260-5244
202-260-8550
Ross.Nphillip@epamail.epa.gov
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
K.RISTEN RYDING
OEA
US EPA
206-553-6918
PAUL SAMPSON
UNIV OF WASHINGTON
DINA SCHREINEMACHERS
EBB/HSD/NHEERL/ORD
US EPA
919-966-5875
919-966-7584
Schreinemachers.dina@epamail.epa.gov
ANDREW SCHULMAN
OGWDW/SRMD/TAB
US EPA
202-260-4197
202-260-3762
Schulman.andrew@epamail.epa.gov
RONALD SHAFER
OP/CEIS
US EPA
202-260-6766
202-260-4968
Shafer.ronald@epamail.epa.gov
BOB SHEPANEK
ORD/NCEA
US EPA
202-564-3348
202-565-0061
Shepanek.robert@epamail.epa.gov
CAROLYN SHETTLE
INSTITUTE FOR SURVEY RESEARCH
202-537-6793
202-537-6873
cshettle@ioip.com
ABRAHAM SIEGEL
OW/OGWDW
US EPA
202-260-2804
Siegel .abraham@epamai I .epa.gov
-------
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
BIMAL SINHA
OPPE/CEIS
US EPA
202-260-2681
Sinha.bimal(5)epamail.epa gov
BENJAMIN SMITH
US EPA
202-260-3026
202-260-3762
Smith.ben@epamail.epa.gov
WILLIAM P. SMITH
OPPE/CEIS
US EPA
202-260-2697
202-260-4968
Sm ith. wi I l@epamai I .epa. gov
WOOLCOTT SMITH
TEMPLE UNIVERSITY
MINDI SNOPARSKY
HYDROGEOLOGY
EPA
215-814-3316
Snoparsky.mindi@epamial.epa.gov
JOHN A. SORRENTINO
TEMPLE UNIVERSITY
215-204-8164
Soirento@astro.ocis.temple.edu
DOREEN STERLING
CEIS
US EPA
202-260-2766
202-260-8550
Sterling.doreen@epamail.epa.gov
WILLIAM TASH
VICE-PROVOST
TEMPLE UNIVERSITY
Phone
Fax
E-mail
MARY LOU THOMPSON
UNIV OF WASHINGTON
Phone 206-616-2723
Fax 206-616-2724
E-mail Mlt@biostat.washington.edu
HENRY TOPPER
OPPT
US EPA
Phone 202-260-6750
Fax 202-260-2217
E-mail topper.henrv@epa.gov
MARIANNETURLEY
UNIV OF WASHINGTON
Phone 206-616-9288
Fax 206-616-9443
E-mail Marianne@cqs.washington.edu
DIANNE WALKER
US EPA REGION III
Phone 215-814-3297
Fax 215-814-2134
E-mail Walker.dianne@epamail.epa.gov
JOHN WARREN
ORD/NCERQA;QAD
us EPA
Phone 202-260-9464
Fax 202-401-7922
E-mail Warren iohn@epamail.epa.gov
NANCY WENTWORTH
ORD/NCERQA/QAD
US EPA
Phone 202-564-6830
Fax 202-565-2441
E-mail Wentworth.nancy@epamail.epa.gov
ELLEN WERNER
INSTITUTE FOR SURVEY RESEARCH
Phone 202-537-6735
Fax 202-537-6873
E-mail Ewemer@ioip.com
RICK WESTLUND
US EPA OFFICE OF POLICY
Phone 202-260-2745
Fax 202-260-9322
E-mail Westlund.rick@epa.gov
-------
CHARLES WHITE
OW/OSTVEAD
US EPA 0
Phone 202-260-5411
Fax 202-260-7185
E-mail White chuck@epamail.epa.gov
NATHAN WILKES
OFFICE OF POLICY
US EPA
Phone 202-260-4910
Fax 202-260-4903
E-mail Wilkes.nathan.epa.gov
JENNIFER WU
OW/OGWDW/SBMD
US EPA
Phone 202-260-0425
Fax 202-260-3762
E-mail Wu. iennifer@epamail .eoa gov
-------
-------
Notes
-------
Welcome to the 1999 EPA Conference on Environmental Statistics and Information
It is my pleasure to welcome you on behalf of EPA's Center for Environmental
Information and Statistics to the 1999 EPA Conference on Environmental Statistics and
Information. For many "old timers" to this conference, you will immediately recognize that we
have changed the name and scope of the conference. Just as EPA is currently planning a new
Office of Information, we have expanded our coverage to information in addition to statistics.
We welcome new attendees to the conference which covers all aspects of data, information,
computer technology, data systems and statistics. We have the data gatherers, the data
administrators, the data users, the data analyzers, the data simulators, the data reinventors, the
data modelers, the data miners, the data assessors, and those who still can't figure out if data are
singular or plural. We hope that all of the attendees will better understand the interrelationships
among these groups as the new Office of Information will encompass so many of these activities.
This year's theme is "EPA's Vision for the 21st Century". I am well aware that many of
this year's conferences on any topic will have similar names. (I also fear any no-hum building
constructed this year will shortly be labeled "turn of the century" architecture.) However, as
EPA is undergoing major changes with the creation of the Office of Information, the Agency is
trying to position itself to meet the burgeoning information challenges of the new century, so I
feel the theme is truly appropriate. I hope that this conference will enable us to better understand
those challenges and approaches to meeting them.
We have an exciting collection of plenary sessions, featured talks, concurrent
presentations, training sessions, poster/computer sessions and panel discussions. However, as
with most meetings, some of the informal opportunities to meet and chat with your colleagues
can frequently be the most productive aspect of the conference. I encourage you to take full
advantage of this year's campus-type setting to continue your dialogues with your associates. I
owe a great deal of thanks to the planning and arrangements committees for their efforts to
organize this conference. Special thanks also go to Margaret Conomos and Connie Lorenz who
assisted me in putting it all together, and to Temple University's Institute for Survey Research
that handled the details and coordination. We encourage you to have a good time, learn a lot,
and tell us about any enhancements you would like to see in the future.
6A1AL
Barry D. Nussbaum
1999 Conference Chair
Conference Planning Committee Arrangements Committee
Susan Auby
Eliga"b'efh MOfgosches Joan Bundy
GetfrgeTlatman Trudy McCoy
Kfth- Allen Ed Lloyd
John Warren
-------
Agenda for the 1999 EPA Conference on
Environmental Statistics and Information
Monday, May 10,1999
3:00-6:00 REGISTRATION AND CHECK-IN
Foyer
4:00-6:00 tfpNfilJRRENT TRAINING SESSIONS
Woollccrtt Smith (Temple University) and Peter Petraitis jy . R
(U^uVfcteity of Pennsylvania) - Workshop on Monte
Carlo Methods in Environmental Statistics
. Joe Anderson (EPA/OIRM) - EPA's Web Site and You Room p
6:00-7:00 Cash Bar
Foyer
Tuesday, May 11,1999
8:30-9:00 .Welcoming Remarks and Introduction of Speakers
Wendy Cleland-Hamnett, Director, EPA Center for
Environmental Information and Statistics
t Peter Goodwin, Dean, Graduate School, Temple
University
Dining Room
9:00-9:30 Al Morris, Director, Office of Environmental Data, EPA Region Dining Room
III, Philadelphia, PA - Information, Statistics and the Region
9:30-10:30 Keynote Address
Jay Hakes, Administrator, Energy Information
Administration
Dining Room
10:00-12:00 Statistical Training Session (Videotapes)
RoomC
-------
Tuesday, May 11,1999 (Continued)
10:30-10:45 Break
10:45-12:00 CONCURRENT PRESENTATIONS
Statistics, Information, and GPRA (Chair: George Bonina, EPA) Room D
. Judith Calem Lieberman, (EPA/OCFO), Analytic
Challenges and the Government Performance and
Results Act
George Bonina (EPA) - Reinventing Environmental
Information
Local Applications of EPA Data (Chair: Ron Shafer, EPA/CEIS) Dining Room
. Henry Topper (EPA/OPPT) - The Baltimore Community
Environmental Partnership: Lessons Learned
Kimberly Nelson (Pennsylvania Department of
Environmental Protection) - The Department of
Environmental Protection Compliance Reporting
System
. N. Bouwes, Steven M. Hassur (EPA/OPPT), S. Keane,
E. Fechner Levy, B. Firlie, and R. Walkling (Abt
Associates, Inc.) - Risk-Screening Environmental
Indicators Model
Statistical Methods for Lab and Air Quality Data Analysis Room B
(Chair: Larry Cox, EPA/ORD/NERL)
. Mary Lou Thompson and Kerrie Nelson (University of
Washington) - Statistical Modeling of Multiply-
Censored Data
. Peter Craigmile (University of Washington) - Trend
Estimation Using Wavelets
Joel H. Reynolds (University of Washington) -
Meteorological Adjustment of Surface Ozone for Trend
Analysis: Pick an Answer, Any Answer
12:00-1:15 Lunch
-------
Tuesday, May 11,1999 (Continued)
1:15-2:30 CONCURRENT PRESENTATIONS
Databases: The Manager's View (Chair: Phil Lindenstruth, EPA) Room D
. Panel: Phil Lindenstruth - STORET, Abraham Siegel -
SDWIS, Mike A. Mundell - PCS (EPA)
Ensuring the Quality of Environmental Information Dining Room
(Chair: Nancy Wenrworth, EPA/ORD)
Nancy Wentworth (EPA/ORD) - Quality Assurance and
Environmental Information
Malcolm Bertoni (Research Triangle Institute) - Using
SimSITE to Illustrate Sampling Techniques
Models and Model Assessment of Environmental Data Room B
(Chair: Mary Lou Thompson, University of Washington)
Rafael Ponce (University of Washington) - Development
of a Linked Pharmacokinetic-Pharmacodynamic Model
of Methylmercury-Induced Developmental
Neurotoxicity
Samantha Bates, Cullen, A. C, and A. E. Raftery
(University of Washington) - Bayesian Model
Assessment
- Marianne Turley, E. David Ford, and Joel Reynolds
(University of Washington) - Pareto Optimal Multi-
Criteria Model Assessment
2:30-4:30 Statistical Training Session (Videotapes) Room C
2:30-2:45 Break
-------
Tuesday, May 11,1999 (Continued)
2:45-4:00 CONCURRENT PRESENTATIONS
Use of the Internet for Sharing Statistics Dining Room
(Chair: Steve Hufford, EPA/CEIS)
. Pat Garvey (EPA/OIRM) - Envirofacts Warehouse:
Environmental Data On The Internet Empowering the
Citizen toward Environmental Protection and
Awareness
Anne Frondorf (USGS) - National Biological Information
Infrastructure
. Chris Miller (NOAA) - National Environmental Data
Index
. Bob Shepanek (EPA/ORD) - EPA's Environmental
Information Management System
Epidemiology and Risk Assessment Cumulative and/or Aggregate Room B
Risk Assessment (Chair: Ruth Allen, EPA)
. David Miller (EPA/OPPTS/OPP) - Food Quality
Protection Act and its Implementation: an Overview of
Statistical and Probabilistic Issues Facing the Office of
Pesticide Programs
. Hans D. Allender (EPA/OPP/HED) - Finding a Statistical
Distribution to Use in the Monte Carlo Exposure
Assessment of Livestock Commodities
. Breeda Reilly (EPA/CEPPO) - Applying Epidemiology to
Study the Prevention of Major Chemical Accidents
Statistical Research Issues in Quality Assurance (Chair: John Room D
Warren, EPA/ORD)
John Warren - Integrating Data Quality Indicators (DQIs)
into Data Quality Objectives (DQOs)
. Charles White (EPA/OW/OST) - A Performance
Evaluation of the Method Detection Limit
-------
Tuesday, May 11,1999 (Continued)
4:15-5:15 PLENARY SESSION (Chair: Wendy Cleland-Hamnett, EPA/CEIS) Dining Room
Corrinne Caldwell, Acting Provost & Vice President,
Temple University - Welcoming Remarks
. Tom Curran (EPA/OAR) - Data, Information and
Statistics: Putting it All Together for Decision-Making
5:15-8:00 POSTER AND COMPUTER SESSIONS Room A
. George T. Flatman (EPA/ORD/NERL) - Satellite data for
Landscape Ecology
Lawrence Lehrman (EPA/RMD/OIS) - Cluster analysis of
fish species and land use
. Connie Lorenz (EPA/CEIS) - All the Stats that are Fit to
Surf
Brand Neimann (EPA) - Digital Library Demonstration
Stuart H. Kerzner (EPA/Region III) - Information
Visualization - Turning Data into Information You Can
Easily Understand
. Maliha S. Nash (EPA/LEB) - Geostatistical Analysis of
Ecological Indicators
Arthur Lubin (EPA/OSEA) - Environmental Random
Stratified Sampling Designs Developed Via Cluster
Analysis
. Heather Case (EPA Customer Service) - What the CEIS
National Telephone Survey will be able to Tell EPA's
Information Providers
. Susannah Dillman (EPA/OPPTS/OPPT) - Methods to
Minimize Human Error in Reporting Analysis Results
John S. Graves (EPA/Region III) - Using Perl Scripts to
Import Data into CIS: An Example Using USGS
Ground Water Site Inventory Data
Rich Heiberger (Temple University) - Demonstration of
ESS, S-Plus, and Trellis Graphics
. William P. Smith (EPA/OPPE/CES) - CD Toxic Release
Inventory (TRI) Data Explorer
5:30-6:30 Wine and Cheese Party - Hosted by William Tash, Vice Provost Pool House
for Research, Temple University
-------
Wednesday, May 12,1999
8:00-9:00 CONCURRENT PRESENTATIONS
Data Integration and Quality: Vision for the Future (Chair: Ruth
Allen, EPA)
Susan Oevesa, D. Grauman, W. Blot, G. Pennello, R.
Hoover, and J. Fraumeni (NCI) - Atlas of Cancer
Mortality in the United States, 1950-94
Ruth Allen (EPA) - Surveillance Improvement Report
Dining Room
Analysis of Cleanups (Chair: Michael J. Messner, EPA/OGWDW) Room B
. Michael J. Messner (EPA/OGWDW) - Cryptosporidium
Occurrence in the Nation's Drinking Water Sources
. Bimal Sinha (EPA/OPPE/CES) - Statistical Estimation of
Average Reid Vapor Pressure of Regular Gasoline
Sampling and Design Issues in Environmental Studies (Chair:
Tony Olsen, EPA/NHEERL)
. David Marker (WESTAT) - Sample Designs for
Environmental Data Collection: Ranked Set Sampling
and Composite Sampling
Paul D. Sampson (University of Washington) -
Monitoring network design with applications to regional
air quality
RoomD
9:00-9:15 Break
9:15-11:15 Statistical Training Session (Videotapes)
RoomC
-------
Wednesday, May 12,1999 (Continued)
9:15-10:45 CONCURRENT PRESENTATIONS
Some Analyses and Potential Analyses at EPA (Chair: Doreen
Sterling, EPA/CEIS)
. Mike Barrette (EPA/OE) - Integrating Data for Planning
and Targeting
. Tom DeMoss and Tom Pheiffer (EPA) - The Mid Atlantic
Integrated Assessment Program (MAIA)
. John Moses (EPA/CEIS) - Strategy to Address Evolving
Environmental Information Needs
RoomD
Measurement Issues Related to our Water Supply (Chair: Barnes
Johnson (EPA/OSWER/OSW)
Andrew Schulman, Jennifer Wu, and Benjamin Smith
(EPA/OGWDW) - Forays into the Unforgiving -
Occurrence Estimation in the Realm of Data with
Multiple Censoring Points (arsenic in the public water
supply)
Henry Kahn, Helen L. Jacobs, and Kathleen A. Stralka
(EPA/OW/EAD) - Estimated Water Consumption In
The U.S. Based On The CSFII
. Virginia A.Colten-Bradley (EPA/OSWER/OSW) -
Development of a Neural Network Tool for Evaluation
of Waste Management Unit Designs
Dining Room
Assessing Risk (Chair: Elizabeth Margosches (EPA/OPPTS)
. Mary Marion (EPA/OPPT) - Simulation and Acute
Dietary Risk Assessments
. David Pawel (EPA/OAR) - Proposed EPA Methodology
for Assessing Risks from Indoor Radon
. Elizabeth H. Margosches, Ph.D., Jennifer Seed, Ph.D.,
and Khoan T. Dinh, Ph.D. (EPA/OPPTS) - Health Data:
How Do We Use It To Protect the Public/Environment?
Margaret Conomos (EPA/CEIS) - Discussant
RoomB
-------
Wednesday, May 12,1999 (Continued)
10:45-11:00 Break
11:00-11:30 PLENARY SESSION (Chair: Barry Nussbaum, EPA/CEIS)
. Woollcott Smith (Temple University) - A Walk on the
Wild Side of Statistical Communication
Dining Room
11:30-12:30 PLENARY SESSION (Chair: Doreen Sterling, EPA/CEIS)
- Robert English (EPA) - Proposed Information
Management Office
Dining Room
12:30-2:00 Lunch
2:00-4:00 Statistical Training Session (Videotapes)
RoomC
2:00-3:30 CONCURRENT PRESENTATIONS
The Visual Presentation of Data (Chair: Al Morris, EPA, Dining Room
Region III)
Al Morris (EPA) - Enviroviz-Tuming Numbers into
Visual Relationships
. David Mintz (EPA/OAR/OAQPS) - Methods for
Displaying Temporal and Spatial Trends
. Daniel Carr (George Mason University) - Two Templates
for Visualizing Georeferenced Statistical Summaries
Listening To Our Information Customers (Co-Chairs: Brendan Room D
Doyle (EPA/CEIS) and Margaret Morgan-Hubbard (EPA/Office of
Communications)
Panel: Margaret Morgan-Hubbard, Director, EPA Office
of Communications, Brendan Doyle, (Acting) Director,
CEIS Customer Survey and Access Division,
Emma McNamara, (Acting) Director, EIMD, OIRM,
and Pat Bonner, EPA Customer Service
8
-------
Wednesday, May 12,1999 (Continued)
Application of Sampling in Aquatic Resources (Chair: Henry
Kahn, EPA/OW/EAD)
. Henry Kahn and Silvestre Colon (EPA/OW/EAD) -
Composite Sampling Analysis of Contaminant Levels
in Fish
. Anthony R.Olsen (EPA/NHEERL) - National Fish Tissue
Contaminant Lake Survey: A New Spatially-Restricted
Survey Design
. Barnes Johnson (EPA/OSWER) - How to Survey Water
Designs
3:30-3:45 Break
RoomB
3:45-5:15 PLENARY SESSION: Statistics and Information at EPA as we
Start a New Century: Where Are We Going? (Chair: Phil Ross,
EPA/CEIS)
. Larry Cox (EPA/ORD/NERL)
. Karen Klima (EPA/IW1)
. Heather Case (EPA/CEIS)
. G.P. Patil (Pennsylvania Slate University)
Dining Room
Thursday, May 13,1999
8:30-10:30 TRAINING
Steven P. Millard (PSI) - Applying Monte Carlo Simulation
Techniques with S-PLUS
Dining Room
9:00-10:30 PRESENTATIONS (Note this time overlaps with above training)
The Data Come In, the Data Go Out Room D
Rick Westlund (EPA/OP) - Reducing Paperwork Burdens
at EPA
. Charlotte Cottrill (EPA/ORD) - EMPACT's Role in the
21st Century
-------
Thursday, May 13,1999 (Continued)
10:30-10:45 Break
WRAP-UP SESSION (Chair: Barry Nussbaum, EPA/CEIS) Dining Room
10-45 11 -45 " William R&ub, Deputy Assistant Secretary for Science
Policy, Department of Health and Human Services -
Perspectives on Data and Information from the
Department of Health and Human Services
11:45-12:00 Door Prize and Closing Remarks Dining Room
10
-------
ABSTRACTS
1999 EPA Conference on
Environmental Statistics and Information
4:00-6:00 Monday, May 10
CONCURRENT TRAINING SESSIONS
WORKSHOP ON MONTE CARLO METHODS IN ENVIRONMENTAL
STATISTICS
Woollcott Smith
Statistics Department, Temple University
Peter Petraitis
Biology Department, University of Pennsylvania
This workshop is divided into two parts:
1. Smith will present an overview of modern computer intensive Monte Carlo methods. The
review will include the statistical motivation as well as technical and philosophical
advantages and disadvantages in using these methods in administrative and legal settings.
We will briefly describe how these methods are used to attack hard statistical problems in
missing data imputation, measurement error and Bayesian analysis. Finally the details of
randomization and simulation methods will be illustrated using a basic aired comparison
design.
2. Petraitis will present a case study on the pros and cons of using randomization methods as an
alternative to analysis of variance and the analysis of covariance.
-------
10:45-12:00 Tuesday, May 11,1999
STATISTICS, INFORMATION, AND GPRA
(CHAIR: GEORGE BONINA, EPA)
ANALYTIC CHALLENGES AND THE GOVERNMENT PERFORMANCE
AND RESULTS ACT
Judith Calem Lieberman
OCFO, US Environmental Protection Agency
The Government Performance and Results Act (GPRA) of 1993 set into motion a spate of
activity in Agency strategic planning and accountability. In essence a legal constitution for good
management, the GPRA requires federal agencies to set goals, measure performance, and report
on the degree to which goals are met. It also places emphasis on attaining results rather than
tracking program activities. The Office of the Chief Financial Officer has been leading EPA's
effort to meet GPRA's statutory requirements, which includes development of a 5-year strategic
plan, annual performance plans (and budgets), and annual performance reports. During the first
cycle of GPRA implementation, several analytical challenges have been revealed. The most
significant ones relate to identification of outcome goals, development of performance measures,
validation/verification of performance data, and comparison of performance with annual goals.
Working through these challenges will require a good understanding of the Agency's mission, a
little creativity and the analytical skills to understand the impact of program activities on
emironmental results.
LOCAL APPLICATIONS OF EPA DATA (CHAIR: RON
SHAFER, EPA/CEIS)
THE BALTIMORE COMMUNITY ENVIRONMENTAL PARTNERSHIP:
LESSONS LEARNED
Henry Topper
US Environmental Protection Agency
In this case study, participants in the Baltimore Community Environmental Partnership will
describe their experiences and present lessons they have learned. The experience presented will
be based on a three-year project involving a Partnership among the residents, governments, and
businesses in south Baltimore and northern Anne Arundel County. This Partnership worked
together to begin addressing the long term environmental and economic concerns in four
neighborhoods in south Baltimore and northern Anne Arundel County. For many years, both
residents and businesses in this heavily industrialized section of the metropolitan Baltimore area
have expressed concerns about health and the environment in their neighborhoods. By working
-------
together in a Partnership, the community completed a comprehensive review of all aspects of its
environment and has begun work to implement a plan to make real improvements.
The Partnership has taken a holistic view of community problems and has developed efforts to
address a broad range of issues facing the community including health concerns, housing issues,
illegal dumping, subsistence fishing, park restoration and enhancement, community gardening,
economic development, air quality, and crime. Based on this holistic approach, the Partnership
has begun to develop an understanding of the complexity of the environmental stresses facing the
community and the need for a multifaceted approach to improving community health and
building a sustainable community. In the area of community health, Partnership committees are
now working to address the issues of indoor air, fish consumption, truck traffic, and industrial
toxic releases. As a part of this effort, the Air Committee of the Partnership completed a
comprehensive screening analysis of air releases from all the businesses and facilities in and
around the Partnership area. This analysis, based on exposure modeling, has given the
community information on the cumulative concentrations of toxics from all sources in each of
the four Partnership neighborhoods. The Air Committee has developed a protocol to compare
these modeled concentrations with established health effect values to determine areas for
pollution prevention. The committee has also developed a protocol and screened for potential
combined effects of multiple chemicals that have similar target organs, e.g. all the chemicals that
are respiratory tract irritants. As a result of the work of the Air Committee, the community now
has some key parts of the information it needs to monitor and improve the local environment.
THE DEPARTMENT OF ENVIRONMENTAL PROTECTION
COMPLIANCE REPORTING SYSTEM
Kimberly Nelson
Pennsylvania Department of Environmental Protection
The Pennsylvania Department of Environmental Protection (DEP) has made significant strides in
improving data management. PA DEP has successfully integrated across more than 12 programs
data to present a holistic view of the people and places it regulates. The data reside in the DEP
client/site database which is fully integrated with departmentwide application processing and
compliance reporting systems. The DEP compliance reporting system is one of the few systems
in the country that can track multi-media inspections, violations, penalties and enforcement
actions for a single facility and is the only system in the country that is on-line for citizens to
track compliance activities. The client/site system also is integrated with the department's new
Pennsylvania Facility Analysis System, a web based CIS application that went on-line for the
public in March. Currently, the department is focusing priority attention on an Environmental
Futures Team whose charge it is to develop a plan for measuring environmental outcomes.
-------
OPPT'S RISK-SCREENING ENVIRONMENTAL INDICATORS MODEL*
Bouwes, N. and Hassur, S.
Office of Pollution Prevention and Toxics, U.S. Environmental Protection Agency
S. Keane, E, Fechner Levy, B. Firlie, and Walkling, R.
Abt Associates, Inc.
The Toxics Release Inventory (TRI) provides raw data on the quantities of chemicals released by
US manufacturing facilities, but these raw data alone do not provide information about the
relative toxicity or exposure potential of these releases. The Office of Pollution Prevention and
Toxics (OPPT) of the US EPA has created the Risk-Screening Environmental Indicators Model
to provide a risk-based perspective of these releases, in a PC-based model. The Indicators Model
integrates toxicity scores with a measure of exposure potential and the size of the potentially
exposed population to calculate individual Indicator Elements for each combination of facility,
chemical, and release media reported under TRI. Each year of reporting generates
approximately 250,000 of these Elements which are summed to provide overall Indicator Values.
The Indicator Elements can also be summed to create sub-Indicators that rank relative impacts by
medium, chemical, geographic area, industry sector or a combination of these and other
variables. This flexibility provides the analyst with the opportunity to examine trends year-to-
year, and to rank and prioritize chemicals, industries and regions for strategic planning, risk-
related targeting for enforcement and compliance purposes, and community-based environmental
protection. The model also permits the user to investigate the relative influence of toxicity,
exposure and population on the results.
'Work supported under EPA Contract Number 68-W6-0021, WA#3-02.
STATISTICAL METHODS FOR LAB AND AIR QUALITY
DATA ANALYSIS (CHAIR: LARRY COX, EPA/ORD/NERL)
STATISTICAL MODELING OF MULTIPLY CENSORED DATA
Mary Lou Thompson and Kerrie Nelson
The National Research Center for Statistics and the Environment, University of
Washington
Laboratory analyses in a variety of contexts may result in doubly left censored measurements,
i.e. amounts of contaminants of concern may be reported by the laboratory as "non-detects" or
"trace". The analysis of singly censored observations has received attention in the biostatistical
(e.g. in the context of survival analysis) and in the environmental literature. We consider
maximum likelihood and semi-parametric approaches to linear models in the doubly censored
setting.
-------
TREND ESTIMATION USING WAVELETS
Peter Craigmile
Department of Statistics, University of Washington
A common problem in the analysis of environmental time series is how to deal with a possible
trend component, which is usually thought of as large scale (or low frequency) variations or
patterns in the series that might be best modeled separately from the rest of the series. Trend is
often confounded with low frequency stochastic fluctuations, particularly in the case of models
such as fractionally differenced processes (FDPs), which can account for long memory
independence (slowly decaying auto-correlation) and can be extended to encompass non-
stationary processes exhibiting quite significant low frequency components. In this talk we
assume a model of polynomial trend plus fractionally differenced noise and apply the discrete
wavelet transform (DWT) to separate a time series into pieces that can be used to estimate both
the FDP parameters and the trend. The estimation of the FDP parameters is based on an
approximation maximum likelihood approach that is made possible by the fact that the DWT
decorrelates FDPs approximately. Once the FDP parameters have been estimated, we can then
test for a non-zero trend. After outlining the work that we have done to date on testing for non-
zero trends, we demonstrate our methodology by applying it to an air quality time series.
METEOROLOGICAL ADJUSTMENT OF SURFACE OZONE FOR
TREND ANALYSIS: PICK AN ANSWER, ANY ANSWER
Joel H. Reynolds
NRCSE, Department of Statistics, University of Washington
A variety of statistical methods for meteorological adjustment of surface ozone have been
proposed in the literature over the last decade. As part of a larger review of the literature, we
summarize and compare six different methods applied to the analysis of surface ozone
observations in the Chicago region from the 1981 -1991 period: nonlinear regression, regression
tree models, extreme events models, time-series filtering, nonlinear additive time-series models,
and canonical covariance analysis. Differences in the resulting trend analyses are discussed in
terms of differences in each analysis' spatial domain and choice of ozone statistic. The review
highlights the need for development of techniques for extreme value analysis of space-time
processes.
-------
1:15-2:30 Tuesday, May 11,1999
DATABASE: THE MANAGER'S VIEW (PANEL SESSION)
Philip Lindenstrutb, Michael A. Mundell, and Abraham Siegel
US Environmental Protection Agency
The Panel will present for discussion several issues involved in the administration of a national
database. These issues start with requirements for the database and addresses optional data
fields, data quality, data ownership, database management issues, and support for the system
during its life cycle. Those on the Panel would like their initial presentations to stimulate a
discussion of these issues with the attendees.
ENSURING THE QUALITY OF ENVIRONMENTAL
INFORMATION (CHAIR: NANCY WENTWORTH, EPA/ORD)
USING SIMSITE TO ILLUSTRATE SAMPLING TECHNIQUES
Malcolm J. Bertoni
Center for Environmental Measurements and Quality Assurance, Research Triangle
Institute
The Simulated Site Interactive Training Environment (SimSITE) is a computer-based training
support system that helps environmental scientists and engineers learn how to plan a field
investigation at a hazardous waste site. Through the use of a graphical user interface provided
by the ArcView geographic information system (CIS), training participants apply concepts such
as Data Quality Objectives (DQOs), Data Quality Indicators (DQIs), statistical sampling design,
and Data Quality Assessment (DQA). SimSITE contains statistical design and analysis tools and
sampling simulation routines that allow the participants to develop and implement sampling
plans that satisfy their DQOs. SimSITE then generates a data set (including sampling and
measurement errors), and allows the participants to make decisions about whether or not to clean
up areas of the artificial site, based on their statistical analysis of the data. At the end of the
simulation, the features of the underlying true contamination are revealed to illustrate the
phenomenon of decision errors. During this interactive presentation, the features and classroom
uses of SimSITE will be demonstrated.
-------
MODELS AND MODEL ASSESSMENT OF ENVIRONMENTAL
DATA (CHAIR: MARY LOU THOMPSON, UNIVERSITY OF
WASHINGTON)
DEVELOPMENT OF A LINKED PHARMACOKINETIC-
PHARMACODYNAMIC MODEL OF METHYLMERCURY-INDUCED
DEVELOPMENTAL NEUROTOXICITY
T.A. Lewandowski, S.M. Bartell, R.A. Ponce, C.H. Pierce, and E.M. Faustman
Department of Environmental Health, University of Washington
Methyl mercury (MeHg) has been shown to cause adverse developmental effects in human and
animal conceptuses exposed in utero. A toxicological model of the disposition and cellular
action of MeHg in the developing fetus can be used to estimate health outcomes for various
levels of exposure. Modeling can also incorporate differences in dose rate, chemical species, or
inter-species variability. A linked toxicokinetic and toxicodynamic model for MeHg has been
developed for the rat based on work performed in our laboratory. The toxicokinetic model
incorporates many of the changes in organ size and blood flow associated with gestation.
In the toxicokinetic model, changes in the population of committed fetal neural cells have been
estimated based on the observed effects of MeHg on rates of cellular death, proliferation and
differentiation in vitro. We are currently determining these rates in vivo using BrdU-Hoechst
flow cytometry. The toxicokinetic model demonstrates an adequate fit to experimental
loxicokinetic data. For example, 3 days after a dose of 1 mg/kg (given on day 16 of gestation),
the model predicts fetal brain and fetal blood levels within 10% of the values observed by
Wannag (1976). In terms of toxicodynamic effects, the model predicts 20% and 65% decreases
in the number of committed neural cells (on gestational day 15, relative to untreated baseline) at
fetal brain concentrations of 10 and 50 umol/kg. It is anticipated that the existing model can be
extended to address other species (i.e., humans) and other developmental toxicants which act by
similar mechanisms (i.e., cell cycle disruption).
Sponsored by the following grants: USEPA R825358 and CR825173 and NIEHS T32ESO-7032.
BAYESIAN MODEL ASSESSMENT
Samantha Bates and A. E. Raftery
Department of Statistics, University of Washington,
Cullen, A.C.
Graduate School of Public Affairs, University of Washington
In this paper we discuss a Bayesian method of analysis which incorporates both prior knowledge
of the distributions of the inputs to a deterministic model and any available data on the model
inputs and outputs. This method uses Monte Carlo simulation from the prior distributions for the
inputs and resampling of these simulations with weights determined by the observed data under
-------
the sample importance resampling scheme of Rubin. The method yields posterior distributions
for the output from which to find distributions for quantities of interest. The method also allows
the separation of the contributions of variability and uncertainty on the posterior distribution of
soil concentration.
We will present an application of this method to modeling poly-chlorinated biphenyl (PCB)
concentrations in various media at a Superfund site in New Bedford Harbor (NBH), MA.
Dredging during this clean-up of the Harbor exposes inhabitants of the surrounding region to
PCB contaminated air, soil and plants. A deterministic model for PCB concentration in soil was
developed by Cullen (1992). The Bayesian method is used to find distributions for the PCB
concentration in soil at this site. In addition we will contrast the results of this Bayesian method
with those of a traditional Monte Carlo approach and a trial-and-error approach.
i
PARETO OPTIMAL MULTI-CRITERIA MODEL ASSESSMENT
Marianne Turley, E. David Ford, and Joel Reynolds
University of Washington
Evolutionary computation (EC) is an optimization technique for finding Pareto optimal solutions
to multiple objective functions. It borrows ideas from evolutionary theory to direct the
optimization search through the parameter space. We applied this optimization to process models
to improve model assessment by requiring a solution, a model parameterization, to achieve
multiple criteria simultaneously. In this talk, I will discuss the algorithm, two alternative search
errors and some examples.
2.45-4:00 Tuesday, May 11, 1999
USE OF THE INTERNET FOR SHARING STATISTICS (CHAIR:
STEVE HUFFORD, EPA)
ENVIROFACTS WAREHOUSE: ENVIRONMENTAL DATA ON THE
INTERNET - EMPOWERING THE CITIZEN TOWARD
ENVIRONMENTAL PROTECTION AND AWARENESS
Pat Garvey
US Environmental Protection Agency
Governments and the courts are acknowledging more and more that the Public has a right to
know what is being discharged and released to the environment. The U S Congress and the
Executive Branch have taken decisive action to ensure this public right to access of data and
information.
The U.S. EPA created the Envirofacts Warehouse to provide the public with direct access to the
vast amounts of information and data in its national program environmental data systems. The
8
-------
Envirofacts Warehouse helps EPA fulfill its responsibility to make information available to the
public, as required by federal legislation and Executive Order.
Envirofacts is available from the Internet, (www.epa.gov/enviro) allowing EPA to disseminate
information quickly and easily. Envirofacts Warehouse contains:
a relational database of the national databases on Superfund (abandoned hazardous waste)
sites, hazardous waste handlers, discharges to water, toxic releases, air releases, and drinking
water suppliers,
the relational database also contains the facility index system, the Envirofacts Master
Chemical Integrator, locational reference tables, and,
spatial data and demographic data from the other sources.
Internet applications are available and part of the Envirofacts Warehouse Internet site to provide
easily designed queries to the databases and to create maps and other reports.
The Presentation shows the capabilities and reasons for the Envirofacts Warehouse. The
presentation demonstrate the features and principles behind the design of the Web site, the
database design and model and demonstrates the various application features and query options
from the Web.
The presentation will demonstrate:
How On-line Queries and Results are useful to the concerned public, interested
organizations, governmental regulatory staff and to Environmental Officer of a plant, facility
or company;
CIS Mapping capabilities and Outputs that are On-line and what are the CIS capabilities in
the future;
Data refresh schedules and the importance of On-line Documentation; and
Customer Feedback procedures for data quality and user needs.
The presentation will address the US EPA directions and program initiatives in public access of
governmental data and community empowerment with environmental data.
NATIONAL BIOLOGICAL INFORMATION INFRASTRUCTURE
Anne Frondorf
U.S. Geological Survey
This presentation will provide a brief description/overview of the National Biological
Information Infrastructure (NBII) program, a collaborative effort to build a distributed, Internet-
based federation of biological science data, information and analytical tools. Examples of the
types of data and information available from the NBII and the types of different agencies and
organizations and partnerships involved in building the NBII will be provided.
Two key elements of the NBII "infrastructure" (i.e. the standards-related activities that help to
support and pull together this distributed data network) will be highlighted. These are the
-------
development of a biological metadata content standard (and an accompanying biological
metadata clearinghouse network) and the continued development of the Integrated Taxonomic
Information System (ITIS) as a standard reference for biological nomenclature and taxonomy
ITIS is a partnership among USGS, EPA, NOAA, USDA, and the Smithsonian Institution.
EPA'S ENVIRONMENTAL INFORMATION MANAGEMENT SYSTEM
Bob Shepanek
Office of Research and Development, US Environment Protection Agency
Presented is an integrated vision for scientific information management approaches supporting
monitoring and assessment activities within the US EPA's, Office of Research and Development
(ORD). This vision was developed based upon lessons-learned from the implementation of
several scientific information management systems and from development of the ORD's strategic
and implementation plans for scientific information management. The vision reflects that
effective management of scientific information must address technical, cultural and management
challenges. Technical challenges include management and integration of metadata, data, and the
modeling, analysis, and visualization tools used as part of assessment activities. Cultural
challenges relate mainly to the protection of intellectual capital produced by individual
investigators. Management issues include commitment of adequate resources for systems
development and operation, support for related policies and procedures, and appropriate
incentives for involvement by staff and project participants.
EPIDEMIOLOGY AND RISK ASSESSMENT CUMULATIVE
AND/OR AGGREGATE RISK ASSESSMENT (CHAIR: RUTH
ALLEN, EPA)
FOOD QUALITY PROTECTION ACT AND ITS IMPLEMENTATION: AN
OVERVIEW OF STATISTICAL AND PROBABILISTIC ISSUES FACING
THE OFFICE OF PESTICIDE PROGRAMS
David Miller
US Environmental Protection Agency
With the passage of the Food Quality Protection Act, the Agency's Office of Pesticide Programs
is now required to aggregate risks from pesticides across exposure pathways and to accumulate
risks from pesticides across chemicals. As a result and in an attempt to develop better risk and
exposure estimates that consider the probabilities associated with simultaneous exposures, the
Office of Pesticide Programs is now using probabilistic (Monte Carlo) techniques in its risk and
exposure assessments. This had necessitated that OPP develop further refinements to its risk
assessment procedures. This presentation will provide an overview of FQPA and discuss its
major science impacts. It will review the traditional (deterministic) type methods used by OPP
in exposure and risk assessments as well as the probabilistic techniques now being used with
increasing frequency. Finally, it will review some of the statistical and policy issues which are
10
-------
now being considered by the Office as it implements the probabilistic risk analysis framework
now in place.
FINDING A STATISTICAL DISTRIBUTION TO USE IN THE MONTE
CARLO EXPOSURE ASSESSMENT OF LIVESTOCK COMMODITIES
Hans D. Allender, Ph.D., P.E.
US Environmental Protection Agency
The presentation develops a methodology to find a frequency distribution of animals'
contamination because of the ingestion of pesticide-contaminated food. Given the percentage of
crops treated (%CT), the methodology calculates the distribution of animals that will be exposed.
Determination of the frequency distribution can be used later in connection with the application
of a Monte Carlo Analysis to the Exposure Assessment of humans to contaminated animal
products. The flexibility of the method allows the construction of frequency distributions to
multiple cases with different %CT. A non-agricultural example explains the process in a way
that everyone can relate to the calculations. The ubiquitous spreadsheet is used as the preferred
medium to obtain random numbers, recalculate probabilities, generate totals, and produce
graphics. A detail explanation of how the spreadsheet is constructed ensures the audience the
possibility of duplicating the exercise. The simplicity of the methodology makes the process
easy to replicate and to extend to similar situations. It also allows the study of severe
contamination by pointing out the percentage of animals which diet has been contaminated from
different sources. In summary, the article indicates a way of calculating a realistic statistical
distribution of animal contamination based on ingestion of contaminated food. Also, the
procedure can be extended to non-agricultural situations.
APPLYING EPIDEMIOLOGY TO STUDY THE PREVENTION OF
MAJOR CHEMICAL ACCIDENTS
Breeda Reilly
Chemical Emergency Preparedness and Prevention Office, US Environmental Protection
Agency
Mandated by the Clean Air Act Amendments of 1990, accident histories from some 69,000
chemical facilities in the United States will become available in the fall of 1999. This
presentation describes the challenges of using the tools of epidemiology with this data to
investigate drivers of severity and frequency of accidents. This study was proposed by Center
for Risk Management and Decision Processes at the Wharton School and is a major focus of an
EPA cooperative agreement. The Major Accident Epidemiology Project aims to contribute to
the process of determining which plants are most likely to incur major events, by ascertaining
whether certain predictors (characteristics of manufacturing plants or of the companies that own
them) are associated with increased probability of a major event. This knowledge can be helpful
in two ways: (1) plants with such risk factors can be monitored more closely (by the companies
themselves as well as by regulators and other stakeholders); and (2) these associations may
provide clues about characteristics of companies' organizational systems that act as underlying
causes of major events.
11
-------
STATISTICAL RESEARCH ISSUES IN QUALITY ASSURANCE
(CHAIR: JOHN WARREN, (EPA/ORD)
INTEGRATING DATA QUALITY INDICATORS (DQIS) INTO DATA
QUALITY OBJECTIVES (DQOS)
John Warren
Quality Assurance Division, Office of Research and Development, US Environmental
Protection Agency
EPA Order 5360.1 CHG 1 (July 1998) requires all EPA organizations to use a systematic
planning process to develop acceptance or performance criteria for the collection, evaluation, or
use of environmental data. Systematic planning identifies the expected outcome of the project,
the technical goals, the cost and schedule, and the acceptance criteria for the final result. The
Data Quality Objectives (DQO) Process is the Agency's recommended planning process when
data are being used to select between two opposing conditions, such as decision-making or
determining compliance with a standard. The outputs of this planning process (the data quality
objectives themselves) define the performance criteria. The DQO Process is a seven-step
planning approach based on the scientific method that is used to prepare for data collection
activities such as environmental monitoring efforts and research. It provides the criteria that a
data collection design should satisfy, where to collect samples; tolerable decision error rates; and
the number of samples to collect.
Data Quality Indicators (DQIs) are the individual performance characteristics specified in the
mandatory Quality Assurance Project Plan (QAPP) that accompanies any environmental data
collection. Typical DQIs include precision, completeness, comparability, and sensitivity. This
discussion centers on how the Agency can effectively make the link between DQOs and DQI
A PERFORMANCE EVALUATION OF THE METHOD DETECTION
LIMIT
Charles White
US Environmental Agency
Performance criteria specified in the original (1981) publication are evaluated using EPA data.
Data available for preliminary evaluation include over thirty combinations of pollutant by
chemical analytical technique.
12
-------
5:15-8:00 Tuesday, May 11,1999
POSTER AND COMPUTER SESSIONS
INFORMATION VISUALIZATION - TURNING DATA INTO
INFORMATION YOU CAN EASILY UNDERSTAND
Stuart H. Kerzner
US Environmental Protection Agency, Region III
The poster shows "EnviroSnax", which are graphics showing tidbits of environmental
information in ways that are easy to understand and highlight past or future environmental
impacts on the Region. They are used for management briefings, public use, press releases and
presentations.
WHAT THE CEIS NATIONAL TELEPHONE SURVEY WILL BE ABLE
TO TELL EPA'S INFORMATION PROVIDERS
Heather Case
EPA Customer Service, US Environmental Protection Agency
This presentation will describe the potential uses of the results from a national telephone survey
recently completed by the CEIS. The national telephone survey, which began in February 1999,
was designed to:
identify and describe environmental information customers within the U.S. population;
identify the public's high interest environmental topics; and
determine the public's access preferences for obtaining and using information.
The survey results will be used to guide CEIS information product and service development.
The survey results will be available for peer review in mid-August 1999.
This presentation will highlight potential uses by information providers in the Programs and
Regions.
METHODS TO MINIMIZE HUMAN ERROR IN REPORTING ANALYSIS
RESULTS
Susannah Dillman
US Environmental Protection Agency
Using "Paste Special" multiple graphs and tables in Excel can be linked to the report in
WordPerfect 8 and updated all at once.
13
-------
USING PERL SCRIPTS TO IMPORT DATA INTO CIS: AN EXAMPLE
USING USGS GROUND WATER SITE INVENTORY DATA
John S. Graves
US Environmental Protection Agency, Region III
One of the primary tools in EPA Region HI for evaluating environmental data is the Geographic
Information System or GIS. A difficulty in using a GIS is that environmental data is not always
readily available in a GIS format. The Perl computer language was used to translate U.S.
Geological Survey ground water data into a format, which could then be imported into a GIS.
This poster presents relevant portions of the Perl script used with explanations of the data
processing steps undertaken as well as examples of GIS generated plots from the resulting data
in EPA Region III.
DEMONSTRATION OF ESS, S-PLUS, AND TRELLIS GRAPHICS
Richard M. Heiberger
Department of Statistics, Temple University
ESS [Emacs Speaks Statistics] is a GNU Emacs interface for interactive statistical programming
and dala analysis. Languages supported include S-Plus, XLispStat, and SAS. ESS provides a
standard interface between statistical programs and statistical processes and has as one of its
goals an increase in efficiency for statistical programming and data analysis, over the usual tools.
ESS displays source code in these languages with syntactic indentation and highlighting of
source code. ESS interacts "directly" with the statistical package. ESS allows intelligent
interaction with the transcript of previous interactive session.
Trellis is a graphical display system that uses multiple panels to simultaneously view
relationships between differenl variables in your inuhivariate dataset through conditioning.
Trellis was developed at Bell Labs as part of S-Plus.
We will have a live demonstration of ESS, S-Plus, and trellis graphics. 1 will analyze and display
several examples of continuous and discrete multivariate and time series data sets.
CD TOXIC RELEASE INVENTORY (TRI) DATA EXPLORER
William P. Smith
Center for Environmental Information and Statistics, US Environmental Protection
Agency
The TRI Data Explorer is a web product designed to provide the user quick and easy queries to
EPA's TRI Chemical release data for years 1988-1997. The Explorer's portal to TRI chemical
release data is through multiple data views which provide detailed and comprehensive chemical
reports at all geographic levels down to the facility level by year or across years. In addition for
each chemical the explorer provides interesting information such as factoids and information on
the top 100 releasing facilities and counties.
14
-------
The TRI Explorer will help our customers find information on topics such as: the chemicals
released in their county during the year; the facilities that are releasing these chemicals in the
county, state or the nation; the top chemicals released in their county, the state, or the nation;
and, the top 100 ranking facilities and counties in the nation that release a given chemical, or all
chemicals. And much more.
The application runs on the web at httD://athena.was.eDa.gov:2002/~wsmith/tri2/explorer.htm.
or on CD for running off-line without the Internet. The CD application will be demonstrated.
8:00-9:00 Wednesday, May 12
DATA INTEGRATION AND QUALITY: VISION FOR THE
FUTURE (CHAIR: RUTH ALLEN, EPA)
ATLAS OF CANCER MORTALITY IN THE UNITED STATES, 1950-94
Susan Devesa, D. Grauman, W. Blot, G. Pennello, R. Hoover, and J. Fraumeni
Division of Cancer Epidemiology and Genetics, National Cancer Institute
The geographic patterns of cancer around the world and within countries have provided
important clues to the environmental determinants of cancer. In the mid-1970s the NCI prepared
county-based maps of cancer mortality in the U.S. that identified distinctive variations and hot-
spots for specific tumors, thus prompting a series of analytic studies of cancer in high-risk areas
of the country. We have prepared an updated atlas of cancer mortality in the United States during
1950-94, based on mortality data from the National Center for Health Statistics and population
estimates from the Census Bureau. Rates per 100.000 person-years, directly standardized using
the 1970 US population, were calculated by race (whites, blacks) and gender for 40 forms of
cancer. The new atlas includes more than 140 computerized color-coded maps showing
variation in rates during 1970-94 at the county (more than 3000 counties) or State Economic
Area (more than 500 units) level. Summary tables and figures are also presented. Selected
maps for the 1950-69 period are also included. Accompanying text describes the observed
variations and suggests explanations based in part on the findings of analytic studies stimulated
by the previous atlases. The geographic patterns of cancer displayed in this atlas should help to
target further research into the causes and control of cancer.
15
-------
ANALYSIS OF CLEANUPS (CHAIR: MIKE MESSNER,
EPA/OGWDW)
CRYPTOSPORIDIUM OCCURRENCE IN THE NATION'S DRINKING
WATER SOURCES
Michael J Messner, Ph.D.
US Environmental Protection Agency
Cryptosporidium is a microbial pathogen which occurs in most of the nations surface waters.
Information on cryptosporidium occurrence will be used in estimating the costs and benefits of
future drinking water regulations. A recently completed survey generated monthly estimates of
cryptosporidium concentrations in the source waters of over 400 large drinking water utilities.
With only two months of validated data in hand, it appears that 80 to 90 percent of the water
volumes analyzed yielded zero oocysts. On its face, this sparsely of nonzero results appears to
severely limit the data's usefulness. In this presentation, a Bayesian approach is outlined for
estimating hierarchical model parameters and their uncertainties. Time permitting, the approach
will be illustrated using a small simulated data set.
SAMPLING AND DESIGN ISSUES IN ENVIRONMENTAL
STUDIES (CHAIR: TONY OLSEN, EPA/NHEERL)
SAMPLE DESIGNS FOR ENVIRONMENTAL DATA COLLECTION:
RANKED SET SAMPLING AND COMPOSITE SAMPLING
David Marker
Westat
Historically environmental statistics and survey sampling have had
relatively limited interaction. Most environmental studies use pre-existing data collection
locations, collect from known hot spots, and/or purposively select data collection locations.
Efficient survey sampling that can support the evaluation of a wide range of hypotheses has been
used to a lesser degree with environmental data than in health, education, or many other types of
data.
This talk will describe two NRCSE funded research activities that try to bridge this gap between
survey sampling and environmental statistics.
Ranked set sampling (RSS) is a method to potentially increase precision and reduce costs by
using "rough but cheap" information to obtain a more representative sample before the real, more
expensive sampling is done. We have explored under what conditions RSS becomes cost-
effective for ecological and environmental field studies where the "rough but cheap"
measurement has a cost.
16
-------
We are continuing to explore when alternative forms of two-phase sampling are preferable to
RSS.
Composite sampling has been proposed in environmental settings where the costs of
measurement are high. It is hoped that by compositing data collected from multiple locations the
cost savings will outweigh the loss of information on the individual locations. Unfortunately it is
not clear how often this trade- off is successful. NRCSE has funded the collection of side-by-side
individual and composite samples so that this trade-off can be explored with real data from a
national survey of over 800 houses. The data collection protocol and types of planned analyses
will be discussed for this ongoing activity.
9:15-10:45 Wednesday, May 12
SOME ANALYSES AND POTENTIAL ANALYSES AT EPA
(CHAIR: DOREEN STERLING, EPA/CEIS)
INTEGRATING DATA FOR PLANNING AND TARGETING
Michael Barrette
US Environmental Protection Agency
For each major regulatory program implemented by EPA, the program office has designed
databases to house the information critical to the program's needs. In a changing world, data
users are now interested in looking at environmental information holistically, which means that
databases must relate to each other.
To plan its enforcement and compliance activities, EPA makes use of integrated data within the
Integrated Data for Enforcement Analysis (IDEA) system. This system provides access to more
than 15 databases maintained by EPA and other government agencies. When trying to compare
across databases, of course many discrepancies and data errors are found. In this presentation
several topics related to data quality and integration will be examined:
What is the critical step needed in order to integrate information across databases at the
facility level? Discussion will focus on EPA's data integration strategies.
What are key methods that have used existing data to find high-priority sector and
geographic issues? Discussion will focus on recent efforts to identify priority areas and
sectors for inspection targeting.
How can data integration be used to find violators? Discussion will focus on some concrete
examples showing how comparison of databases can lead facilities that are improperly
regulated.
17
-------
THE MID ATLANTIC INTEGRATED ASSESSMENT PROGRAM (MAIA)
Tom DeMoss
Environmental Services, U.S. Environmental Protection Agency, Region III
Tom Pheiffer
Atlantic Ecology Division, NHEERL, U.S. Environmental Protection Agency
The MAIA program is an integrated environmental assessment program being conducted by
USEPA, Region HI, and US EPA's Office ofjtesearch and Development, partnership with other
Federal and State Agencies.
Objectives of the MAIA program are to build partnerships and get ail stakeholders involved in
helping to (1) identify questions needed for assessing major ecological resource area, such as
ground water, surface water, forests, estuaries, wetlands, and landscapes; (2) characterize the
health of each resource are, based upon exposure and effect information; (3) identify possible
associations with stressors, including landscape attributes, that may explain impaired conditions
for both specific resources and the overall ecosystem; (4) target geographic areas and critical
resources for protection and restoration, and (5) monitor environmental management progress.
Our experience with partners uncovered certain key principles of effective watershed
management. They were (1) agreement on geologic boundaries and or units of assessment; (2)
conduct an assessment of their biological condition of resources; (3) target management to real
impairment based upon the biological assessments including TMDL, nutrients and habitat
restoration; (4) have watershed approach be holistic or segment by segment bases upon nature of
problem; (5) have five-year rotation to monitoring and to assessments to allow time for change
of environment and for progress from management action; (6) buy-in stakeholders so assessment
and monitoring plans use all available resources and innovative options; (7) success will be more
cost-effective monitoring and management fixes.
Successful State partnering involves early buy in well before products are developed. MAIA's
emphasis on aquatic biology and habitat is a departure from the water quality standards/TMDL
mentality and requires open dialogue with state biologists who must educate their managers on
the importance of habitat preservation and restoration as the new wave of management of their
aquatic resources.
STRATEGY TO ADDRESS EVOLVING ENVIRONMENTAL
INFORMATION NEEDS
John Moses
Center for Environmental Information and Statistics (CEIS), US Environmental Protection
Agency
While primarily a regulatory agency, the U.S. Environmental Protection Agency is devoting an
increasing amount of its resources to responding to public requests for information about
environmental quality, pollution sources, and human health and ecosystem concerns.
-------
Additionally, the Agency must report annually to Congress on its progress in protecting human
health and safeguarding the natural environment, as required under the Government Performance
and Results Act (GPRA). Yet, in many cases, the data EPA needs to respond to public questions
and to report on its progress are not readily available. The Evolving Information Needs Strategy
addresses the gaps between the data the Agency needs and the data it currently has.
Working with EPA Regional and Program Offices and external stakeholders, CEIS developed a
two-phase strategy to identify and address some of the Agency's key environmental information
gaps. Phase I is a general screening analysis for identifying major gaps in 26 key environmental
problem areas and for setting priorities among these problem areas. Phase II is a methodology
for performing a more in-depth analysis of and recommendations to address the gaps associated
with each environmental problem area. This paper reports on the Phase I screening analysis,
conducted from June through April 1999.
MEASUREMENT ISSUES RELATED TO OUR WATER SUPPLY
(CHAIR: BARNES JOHNSON (EPA/OSWER/OSW)
FORAYS INTO THE UNFORGIVING- OCCURRENCE ESTIMATION IN
THE REALM OF DATA WITH MULTIPLE CENSORING POINTS
Andrew Schulman, Jennifer Wu, and Ben Smith
US Environmental Protection Agency
Under the Safe Drinking Water Act, the Agency is charged with establishing standards for
allowable levels of contaminants in the Nation's public water systems. Central to the selection
of the regulatory level is the determination of the relative benefits and costs likely to be
achieved. Benefits and costs are directly proportional to the level of current occurrence.
Consequently, sound decision making requires the best possible estimation of occurrence be
utilized.
In developing a new regulation for arsenic, the Agency has data from over twenty States
covering a time span of up to twenty years. Because the current regulation is at a much higher
concentration than new options under investigation, however, many State data sets are heavily
censored by detection limits within the range of required estimation. This paper will discuss the
data and the approaches EPA is considering for the assimilation of the data into national and
mtra-system occurrence estimation.
ESTIMATED WATER CONSUMPTION IN THE U.S. BASED ON THE
CSFII
Henry D. Kahn, Helen L. Jacobs, and Kathleen A. Stralka
US Environmental Protection Agency
Knowledge of drinking water intake is fundamental to the mission of the Office of Water and an
important component of a number of programs at EPA. This presentation provides a summary of
19
-------
our recent efforts to generate up-to-date estimates of water intake by the population of the United
States. To obtain current estimated water consumption distributions, we have analyzed the
United States Department of Agriculture's (USDA's) Combined 1994-96 Continuing Survey of
Food Intake by Individuals (CSFII) data set. Per capita water intake is estimated for three
sources of water: municipal/tap, bottled, and other sources of water (i.e., private well, private
cistern, or private or public well). For each source of water, distributions are generated for direct
and indirect water consumption. The distributions by age, gender, race, socioeconomic status,
and geographical region and separately for pregnant and lactating women are also estimated.
Survey design and statistical methodology are discussed. We anticipate that the water
consumption distributions will be used in a wide range of applications including: rules limiting
amounts of microbes; disinfectant by-products (DBF) rules; radon and other drinking water
contaminant rules; protection of sensitive populations and other exposure assessments.
DEVELOPMENT OF A NEURAL NETWORK TOOL FOR EVALUATION
OF WASTE MANAGEMENT UNIT DESIGNS
Virginia Cohen-Bradley
Economics, Methods, and Risk Analysis Division, Office of Solid Waste, US Environmental
Protection Agency
Samuel Figuli, Julia Lewis, and Katrin Arnold
HyroGeoLogic, Inc.
The Office of Solid Waste recently completed a neural network software tool designed for
evaluating leachate concentrations in four different waste management units, with three different
liner types. The purpose of the tool is to help non-hazardous industrial waste facilities determine
the concentration for the constituent of concern that can be disposed of safely in a specific waste
management unit design. The neural network software, EPA's Industrial Waste Management
Evaluation Model (1WEM) is based upon EPA's ground-water fate-and-transport model,
EPACMTP. EPACMTP was designed for national-level risk assessments. It is run in Monte
Carlo mode, using hydrologic data representative of the United States. Seven parameters judged
to be the most significant in EPACMTP were used to build four different neural network tools,
one for each of the waste management units: landfill, surface impoundment, waste piles, and
land application units.
IWEM has a multi-layer perceptron architecture and was trained in back-propagation mode from
target output generated by the Monte Carlo-style analyses with EPACMTP. Several different
approaches to producing training- and test-data sets were used. In general, the comparison
between the neural network and the EPACMTP results is good. The accuracy of the neural
networks varies with the location of the EPACMTP response surface that is being simulated.
20
-------
ASSESSING RISK (CHAIR: ELIZABETH MARGOSCHES
(EPA/OPPTS)
PROPOSED EPA METHODOLOGY FOR ASSESSING RISKS FROM
INDOOR RADON
David Pawel, Ph.D.
US Environmental Protection Agency
Radon has been determined to be the second leading cause of lung cancer after cigarette smoking
(NAS 1998). Based on methodology published by the National Academy of Sciences (NAS) in
its BEIRIV report (NAS 1988) and in its "Comparative Dosimetry" report (NAS 1991), EPA
has previously estimated that 13,600 lung cancer deaths in the U.S. each year are radon related
(EPA 1992). Subsequently, the Agency sponsored a study by the NAS, which reviewed the large
body of evidence about radon that has become available since their earlier reports. The new
NAS study, BEIR VI (NAS 1998), confirmed that radon is a serious public health problem, and
provided new estimates of radon risk and of radon-attributable lung cancer deaths, which were
somewhat higher than EPA had projected previously, particularly for never smokers. The BEIR
VI committee concluded, moreover, that about one-third of these cases are preventable if all
homes above 4 pCi/L are remediated.
We will discuss proposed revisions to EPA's methodology for calculating radon-related risk
estimates in light of BEIR VI and the Agency's own previous analysis. These include estimates
of attributable risk and risk per working level month (WLM). Attributable risk is the proportion
of lung cancer deaths attributable to radon. Risk per WLM is the number of expected radon-
induced cancer deaths for the current population divided by the corresponding total of past and
future exposures. We will describe life table methods for calculating these quantities, and show
how changes in smoking patterns might impact these estimates of risk. It is anticipated that this
methodology would be used by EPA in a number of contexts, including: (1) updating its public
information aimed at reducing residential radon exposures; (2) its assessment of risk from radon
in drinking water; and (3) its assessment of risks associated with radium contaminated sites.
HEALTH DATA: HOW DO WE USE IT TO PROTECT THE
PUBLIC/ENVIRONMENT?
Elizabeth H. Margosches, Ph.D., Jennifer Seed, Ph.D., and Khoan T. Dinh, Ph.D.,
US Environmental Protection Agency
This talk will describe the types of data typically available for analysis by the EPA's Office of
Pollution Prevention and Toxics, and how they are used. These data are submitted under various
statutes or gathered from the open literature and are used to help decide to what degree the public
or the environment may be at risk of incurring adverse effects if certain exposures occur. The
decisions include such considerations as whether the available studies are experimental or
observed in situ and how inferences may be made from various animal studies to the wild or to
humans as well as inferences from one effect to another. Sampling and data collection issues,
21
-------
missing data, and data modeling are all critical statistical aspects of this activity. An example
will be given those focuses on generalizing inferences from a dose-response model.
2:00-3:30 Wednesday, May 12
THE VISUAL PRESENTATION OF DATA (CHAIR: AL
MORRIS, REGION III, EPA)
ENVIROVIZ-TURNING NUMBERS INTO VISUAL RELATIONSHIPS
Alvin R. Morris
Director, Office of Environmental Data, US Environmental Protection Agency, Region HI
We're drowning in data - ever hear that plaintive wail? While we may not be drowning, we are
faced with under-utilizing data. Another more recent challenge facing us- in the spring of next
year- is to prove to the congress, the public and others that we are using the funds they provide
to actually improve the environment-how much, where and for what price.
Data visualization can help solve both those challenges. This presentation will be the first
presentation of outputs of a prototype program we named EnviroViz. A program that
dynamically links air and water ambient and major point sources to:
v\here they are located: in the Region, state, county, and watershed
shows the 6-year trend for each of 7 air and 46 water parameters (stressors)
the GPRA goals to the sub-objective level
and shows for each sub-objective the associated FTE, contract $, state and tribal grants
It's a new approach to more easily understanding the meanings embedded in environmental data
and can be applied in many areasplease come see and comment.
METHODS FOR DISPLAYING TEMPORAL AND SPATIAL TRENDS
David Mintz
Air Quality Trends Analysis Group, US Environmental Protection Agency
EPA's Office of Air Quality Planning and Standards is tasked with developing an annual report
on the nation's air quality. This report, entitled National Air Quality and Emissions Trends
Report, uses various graphing techniques to present temporal and spatial trends in the data. This
paper discusses the methods employed in the report, their strong points, and their limitations.
Much of the graphical design is based on the principles of Edward Tufte and other leading
authorities on the visual display of information.
22
-------
TWO TEMPLATES FOR VISUALIZING GEOREFERENCED
STATISTICAL SUMMARIES
Daniel B. Carr
Center for Computational Statistics, George Mason University
This paper presents two new templates for visualizing spatially-index statistical summaries. The
first template called conditioned choropleth (CC) maps represents a powerful interactive
extension of classed choropleth maps. The basic layout is a 3 x 3 matrix of panels containing
nine juxtaposed maps. One conditioning variable corresponds to rows and the other to columns.
The analyst controls the highlighting of map regions by manipulating row and column sliders
that define acceptable intervals for the conditioning variables. A small tab in each panel shows a
value summarizing the highlighted region values. The presence or absence of main effects and
interaction are evident at a glance. Other analyst interactions including dynamic class interval
selection and simultaneous pan and zoom for all panels. The examples emphasize study of
human mortality rates for health service areas conditioned on environmental and demographic
variables.
The second template, called linked micromap (LM) plots, provides an alternative to traditional
classed choropleth maps. The new design trades off region boundary resolution for more
accurate or extensive statistical summaries. These summaries can be bar plots, dot plots, box
plots, time series, line high plots for over a hundred variables and so on. Color provides a local
link of each region's (or site's) statistical summary and it's spatial position in the micromap.
Examples show numerous variations of this template. The discussion addresses pattern
discovery and working in progress for drilling down from state to county to census tract.
LISTENING TO OUR INFORMATION CUSTOMERS (PANEL SESSION)
Margaret Morgan-Hubbard, Director, EPA Office of Communications
Brendan Doyle, (Acting) Director, CEIS Customer Survey and Access Division
Emma McNamara, (Acting) Director, EIMD, OIRM (invited), and
Pat Bonner, EPA Customer Service
US Environmental Protection Agency
This session will give participants an overview of how EPA and CEIS are surveying the
Agency's current and potential environmental information customers to better understand their
needs and access preferences. Several examples of how customer feedback is helping to shape
various EPA information products and services will be introduced. CEIS will re-cap lessons
learned from the Center's customer surveys over the past two years and give an update on their
national customer telephone survey (results due this fall). Session participants will have an
opportunity to express their interests in using the Center's survey data for their own analyses and
programs. The basics of using customer feedback on your products or services will also be
covered.
The panel agenda will include:
23
-------
Introductions: Brendan Doyle, CEIS
Margaret Morgan-Hubbard, OCEMR: The importance of focussing your information product or
service on your customers' needs and a vision for serving EPA's environmental information
customers in the future.
Brendan Doyle: Overview of what we've learned so far by implementing the CEIS customer
survey plan and what we hope to learn from our national information customer telephone survey
this fall.
Emma McNamara: EPA's web sites- incorporating customer input and customer service
principles into developing and maintaining a Web site.
Pat Bonner: EPA Customer feedback 101- will discuss how "Hearing the Voice of the
Customer" guidelines can help you to obtain useful customer feedback on your products,
processes and services.
APPLICATION OF SAMPLING IN AQUATIC RESOURCES
(CHAIR: HENRY KAHN, EPA/OW/EAD)
COMPOSITE SAMPLING ANALYSIS OF CONTAMINANT LEVELS IN
FISH
Henry D. Kahn and Silvestre Colon
US Environmental Agency
Samples offish formed by physically mixing, i.e., grinding together, a number offish into a
combined, aggregate sample are referred to as "composite samples". Chemical analysis of
composite samples is a cost-effective mechanism for estimating mean levels when the cost of
analysis is high and the cost of obtaining sample units, such as individual fish, is relatively low.
A possible concern in the analysis of composite sampling data is the absence of measurement
results on individual units that comprise the composite. This presentation considers a set of data
on contaminant levels in measured in composite samples offish and individual fish that
constitute the composite samples. The results allow for comparison of composite and individual
analyses. Additional topics discussed are: estimation of variance components associated with the
composite samples, using measurements made on subsamples of the composites and effects of
fish length and weight on contaminant levels.
24
-------
NATIONAL FISH TISSUE CONTAMINANT LAKE SURVEY: A NEW
SPATIALLY-RESTRICTED SURVEY DESIGN
Anthony R. Olsen
NHEERL Western Ecology Division, U.S. Environmental Protection Agency
In 1998, the U.S. Environmental Protection Agency initiated a national study offish tissue
contaminants in lakes and reservoirs. The study requires the development of a survey design to
meet the study objectives. For the national lake study, a list frame of waterbodies greater than 1
hectare is available. The frame provides information on the lake surface area and its geographic
location, in the form of a geographic information system (GIS) coverage. However, the frame
includes waterbodies that do not meet the definition of the target population. The frame
includes 270,761 waterbodies. This paper develops the survey designs for the study and
discusses how an underlying discrete global grid can be used to control the spatial distribution of
the sample and to address the imperfection of the frame. The survey design does not use finite
population sampling theory, but a continuous population in a bounded area theory that parallels
it. The spatially-restricted design enables the concept of a systematic sample to be implemented
while maintaining the ability to obtain design-based estimates and variance estimates.
8-30-10.30 Thursday, May 13
APPLYING MONTE CARLO SIMULATION TECHNIQUES WITH S-
PLUS
Steven P. Millard
Probability Statistics and Information (PSI)
Monte Carlo Simulation covers a broad range of topics, including simply generating random
numbers, probabilistic risk assessment, bootstrapping to obtain the distribution of (and hence
confidence intervals for) some statistic for which the distribution is unknown or not assumed,
and permutation tests. This talk will discuss the concepts behind each of these main topics, then
use examples to show you how to implement these methods using S-PLUS and
ENVIRONMENTAL STATS for S-PLUS.
25
-------
9:00-10:30 8:30-10:30 Thursday, May 13
THE DATA COME IN, THE DATA GO OUT
REDUCING PAPERWORK BURDENS AT EPA
Rick Westlund
Office of Policy, US Environmental Protection Agency
In the March 1995 Reinventing Environmental Regulation report, EPA established a long term
commitment to identify and eliminate obsolete, duplicative, and unnecessary monitoring,
reporting, and record keeping requirements. To date, EPA has removed more than 25 million
baseline burden hours, and built an internal watchdog culture dedicated to avoiding unnecessary
new paperwork burdens. Although total burden has continued to creep upward due to new
statutory requirements and new right-to-know collections, EPA programs continue to develop
creative approaches to chip away at burden without endangering environmental objectives. In
addition, EPA is developing many enterprise-wide initiatives designed as strategic investments
with the potential for much larger burden reductions three to five years from now.
In the last several years the Agency has accelerated its efforts to improve information collection
management, with a particular focus on reducing burdens associated with reporting and record
keeping, while at the same time enhancing data quality, coordinating our data activities with
States, improving our collection and display technologies, and compiling our data into a single
Internet site. We have taken major steps, but there is still more to do. The public's
right-to-know is now a fundamental cornerstone of our work at EPA, and we have all worked
hard to put information into the hands of the American people in the belief that this is one of the
best ways to protect public health and the environment. In the course of doing so, we have
learned that the Agency's effective management of its data is central to the measurement of our
progress in delivering the protections the American people expect. As we embark on a new era
of information technology and enhanced public access to data, we are committed to minimizing
our paperwork burden on the public while ensuring that our data are timely, accurate, useful to
the public, and able to effectively inform our own decision making.
The Agency has several initiatives underway to redesign or refocus the way we manage
information collection with primary goals to reduce burden on the public while accomplishing
our environmental protection mission. The most encompassing initiative is the recently launched
reorganization plan involving the formation of a new information organization that will bring
together all Agency information programs to better manage our information resources with an
expressed goal of reducing burden on the public while enhancing the data quality and integrity as
it is used within the Agency and made available to others outside the Agency.
Another major initiative, started over a year ago after the 1997 Information Streamlining Plan, is
the continued development of the Reinventing Environmental Information (REI) initiative. In its
early stages, the plan focuses on data quality and building infrastructure, but burden reduction
savings will become more apparent as the efficiencies in reporting options become available.
26
-------
The Agency has been very active working with the States on burden reduction especially through
partnership workgroups with the Environmental Council of States (ECOS). The workgroup is
identifying burden reduction opportunities by defining what information is and should be
collected, how information is transmitted, and how information is used. The workgroup is also
engaging industry, the public and others to help draft a tactical approach to burden reduction.
Within the Agency, the program offices are developing a range of streamlining and reinvention
initiatives to reduce burdens. They range from whole program streamlining as in the Office of
Solid Waste's comprehensive review of the RCRA program to the Office of Air's reengineering
of the pre-production certification program for new motor vehicles.
10:45-11:45 Thursday May 13
PERSPECTIVES ON DATA AND INFORMATION FROM THE
DEPARTMENT OF HEALTH AND HUMAN SERVICES
William F. Raub, Ph.D.
Deputy Assistant Secretary for Science Policy
Department of Health and Human Services
The Department of Health and Human Services (DHHS) employs a wide variety of data and
information systems as it seeks to enhance the well-being of Americans by providing for
effective health and human services and by fostering strong, sustained advances in the sciences
underlying medicine, public health, and social services. DHHS data-oriented efforts range from
(a) collection of national vital and health statistics to (b) systematic surveillance focused on
specific diseases and disorders to (c) special surveys oriented to particular public health issues
and or particular population groups. A major contemporary challenge is to improve surveillance
for nc\\ and reemerging infectious diseases in general while improving preparedness to detect
and respond to potential acts of biological terrorism.
27
-------
Evaluation Form
1999 EPA Conference on Environmental Statistics and Information
May 10-13,1999
Please help us improve future conferences by taking a few moments to provide us with your
comments on this year's conference.
1 Overall Conference Evaluation
Did you broaden your EPA contacts?
Did you update your current knowledge?
Did vou find exposure to new material1'
Did you gain more agency-wide perspective?
Were you able to exchange technical methods?
Were you able to discuss problems and concerns?
Please check one box
Very
Much
Some
Extent
Limited
Extent
2 Session Evaluation
Workshop on Monte Carlo Methods in Environmental Statistics
fWoollcott Smith and Peter Petraitis)
EPA's Web Site and You (Joe Anderson)
Information, Statistics and the Region (Al Morris)
Keynote Address (Jay Hakes)
Statistical Training Session (Videotapes)
Statistics, Information, and GPRA (George Bonina and Judith
Calem Lieberman)
Local Applications of EPA Data (Ron Shafer, Henry Topper,
Kimberly Nelson, N. Bouwes, and Steven M Hassur)
Statistical Methods for Lab and Air Quality Data Analysis (Larry
Cox, Mary Lou Thompson, Kerrie Nelson, Peter Craigmile, and
Joel Reynolds)
Databases The Manager's View (Phil Lindenstruth, Abraham
Siegel, and Mike Mundell)
Ensuring the Quality of Environmental Information (Nancy
Wentworth and Malcolm Bertoni)
Please check one box
Highly
Relevant
Fairly
Relevant
Not Very
Relevant
-------
2 Session Evaluation Continued
Models and Model Assessment of Environmental Data (Mary
^ou Thompson, Rafael Ponce, Samantha Bates, A. C. Cullen,
A E Raftery, Marianne Turley, E. David Ford, and Joel
Reynolds)
Use of the Internet for Sharing Statistics (Steve Hufford, Pat
Garvey, Anne Frondorf, Chris Miller, and Bob Shepanek)
Epidemiology and Risk Assessment Cumulative and/or
Aggregate Risk Assessment (Ruth Allen, David Miller, Hans
Allender, and Breeda Reilley)
Statistical Research Issues in Quality Assurance (John Warren,
Rob O'Brien, and Charles White)
Data, Information and Statistics Putting it All Together for
Decision-Making (Tom Curran )
Poster and Computer Sessions
Data Integration and Quality. Vision for the Future (Ruth Allen,
Susan Devesa, D Grauman, W Blot, G Pennello, R Hoover,
and J Fraumeni )
Analysis of Cleanups (Michael J Messner and Bimal Sinha)
Sampling and Design Issues in Environmental Studies ( Tony
Olsen, David Marker, and Paul D Sampson)
Some Analyses and Potential Analyses at EPA (Doreen Sterling,
Mike Barrette, Tom DeMoss, Tom Pheiffer, and John Moses )
Measurement Issues Related to our Water Supply (Barnes
Johnson, Andrew Schulman, Jennifer Wu, Benjamin Smith,
Jennifer Wu, Henry Kahn, Helen L Jacobs, and Kathleen A.
Stralka, and Virginia A. Cohen-Bradley)
Assessing Risk (Elizabeth Margosches, Mary Marion, David
Pawel, and Margaret Conomos)
A Walk on the Wild Side of Statistical Communication
(Woollcott Smith)
Proposed Information Management Office (Robert English)
The Visual Presentation of Data (Al Morris, David Mintz, and
Daniel Carr)
Listening to Our Information Customers (Brendan Doyle,
Margaret Morgan-Hubbard, Pat Bonner, and Emma McNamara]
Please check one box
Highly
Relevant
Fairly
Relevant
Not Very
Relevant
-------
2 Session Evaluation Continued
Application of Sampling in Aquatic Resources (Henry Kahn,
Silvestre Colon, Anthony R Olsen, and Barnes Johnson)
Statistics and Information at EPA as we Start a New Century:
Where Are We Going? (Phil Ross, Larry Cox, Karen Klima,
Heather Case, and G.P. Patil)
Applying Monte Carlo Simulation Techniques with S-PLUS
(Steven P Millard)
The Data Come In, the Data Go Out (Rick Westlund and
Charlotte Cottrill )
WRAP-UP SESSION (Barry Nussbaum and William Raub)
Please check one box
Highly
Relevant
Fairly
Relevant
Not Very
Relevant
3 What were the greatest strengths of the conference? What aspects did you like the most?
4 What were the greatest weaknesses of the conference? What aspects and sessions did you
like the least?
5 Would you be interested in other training sessions that would introduce you to a new
development in applied statistical methodology?
Yes No Unsure
Suggestions for topics
6 Are you planning to attend next year's conference on environmental statistics and
Yes No Unsure
7 Other Comments
-------
REGISTRANTS
The 1999 EPA Conference on
Environmental Statistics and Information
SugarLoaf Conference Center
Philadelphia, Pennsylvania May 10-13.1999
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
RUTH ALLEN
OPP/HED/CEB
US EPA
703-305-7191
301-402-4279
Allen.ruth@epamail.epa.gov
HANS ALLENDER
US EPA
703-305-7883
703-605-0645
A llender.hansfSepamai 1 .eoa.gov
JOSEPH ANDERSON
US EPA
202-260-3016
LARA P. AUTRY
OAR/OAQPS/EMAD
US EPA
919-541-5544
919-541-1039
Aurrv.lara@eDa.gov
MICHAEL BARRETTE
US EPA
202-564-7019
Barrette.michael@epamail.epa.gov
SAMANTHA BATES
UNIV OF WASHINGTON
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
MALCOLM BERTONI
RESEARCH TRIANGLE INSTITUTE
202-728-2067
202-728-2095
MJB@rti.org
CINDY BETHELL
US EPA
GEORGE BONINA
OIRM
202-260-6227
Bonma.george@epa.gov
PATRICIA BONNER
US EPA
202-260-0599
Bonner.patricia@epamail.epa.gov
ED BRANDT
CEIS/IA1AD
US EPA
202-260-6217
Brandt.edward@epamail.eDa.gov
LORI BRUNSMAN
OPPTS/OPP/HED
US EPA
703-308-2902
703-605-0645
Brunsman.lorif2).eDamail.eDa.gov
-------
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
DANIEL CARR
GEORGE MASON UNIV
703-993-1671
703-993-1521
HEATHER ANNE CASE
OP/CEIS
US EPA
202-260-2360
202-260-4903
Case.heather@epamail.eDa.gov
WENDY CLELAND-HAMNETT
OP/CEIS
US EPA
206-260-4030
202-260-0275
Cleland-Hamnett.wendv@epa.gov
SILVESTRE COLON
OFFICE OF WATER
US EPA
202-260-3066
202-260-7185
Colon.silvestre@epamail.epa.gov
VIRGINIA A.COLTEN-BRADLEY
OSWER/EMRAD
US EPA
703-308-8613
703-308-0509
Colten-bradlev virginia(5)epamail.epa.go\
MARGARET CONOMOS
OPPE/CEIS
US EPA
202-260-3958
202-260-4968
Conomos.margaret@epa.gov
LAWRENCE COX
ORD/NERL
US EPA
919-541-2648
919-541-7588
Cox, larrv@epamail.epa. gov
PETER CRAIGMILE
UNIV OF WASHINGTON
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
DAVID CROSBY
AMERICAN UNIVERISITY
202-885-3155
Dcrosbv@american .edu
THOMAS CURRAN
OAR/OAQPS
US EPA
919-541-5694
919-541-4028
Curran.thomas@epamail.epa.gov
THOMAS DEMOSS
US EPA MAIA
410-305-2739
410-305-3095
Demoss.tom@epa.gov
SUSAN DEVESA
NATIONAL CANCER INSTITUTE
NIH
301-496-8104
301-402-0081
Devesas@epndce.nci.nih.gov
SUSAN DILLMAN
OPPTS/OPPT/NPCD
US EPA
202-260-5375
202-260-0001
Dillman.susan@epa.gov
KHOAN TAN DINH
US EPA
202-260-3891
202-260-1283
Dinh.khoan@epamail.epa.gov
DONALD DOERFLER
ORD/ERC/NHEERL
US EPA
919-541-7741
Doerfler.donald@epamail.epa.pov
BRENDAN DOYLE
US EPA
202-260-2693
202-260-4968
Dovle.brendan@epamail.epa pnv
-------
LEE ELLIS
CEIS
US EPA
Phone 202-260-6123
Fax 202-260-4968
E-mail Ellis.lee@epamail.epa.gov
ROBERT ENGLISH
INFO TRANS/ORG PLANNING
US EPA
Phone 202-260-5995
Fax 202-260-3655
E-mail English.robert@epamail.epa.gov
DAVID FARRAR
OPP
US EPA
703-305-5721
703-305-6309
Farrar.david@epamail.epa.gov
TERENCE FITZ-SIMONS
US EPA
919-541-0889
Phone
Fax
E-mail
Phone
Fax
E-mail
GEORGE T. FLATMAN
ORD/NERL-CRD
US EPA
Phone 702-798-2528
Fax 702-798-2208
E-mail George flatman@epamail.epa.gov
JOHN F. FOX
OST
US EPA
Phone 202-260-9889
Fax 202-260-7185
E-mail Fox.iohn@epamail.epa.gov
MARY FRANKENBERRY
OPPTS/OPP/EFED
US EPA
Phone 703-305-5694
Fax 703-305-6309
E-mail Frankenberrv.mary@epamail.epa.gov
ANNE FRONDORF
US GEOLOGICAL SURVEY
Phone 703-648-4205
Fax 703-648-4224
E-mail Anne_frondorf@usgs.gov
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
WILLIAM GARETZ
OPPE/CEIS
US EPA
202-260-2684
Garet2.william@epamail.epa.gov
PATGARVEY
OIRM/EIMD
US EPA
202-260-3103
202-401-8390
Garvey.pat@epamail.epa.gqv
SUSAN P. GEYER
CEIS
US EPA
202-260-6637
Gever.susan@epa.gov
MELISSA GONZALES
ORD/NHEERL
US EPA
919-966-7549
919-966-7584
Gonzales.mel issa@epa.gov
PETER GOODWIN
DEAN GRADUATE SCHOOL
TEMPLE UNIVERSITY
BRIAN GREGORY
OAR/ORIA/IED/CHB
US EPA 0
202-564-9024
202-565-2038
Gregory.brian@epamail.epa.gov
JAY HAKES
ENERGY INFORMATION
ADMINISTRATION
-------
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
STEVEN M. HASSUR
OPPT
US EPA
202-260-1735
202-260-0981
Hassur.steven@epamail.epa.gov
RICHARD HEIBERGER
TEMPLE UNIVERSITY
KAREN KLIMA
US EPA
JAMES HEMBY
OAQPS
US EPA
919-541-5459
919-541-2464
Hembv.iames@epa.gov
DAVID M. HOLLAND
ORD/NHEERL
US EPA
919-541-3126
919-541-1486
Holland.davidi@eDamail.epa.gov
STEVE HUFFORD
CE1S
US EPA
202-260-9732
202-260-4968
Hufford.steve@epamail.epa.gov
BARNES JOHNSON
OSWER/OSW
US EPA
703-308-8855
703-308-0511
Johnson. barnes@epamai 1 .epa. eov
HENRY KAHN
OW/EAD
US EPA
202-260-5408
202-260-7185
Kartn.henrv@epamail.epa.gov
R. CATHERINE KING
US EPA OECEJ
215-814-0871
215-814-2905
King.catherine@epamail.epa.gov
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
ARTHUR T. KOINES
OP/CEIS
US EPA
202-260-4030
202-260-0275
Koines.arthur@epamail.epa.gov
MELKOLLANDER
INSTITUTE FOR SURVEY RESEARCH
202-537-6845
202-537-6873
LEE KYLE
OGWOW
US EPA
202-260-1154
202-401-3041
Kyle.lee@epamaii.epa.gov
PEPI HERBERTLACAYO
CEIS
US EPA
202-260-2714
202-260-4968
Lacavo pepi@epamail.epa.gov
RASHM1LAL
OP/CEIS
US EPA
202-260-3007
202-260-8550
Rashmi.lal@epamail.eDa.gov
JADE LEE
EPA OFFICE OF WATER
202-260-1996
202-260-7185
Lee.jade@epa.gov
JUDY LEE
WASTE/CHEM MGMT DIV
215-814-3401
215-814-3113
Lee.judy@epamail.epa.gov
-------
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
LAWRENCE LEHRMAN
RMD/OIS
US EPA
Lehrman.lawrence@epamail.epa.gov
ELEANOR LEONARD
OP/CEIS
US EPA
202-260-9753
703-525-3455
Elleonard@aol.com
JUDITH C. LIEBERMAN
OCEO
US EPA
202-260-8638
202-401-1515
Lieberman.iudv@epamail.epa.gov
PHILIP LINDENSTRUTH
OFFICE OF WATER
US EPA 0
202-260-6549
202-260-7024
Lindenstruth.phil@epamail.epa.gov
CONNIE LORENZ
OP/CEIS/CSAD
US EPA
202-260-4660
202-260-4903
ARTHUR LUBIN
OSEA
US EPA
312-886-6226
312-353-0374
Lubin.arthur@epamail.epa.gov
ALLAN MARCUS
NCEA
US EPA
919-541-0643
919-541-1818
lvlarcus.allan@epa.gov
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
ELIZABETH MARGOSCHES
OPPTS/OPPT
US EPA
202-260-1511
202-260-1279
Margosches@epamail.epa.gov
MARY A. MARION
OPPTS/OPP/HED
US EPA
703-308-2854
Marion.marv@epamail.epa.gov
DAVID MARKER
UNTV OF WASHINGTON
ETHAN MCMAHON
OP/CEIS
US EPA
202-260-8549
Mcmahon.ethan@epamail.epa.gov
MICHAEL MESSNER
OGWDW
US EPA
202-260-8107
Messner.michael@epamail.epa.gov
STEVEN P. MILLARD
PSI
206-528-4877
206-528-4802
SmiHard@probstatinfo.com
CHRISTOPHER MILLER
NOAA
Phone
Fax
E-mail
DAVID MILLER
OPPTS/OPP
US EPA
703-305-5352
703-305-5147
Miller.davidJ@epamail.epa.gQv
-------
DAVIDJVDNTZ
OAR/OAQPS
US EPA
Phone 919-541-5224
Fax 919-541-1903
E-mail Mintz.david@eDa.gov
MARGARET MORGAN -HUBBARD
DIRECTOR, OFFICE OF
COMMUNICATION
US EPA
202-260-5965
Morgan-hubbard.margaret
@epamail.epa.gov
Phone
Fax
E-mail
AL MORRIS
OFFICE OF ENVIRON DATA
Phone 215-814-5701
Fax 215-814-5718
E-mail morris.alvin@epa.gov
REBECCA MOSER
CEIS
US EPA
202-260-6780
202-260-4903
Phone
Fax
E-mail
JOHN MOSES
OP/CEIS
US EPA
Phone 202-260-6380
Fax 202-401-7617
E-mail Moses.iohn@epamai I .epa.gov
NICKNAPOLI
US EPA
Phone 215-816-2621
Fax 215-814-2783
E-mail Napoli.nick@epamail.epa.gov
MALIHA S. NASH
ORD/NERL-CRD
US EPA
Phone 702-798-2528
Fax 702-798-2208
E-mail Nash.maliha@epamail.epa.gov
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
KIMBERLY NELSON
PA DEPT OF ENVIR PROTECTION
717-787-3534
717-783-8926
Nelson.kimberly@dep.state.pa.us
BARRY NUSSBAUM
OPPE/CEIS
US EPA
202-260-1493
202-460-4968
Nussbaum.barrv@.epamail.epa.gov
ROB O'BRIEN
BATTELLE
509-375-6769
509-375-2604
Robert.obrien@pnl.gov
ANTHONY R. OLSEN
USEPANHEERL
541-754-4790
541-754-4716
Tolsen@mail.cor.epa.gov
G. P. PATIL
PENNSYLVANIA ST UNIV
814-865-9442
814-863-7114
Gpp@stat.psu.edu
ROBERT M. PATTERSON
COLLEGE OF ENGINEERING
TEMPLE UNIVERITY
215-204-1665
215-204-6936
rpatterson@thunder.temple.edu
DAVID PAWEL
ORIA
US EPA
202-564-9202
PETER PETRAITIS
UNIVERISTY OF PENNSYLVANIA
-------
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
ANNE POLIT1S
CEIS/IAIAD
US EPA
202-260-5345
202-260-4903
Polids.anne@epamail.epa.gov
RAFAEL PONCE
UNIV OF WASHINGTON
WILLIAM F. RAUB
DEPUTY ASST SECTY SCI POLICY
DEPT HEALTH & HUMAN SERV
BREEDA REILLY
CEPPO
US EPA
202-260-0716
Reilly breeda@epamail.eDa.gov
JOSEPH RETZER
OP
US EPA
202-260-2472
Retzer.ioseph@epamail.epa.gov
JOEL REYNOLDS
UNIV OF WASHINGTON
EDNA RODRIGUEZ
OP/CEIS/CSAD
US EPA
202-260-3301
202-260-4903
Rodriguez.edna@epamail.epa.gov
N. PHILLIP ROSS
CEIS
US EPA
202-260-5244
202-260-8550
Ross.Nphillip@eDamail.eDa.gov
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
KRISTEN RYDING
OEA
US EPA
206-553-6918
PAUL SAMPSON
UNIV OF WASHINGTON
DINA SCHREINEMACHERS
EBB/HSD/NHEERL/ORD
US EPA
919-966-5875
919-966-7584
Schreinemacners.dina@epamail.epa.pov
ANDREW SCHULMAN
OGWDW/SRMD/TAB
US EPA
202-260-4197
202-260-3762
Schubnan.andrew@epamail.epa.gov
RONALD SHAFER
OP/CEIS
US EPA
202-260-6766
202-260-4968
Shafer.ronald@epamail.epa.flov
BOB SHEPANEK
ORD/NCEA
US EPA
202-564-3348
202-565-0061
Shepanek.robert@epaman epa p,
CAROLYN SHETTLE
INSTITUTE FOR SURVEY RESEARCH
202-537-6793
202-537-6873
cshettle@ioip.com
ABRAHAM SIEGEL
OW/OGWDW
US EPA
202-260-2804
Siegel.abraham@epamail.epa.gov
-------
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
Phone
Fax
E-mail
BIMAL SINHA
OPPE/CEIS
US EPA
202-260-2681
Sinha.bimal@eDaniail.epa.gov
BENJAMIN SMITH
US EPA
202-260-3026
202-260-3762
Smith.ben@epamail.eDa.eov
WILLIAM P. SMITH
OPPE/CEIS
US EPA
202-260-2697
202-260-4968
Sm ith. wi I l@epamai I .epa.gov
WOOLCOTT SMITH
TEMPLE UNIVERSITY
MINDI SNOPARSKY
HYDROGEOLOGY
EPA
215-814-3316
Snoparsky .m ind i@epam ial .epa.gov
JOHN A. SORRENTINO
TEMPLE UNIVERSITY
215-204-8164
Sorrento@astro.ocis.temple.edu
DOREEN STERLING
CEIS
US EPA
202-260-2766
202-260-8550
Sterling.doreen@epamail.epa.gov
WILLIAM TASK
VICE-PROVOST
TEMPLE UNIVERSITY
Phone
Fax
E-mail
MARY LOU THOMPSON
UNIV OF WASHINGTON
Phone 206-616-2723
Fax 206-616-2724
E-mail Mlt@biostat.washington.edu
HENRY TOPPER
OPPT
US EPA
Phone 202-260-6750
Fax 202-260-2217
E-mail topper.henrv@epa.gov
MARIANNE TURLEY
UNTV OF WASHINGTON
Phone 206-616-9288
Fax 206-616-9443
E-mail Marianne@cqs.washington.edu
DIANNE WALKER
US EPA REGION III
Phone 215-814-3297
Fax 215-814-2134
E-mail Walker.dianne@epamail.epa.gov
JOHN WARREN
ORD/NCERQA/QAD
US EPA
Phone 202-260-9464
Fax 202-401-7922
E-mail Warren.iohn@epamail.epa.eov
NANCY WENTWORTH
ORD/NCERQA/QAD
US EPA
Phone 202-564-6830
Fax 202-565-2441
E-mail Wentworth.nancy@epainail.epa.gov
ELLEN WERNER
INSTITUTE FOR SURVEY RESEARCH
Phone 202-537-6735
Fax 202-537-6873
E-mail Ewerner@ioip.com
RICK WESTLUND
US EPA OFFICE OF POLICY
Phone 202-260-2745
Fax 202-260-9322
E-mail Westlund.rick@epa.gov
-------
CHARLES WHITE
OW/OST/EAD
US EPA 0
Phone 202-260-5411
Fax 202-260-7185
E-mail White.chuck@epamail.epa.gov
NATHAN WILKES
OFFICE OF POLICY
US EPA
Phone 202-260-4910
Fax 202-260-4903
E-mail Wilkes.nathan.epa.gov
JENNIFER WU
OW/OGWDW/SBMD
US EPA
Phone 202-260-0425
Fax 202-260-3762
E-mail Wu.iennifer@epamail.epa.gov
------- |