An Evaluation of the Approach for
Assessing Risks to the Benthic
Invertebrate Community at the
Portland Harbor Superfund Site

Prelim in a ry Draft
Prepared for:

U.S. Environmental Protection Agency
Oregon Operations Office
805 SW Broadway, Suite 500
Portland, Oregon 97205

and

Parametrix, Inc.
33972 Texas Street SW
Albany, Oregon 97321
Prepared - September, 2008 - by:

D.D. MacDonald                 P.F. Landrum
MacDonald Environmental Sciences Ltd.   Landrum and Associates
#24 - 4800 Island Highway North        6829 Earhart Road
Nanaimo, British Columbia V9T 1W6     Ann Arbor, Michigan 48105
                                MACDONALD
                                ENVIRONMENTAL SCIENCES LTD.

-------
An Evaluation of the Approach for
   Assessing Risks to the Benthic
  Invertebrate Community at the
  Portland Harbor Superfund Site

               Preliminary Draft
           U.S. Environmental Protection Agency
               Oregon Operations Office
              805 SW Broadway, Suite 500
                Portland, Oregon 97205

                      and

                  Parametrix, Inc.
                33972 Texas Street SW
                Albany, Oregon 97321
             Prepared - September, 2008 - by:

                  D.D. MacDonald
           MacDonald Environmental Sciences Ltd.
             #24 - 4800 Island Highway North
            Nanaimo, British Columbia V9T 1W6
                   P.F. Landrum
                    )
                Landrum and Associates
                  6829 Earhart Road
               Ann Arbor, Michigan 48105


           MESL DOCUMENT No. MESL-PHR-BICRA-0908-V2

-------
                                                            TABLE OF CONTENTS -PAGEI
Table of Contents


      Table of Contents	  I

      List of Figures	Ill

      List of Acronyms	IV

      1.0    Introduction	1

      2.0    Background	3

      3.0    Terms of Reference for this Evaluation	7

      4.0    Recommendations and Associated Rationale	9
            4.1    Scope of this Evaluation	10
            4.2    Recommended Framework for Assessing Risks to the Benthic
                  Invertebrate Community	11
            4.3    Recommended Procedures for Designating Sediment Samples
                  as Toxic or Not Toxic	15
            4.4    Recommended  Procedures  for  Developing  a  Reference
                  Envelope for Interpreting Data from Whole-Sediment Toxicity
                  Tests	20
            4.5    Recommended  Procedures for Integrating Data on Multiple
                  Toxicity Test Endpoints	22
            4.6    Recommended  Procedures  for  Evaluating   Relationships
                  Between Sediment Chemistry and Sediment Toxicity	25
            4.7    Recommended  Procedures   for   Developing   Toxicity
                  Thresholds	27
            4.8    Procedures for Evaluating Concentration-Response Models	28
            4.9    Recommended  Procedures for Assessing Risks to  Benthic
                  Invertebrates	31

      5.0    Summary and Conclusions	31

      6.0    References	38

-------
                                                        TABLE OF CONTENTS - PAGE n
Addendum 1 Further Evaluation of the Approach for Assessing Risks to
      the Benthic  Invertebrate Community at  the Portland Harbor
      Superfund Site	A-l
             Al.O  Introduction	A-l
             A2.0  Responses to Additional Questions	A-l
             A3.0  Application of Regional  Sediment Evaluation Team
                   (RSET) Process to the Portland Harbor Site	A-14
             A4.0  Development of a Reference Envelope for Portland Harbor
                   	A-15
             A4.1  Approaches to Selecting Reference Locations	A-16
             A4.2  Criteria for Identifying Reference Sediment Samples. . .  A-17
             A5.0  Development of Clean-up Goals for Portland Harbor. . .  A-18
             A6.0  References	A-19

      Table Al    Reliability of the sediment toxicity thresholds (STTs)
                   that were derived based on the results of 28-day toxicity
                   tests  with the amphipod, Hyalella  azteca, and the
                   mussel, Lampsilis siliquoidea (Endpoints: survival and
                   biomass)	A-21
      Table A2    Predictive ability of the sediment toxicity thresholds
                   (STTs) that were derived based on the results of 28-day
                   toxicity tests  with the amphipod, Hyalella azteca, and
                   the mussel, Lampsilis siliquoidea (Endpoints:  survival
                   and biomass)	A-22
      Table A3    Incidence of toxicity to Ampelisca abdita and Hyalella
                   azteca exposed to whole-sediment samples with various
                   mean probable effect concentration-quotient (PEC-Q)
                   distributions	A-27
      Table A4    Biological conditions  that  occur  within the three
                   categories of risk to the benthic invertebrate community
                   in  the  Calcasieu  Estuary, identified using the  risk
                   designations assigned to each sample	A-28
      Figure Al    Relationship  between the geometric mean of the mean
                   PEC-Q and  the  average  survival of the  freshwater
                   amphipod, Hyalella azteca, in 28-d toxicity tests (data
                   source: MacDonald et al. 2002; dashed lines represent
                   95% prediction limits)	A-29

-------
                                                                  LIST OF FIGURES - in
List of Figures

      Figure 1    Scatter  plot  showing  the  relationship  between amphipod
                  (Hyalella aztecd) survival and biomass (n - 76)	F-l

      Figure 2    Scatter  plot showing the  relationship  between amphipod
                  (Hyalella aztecd) survival and midge (Chironomus dilutus)
                  survival (n = 76)	 F-2

      Figure3    Scatter  plot  showing 'the  relationship  between amphipod
                  (Hyalella aztecd) survival and midge (Chironomus dilutus)
                  biomass (n = 76)	F-3

-------
                                                                 LIST OF ACRONYMS - iv
List of Acronyms

      BERA      -  baseline ecological risk assessment
      CERCLA    -  Comprehensive Environmental  Response,  Compensation,  and
                     Liability Act
      COPC      -  chemical of potential concern
      DW        -  dry weight
      ERA        -  ecological risk assessment
      ESB-TU     -  equilibrium partitioning-based sediment benchmark-toxic unit
      foc          -  fraction organic carbon
      FPM        -  floating percentile model
      iAOPC      -  initial area of potential concern
      LOE        -  line-of-evidence
      LRM        -  logistic regression model
      LWG       -  Lower Willamette Group
      MSD        -  minimum significant difference
      NOAA      -  National Oceanic and Atmospheric Administration
      PAH        -  polycyclic aromatic hydrocarbon
      PCB        -  polychlorinated biphenyl
      PEC-Q      -  probable effect concentration-quotient
      PEL        -  probable effect level
      PRO        -  preliminary remediation goal
      PRP        -  potentially responsible party
      QAPP      -  Quality Assurance Project Plan
      RSET       -  Regional Sediment Evaluation Team
      RI/FS       -  remedial investigation/feasability study
      SEM-AVS   -  simultaneously extracted metals minus acid volatile sulfide
      SFF        -  Sustainable Fisheries Foundation
      SQG        -  sediment quality guideline
      SQV        -  sediment quality value
      TEL        -  threshold effect level
      TIE         -  toxicity identification evaluation
      TMDL      -  total maximum daily load
      USEPA     -  United States Environmental Protection Agency
      WOE       -  weight-of-evidence

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RJSKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 1

1.0  Introduction
      The Portland Harbor Comprehensive Environmental Response, Compensation, and
      Liability Act (CERCLA) site is located in Portland, Oregon and includes about 11
      miles of the lower Willamette River and surrounding upland areas that discharge to
      the river. The Willamette River is a major tributary to the Columbia River.  As part
      of the overall remedial investigation/feasability study (RI/FS) that is being conducted
      at the site,  assessments  of the nature and extent of contamination, of risks to
      ecological receptors, and of risks to  human health have been ongoing for some time.
      These assessment activities are being led by the potentially responsible parties (PRPs)
      through work conducted by the Lower Willamette Group (LWG).

      As  part of the RI/FS  process, the  LWG is conducting  a baseline ecological  risk
      assessment (BERA) of the Portland Harbor site. According to the baseline problem
      formulation that has been developed for the site, the BERA is intended to assess risks
      to aquatic plants,  benthic macroinvertebrates, bivalves, decapods, fish, amphibians,
      aquatic-dependent  birds, and  aquatic-dependent  mammals  (USEPA  2008).
      Importantly, the problem formulation document identifies the assessment endpoints
      and the measurement  endpoints that will be evaluated in the BERA. For benthic
      macroinvertebrates, the BERA is intended to provide a basis for assessing effects on
      the survival, growth,  and reproduction of benthic invertebrates associated with
      exposure to contaminated sediments and transition zone water  (i.e., pore water) in
      Portland Harbor.   The measurement endpoints  that were  identified to support
      evaluation of the status of the  assessment endpoint include (USEPA 2008):

          •   Whole-sediment toxicity;
          •   Whole-sediment chemistry;
             Surface-water chemistry;
          •   Pore-water chemistry; and,
          •   Invertebrate-tissue chemistry.
                                                           PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 2

 A number of procedures have been identified for interpreting the data collected in the
 study area relative to evaluation of this assessment endpoint. For example, the LWG
 (2004) identified provisional toxicity reference values for use in the ecological risk
 assessment process. In addition, LWG described procedures for estimating risks to
 benthic invertebrates using sediment toxicity tests (LWG 2005a) and using predictive
 models based on sediment toxicity tests (LWG 2006).  More recently, United States
 Environmental Protection Agency (USEPA) identified specific analytical procedures
 for interpreting these  data in the problem formulation document and supporting
 documentation (USEPA 2008). While there are many similarities among the various
 data interpretation procedures that have been identified to date, LWG and USEPA
 have had some difficulty in coming to agreement on the details of these approaches
 to data analysis.

 Both LWG and USEPA recognize  that resolving differences regarding the data
 analysis process for assessing risks to benthic invertebrates could be challenging. For
 this reason, LWG and USEPA have agreed to solicit an independent evaluation of the
 various approaches that  have been proposed to date to provide a perspective that
 could help to  identify a mutually-acceptable path forward.  More specifically, Don
 MacDonald and Peter  Landrum were retained by Parametrix, Inc., on behalf of the
 LWG and USEPA, to conduct such an evaluation of approaches for assessing risks
 to the benthic community at the Portland Harbor site.  This document presents the
 background information (Section 2.0) and terms of reference (Section 3.0) that were
 provided by USEPA. In addition, this document summarizes the recommendations
 that are offered to LWG and USEPA for assessing risks to benthic invertebrates using
 the data and information that have been collected at the site (Section 4.0). Responses
 to each of the seven questions posed by USEPA in the terms of reference are provided
 in the Summary and Conclusions (Section 5.0) of this document.
                                                      PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGES

2.0  Background

      As indicated above, LWG and USEPA agreed to have Don MacDonald and Peter
      Landrum conduct an independent evaluation of the various approaches for assessing
      risks to benthic invertebrates at the Portland Harbor site. To facilitate this evaluation,
      the various documents pertaining to the benthic invertebrate portion of the BERA,
      prepared by LWG or USEPA, were provided to these reviewers.  In addition, the
      reviewers were provided with access to the  data and information that have been
      collected to date at  the site.  Furthermore, additional background  information was
      provided by USEPA, as follows:

      Portland Harbor Work Plan:  Due to the large size of the Portland Harbor site
      (approximately 11 river  miles),  USEPA and the Lower Willamette agreed to use
      sediment and bioassay results to "develop a predictive  model of chemical-to-effects
      to assess risk from bulk sediment."   This  approach was not described  in the
      programmatic work plan (April 2004) but rather in the technical memorandum -
      Estimating Risks to Benthic Organisms using Sediment Bioassays (March 18,2005).
      This technical memorandum specified the sediment bioassay tests that would be used
      at the site (10-day Chironomus and 28-day Hyalella), the endpoints (growth and
      mortality) the hit/no-hit designation (10% and 25% difference from control for the
      two mortality endpoints, 25 and 40% difference from control for the Hyalella growth
      endpoint, and 20%  and 30%  difference from control  for the Chironomus growth
      endpoint),  and  the  approaches  that would be considered to develop predictive
      relationships [1) sediment quality values (SQVs) derived using database percentiles,
      2) SQVs derived using consensus-based values, 3) a quotient method, 4) the floating
      percentile method, and 5) logistic  regression analysis].  It was agreed that each
      predictive relationship would be evaluated using measures such as false positive and
      false negative reliability rates.

      Round2Data Collection:  In 2004, 233 sediment bioassay tests were performed on
      sediment samples collected from the Portland Harbor  site. Sample locations were
      selected to ensure that bioassay tests were performed across a range of contaminant
                                                          PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALVATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 4

 concentrations and sources. Results were presented  in the Round 2A Data Report -
 Sediment Toxicity Testing (April 8, 2005). Results  are presented in this report and
 are  also available in Query Manager,  a database  developed and  maintained by
 National Oceanic  and Atmospheric Administration (NOAA).

 Preliminary Evaluation ofBenthic Toxicity Results: Once the Round 2  Bioassay
 results were received, USEPA and the LWG embarked on a series of discussions to
 determine  which predictive model(s) to apply at the site.  The LWG presented an
 analysis that suggested that the Probable Effect Concentration-Quotient  (PEC-Q)
 approach was not a reliable predictor of sediment toxicity at the site and that the
 predictive models should focus in on the floating percentile and logistic regression
 models.  It was agreed  that the  models would  consider three different hit/no-hit
 thresholds - 10%, 20% and 30% difference  from control.  The LWG also raised
 concerns  about the reliability of the Hyalella growth  endpoint in  the floating
 percentile model.

 Benthic Interpretive  Report:   On March  17, 2006, the LWG  submitted the
 Interpretive Report:  Estimating Risks to Benthic Organisms using Predictive Models
 Based on Sediment Toxicity Tests. This report presented an evaluation of the floating
 percentile and logistic regression  models as well as a comparison to existing SQVs.
 The stated  goal of the predictive model is "to  derive SQVs that are sufficiently
 reliable for predicting benthic toxicity within the study area" and to develop  a line-of-
 evidence "for identifying areas where chemical concentrations in sediment may pose
 a risk to benthic invertebrates."

 On July 6, 2006, USEPA commented on the Benthic Interpretive Approach.  The
 LWG responded to these comments on September 1, 2006. In the LWG response to
 comments, there were a number of comments that the LWG identified as category 1
 - strongly disagree; cannot accept. In particular, the LWG disagreed with USEPA's
 comment to include the Hyalella growth endpoint in the floating percentile model and
 to consider effects level  1 (10% difference from  control) in the development of the
 predictive models.  In addition, the LWG agreed to the use of the alternative logistic

                                                      PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGES

 regression model using  a larger, non-site specific, freshwater database for the
 Hyalella 28-day growth and survival test as a complimentary line-of-evidence (LOE)
 to the floating percentile model.  The LWG also agreed to use the revised logistic
 regression model based on the Hyalella pooled endpoint and the floating percentile
 model based on Chironomus growth, Chironomus mortality  and Hyalella morality
 endpoints as separate LOEs in assessing risks to the benthic community.

 Round 2 Report:  On February  21, 2007, the LWG submitted the Comprehensive
 Round 2 Site Characterization Summary and Data  Gaps Report.   In the Round 2
 Report, the evaluation of benthic risks considered  the floating percentile model -
 effect levels 2 and 3 for the Chironomus growth, Chironomus  mortality and Hyalella
 morality endpoints and the logistic regression model at the effect level 2 for the
 pooled Hyalella and Chironomus endpoints. Although the Round 2 report utilized the
 logistic regression model for the identification of Round 2 Chemicals of Potential
 Concern (COPCs; see Table  9.3-1 of the Round 2  Report),  the logistic  regression
 model was not used to develop initial areas of potential concern (iAOPCs) due to the
 following concerns:

    •   Irreproducibility of the logistic regression model;
    •   The predictive ability of the Hyalella growth endpoint; and,
    •   The reduction in predictive accuracy when combining the two models.

 In addition, the logistic regression model as applied by Jay Field of NOAA relied on
 approximately 400 samples  collected outside  Portland  Harbor.   The  LWG has
 objected to the inclusion of this data into the logistic regression model - especially if
 the data can not be made available to the LWG.  USEPA has stated that the non-site
 data must be made available to the LWG if we are to use if for site  decision making.

 USEPA considered the logistic regression model and the Hyalella growth endpoint
 in our evaluation of benthic risks for the purpose of identifying Round 3B  data gaps.
 However, during the finalization of the field sampling plan for sediment toxicity

                                                      PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 6

 testing, USEPA and the LWG could not reach agreement on the use of the Hyalella
 growth endpoint in the application of the predictive models and instead agreed to
 identify sediment sampling locations, in part, based on an evaluation of the empirical
 Hyalella growth toxicity testing.  It should be noted that approximately 50 additional
 samples were collected for toxicity testing in the fall of 2007.  These data are
 available but have not yet been evaluated.

 BERA Problem Formulation: On February 15,2008, USEPA submitted the Problem
 Formulation for the Baseline Ecological Risk Assessment to the LWG. The purpose
 of the problem formulation was to guide the development of the baseline ecological
 risk assessment.  Relevant risk hypotheses from the Problem  Formulation include:

    •  Do  contaminant concentrations in bulk sediments from Portland Harbor
       exceed sediment quality  benchmarks  for the survival, reproduction or
       growth of benthic macroinvertebrates?

    •  Is the survival or growth of benthic macroinvertebrates as predicted from
       bulk sediment chemistry below acceptable thresholds as determined by the
       use of modeling techniques such as logistic regression modeling or floating
       percentile modeling?

    •  Is the survival of benthic invertebrates, as indicated by the survival of the
       amphipod Hyalella azteca and the midge Chironomus  tentans exposed to
       whole sediments from Portland Harbor below biological effect thresholds
       which represent minor,  moderate, or severe levels of unacceptable effect?

    •  Is the growth  or biomass of benthic invertebrates  (Hyalella azteca and
       Chironomus tentans) exposed to bulk sediments from Portland Harbor
       below biological effect thresholds which represent minor,  moderate, or
       severe levels of unacceptable effect?
                                                      PORTLAND HARBOR SUPERFUND SITE

-------
     	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 7

      The problem formulation required evaluation of the empirical toxicity results at the
      10%, 20% and 30% difference from control level and the floating percentile model
      at the 20% and 30% effect level.  In addition, the problem formulation required a
      substitution of the Hyalella growth endpoint with a total biomass endpoint, suggested
      pooling of endpoints to improve model performance, recommended incorporation of
      the Round 3 Data into the models, and recommended reconciling the chemicals
      evaluated in the two models to the extent possible.

      Current Status -  Post Problem Formulation Discussions: Following submittal of
      the problem formulation by USEPA, a series of discussions took place in an effort to
      resolve discrepancies between the Round 2 Report, the Problem Formulation, and
      previously submitted documents, such as the benthic interpretation report and the
      2005 Technical Memorandum - Estimating Risks to the Benthic Community using
      Sediment Toxicity Tests.  A number of approaches were considered including
      adjusting the effect levels for the Hyalella growth endpoint and incorporation of the
      RSET one-hit/two-hit approach into the floating percentile model.

      Ultimately, USEPA and the LWG  have  not been able to reach agreement on the
      hit-no-hit threshold for application of the predictive models. USEPA and the LWG
      have agreed to  substitute the total biomass endpoint for the growth endpoint for both
      Hyalella and Chironomus. Further, USEPA and the LWG have a tentative agreement
      to use the 10%, 20% and 30% difference from control for the empirical data but even
      this agreement is  tied to agreements on the use of the predictive models.
3.0  Terms of Reference for this Evaluation

      Because the LWG and USEPA have not been able to reach agreement, we have
      requested your assistance as an impartial reviewer to review the existing data and
      make recommendations about the evaluation of the empirical toxicity.  Specifically

                                                         PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGES

 we request that you evaluate the existing data and the state of the science to answer
 the following questions:

    •  What hit/no-hit criteria should be applied to the empirical sediment toxicity
       tests?
    •  What pooling of endpoints, if any, should be applied for use in each of the
       predictive  models?   Pooling  may  include pooling the growth  (total
       biomass) and mortality endpoints for each test organism (2 endpoints) or
       both  test  organisms  (1  endpoint)  and the application of the RSET
       one-hit/2-hit criteria.
    •  What hit/no-hit criteria should be applied for the logistic regression and
       floating percentile models?  Note that one, two or three criteria may be
       applied to each endpoint and each model. However, this will increase the
       amount of work required to develop the models.
    •  Should non-site data be considered in  the development of the logistic
       regression model?
    •  Once the models have been run, what analysis, if any, should be performed
       to optimize model performance?
    •  Should the predictive models be used at all given their reliability?
    •  How  should  the results of the  predictive models be used, in conjunction
       with other site data, in a weight-of-evidence (WOE) evaluation aimed at
       assessing risk to the benthic community?

 Please provide supporting information for all recommendations.
                                                      PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 9

4.0  Recommendations and Associated Rationale

      Ecological risk assessment (ERA) represents an essential element of the overall RI/FS
      process,  which  is  designed  to  support risk  management decision-making for
      Superfund sites.   More specifically,  ERA provides  risk managers with  key
      information for managing contaminated sites by estimating and describing risks to
      ecological receptors associated with exposure to contaminated environmental media.
      Such information helps  risk managers and other interested parties understand the
      ecological significance of environmental contamination at the site.  The ERA process
      also results in determination of the concentrations of COPCs that represent thresholds
      for adverse effects on the selected assessment endpoints.  This latter information is
      essential for evaluating the efficacy of the remedial alternatives that are proposed to
      address concerns regarding risks to ecological receptors utilizing habitats  in the
      vicinity of Superfund sites.

      At many  Superfund sites, concerns relative to effects on human health and ecological
      receptors associated with exposure to contaminated media are focused primarily on
      contaminated sediments.  While surface-water resources may also be contaminated,
      the COPCs in this medium  generally originate from sediments or upland activities
      (e.g., point-source  discharges of wastewater and  non-point  source  releases of
      COPCs). When the COPCs originate from upland sources, other programs (e.g., total
      maximum  daily load; TMDL)  represent the  most direct  means of addressing
      contamination issues.  Otherwise, active sediment management is needed to improve
      water quality conditions  (i.e., when surface water is being degraded by sediment
      quality conditions). In addition, the tissues of aquatic organisms can be contaminated
      to such an extent that their consumption poses risks to ecological receptors and/or
      human health. In these cases, sediment-associated COPCs are frequently the primary
      source of the tissue contamination.  Therefore, aquatic ERAs need to be designed to
      provide risk managers with the  information they need to manage  contaminated
      sediments. From our perspective, the Portland Harbor site does not appear to be an
      exception to  this rule.  That  is,  the BERA for the Portland Harbor site must be


                                                          PORTLAND HARBOR SUPERFUND SITE

-------
      	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 10

      designed  and implemented in a manner that provides risk managers  with the
      information needed to effectively manage contaminated sediments.
4.1  Scope of this Evaluation

      Contaminated sediments can pose unacceptable risks to ecological receptors for two
      main reasons. First, contaminated sediments can be directly toxic to the organisms
      that utilize benthic habitats  at the site  (i.e.,  microbiota, aquatic plants, benthic
      invertebrates, benthic fish, sediment-probing birds).   Second, sediment-associated
      COPCs can accumulate in the tissues of aquatic organisms and, in so doing, adversely
      affect the organisms that  feed on these prey species, either directly or indirectly
      through food web transfer. We understand that procedures  for assessing the risks
      associated with exposure to bioaccumulative COPCs at the Portland Harbor site have
      been developed and are currently under review.  Accordingly, this review is focused
      on evaluating the approaches that have been proposed by LWG and/or USEPA for
      assessing risks to  benthic invertebrates at the Portland Harbor  site  (i.e.,  risks
      associated with toxicity  to  benthic  invertebrates associated  with exposure  to
      contaminated sediments).  More specifically, this evaluation is intended to provide
      the LWG and USEPA with recommendations on the following topics:

          •  Framework for assessing risks to benthic invertebrates;
          •  Procedures for designating sediment samples as toxic and not toxic  (i.e.,
            hit and no hit);
          •  Procedures for integrating data on multiple toxicity test endpoints;
          •  Procedures for evaluating relationships between sediment chemistry and
            sediment toxicity;
          •  Procedures for developing toxicity thresholds for sediment;
                                                            PORTLAND HARBOR SUPERFUND SITE

-------
     	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY-PAGE 11

         •  Procedures for evaluating concentration-response models (e.g., logistic
            regression and floating percentile models; and,
         •  Procedures for assessing risks to benthic invertebrates.

      Each of these topics  are discussed in the  following sections of this document. In
      addition, the recommendations offered on these topics were used to provide responses
      to each of the seven  questions  that were posed in the terms of reference for this
      evaluation.
4.2   Recommended  Framework  for Assessing  Risks  to  the
      Benthic Invertebrate Community

      The problem formulation document (USEPA 2008) describes the framework that is
      preferred by USEPA for assessing risks to benthic invertebrates associated with
      exposure to contaminated environmental media at the Portland Harbor site.  The
      preferred approach utilizes data on multiple measurement endpoints to assess risks to
      benthic invertebrates, including:

         •  Whole-sediment toxicity;
         •  Whole-sediment chemistry;
         •  Surface-water chemistry;
         •  Pore-water chemistry; and,
         •  Invertebrate-tissue chemistry.

      The analysis plan included in the problem formulation document describes how
      information from each LOE will be used to estimate risks to benthic invertebrates.
      This framework relies primarily on whole-sediment chemistry and whole-sediment

                                                         PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALVATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 12

 toxicity data. More specifically, sediment samples are classified into one of four
 effect levels (i.e., 0,1,2, and 3) based on the observed control-adjusted response rate.
 In addition, each sediment sample is classified into one of four effects levels (i.e., 0,
 1, 2, and 3) based on the results of the logistic regression model (LRM) and based on
 the  floating percentile model (FPM).  Under certain circumstances, the framework
 calls for adding an additional point to the classification score generated using the
 LRM  or the FPM.  The highest score generated by evaluating the toxicity data, the
 LRM, or the FPM is then used to designate the potential risk to benthic invertebrates
 or potential for benthic toxicity, as follows:

    Classification Score         Potential for Benthic Toxicity
       Blank                           No Data
           0                            Unlikely
           1                            Low
           2                            Medium
           3                            High
           4                            Very High

 A WOE framework  is also described  in the  problem formulation  document.
 Application of this framework is dependent on evaluating each LOE and assigning a
 weight that reflects scientific reliability and relevance.  This information will then be
 used to identify and  rank the  LOEs for  each  receptor that provide  the  most
 scientifically-reliable indication of the status of each assessment endpoint  from
 exposure to COPCs at the site and, hence, which might be the most useful for making
 management decisions (USEPA 2008).

 The approach for assessing risks to benthic invertebrates described in the problem
 formulation document is not unreasonable. However, the framework could be refined
 to simplify the process  for conducting the benthic risk assessment.  More specifically,
 we  recommend the following framework for classifying sediment samples into
 multiple categories based on the risks that they pose to benthic invertebrates:
                                                      PORTLAND HARBOR SUPERFUND SITE

-------
AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 13

For sediment samples for which acceptable whole-sediment toxicity data
are available  (i.e., at minimum, the results  of 10-d tests with  midge,
Chironomus dilutus, and 28-d tests with amphipods, Hyalella azteca;
endpoints: survival and biomass), use only the existing toxicity data to
classify samples into risk categories based on the observed effects on the
toxicity  test  organisms  used to  evaluate  the status of the  benthic
invertebrate community (i.e., the results of the predictive modeling should
not be used to evaluate risks to benthic invertebrates for these samples).
In this way, risks to benthic invertebrates can be evaluated directly based
on the results of toxicity tests to either midge or amphipods. This approach
will eliminate the possibility that samples will be predicted to be toxic
using one or both of the predictive models  (and thereby  assigning an
elevated risk score), when toxicity test results demonstrate that the sample
is not toxic. At any location where LWG  or USEPA disagrees with the
classification that is assigned using this approach, toxicity identification
evaluation (TIE) and/or other procedures  may be conducted to provide
additional information for  identifying  the factors  that are causing or
substantially contributing to the observed toxicity.

For sediment samples for which acceptable whole-sediment toxicity data
are not available (i.e., only whole-sediment chemistry data are available),
use the most reliable of the predictive models to predict toxicity to benthic
invertebrates associated with exposure  to Portland Harbor sediments. If
only limited toxicity data are available for the sediment sample, select the
higher of the risk classifications from the predictive model results and the
toxicity test results.  This will provide a conservative basis for assessing
risks  to benthic invertebrates (i.e.,  which would tend to over-estimate
rather than under-estimate risks). For any location where LWG or USEPA
disagrees with the classification that is assigned  using  this  approach,
supplementary toxicity testing may be conducted to provide a more reliable
basis for assessing risks to benthic invertebrates at the site.

                                                PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 14

 This simplified approach to benthic risk assessment is based on the premise that
 whole-sediment toxicity tests are likely to provide more reliable information for
 evaluating effects  in  benthic invertebrates  associated with exposure to  Portland
 Harbor sediments than would predictive modeling.  It also recognizes that the two
 predictive models may have different capabilities for correctly classifying sediment
 samples from Portland Harbor as toxic or not toxic. Accordingly, the risks to benthic
 invertebrates are likely to be assessed more accurately if the most reliable predictive
 model is used  to predict  sediment  toxicity.  It is  important to  acknowledge the
 possibility that neither of the predictive models can  accurately classify sediment
 samples as toxic and not toxic across the entire site. In this  event, it may be necessary
 to develop supplementary predictive models that can be used to more accurately
 predict toxicity for the areas that the LRM and/or FPM are shown to be less reliable.
 Alternatively, supplemental toxicity testing could be conducted in  such  areas to
 provide the information needed to accurately assess risks to benthic invertebrates.

 At certain locations, risk managers may require additional information (i.e., beyond
 the  risk classification for  a sediment sample) to assist them  in making sediment
 management decisions. For example, additional information may be needed when
 sediment samples have elevated chemistry, but are found to be not toxic to the
 selected toxicity test organisms  and  endpoints. In these cases, further data analysis
 and/or further sampling may be required to explain the lack of toxicity  in these
 samples.  In other cases, sediment samples may have low chemistry, but are found to
 be toxic to the selected toxicity test organisms/endpoints. In these cases, further data
 analysis and/or further sampling may be required to identify the factor or factors that
 are causing or substantially contributing to the observed toxicity.
                                                       PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY- PAGE 15

 4.3   Recommended   Procedures   for   Designating   Sediment
       Samples as Toxic or Not Toxic

       At the Portland Harbor site, a number of whole-sediment toxicity tests have been
       conducted to evaluate the effects on benthic invertebrates associated with exposure
       to contaminated sediments.  More specifically, 10-d whole-sediment toxicity tests
       with the midge, Chironomus dilutus, and 28-d whole-sediment toxicity tests with the
       amphipod, Hyalella azteca, have been conducted on over 300 sediment samples from
       the  study area (Endpoints:  survival and  growth for both tests).  In addition,
       information on the survival and growth of oligochaetes (Lumbriculus variegatus) and
       Asiatic clams (Corbiculafluminea) exposed to Portland Harbor sediments during 28-d
       bioaccumulation tests provides additional information for assessing sediment toxicity.
       Interpretation of the results of these toxicity tests requires a procedure for designating
       the samples as toxic (hit) or not toxic (no hit) to benthic invertebrates.

       A number of approaches can be used to interpret the results of whole-sediment
       toxicity tests with benthic invertebrates. These approaches can be classified into four
       general categories, including control  comparison approach, minimum  significant
       difference (MSD) approach, reference envelope approach, and the multiple category
       approach. Each of these approaches are  briefly described below:

          •  Control Comparison Approach - Application of the control comparison
             approach involves statistical comparison of the responses of test organisms
             exposed to site sediments to the responses of test organisms exposed to
             control sediments.  Treatments that have responses that are significantly
             different from those observed in the control treatment(s) are designated as
             toxic.

          •  Minimum Significant Difference Approach - Application of the MSD
             approach is dependent on the completion of power analyses with data from
             multiple studies  for a specific toxicity test.  These results  are used to

                                                           PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 16

       identify the MSD (or minimum detectable difference) from the control
       treatment.  Treatments with response  levels greater than the MSD are
       designated as toxic (Thursby et al.  1997; Phillips et al. 2001).

    •  Reference Envelope Approach  - Application of the reference envelope
       approach  involves collection and  testing of sediment samples from a
       number of reference sites within or nearby the study area.  In this context,
       a reference sediment sample is considered to be whole-sediment obtained
       near an area of concern used to assess sediment conditions exclusive of the
       materials of interest (i.e., COPCs; ASTM 2007). The results of the toxicity
       testing conducted on these samples can be used to develop a reference
       envelope (i.e., normal  range of responses of test organisms exposed to
       reference sediments, as defined by  ASTM 2007). Sediment samples with
       response levels  that fall outside the  normal range  of responses (e.g.,
       survival below the 5th percentile for the reference samples) are designated
       as toxic.

    •  Multiple Category Approach  - Application of the multiple category
       approach involves classifying sediment samples into various groups (e.g.,
       not toxic,  low toxicity, moderate toxicity, or high toxicity), based on the
       magnitude of the observed response. The results of statistical comparisons
       to the negative control  results are also used to classify sediment samples
       into the various categories.

 According to the information presented in the problem formulation document, a
 multiple category approach has been selected for interpreting the results of whole-
 sediment toxicity tests conducted using sediments obtained from Portland Harbor.
 More specifically, sediment samples will be classified into effects level 0, 1, 2, or 3
 if control-adjusted response  rates are  >90%, 80  - 90%, 70 -  80%, and <70%
 respectively.  In order for effects to be considered significant, the response must be
                                                      PORTLAND HARBOR SUPERFVND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 17

 statistically-significantly different from the negative control response at the p< 0.05
 level.

 Recently (2007), the Sustainable Fisheries Foundation (SFF) convened a workshop
 in Victoria on behalf of the B.C. Ministry of the Environment to explore the question
 of how to interpret the results of sediment  toxicity tests (SFF 2007).   At this
 workshop, participants agreed that site-wide ecological risk assessments represent the
 most important applications of whole-sediment toxicity data. More specifically, it
 was agreed that the results of the toxicity testing program that is implemented at a site
 should support the development of site-specific toxicity thresholds (i.e., to support
 development of preliminary remediation goals and/or clean-up goals). In this context,
 workshop participants agreed that designation of samples as toxic or not toxic is  not
 necessarily required early in the site assessment process.  Rather, the magnitude of
 effect data can be used directly in  the  development  of concentration-response
 relationships for COPCs at the site. The magnitude of effect data can also be used to
 classify sediment samples into risk categories, without having to designate individual
 sediment samples groups as toxic or not toxic. This approach to the interpretation of
 whole-sediment toxicity data was considered to be desirable because no information
 is lost during the interpretation process.  Hence, workshop participants generally
 agreed with the approach that has been described for use in Portland Harbor (USEPA
 2008).

 Workshop participants also recognized that interpretation of toxicity test results may
 necessitate designation of individual sediment samples as toxic or not toxic (e.g., hot
 spot  identification, evaluation of the spatial extent  of toxicity).    In these cases,
 workshop participants agreed that a step-wise approach should be used to  interpret
 the results of individual toxicity tests.  We have reviewed the approach suggested by
 workshop participants and refined it to recommend a toxicity designation process  for
 the Portland Harbor site that consists of the following steps:
                                                       PORTLAND HARBOR SUPERFUND SITE

-------
AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 18

Conduct whole-sediment toxicity tests in accordance with standardized
protocols, as described in the project Quality Assurance Project Plan
(QAPP);
Evaluate the validity of each whole-sediment toxicity test.  The project
data quality objectives, which are documented in the QAPP, should define
the performance criteria for measurement data that will be used to evaluate
toxicity test acceptability. At minimum, such performance criteria should
define the acceptable range of negative control and positive control (i.e.,
reference toxicant) results.  Evaluation of potential  test interferences
should also be conducted during this step in the process (e.g., comparison
of  ammonia and hydrogen  sulfide  levels  to lowest  observed effect
concentrations for the test species, conducting Spearman Rank correlation
analysis);
Compare the results obtained for each sediment sample to the negative
control results for the corresponding batch of samples. Sediment samples
for which the measured response is significantly greater than that for the
negative control (i.e., a one-tailed statistical test would be used) should be
tentatively identified as toxic;
Compare the toxicity test results obtained for each sediment sample to the
reference envelope developed for the corresponding toxicity test endpoint.
Sediment samples that were  tentatively identified  as toxic based on the
previous step of the process (i.e., based on comparison to the results for the
negative control treatment) would be designated as toxic if the measured
response is greater than the lower limit of responses for reference sediment
samples  (e.g., if the  reference envelope for  amphipod survival in a 28-d
whole-sediment toxicity  test is 77 to 98%, then sediment samples for
which amphipod survival is less than 77% would be designated as toxic).
In general, control-adjusted response rates for reference sediment samples
should be used to develop the reference envelope because the negative
control results for multiple batches of samples are  likely to be different;
and,

                                                PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 19

    •  Sediment samples that are  designated as toxic using both the reference
       envelope and control comparison approaches should be identified as those
       that pose the  highest  risks to the benthic  invertebrate community.
       Sediment samples for which the response of the test organism falls within
       the reference envelope should not be designated as toxic and should be
       considered to pose the lowest risks to the benthic invertebrate community.

 Participants at the SFF workshop also indicated that the MSD approach can be used
 to designate sediment samples as toxic or not toxic.  While the MSD approach could
 also be applied at the Portland Harbor site, MSDs have not yet been developed for the
 four toxicity tests that have been used to evaluate the toxicity of sediments at the site.
 While such MSDs are currently under development, they are unlikely to be available
 within the time frame required to support the Portland Harbor BERA (C.G. Ingersoll,
 United States Geological Survey.  Personal communication).

 All of the participants at the SFF workshop recognized that the results of individual
 whole-sediment and pore-water toxicity tests may be used within a WOE framework
 for evaluating risks to the benthic invertebrate community associated with exposure
 to contaminated sediments. Workshop participants agreed that such WOE evaluations
 require information on the magnitude of toxicity in addition to, or instead of, toxicity
 designation information.  Hence, it was generally agreed that the information on the
 magnitude of the response be retained to support further analyses of the toxicity data
 (i.e., WOE evaluations). Such WOE evaluations can be used to classify sediment
 samples into  categories  based on  the magnitude  of risk that they pose to benthic
 invertebrates. However, such categories are not relevant for determining if individual
 samples are toxic or not toxic.
                                                      PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 20

 4.4   Recommended  Procedures  for Developing  a  Reference
       Envelope  for  Interpreting  Data  from  Whole-Sediment
       Toxicity Tests

       Based on the information that was provided to  support this evaluation, a multiple
       category approach has been proposed by USEPA (2008) for the Portland Harbor site.
       We believe that the reference envelope approach will complement the multiple
       category approach by providing a robust and defensible basis for designating sediment
       samples from the study area toxic  or not toxic.  Therefore, it is recommended that
       LWG and USEPA include the reference envelope approach in the process that will
       be used to interpret the results of whole-sediment toxicity tests conducted with
       sediment samples from Portland Harbor (as described in Section 4.3).

       In general, application of the reference envelope approach necessitates identification
       of candidate reference  sites as part  of  the overall sampling program  design.
       Accordingly, LWG (2005b) indicated that whole-sediment toxicity testing would be
       conducted on a total of six upstream ambient stations "to place the results for the
       study area in a regional context". While these data represent an important element
       of the overall sediment  sampling  program, they may not be sufficient to define
       reference conditions for the  Portland Harbor site.   Our experience at other sites
       suggests that about  15 sediment samples  are needed to adequately characterize
       variability in the responses of toxicity test  organisms associated with exposure to
       reference sediments.  It is understood that three rounds of toxicity testing have already
       been completed and that both LWG and USEPA have an interest in completing the
       BERA in a timely manner. Therefore, the following procedure is recommended for
       developing reference envelopes for the toxicity test endpoints that have been used .to
       characterize sediment quality conditions at the Portland Harbor site:

          •   Identify sediment samples from the study area that are representative of
             reference conditions.  Candidate reference sediment samples  can be
             identified on an a posteriori basis by applying a series  of  criteria for

                                                          PORTLAND HARBOR SUPERFUND SITE

-------
AN EVALVATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 21

sediment  chemistry and  sediment toxicity.   More  specifically, the
following criteria for whole-sediment  chemistry are recommended for
identifying candidate reference sample (USEPA 2003; 2005; MacDonald
etal 2007):
-  All measured metals, polycyclic aromatic hydrocarbons (PAHs), and
   polychlorinated  biphenyls (PCBs)  occur at concentrations  below
   conservative sediment quality guidelines (SQGs);
-  Mean PEC-QDW< 0.1;
-  £ESB-TUPAHl<0.1;and,
-  (£SEM-AVS)/foc < 130 jimol/g.

Candidate reference samples that meet the  criteria for whole-sediment
chemistry should be further evaluated to confirm that they were not toxic
to sediment-dwelling organisms.  More  specifically:
   Control-adjusted response rate  should not exceed the MSB for each
   toxicity test endpoint; or,
-  In the absence of MSB values, control-adjusted response rate should
   not exceed the Tier II levels applied in the National Sediment Inventory
   (USEPA 2004);

These biological criteria should  be applied  to ensure that samples for
which the biological response may have been adversely affected due to the
presence of unmeasured COPCs (or COPCs for which SQGs are not
available) are not used in the reference envelope calculation.  Sediment
samples that meet both the chemical and biological criteria should be
selected as reference samples for the study area.

Betermine the normal range of toxicological responses for each toxicity
test  conducted and endpoint measured.  The reference  envelope is
commonly calculated in a  manner  such that  it encompasses 95%  of the
variability in the -response data. While several procedures can be used to
calculate the reference envelope, we recommend calculating the lower limit
                                              PORTLAND HARBOR SUPERFUND SITE

-------
      	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 22

             of the reference  envelope as  the 5th percentile of the control-adjusted
             response data for each toxicity test and endpoint.  It is recommended that
             the response data be log-transformed prior to calculating the 5th percentile
             response level. The normal range of reference responses spans the range
             from the 5th percentile value to the maximum value in the data set.

          •   Designate sediment samples with control-adjusted effect values lower than
             the  lower  limit of the normal range of control-adjusted responses  in
             reference samples (i.e., lower than the 5th percentile) as toxic for the
             endpoint under consideration (see Appendix E2 of the MacDonald et al.
             2002 for a more detailed description of these procedures).

      As indicated in Section 4.3, the  criteria for statistical difference from the control
      would also need to be met to designate a sediment sample as toxic using the reference
      envelope approach.  It is important to note that application of this approach results in
      the designation of toxicity on an endpoint-by-endpoint basis. Therefore, a single
      sample can be designated as toxic  for certain endpoints and not toxic  for other
      endpoints.  This reflects differences in species sensitivity and response to different
      mechanisms of toxic action, as represented by the mixture of contaminants in the
      sediments.
4.5  Recommended Procedures for Integrating Data on Multiple
      Toxicity Test Endpoints

      The concept of pooling multiple  endpoints  for a toxicity  test and/or multiple
      endpoints from multiple toxicity tests has been proposed for interpreting the whole-
      sediment toxicity data for the Portland Harbor site, particularly for use in predictive
      modeling  of sediment toxicity.  It is our recommendation that multiple endpoints
      should not be pooled, either to support interpretation of the whole-sediment toxicity

                                                           PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 23

 data or to support the development of predictive models. Rather, we believe that each
 endpoint provides unique information that can be used to support assessment of risks
 to benthic invertebrates, the development of predictive models, and the derivation of
 site-specific toxicity  thresholds  [including preliminary remediation goals (PRGs)
 and/or clean-up goals].

 From a toxicological  perspective,  organisms can be differentially  sensitive to
 contaminants  because of differences  in exposure conditions, differences  in
 biotransformation rates, and differences in receptor sensitivities to the active toxicant.
 This suggests that each endpoint provides information on the response of the toxicity
 test organism to the mixture of COPCs in the  sediments at the site.  Such responses
 may be different from those of other species or toxicity test endpoints, thereby
 representing a unique response to the exposure. Examples of this can be found in the
 literature where a  species shows responses to different contaminants at different
 concentration levels, even without considering the differences in exposure conditions
 (Hwang et al. 2004).  Figures  1 to 3 provide plots of the relationships between
 amphipod survival and amphipod biomass, midge  survival, and midge biomass at
 another site in the U.S. These results  indicate that  the response of the toxicity test
 organisms are not well correlated with one  another.  That is, these  toxicity test
 endpoints frequently provide unique  information on the toxicity of sediment samples.
 By refining these plots in a way that conveys information on the COPC mixture in
 each sample (e.g., which class of COPC has the largest hazard quotient) or geographic
 location (e.g., area of interest), patterns can emerge that can help interpret the toxicity
 test results. Such information could  be lost if the test results are pooled for different
 endpoints or different toxicity tests.

 Information from multiple toxicity  tests and  multiple toxicity test endpoints can,
 however, be considered together to help prioritize areas of interest within a site that
 may be considered for source control or other sediment management actions. In such
 evaluations, each toxicity test endpoint can provide a unique LOE for assessing
 sediment quality conditions. Sediment samples that are found to be toxic for more
 than one toxicity test endpoint may be assigned a higher priority than those that are

                                                       PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RJSKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 24

 found to be toxic relative to  a  single toxicity test endpoint.  However, it  is also
 important to consider the endpoint measured and the magnitude of the response in
 such a prioritization process.  It is also important to remember that certain COPCs
 and/or COPC mixtures can be especially toxic to certain test organisms (Schuler et
 al. 2006). Therefore, finding a single significant toxic response using the criteria of
 significant  difference from control and the reference envelope approach  would
 suggest  that there are conditions of concern in the sediment (i.e., exposure to such
 sediments poses potential risks to benthic invertebrates). Risk managers must utilize
 this information when  considering  alternatives for addressing such  risks (e.g.,
 collecting additional  information  to  further evaluate the nature  and extent  of
 contamination, to further evaluate sediment toxicity, to identify the factors that are
 causing or substantially contributing to the  observed effects, monitored natural
 attenuation, active remediation).

 From a modeling perspective, focusing on a single endpoint for each model provides
 a more consistent data set than an approach that attempts to combine endpoints. Such
 pooling  of endpoints could easily result in conflicting results, where one endpoint
 provides no hit data and another endpoint provides a hit. This makes the modeling
 less reliable and more variable than would be the case if each endpoint is considered
 separately in the development and evaluation of the various models. This problem
 was clearly evident in the data presented in the LWG (2006) report.

 For the purpose of modeling, survival and biomass are the two toxicity test endpoints
 that should be considered for the amphipod and midge tests. The use of biomass as
 a substitution for the growth endpoint corrects for the problem  that occurs with the
 growth endpoint when changes in nutrient availability due to reduction in numbers
 of organisms in  a replicate influence the growth of surviving organisms  in that
 replicate (i.e., these types of data are evident  in the Round 2 data report). Thus, by
 making a series of models for the different endpoints, each model can be compared
 to  the existing data to determine which performs  the best in terms of correctly
 predicting the presence and absence of toxicity for each sample  (on an endpoint-by-
 endpoint basis).

                                                      PORTLAND HARBOR SUPERFUND SITE

-------
     	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHLC INVERTEBRATE COMMUNITY-PAGE 25

      As indicated above, each endpoint should be evaluated separately for each model. In
      addition, both modeling approaches should use  the same criteria (i.e.,  modified
      reference envelope approach) for what constitutes a hit or no-hit for the toxicity test
      endpoint under consideration.  In this way, the models will be generated using
      comparable data sets and the outputs of the models  can be  directly compared.
      Subsequently, the more reliable models can be identified and selected for use in the
      BERA.  The use of different terms of reference for the two modeling approaches can
      lead to predictions that have different meanings. There is no toxicological reason to
      believe that the criteria for selecting endpoints or designating samples as toxic or not
      toxic should be different for the two models. Thus, for consistency in comparing the
      utility of the models and for understanding the predictions, we recommend that the
      same criteria,  as outlined above, be employed for both modeling efforts.
4.6   Recommended  Procedures  for  Evaluating  Relationships
      Between Sediment Chemistry and Sediment Toxicity

      There are a number of approaches that could be used to evaluate the relationships
      between whole-sediment chemistry and whole-sediment toxicity at the Portland
      Harbor site. Based on the information presented in LWG (2006) and USEPA (2008),
      the logistic regression model and the floating percentile model are the two approaches
      that are currently being considered and tested for the Portland Harbor  site.  These
      models are being developed to provide accurate predictions of sediment toxicity for
      sediment samples for which only whole-sediment chemistry data are available to
      evaluate sediment  quality conditions.  That is,  the  model must result  in the
      identification of toxicity thresholds for COPCs and/or COPC mixtures that provide
      a reliable  basis  for  classifying such sediment samples  as  toxic or  not toxic.
      Accordingly, these models must be able to incorporate all the identified COPCs and
      toxicity test endpoints within the modeling framework.
                                                         PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 26

 The two models that have been identified for use at the Portland Harbor site both have
 the potential to provide risk assessors with the tools needed to support the BERA (i.e.,
 toxicity thresholds that accurately classify sediment samples from the Portland Harbor
 site as toxic and not toxic). Therefore, it is recommended that predictive modeling
 be included in the overall framework that is used to evaluate risks to the benthic
 invertebrate community at the Portland Harbor site.

 The use of matching whole-sediment chemistry and whole-sediment toxicity data
 from the Portland Harbor site in the development of such predictive models represents
 a reasonable approach for deriving toxicity thresholds for COPCs and COPC mixtures
 at the site.  However, there is no reason to believe that data from other freshwater
 sites cannot be used to generate relationships between sediment chemistry and
 sediment toxicity.  While  certain data from other sites could be  fundamentally
 different from those for the site (i.e., due to differences in the underlying geology or
 due to differences in  the binding phases that alter contaminant bioavailability),  the
 toxicity thresholds that are derived using the predictive models will be evaluated to
 determine their performance in terms of predicting toxicity at the Portland Harbor
 site.  The toxicity thresholds that perform the best (i.e., that provide the most accurate
 basis for classifying sediment samples as toxic and not toxic) should be selected to
 support the BERA. Therefore, the use of non-site data in model development does
 not represent a substantive issue relative to application of the various models. On the
 contrary, by using additional data  in model development, the potential for variation
 in response due to differences in habitat or other factors can be incorporated into  the
 model. Therefore, use of non-site data could improve the models that are developed
 for the site.

 In addition  to the two modeling approaches that have been explicitly identified to
 date, there are other modeling approaches that could be used to describe the matching
 sediment chemistry and  sediment toxicity data from the site (see MacDonald et al.
 2003; 2005a; 2005b; 2008 for examples). In addition, it may be necessary to develop
 Area of Interest-specific models to  describe such relationships in areas within the site
 that have unique COPCs, COPC mixtures, or COPC concentration gradients. The

                                                      PORTLAND HARBOR SUPERFVND SITE

-------
      	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY-PAGE 27

      need for additional models should be evaluated following the evaluation of the site-
      wide models that are developed using the LRM and FPM approaches.
4.7  Recommended   Procedures    for   Developing    Toxicity
      Thresholds

      There are a wide variety of approaches that can be used to develop toxicity thresholds
      for COPCs and/or COPC mixtures in sediments. The LRM and FPM approaches that
      have been selected for use at the Portland  Harbor site  both have established
      procedures for deriving toxicity  thresholds based on the modeling results.  These
      procedures are reasonable and can be used to establish candidate toxicity thresholds
      for use in the BERA.

      At this stage of the process, it is important to explicitly identify the narrative intent
      of any toxicity thresholds that are developed using the predictive models.  For
      example, MacDonald  et al. (2003) developed  two types of toxicity thresholds for
      selected COPCs and COPC mixtures.  More specifically, these investigators derived
      low risk and high risk toxicity thresholds for selected COPCs and COPC mixtures.
      The  low risk toxicity thresholds were intended to identify the  concentrations of
      COPCs or COPC mixtures below which adverse effects on benthic invertebrates were
      unlikely to be observed (i.e., fewer than 20% of the sediment samples would be toxic
      to benthic invertebrates).  These low risk toxicity thresholds were established at
      COPC/COPC mixture concentrations that corresponded to a 10% increase in the
      magnitude of toxicity to selected toxicity test organisms, relative to the average
      response rates for toxicity test organisms exposed to reference sediment samples. In
      contrast, the high risk toxicity thresholds were intended to identify the concentrations
      of COPCs or COPC mixtures above which adverse effects on benthic  invertebrates
      were likely to be observed frequently (i.e., more than 50% of the sediment samples
      would be toxic to benthic  invertebrates). These high risk toxicity thresholds were

                                                           PORTLAND HARBOR SUPERFUND SITE

-------
      	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY- PAGE 28

      established at COPC/COPC  mixture  concentrations that corresponded to  a 20%
      increase in the magnitude of toxicity to selected toxicity test organisms, relative to the
      average response rates for toxicity test organisms exposed to reference sediment
      samples.  By explicitly establishing the narrative intent of the toxicity thresholds, it
      is possible to develop criteria for evaluating the performance of the resultant toxicity
      thresholds that directly reflect the intended uses of the toxicity thresholds. Therefore,
      it is recommended that the narrative intent of the toxicity thresholds for the Portland
      Harbor site be explicitly described.  In general, the remedial action objectives that are
      established for the  site will provide a relevant basis for determining the narrative
      intent of the toxicity thresholds.
4.8  Procedures for Evaluating Concentration-Response Models

      LWG (2006) identified seven reliability parameters for evaluating existing SQVs and
      the  model  predictions,  including  false  positives,  false  negatives, sensitivity,
      efficiency, predicted hit reliability, predicted no-hit reliability, and overall reliability.
      However, it is not clear that the narrative intent of these SQVs was considered during
      the evaluation process. For example, the threshold effect levels (TELs) and similar
      values are intended to identify the concentrations of COPCs or COPC mixtures below
      which adverse effects on benthic invertebrates would be infrequently observed (i.e.,
      in fewer than 10% of the samples). In contrast, the probable effect levels (PELs) and
      similar values are intended to identify the concentrations of COPC or COPC mixtures
      above which adverse effects on benthic invertebrates would be frequently  observed
      (i.e., greater than 50% of the sediment samples would be toxic).  It is not clear from
      the analysis presented in LWG (2006) how the narrative intent of the SQVs was
      considered in the  evaluation process.  Without considering information on the
      narrative intent of the SQVs, it is not possible to determine how applicable certain
      SQVs could be for predicting the presence or absence of sediment toxicity  at the
      Portland Harbor site. Therefore, a suite of candidate SQVs should be identified that
      are consistent with the narrative intent of toxicity thresholds for the Portland Harbor

                                                           PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 29

 site and these candidate SQVs should be evaluated using the same criteria and data
 that are used to evaluate the site-specific toxicity thresholds derived using the LRM
 and the FPM.

 As indicated in LWG (2006), evaluation of the toxicity thresholds that are developed
 using the  LRM and  FPM represents the  most important part of the predictive
 modeling process. However, it is essential to establish the narrative  intent of the
 toxicity thresholds that are developed using the predictive models to ensure that the
 evaluation process is fair and relevant. That is,  information on the narrative intent of
 the toxicity thresholds should be used to establish the criteria that will be used in the
 evaluation process.

 Once the evaluation criteria have been established, the models can be developed and
 their performance  can be evaluated relative to the criteria.  Two general types of
 evaluations  are recommended,  including reliability of the toxicity thresholds and
 predictive ability of the toxicity thresholds.  In this context, reliability  is defined as
 the ability of the toxicity thresholds to correctly  classify the sediment samples that are
 used to develop the model as toxic and not toxic.  In contrast, predictive ability is
 defined as the ability of the toxicity thresholds to correctly classify sediment samples
 as toxic and not toxic for an independent data set (i.e., data that were not used in the
 model development process).

 For Portland Harbor, matching sediment chemistry and sediment toxicity data are
 available for more than 300 sediment samples.  Most of these data have been used to
 develop the existing FPMs and LRMs. However, there is a whole new set of data that
 has  been collected (50 samples) which might be excluded from formulation of the
 model and used as a validation data set. Alternatively, the entire data set could be split
 into two sub-sets, one of which could be used  to re-develop the models (i.e., using
 data for about 200 sediment samples) and the second could be  used to evaluate the
 predictive ability of the models (i.e., using the data for about 100 sediment samples).
 If the second approach is used, it may be useful to stratify the data into quartiles
 based on sediment chemistry (e.g., mean PEC-Qs) and randomly select 25 sediment

                                                       PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 30

 samples from each quartile for use in the predictive-ability evaluation.  The remainder
 of the data could be used to develop the models and evaluate their predictive ability.

 The criteria that were established by LWG (2006) could be refined prior to evaluating
 the reliability and predictive ability of the models. More specifically, it may be useful
 to refine the evaluation criteria to  align  them better with the  remedial action
 objectives for the site.  In this case, a low risk toxicity threshold would be considered
 to be reliable and predictive if, for example, the incidence of sediment toxicity is low
 (e.g., <  10%) for sediment samples with COPC or COPC mixture concentrations
 below the toxicity threshold.  In contrast, a high risk toxicity threshold would  be
 considered to be reliable and predictive if,  for example, the incidence of sediment
 toxicity  is high  (e.g., > 50%) for sediment  samples with COPC or COPC mixture
 concentrations above the  toxicity threshold. An intermediate incidence of toxicity
 might be expected  at concentrations of COPCs or COPC mixtures between the low
 risk and high risk toxicity thresholds.  The point is, it is not unreasonable to expect
 that multiple toxicity thresholds may be required to provide risk assessors and risk
 managers with  the tools that they need to  evaluate and manage contaminated
 sediments at the Portland Harbor site.  The  results of the reliability and predictive-
 ability evaluations will provide risk assessors and risk managers with the information
 that they need to select the tools required to support the RI/FS.

 The obvious should also  be pointed out. That is,  none of the models are without
 limitations. Neither model can be considered to provide any direct information about
 cause and effect. Although the Pmax logistic regression model does provide some
 insight.  Both models  are making correlations between a gross chemistry value and
 the observed toxicity response without regard to issues such as bioavailability or the
 mixture  of chemicals at the various  stations.  This is particularly true for the floating
 percentile model that does not attempt to address mixture response in any manner but
 uses the correlations for each chemical to produce a separate acceptable value  for a
 specific  chemical.  The logistic regression model can use either a sum probability or
 the more usual  probability max  approach to incorporate response addition as the
 likely interaction of compounds in the sediment (Field et al. 2002). It would  be

                                                      PORTLAND HARBOR SUPERFUND SITE

-------
     	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 31

      helpful to present the results of LRM  for both Pmax and Pavg because the two
      versions of the COPC mixture model can provide different information about the
      sediment samples.  The logistic  regression  approach has been peer reviewed and
      published to provide additional reliance in its acceptability. However, the model that
      is  selected for use in Portland Harbor should be the one that provides  the best
      predictions of toxicity after fully developing the models and comparing the results to
      a validation data set.
4.9   Recommended  Procedures  for  Assessing Risks to Benthic
      Invertebrates

      A WOE approach is recommended for assessing risks to benthic invertebrates at the
      Portland Harbor site (as described in Section 4.1).  Models are not perfect and all
      LOEs should be employed to make the best decision possible about the status of a
      station.  It is particularly important to consider the spatial data if the model predicts
      a different result than is observed  at nearby stations.  Then, depending on the
      importance of the decision to be made, additional sampling and analysis (including
      additional toxicity testing) may be required.
5.0  Summary  and  Conclusions

      Over the past few years, the LWG and USEPA have prepared a variety of technical
      reports and engaged in a number of technical discussions in an effort to come to
      agreement on  the  procedures that should  be  used to  evaluate risks to  benthic
      invertebrates at the Portland Harbor Superfund Site.  While substantial progress has
      been made in certain areas (e.g., sediment sampling and characterization), there are
      several issues that have not yet been resolved. This is important because both LWG

                                                         PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY-PAGE 32

 and USEPA are interested in completing  the BERA component of the RI/FS and
 uncertainty regarding these outstanding issues is likely to impede progress towards
 this goal.

 Recognizing that several key issues need to be resolved in the near-term to keep the
 project on schedule, LWG and USEPA agreed to have Don MacDonald and Peter
 Landrum conduct an independent evaluation of the various approaches for assessing
 risks to benthic invertebrates at the Portland Harbor site. To facilitate this evaluation,
 the various documents pertaining to the benthic invertebrate portion of the BERA,
 prepared by LWG or USEPA, were provided to these reviewers. In addition, the
 reviewers  were provided with access to the data and information that have  been
 collected  to date at the site.  Furthermore,  the reviewers were provided  with
 background information considered to be particularly relevant to understanding the
 unresolved issues.

 This document summarizes the recommendations that are offered by Don MacDonald
 and Peter Landrum for assessing risks to benthic invertebrates at the Portland Harbor
 Site. More specifically, Section 4.1 to 4.8 of this document outline the recommended
 procedures for assessing risks to the benthic  invertebrate community  at the site.
 These recommendations are summarized  in the following responses to the seven
 questions that were posed to help structure this review:

    1.  What hit/no-hit  criteria  should be applied  to the empirical sediment
       toxicity tests?

       Response: The whole-sediment toxicity data should be designated as toxic
       (hit) or not toxic (no hit) using the modified reference envelop approach
       (as  described in Section 4.3). In this approach, the toxicity of sediment
       samples is evaluated on an endpoint-by-endpoint basis.  A sediment sample
       is designated as toxic for a specific endpoint if the response of the toxicity
       test organism exposed to sediment from the site is significantly greater than
       the  response  of toxicity test  organisms exposed to negative  control

                                                      PORTLAND HARBOR SUPERFUND SITE

-------
   AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 33

   sediment and if the response falls outside the normal range of responses for
   reference  sediment samples (i.e., outside the reference envelope).  It is
   clear from the information in LWG (2006) that the Level 1 hit/no hit
   criteria includes  samples in the hit category  that are not statistically
   different from reference conditions. This  decision likely added variability
   to the modeling exercise.
2.  What pooling ofendpoints, if any, should be applied for use in each of the
   predictive  models?   Pooling may  include pooling the growth  (total
   biomass) and mortality endpointsfor each test organism (2 endpoints) or
   both  test organisms (1 endpoint)  and the application of the  RSET
   one-hit/2-hit criteria.

   Response: Endpoints  should not be pooled,  either for  the  purpose of
   interpreting toxicity test results or for the purpose of developing predictive
   models and the associated toxicity thresholds.  Each endpoint provides
   potentially unique  information  about the station and a hit from  one
   endpoint should  be sufficient to question the character of the station.
   Therefore, survival and biomass of midge and survival  and  biomass of
   amphipods are the four endpoints that should be evaluated in the predictive
   modeling process.
3.  What hit/no-hit criteria should be applied for the logistic regression and
   floating percentile models? Note that one, two or three criteria may be
   applied to each endpoint and each model. However, this will increase the
   amount of work required to develop the models.

   Response: The toxicity designations that are used to support interpretation
   of the results of the empirical whole-sediment toxicity tests should be used
   in evaluating both of the predictive models (i.e., LRM and FPM) because

                                                   PORTLAND HARBOR SUPERFVND SITE

-------
   AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 34

   there is no toxicological justification for selecting different criteria for
   different modeling structures.
4.  Should non-site data be considered in the development of the logistic
   regression model?

   Response: There is no reason why non-site data cannot be used to develop
   either the LRM or the FPM. The most important step in the process is to
   evaluate the performance of the models utilizing the site-specific data.
   Only those models that have the best performance and least uncertainty
   should be used in the BERA. The data set for Portland Harbor is relatively
   small  for model development  purposes,  so it makes  sense to  use
   appropriate non-site  data  if this leads  to  improved model prediction
   (performance).
5.  Once the models have been run, what analysis, if any, should be performed
   to optimize model performance?

   Response:  The performance of the  models should be evaluated by
   determining the reliability and predictive ability of the toxicity thresholds
   that are derived using the models. While the reliability of the models was
   evaluated in the LWG (2006) document using seven criteria, these criteria
   should be refined to better reflect the narrative intent of the toxicity
   thresholds that are being evaluated and the remedial action objectives that
   are established for the site. Other candidate sediment quality values should
   also be evaluated using these site data to determine which ones may be the
   most reliable for evaluating risks to sediment-dwelling organisms at the
   Portland Harbor site. The results of such evaluations will provide a basis
   for determining  which  model  provides the  most  accurate  basis  for
                                                   PORTLAND HARBOR SUPERFUND SITE

-------
   AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE 35

   predicting toxicity at sampling locations for which sediment-chemistry data
   represent the principal LOE for assessing risks to benthic invertebrates.

   It is important to evaluate models equally and consistently using the data
   from the site.  Therefore, model performance should be evaluated on an
   endpoint-by-endpoint basis. Subsequently, these results can be integrated
   to determine overall model performance at the site. The uncertainty of the
   model predictions should be provided as part of the information .to allow
   for improved interpretation of the model prediction.

   The reliability of the toxicity thresholds should be evaluated using the data
   that were used to develop the models. The predictive ability of the toxicity
   thresholds should be  evaluated using an independent data set.  In this
   respect, there should be a portion of the data set that is set aside for model
   validation that is not used for  model  development.   Testing on an
   independent data set is generally accepted as the appropriate approach to
   evaluating  model performance.  The independent  data set  should be
   representative of the data as a whole for both contaminant concentrations
   and organism response. We recognize that the data set for Portland Harbor
   is relatively small for the purpose of model development, however;  it
   should be possible to set aside 20 to 30% of the data for a validation set.
   The size of the Portland Harbor data set is one of the reasons that inclusion
   of non-site data for the development of the model should be considered.
6.  Should the predictive models be used at all given their reliability?

   Response:  Insufficient model development  and evaluation has been
   completed to fully assess the reliability of the predictive models that are
   proposed  for  use at the site.  Therefore, it  is recommended  that a
   systematic model development process be undertaken to create high-quality
   models. Subsequently, the model results should be evaluated to determine

                                                  PORTLAND HARBOR SUPERFUND SITE

-------
   AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 36

   how well the resultant toxicity thresholds predict the presence and absence
   of sediment toxicity at the Portland Harbor site.  If the results of these
   evaluations  show that one or both of the models cannot  be  applied to
   reliably predict the presence and absence of sediment toxicity throughout
   the site, additional toxicity testing should be conducted in the areas where
   the models  are thought to be unreliable.  Alternatively, area-specific
   models might be developed that provide a more reliable basis for predicting
   sediment toxicity in specific areas.
7.  How should the results of the predictive models be used, in conjunction
   with other site data, in a \veight-of-evidence evaluation aimed at assessing
   risk to the benthic community?

   Response:  Risks to benthic invertebrates associated with exposure to
   sediments at the Portland Harbor site should  be  evaluated differently,
   depending on the types of data that are available for a sampling location.
   If the minimum whole-sediment toxicity data (i.e., survival and biomass of
   midge in 10-d exposures and survival and biomass of amphipods in 28-d
   exposures) are available for a sampling location, then these data should be
   used preferentially to assess risks to benthic invertebrates  (as stated in
   LWG   2006).   If the requisite whole-sediment  toxicity data are not
   available for a sampling location, then the most reliable predictive model
   should be used, in conjunction with any toxicity data that are available, to
   assess risks to benthic invertebrates.  In addition, the prediction should be
   compared to nearby stations of similar characteristics (chemistry, geology,
   etc.) that include toxicity information to help inform whether to trust the
   prediction results.  Even comparison to stations that are some distance
   away, but have similar physical/chemical characteristics and have toxicity
   information, could lead to improved interpretation of the validity of the
   prediction.   Furthermore, the  potential for  a  station to  follow  a
   concentration/toxicity gradient can add information about the  validity of

                                                   PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY- PAGE 37

       the prediction.  Examination of the data for the samples where chemistry
       and toxicity are not well correlated can provide additional insights on the
       bioavailability  of COPCs.   In any  case  where  the  prediction seems
       questionable, additional chemical and/or toxicity testing is recommended
       to resolve the issue.

 In response to a preliminary review by USEPA personnel, an addendum was prepared
 to further clarify some of the responses included in this document.  This addendum
 is attached to this document.
                                                       PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 38


6.0  References


      ASTM (American Society for Testing and Materials). 2007. Standard test method
         for measuring the toxicity of sediment-associated contaminants with freshwater
         invertebrates. El706-00.  Section Eleven: Water and Environment Technology.
         West Conshohocken, Pennsylvania.

      Field, L.J., D.D. MacDonald, S.B. Norton, C.G. Ingersoll, C. Severn, D.E. Smorong,
         and  R.A. Lindskoog.  2002.   Predicting amphipod  toxicity  from sediment
         chemistry using logistic  regression  models.  Environmental Toxicology and
         Chemistry 21:1993-2005.   ;

      Hwang, H.,  S.W. Fisher, K. Kim, and P.F. Landrum. 2004.  Comparison of the
         toxicity using body residues of DDE  and  select  PCB  congeners to the midge,
         Chironomus riparius, in  partial-life cycle tests.  Archives of Environmental
         Contamination and Toxicology 46:32-42.

      LWG (Lower Willamette  Group).   2004.  Portland Harbor RI/FS.  Technical
         memorandum: Provisional toxicity reference value selection for the  Portland
         Harbor preliminary ecological risk assessment. Portland, Oregon.

      LWG (Lower Willamette Group). 2005a. Portland Harbor Superfund Site Ecological
         Risk Assessment: Estimating Risks  to Benthic  Organisms Using Sediment
         Toxicity  Tests.   Prepared by Windward Environmental, TerraStat Consulting
         Group, and Avocet Consulting .  Seattle, Washington.

      LWG (Lower Willamette Group). 2005b.  Portland Harbor RI/FS. Round 2A data
         report. Sediment toxicity testing.  Prepared by Windward Environmental. Seattle,
         Washington.

      LWG (Lower Willamette Group). 2006. Portland Harbor Superfund Site Ecological
         Risk Assessment: Interpretative Report: Estimating Risks to Benthic Organisms
         Using Predictive Models Based on Sediment Toxicity Tests.   Prepared  by
         Windward Environmental, TerraStat Consulting Group, and Avocet Consulting.
         Seattle, Washington.

      MacDonald, D.D.,  C.G.  Ingersoll, D.R.J. Moore, M. Bonnell, R.L. Breton, R.A.
         Lindskoog, D.B. MacDonald, Y.K. Muirhead, A.V. Pawlitz, D.E. Sims, D.E.
         Smorong, R.S. Teed,  R.P. Thompson, and N. Wang. 2002.  Calcasieu Estuary
         remedial  investigation/feasability  study  (RI/FS):   Baseline ecological  risk


                                                         PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE 39

    assessment (BERA).   Technical  report plus  appendices.   Contract No.
    68-W5-0022. Prepared for CDM Federal Programs Corporation and United States
    Environmental Protection Agency.  Dallas, Texas.

 MacDonald, D.D., R.L. Breton, K. Edelmann, M.S. Goldberg, C.G. Ingersoll, R.A.
    Lindskoog, D.B. MacDonald, D.R.J. Moore, A.V. Pawlitz, D.E. Smorong, and
    R.P. Thompson. 2003. Development and evaluation of preliminary remediation
    goals for selected contaminants of concern at the Calcasieu Estuary cooperative
    site, Lake Charles,  Louisiana.   Prepared for  United  States Environmental
    Protection Agency, Region 6. Dallas, Texas.

 MacDonald, D.D., C.G. Ingersoll, D.E. Smorong, L. Fisher, C. Huntington, and G.
    Braun.   2005a.   Development  and  evaluation of risk-based preliminary
    remediation goals for selected sediment-associated contaminants of concern in the
    West Branch of the Grand Calumet River. Prepared for:  United States Fish and
    Wildlife Service.  Bloomington, Indiana.

 MacDonald, D.D., C.G. Ingersoll, A.D. Porter, S.B Black, C. Miller, Y.K. Muirhead.
    2005b. Development and evaluation of preliminary remediation goals for aquatic
    receptors in the Indiana Harbor Area of Concern.  Technical Report. Prepared for:
    United States Fish and  Wildlife Service.  Bloomington, Indiana and Indiana
    Department of Environmental Management. Indianapolis, Indiana.

 MacDonald, D.D., D.E.  Smorong, D.G. Pehrman, C.G. Ingersoll, J.J. Jackson, Y.K.
    Muirhead, S. Irving, and C. McCarthy.  2007. Conceptual field sampling design
    - 2007 sediment sampling program of the Tri-State Mining District.  Prepared for
    U.S. Environmental Protection Agency.  Region VI, Dallas, Texas. Region VII,
    Kansas City, Kansas.

 MacDonald, D.D., D.E. Smorong, C.G. Ingersoll, J.M. Besser, W.G. Brumbaugh, N.
    Kemble, T.W. May, S. Irving, and M. O'Hare. 2008. Evaluation of the matching
    sediment chemistry  and sediment toxicity  in the Tri-State Mining  District
    (TSMD), Missouri, Oklahoma, and Kansas.  Preliminary Draft.  Prepared for
    United States Environmental Protection  Agency Region  6, and Region  7 and
    United States Fish and Wildlife  Service, Columbia, Missouri. Prepared  by
    MacDonald Environmental Sciences Ltd. Nanaimo, British Columbia. United
    States  Geological Survey. Columbia, Missouri and CH2M Hill. Dallas, Texas.
                                                    PORTLAND HARBOR SUPERFUND SITE

-------
	AN EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE40

 Phillips, B.M., J.W. Hunt, B.S. Anderson, H.M. Puckett, R. Fairey, C.J. Wilson, and
    R. Tjeerdema.  2001.  Statistical significance of sediment toxicity test results:
    Threshold values derived by the detectable significance approach. Environmental
    Toxicology and Chemistry 20:371-373..

 Schuler,  L.J.,  P.P.  Landrum, and  M.J.  Lydy. 2006.  Comparative toxicity  of
    fluoranthene  and  pantachlorobenzene  to   three  freshwater  invertebrates.
    Environmental Toxicology and Chemistry 25:985-994.

 SFF (Sustainable Fisheries Foundation). 2007. Workshop to support development
    of guidance on the assessment of contaminated sediments in British Columbia.
    Prepared for B.C. Ministry of Environment. Victoria, British Columbia.

 Thursby,  G.B., J. Heltshe, and K.J. Scott.  1997.  Revised approach to toxicity test
    acceptability criteria using a statistical performance assessment. Environmental
    Toxicology and Chemistry 16(6):1322-1329.

 USEPA (United States Environmental Protection Agency). 2003. Procedures for the
    derivation of equilibrium partitioning sediment  benchmarks (ESBs) for the
    protection of benthic organisms:  PAH mixtures. EPA-600-R-02-013. Office  of
    Research and Development. Washington, District of Columbia.

 USEPA (United States Environmental Protection Agency). 2004. The incidence and
    severity of sediment contamination in surface  waters  of the United  States.
    National sediment quality survey: Second edition (updated). EPA 823-R-02-013.
    Office of Research and Development. Washington, District of Columbia.

 USEPA (United States Environmental Protection Agency). 2005. Procedures for the
    derivation of equilibrium partitioning  sediment  benchmarks (ESBs)  for the
    protection of benthic organisms: Metal mixtures (cadmium, copper, lead, nickel,
    silver, and zinc).  EPA-600-R-02-11.  Office of  Research and Development.
    Washington, District of Columbia.

 USEPA  (United States  Environmental Protection  Agency).   2008.   Problem
    formulation for the baseline ecological risk assessment of the Portland Harbor site.
    USEPA Region X. Portland, Oregon.
                                                     PORTLAND HARBOR SUPERFUND SITE

-------
Figures

-------
Figure 1.  Scatter plot showing the relationship between amphipod
          (Hyalella azteca; HA) survival and biomass (n = 76).
ZJU -
s*
o^
C/5
S
| 200 -
o
3
•o
3 150 -
₯
•
"o
+-*
I loo-
T3
oo
1 50-

OH
1 :
n <
Not Toxic to
HA Survival

Not Toxic to
HA Biomass
Toxic to HA Survival * .
Not Toxic to HA Biomass
• ••
• • _
•• .%
00
^*J
% t&L '
• 1 Not Toxic to
A HA Survival
• Toxic to HA Survival
« 0 Toxic to HA Survival Toxic to HA
A 9 Biomass
iff. 	
              10   20    30    40    50   60   70   80   90   100  110  120

                     Amphipod 28-d Control-adjusted Survival (%)
                                                                    PageF-1

-------
Figure 2.   Scatter plot showing the relationship between amphipod
           (Hyalella azteca, HA) survival and midge (Chironmus dilutus; CD)
            survival (n = 76).

8
13
>
•3
•o
C/3
^
1
8
-4— t
O
O
i
o

-------
Figure 3.   Scatter plot showing the relationship between amphipod
            (Hyalella azteca; HA) survival and midge (Chironomus dilutus; CD)
            biomass (n = 76).
       350
       300 -
    I  250
   m
    %  200
    |  150
    o
    U
    2  100
    00
   S   50
Toxic to HA
Not Toxic CD
 Toxic to HA
 Toxic to CD
                            I




                            I Not Toxic to HA
                             N^t Toxic to CD
                                                          •it
                                                        !<•:•.
 • •
Not Toxic to HA
Toxic to CD
               10    20    30    40    50    60   70   80   90   100   110   120

                      Amphipod 28-d Control-adjusted Survival (%)
                                                                       Page F-3

-------
Addendum 1

-------
 ADDENDUMTO THE EVALUATION OF THE APPROACH FOR ASSESSING RJSKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGEA-1

Addendum  1      Further Evaluation of the Approach for
                       Assessing   Risks    to    the    Benthic
                       Invertebrate Community at the Portland
                       Harbor Superfund  Site
Al.O Introduction
      In response to a request by the U.S. Environmental Protection Agency (USEPA) and
      the Lower Willamette Group (LWG), Don MacDonald and Peter Landrum conducted
      an  independent evaluation of the approach for assessing risks to  the benthic
      invertebrate community at the Portland Harbor Superfund site (MacDonald and
      Landrum 2008).  Following submission, the document was reviewed by several
      members of the USEPA Technical Team.  This review resulted in the identification
      several additional questions that needed to be answered to enhance the clarity of the
      original document.  This addendum to the original report is intended to address the
      additional questions that were posed by the USEPA Technical Team, as well as
      several issues that were not sufficiently discussed in the original document.
A2.0 Responses to Additional Questions
      Four additional questions were posed by the USEPA Technical Team in an effort to
      achieve greater clarity in the 'recommendations offered by MacDonald and Landrum
      (2008). These questions are presented below, along with our responses.

      Question  1:   In  Section   4.6  (Recommended  Procedures  for  Developing
         ToxicityThresholds), you  discuss the "narrative intent" of toxicity thresholds as
         an important element of developing the specific quantitative threshold values to
         be used in Portland Harbor.  Even though you mention some examples and
         provide a citation, it was not entirely  clear to us what quantitative thresholds
         should be used to support the  "low-risk" and "high-risk" toxicity thresholds,
         whether two risk thresholds is sufficient, and what specific steps, if any, would
         need to be taken to use the narrative intent to develop quantitative thresholds. Are
         different quantitative thresholds needed for each of the four empirical toxicity test
                                                      PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGEA-2

         results? Also, are these determinations made a priori or a posteriori to analysis
         of the Portland Harbor toxicity data, and to what extent are site data used? In
         general, additional detail regarding the scientific basis and specific steps needed
         would be helpful.
         Response: Section 4.6  of MacDonald  and Landrum (2008)  describes our
            recommendations relative to the development of toxicity thresholds for the
            Portland Harbor site. However, these follow-up questions make it clear
            that our original text was not sufficiently detailed to enable the reader to
            fully understand the recommended procedures. For this reason, we would
            like to offer the following clarifications to make our recommendations
            more accessible. More specifically, we believe that toxicity thresholds for
            the Portland Harbor site should be developed using a step-wise process.
            The steps in this process include:

            •   Develop remedial action objectives (RAOs);
                Define the purpose of the toxicity thresholds;
            •   Establish the narrative  intent of the toxicity  thresholds;
            •   Establish criteria for evaluating the toxicity  thresholds;
            •   Establish procedures for designating sediment samples as toxic or not
                toxic;
                Apply the procedures  for toxicity designation  and assign toxicity
                designations for each endpoint;
            •   Develop concentration-response models using the matching sediment
                chemistry and toxicity  data;
            •   Derive toxicity thresholds;
            •   Evaluate  the  reliability and/or  predictive  ability of the  toxicity
                thresholds.

            Each of these steps in the  process are briefly clarified in the following
            sections of this response.

            Develop Remedial Action Objectives - RAOs are narrative statements that
            describe the intent of any remedial actions that are undertaken to protect

                                                            PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUMTO THE EVALUATION OF THE APPROACH FOR ASSESSING RJSKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGEA-3

            human health and the environment at a contaminated site.  For example,
            the  RAOs for whole sediment  at the  Portland  Harbor site might  be to
            minimize or prevent exposure to whole  sediments  that are sufficiently
            contaminated to pose moderate or high risks to the microbial or benthic
            invertebrate communities.  Such RAOs describe the desired future of the
            condition of sediments at  the site relative to the risks that they pose to
            human health and/or ecological receptors. Therefore, the RAOs provide
            important guidance to risk assessors on the establishment of the narrative
            intent of the toxicity threshold that will be used in the Baseline Ecological
            Risk Assessment (BERA) and/or Feasibility Study (FS).
            Define the Purpose of the Toxicity Thresholds - For the Portland Harbor
            site, numerical toxicity thresholds  are required to satisfy two important
            needs.  First, toxicity thresholds are needed to support the BERA. In this
            application, the toxicity thresholds  are needed to classify chemistry-only
            sediment samples into categories based on  the risks  that they pose to
            benthic invertebrates. Second, toxicity thresholds are needed to support
            the FS. In this application, the toxicity thresholds are needed to establish
            preliminary remediation goals (PRGs; i.e., risk-based tools for evaluating
            remedial  options at the site) that can be used to evaluate the costs and
            benefits associated with various remedial options. At other sites, we have
            endeavored to establish  toxicity thresholds that  could  be  consistently
            applied within the BERA  and the FS. In this way, there is a direct linkage
            between the toxicity  thresholds  that are used to evaluate risks to benthic
            invertebrates and the toxicity thresholds that are used to establish clean-up
            goals (e.g.,  PRGs; i.e., RAOs inform the narrative intent of the toxicity
            thresholds,  which informs selection  of toxicity thresholds based  on
            reliability and predictive  ability analyses, which inform the selection of
            PRGs).
            Establish the Narrative Intent of the Toxicity Thresholds - Virtually all
            approaches to the development of sediment quality guidelines (SQGs) are
            linked to a narrative that describes the purpose or intent of the resultant
            SQGs. This narrative intent has been described in various publications and
            summarized  for selected  national SQGs in Wenning  et  al. (2002).
                                                           PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RJSKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGEA-4

            Importantly, the narrative intent of the SQGs provides risk assessors with
            essential guidance on the appropriate uses of the  SQGs and relevant
            information for establishing criteria for evaluating how well the SQGs
            work at specific sites.   For example, a threshold effect level (TEL) is
            intended to identify the concentration of a chemical of potential concern
            (COPC) below which adverse effects on benthic invertebrates are likely to
            be observed only infrequently.  Therefore, a TEL  should  be used to
            identify conditions where the concentrations  of a  specific  COPC are
            unlikely to cause or substantially  contribute to sediment toxicity.  In
            addition, TELs should  be  considered to be reliable if there is a low
            incidence of toxicity (IOT; i.e., <10%) for sediment samples that  have
            COPC concentrations below the TELs for all measured substances. A TEL
            should  not, necessarily,  be  evaluated to determine how well it predicts
            toxicity because TELs were not designed for this purpose.

            Numerical toxicity thresholds (i.e.,  site-specific sediment quality values;
            SQ Vs) have been identified as important tools for assessing risks to benthic
            invertebrates at the Portland Harbor site. As such, it would be beneficial
            to clearly articulate the narrative intent of the toxicity thresholds that will
            be used in the BERA process and/or to establish target clean-up goals (i.e.,
            PRGs).  The narrative intent of the SQVs should be consistent with the
            RAOs that are established for the site.  More specifically, numerical SQVs
            are required to identify sediment samples at the Portland Harbor site that
            pose low risks to benthic invertebrates (i.e.,  below which there would be
            a low IOT; e.g., <20% of the samples would be predicted to be toxic).
            Remedial measures are  unlikely to be required to address risks to the
            benthic invertebrate community at  locations with COPC concentrations
            below the  low-risk SQVs. In addition, numerical SQVs are  required to
            identify sediment samples that pose high risks to benthic invertebrates (i.e.,
            above which there would be a high IOT; e.g., > 50% of the samples would
            be predicted to be toxic). Remedial measures may be required to address
            risks to the benthic  invertebrate community at locations with COPC
            concentrations above the high-risk SQV.   Such low-risk and high-risk
            SQVs would also result in the identification of COPC concentrations that
            would be predicted to be associated with a moderate IOT; e.g., 20 to  50%
            of the  samples  would  be predicted to be toxic).   Additional  data
            interpretation and/or toxicity testing may be required at locations  with
                                                           PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE A-5

            COPC concentrations that fall between the low-risk and high-risk SQVs.
            This approach would be consistent with the one used in the Calcasieu
            Estuary  to  support the  derivation  of toxicity thresholds for use in the
            BERA and the FS (MacDonald et al 2002; 2003).

            The Calcasieu Estuary example illustrates one option for establishing the
            narrative intent of the SQGs. It may be that there  is a need to establish
            additional categories for assessing risks to benthic invertebrates in Portland
            Harbor.  For example, the State of California established a set of criteria
            for  placing sediment samples into each of four  categories, based on
            potential for toxicity to benthic invertebrates (i.e., non-toxic,  low toxicity,
            moderate toxicity, and high toxicity). The toxicity thresholds that were
            established for various COPCs and COPC mixtures reflected the narrative
            intent   of  the   categories   (see  http://www.sccwrp.org/sqo/pubs/
            503_toxicity_indicator_methods.pdf). These thresholds were explicitly
            developed to facilitate  classification of sediment samples into these
            categories using data on sediment toxicity and/or sediment chemistry (See
            http://www.sccwrp.org/sqo/pubs/543_ChemToxSQGComparison_Draft
            _10_24_07.pdf).  While the two examples described here illustrate  two
            options  for describing the narrative intent of  SQGs, the  numbers of
            categories for which the narrative is established  depends on the needs of
            the manager.

            It is recommended that SQVs  be  established for  all  four of the endpoints
            (i.e., amphipod  survival, midge survival, amphipod biomass, and midge
            biomass) examined at the Portland Harbor site because the organisms may
            be differentially sensitive by endpoint  and/or  by species  to  different
            mixtures of chemicals in the sediment. However, the narrative intent of the
            SQVs developed using the models for each endpoint should be similar, at
            least at the outset. Following model and SQV development, the reliability
            and predictive ability evaluations will provide the information needed to
            determine the relative  sensitivity  of each endpoint  and  the level of
            protection that SQGs derived for various endpoints will afford toxicity test
            organisms overall.

            Establish Criteria for Evaluating the Toxicity Thresholds - Once the
            narrative intent of the SQVs has been established, it is possible to establish


                                                           PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGEA-6

            criteria for evaluating the site-specific toxicity thresholds. For the above
            example, the low-risk SQVs would be considered to be reliable if there is
            a  low  IOT   (i.e.,  <20%)  for sediment  samples  that  have COPC
            concentrations below the low-risk SQVs for all measured substances.  In
            contrast, the high-risk SQVs would be considered to be reliable if there is
            a  high  IOT  (i.e.,  >50%) for sediment  samples  that  have COPC
            concentrations above the high-risk SQVs for all measured substances.  In
            addition, a low-risk/high-risk pair of SQVs for a COPC or COPC mixture
            would be considered to be reliable if there is an moderate IOT when COPC
            concentrations fall between the two SQVs (i.e.,  20 to 50% IOT).   This
            example illustrates the need to establish a  direct linkage between the
            narrative intent of the SQVs and the criteria that are used to evaluate the
            SQVs.
           Establish Procedures for Designating Sediment Samples as Toxic or Not
           Toxic - Both of the modeling approaches that have been selected for use
           at the Portland Harbor site rely on hit/no hit designations of the sediment
           samples used in the development of the predictive models. Section 4.3 of
           MacDonald and Landrum (2008) describes our recommended procedures
           for determining if individual sediment samples are toxic or not toxic to
           benthic invertebrates (i.e., reference envelope approach). This approach
           can be applied to designate sediment samples as toxic or not toxic for each
           of the toxicity test endpoints selected for assessing whole-sediment toxicity
           at the Portland  Harbor site.  Recommended approaches  for selecting
           reference stations  are described in Section A4.1 of this document.  In
           addition,  the recommended criteria for identifying  reference sediment
           samples are presented in Section A4.2 of this document. The criteria for
           evaluating  candidate  reference  samples  presented in  Section  A4.2
           supercedes the criteria listed in Section 4.3 of MacDonald and Landrum
           (2008).
           Assign Toxicity Designations to Sediment Samples - As indicated above,
           MacDonald and Landrum (2008) recommended procedures for designating
           sediment  samples  from  Portland  Harbor  as  toxic  or  not toxic.
           Implementation of  these  and/or  alternate procedures will  facilitate


                                                          PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUMTO THE EVALUATION OF THE APPROACH FOR ASSESSING RJSKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGEA-7

            designation of each sediment sample from the study area as toxic or not
            toxic on an endpoint-by-endpoint basis. That is, each sediment sample will
            have at least  four toxicity designations (i.e., based on amphipod survival,
            amphipod biomass, midge survival, and midge biomass). These toxicity
            designations  should directly support the development of predictive models
            for each of the four toxicity test endpoints and each of the COPCs/COPC
            mixtures that are relevant to the site.
            Develop Predictive Models (i.e., Concentration-Response Models) - As
            indicated by MacDonald and  Landrum (2008),  there are a variety of
            approaches that could be used  to evaluate relationships between whole-
            sediment chemistry and whole-sediment toxicity at the Portland Harbor site
            (See  Section 4.5).  The logistic  regression model  (LRM) and floating
            percentile model (FPM) are likely to provide useful tools for evaluating
            relationships between the concentrations of COPCs/COPC mixtures in
            Portland Harbor sediments and the responses of benthic  invertebrates
            (i.e.,amphipod  survival, amphipod biomass, midge  survival, and midge
            biomass).  In addition, the site-specific sediment chemistry and sediment
            toxicity  data could be used  to develop  concentration-response  models
            based on magnitude of toxicity (MOT; e.g., control-adjusted survival of
            amphipods).  Furthermore, area of interest-specific  models  could be
            developed to better explain the relationships between sediment chemistry
            and sediment toxicity if the site-wide models are not sufficiently reliable
            to accurately predict the presence or absence of sediment  toxicity.

            Based on our review  of the existing models  and their performance, it
            appears  that grain size (i.e., percent fines) is the metric  that  is best
            correlated with the responses of benthic invertebrates exposed to Portland
            Harbor sediments. While these  results could reflect the physical effects of
            grain  size,  the  toxicity  test  organisms that were selected to evaluate
            Portland Harbor sediments are not highly sensitive to grain size (USEPA
            2000; ASTM 2007).  Therefore, it is more  likely that percent fines
            represents  a general surrogate for contamination  in Portland  Harbor
            sediments.  That percent fines is better correlated with sediment toxicity
            than any of the measured COPCs qr COPC mixtures likely indicates that
            a  variety of measured and/or  unmeasured substances are causing or
                                                          PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGEA-8

            substantially contributing to the observed toxicity in these sediments. This
            information strengthens the position that multiple chemical concentration
            gradients occur within Portland Harbor sediments. If this is the case, then
            it is unlikely that site-wide predictive models for individual COPCs or
            simple COPC mixtures (e.g., tPAHs, tDDTs, tPCBs) will  provide highly
            reliable bases for classifying  sediment samples as toxic or not toxic to
            benthic invertebrates. If this is the case, area of interest-specific predictive
            models may be required to  improve the reliability and predictive ability of
            the  models.  Alternatively, other  data  collection and/or interpretation
            approaches may be required to support remedial decisions at the site.
            Derive  Toxicity  Thresholds  - As  indicated  above, two modeling
            approaches have been selected to support evaluation of risks to benthic
            invertebrates  at the Portland Harbor site.  Both the logistic regression
            model (LRM) and floating percentile model (FPM) approaches can be used
            to derive numerical toxicity thresholds (i.e., SQVs) for individual COPCs
            and/or  COPC mixtures.  Both  approaches provide information on the
            probability of observing toxicity to benthic invertebrates  based on the
            measured  concentrations of COPCs/COPC mixtures in sediments (i.e.,
            these models  are IOT based rather than MOT based).

            At other sites that we have worked on (e.g., Calcasieu Estuary, Tri-State
            Mining District), two types of toxicity thresholds  were established to
            support the BERA and FS processes, including low-risk toxicity thresholds
            and   high-risk  toxicity  thresholds  [as  described in  Section  4.6  of
            MacDonald and Landrum (2008)].  Both of these toxicity threshold types
            were developed to correspond to pre-selected magnitudes of toxicity
            (MOT; i.e., 10% and 20%  increase in the MOT  relative to reference
            conditions, respectively).  The  MOTs were selected jointly by the risk
            assessors, the risk managers, and the Natural Resources Trustees, and were
            considered to be consistent with the RAOs for the sites.  The low-risk and
            high-risk toxicity thresholds were derived from the concentration-response
            relationships  developed  for each COPC/COPC mixture-toxicity test
            endpointpair of interest at the site (Figure  Al; see MacDonald et al. 2003;
            2005a; 2005b for more information).
                                                           PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE A-9

            It is our understanding that the two  modeling approaches  selected to
            support evaluation of risks to benthic invertebrates provide information on
            the  probability of observing toxicity to benthic invertebrates (i.e., IOT
            rather than MOT). Our experience suggests that toxicity thresholds based
            on IOT and on MOT can be generally consistent, with a 10% increase in
            the  MOT roughly corresponding to a 20%  increase in the IOT.  Toxicity
            thresholds based on a 20% increase in the MOT generally correspond to
            those based on a 50% increase in the IOT.  Therefore, it would not be
            unreasonable to establish the narrative intent of SQGs for the Portland
            Harbor site as follows:

            •    Low-risk toxicity thresholds represent the concentrations of COPC or
                COPC  mixtures below  which there is less than 20% IOT to benthic
                invertebrates;
            •    High-risk toxicity thresholds represent the concentrations of COPC or
                COPC mixtures above which there is greater than 50% IOT to benthic
                invertebrates; and,
                A moderate IOT (i.e., 20 to 50%) should be observed at concentrations
                of COPCs or COPC mixtures between the  low-risk and high-risk
                toxicity thresholds.  A moderate risk would be assigned to sediment
                samples with concentrations of COPCs or COPC-mixtures that fall
                within this category.

            Such narrative objectives for the toxicity thresholds would provide clear
            guidance to the modelers relative to the development of toxicity thresholds
            from the models. In addition, establishment of such narrative objectives
            for  the toxicity  thresholds would  provide  important  information for
            establishing evaluation criteria for  determining  the  reliability  and
            predictive ability of the toxicity thresholds that are developed  from the
            models.
            Evaluate  the  Reliability  and/or  Predictive  Ability of the Toxicity
            Thresholds - The reliability of the various toxicity thresholds should be
            evaluated to determine if they can be used to accurately classify sediment
            samples from  the  site  as  toxic  or not toxic (i.e., using  the  matching
            sediment chemistry and toxicity data that were used to derive the toxicity

                                                           PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM™ THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGEA-10

             thresholds).  In contrast, the evaluation of predictive ability is conducted
             using an independent data set (i.e., using matching sediment chemistry and
             toxicity data that were not used to derive the toxicity thresholds).

             At a metals-contaminated site, toxicity thresholds were developed using the
             results of 28-d toxicity tests with the amphipod, Hyalella azteca, and the
             mussel, Lampsilis siliquoidea (i.e., T10 and T20 values, based on MOT).
             The results of the evaluation of the reliability of these toxicity thresholds
             are presented in Table Al.  These results show the IOT below each toxicity
             threshold, the IOT above each toxicity threshold, and the overall correct
             classification rate for each toxicity threshold.  Similarly, the results of the
             predictive ability evaluation are presented in Table A2.

             The Calcasieu Estuary study also provides a useful example for illustrating
             the  importance  of conducting  the reliability and predictive  ability
             evaluations.  In this case, mean  probable effect concentration-quotients
             (PEC-Qs) of 0.24 and 0.45 for amphipod survival (Hyalella azteca) were
             selected as the low-risk and high-risk toxicity thresholds, respectively. The
             results of the reliability evaluation showed that the incidence to toxicity
             was generally low below the  selected low-risk toxicity  threshold (i.e.,
             18.7% of  the samples were toxic to Hyalella azteca in 28-d exposures;
             Table A3). Above the selected  high-risk threshold, 69% of the samples
             were toxic.   Because there was a  high IOT between the two  toxicity
             thresholds (i.e., 67%),  it was concluded that a  single toxicity threshold
             could be used to classify sediment samples into two categories, toxic or not
             toxic  to amphipods in 28-d  toxicity  tests (i.e.,  the  low-risk  toxicity
             threshold of 0.24 for mean PEC-Q was selected as the toxicity threshold).

             This example also provides important information on the predictive ability
             of the  toxicity thresholds (i.e.,  in terms of predicting toxicity to other
             toxicity test organisms  and endpoints and predicting responses of the
             benthic invertebrate community).  These results show that  the selected
             toxicity thresholds provided an  accurate basis  for classifying sediment
             samples from the  site  as  toxic  and not toxic based on the  survival of
             another amphipod species (Ampelisca abdita) and on the fertilization of sea
             urchins (Arbacia punctulata; Table A4).  In addition, many of the benthic
             invertebrate community structure endpoints showed graded responses for


                                                             PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGEA-ll

            the groups of sediment samples identified by the toxicity thresholds (Table
            A4). Therefore, the results of the predictive ability evaluation confirmed
            that the toxicity thresholds could be used to accurately classify sediment
            samples into low, moderate, and high-risk categories.  Interestingly, these
            results showed that the growth (length) offfyalella azteca did not provide
            additional information relative  to the  risks  that sediment-associated
            COPCs/COPC mixtures posed to benthic invertebrates.
             Selection and Application of the Toxicity Thresholds - As indicated in the
             previous  section, the results  of the  reliability and predictive ability
             evaluations provide essential information for selecting toxicity thresholds
             for use in the BERA and/or FS. For both the metals-contaminated site and
             the Calcasieu Estuary, these results can be used directly to identify the
             toxicity thresholds that meet the narrative intent established earlier in the
             process. This direct linkage between narrative intent and the performance
             of the toxicity  thresholds makes  the selection process relatively straight
             forward.

             For the Portland Harbor BERA, the results of the reliability and predictive
             ability evaluations will provide the information needed to decide which
             toxicity thresholds should be used in the BERA and FS processes and to
             decide how such toxicity thresholds should be used to assess risks to the
             benthic invertebrate  community and/or establish  clean-up  goals  (i.e.,
             PRGs) for the site. As indicated in the Calcasieu Estuary example, it is
             possible that a single toxicity threshold can be used to conduct risk
             assessments in the BERA and to  establish PRGs to  support the FS. The
             results of these evaluations for the Portland Harbor site could also suggest
             that it is reasonable to utilize toxicity thresholds  for  multiple endpoints to
             provide  multiple lines-of-evidence  for  evaluating risks  to  benthic
             invertebrates (i.e., in the sample-by-sample evaluation of sediment quality
             conditions). For example, the  State of California combined multiple lines
             of evidence to evaluate sediment  quality conditions at each sampling
             station  (for more information,   see  http://www.sccwrp.org/sqo/pubs/
             545_MLOE_FrameworkValidationDraft_10_15_07.pdf).  The same type
             of approach could be used for the various endpoints, organisms   and
             thresholds to provide  a framework for deciding the magnitude of concern
                                                            PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGEA-12

            about a station.  This does not mean that a station with only one threshold
            exceeded is ignored but rather it would be assigned a lower magnitude of
            concern than one with multiple thresholds exceeded. In contrast, it may be
            reasonable to select toxicity thresholds for only one endpoint during the
            development of PRGs (e.g., the most sensitive toxicity test endpoint, which
            would be expected to be protective of all other toxicity test endpoints).
             Summary - In summary, we recommend that the RAOs and narrative intent
             of the SQVs be established prior to developing predictive models for the
             site.   This is important for  ensuring that the models  can be properly
             optimized to respond to the narrative intent articulated. Establishment of
             the narrative intent of the SQVs a priori will support the development of
             evaluation criteria that are consistent with management needs at the site (as
             articulated in the RAOs). In addition, we recommend using data from the
             Portland Harbor site and/or from other locations in the development of the
             two models.  We further recommend that a portion of the data from the site
             be set aside for use in evaluating the predictive ability of the models. By
             doing so, both the reliability and predictive  ability of the  SQVs can be
             evaluated. The results of these evaluations should be used to identify the
             toxicity threshold or toxicity thresholds that ought to be used to  classify
             sediment samples from the site in terms of the risks that they pose to the
             benthic invertebrate community. These results  should also be  used to
             identify the need  for area  of interest-specific toxicity thresholds and/or
             other  data interpretation approaches to  evaluate risks to the  benthic
             invertebrate  community   associated  with  exposure  to  contaminated
             sediments and to support remedial decisions at the site.
      Question 2: In your answer to question #4 (should non-site data be considered in the
         development of the LRM?), you support use of non-site data. However, would
         you also support use of non-site data in the development of the floating point
         model? Most of the discussion regarding use of non-site data between EPA and
         LWG have focused on the LRM, but in the interests of full clarity, we wanted to
         know whether you suggested non-site data are also of value to the floating point
         model.
                                                            PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGE A-13

         Response: It would also be acceptable to use non-site data for developing the
            floating point model. The objective of the modeling process is to develop
            one or more tools that can be used to accurately classify sediment samples
            as toxic or not toxic, based on whole-sediment chemistry data alone. Such
            tools  can  include generic  SQGs or  site-specific  sediment toxicity
            thresholds for  individual COPCs and/or COPC mixtures.  From our
            perspective,  the approach  that is used to generate the  models and the
            source of the underlying data that are applied in  the modeling process is
            not particularly relevant.  What matters  is whether or not the resultant
            model can be used to accurately classify sediment samples from Portland
            Harbor as toxic or not toxic (i.e., based on the results of the reliability and
            predictive ability evaluations).  We have described the procedures for
            evaluating the models in Section 4.7 of the document.

            There is one issue that we have some concern about with respect to the use
            of site data in the development of the models of toxic  response  versus
            chemical contamination. The sediment samples that have been collected
            at the Portland Harbor site include material present within the  0 to 30 cm
            sediment depth. Hence, the samples include material located beyond (i.e.,
            deeper than) the  biologically-active zone [i.e., 9.8 ± 4.5 cm  for marine
            organisms (Boudreau 1998), 0-2 cm to 0-15 cm for nearshore infauna, and
            0-2 cm to 0-12 cm for freshwater invertebrates (http://www.sediments.org/
            sedstab/germano.pdf).  The biologically-active depth is tied to the  rate of
            deposition of the  sediments (White and Miller 2008).

            Inclusion of deeper material in the site sediment samples increases the
            likelihood that factors such  as ammonia and/or hydrogen sulfide have
            contributed to the observed responses of toxicity test organisms. Thus, the
            selection of 0-30 cm sediment horizon in the sampling programs could lead
            to some  misleading information on the current surficial conditions and,
            because of the complications noted above, could result in variability in the
            development of the relationship between sediment chemistry and toxicity.
            This  issue is also relevant to the selection of sediment samples for
            inclusion in the reference envelope calculations (see Section A4.0 below).
                                                           PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGEA-14

      Question 3: Are there any problems with the Hyalella azteca biomass endpoint tests
          that  would preclude their use as an empirical line of evidence in the baseline
          ecological risk assessment for Portland Harbor?

          Response:   No. The biomass endpoint is a useful endpoint for evaluating
            effects on benthic invertebrates associated with exposure to contaminated
            sediments.  While we have  only recently started to use  the  biomass
            endpoint,  our experience at other sites indicates that this endpoint can be
            among the most useful endpoints relative to quantifying the relationships
            between COPC concentrations in sediments and the responses of toxicity
            test organisms.  By  integrating the survival and weight endpoints, the
            biomass endpoint can provide useful information for evaluating the effects
            on amphipods associated with exposure to contaminated sediments at the
            Portland Harbor site. This endpoint is particularly useful for evaluating
            sediment samples that have marginal hits for one or both of the underlying
            endpoints (survival and weight).
      Question 4: Are there any reasons the Hyalella azteca biomass endpoint empirical
         results should not be used in the floating percentile models under development for
         Portland Harbor?

         Response:    No.    We  have  used the biomass endpoint  to  develop
             concentration-response relationships for a variety of COPCs and COPC
             mixtures. As indicated in Section 4.7 of MacDonald and Landrum (2008),
             the key is to evaluate the reliability and predictive ability of the resultant
             models and  the  associated toxicity thresholds.  The results  of  such
             evaluations will provide the information needed to determine if the models
             developed using this endpoint are appropriate for use in the BERA and/or
             the establishment of PRGs for the  site.
A3.0 Application of Regional Sediment Evaluation Team (RSET) Process
      to the Portland Harbor Site
      The RSET process was initiated in 2002  to update the Lower Columbia dredged
      material evaluation framework (DMEF).  More specifically, RSET was established

                                                          PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHLC INVERTEBRATE COMMUNITY -PAGE A-15

      to revise and develop sediment evaluation procedures for the region.  This process
      was intended to result in the development of a northwest regional sediment evaluation
      framework that could be used by federal and state agencies in Region 10.  As part of
      this effort, RSET is in the process of evaluating the protectiveness of the current suite
      of bioassays, reviewing and refining biological interpretive criteria, and reviewing and
      refining sediment screening levels.

      Based on our cursory review, the RSET process has the potential to provide useful
      advice and  guidance relative to the evaluation of dredged  materials  and  other
      sediments. Therefore, it is reasonable to review the results of the RSET process and
      assess their applicability to the Portland Harbor site.  However, it is important to
      remember that the narrative intent of the sediment screening levels that emerge from
      the RSET process may not be consistent with the remedial action objectives (RAOs)
      that are established for the Portland  Harbor site. Similarly, guidance provided by
      RS.ET relative to the interpretation of toxicity test results may not be consistent with
      the RAOs.  Therefore, the tools that are ultimately used to evaluate risks to  the
      benthic invertebrate  community should be selected to meet  site assessment and
      management needs at the Portland Harbor site. In our view, there is no need for site
      assessment activities to be entirely consistent with RSET guidance or RSET decisions
      regarding data utilization or interpretation.
A4.0 Development of a Reference Envelope for Portland Harbor
      Section 4.3  of MacDonald  and  Landrum (2008) describes the recommended
      procedures for developing a reference envelope for interpreting whole:sediment
      toxicity data from the Portland Harbor site.  This section of the original document did
      not provide sufficient detail to enable risk assessors to establish a reference envelope
      for the site.   The following information is provided  to assist  readers in better
      understanding our recommendations for developing a reference envelope for Portland
      Harbor.
                                                           PORTLAND HARBOR SUPERFVND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE A-16

A4.1 Approaches to  Selecting Reference Locations
      In general, candidate reference locations should be established on an a priori basis,
      based on an understanding of the water body under investigation and the existing data
      on sediment quality conditions. According to ASTM (2007), a reference sediment
      sample is defined as whole sediment obtained from an area of concern used to assess
      sediment conditions exclusive of the materials of interest.  Therefore,  candidate
      reference locations should be selected based on  their proximity to the study area,
      using, at minimum, information on whole-sediment chemistry.

      At the Portland harbor site, several options are available for identifying  candidate
      reference locations.   First, the sediment samples  that were  collected at  the six
      locations in upstream areas can be considered for use as reference sediment samples.
      In addition, it may be possible to identify reference sediment samples from the
      samples  that have been collected to date from the Portland Harbor site.  Finally,
      additional candidate reference locations could be identified in upstream areas, within
      the site boundaries, in downstream areas, in tributaries, or in the Columbia River. In
      all cases, the whole-sediment chemistry and whole-sediment toxicity data collected
      at candidate reference locations would need to be reviewed to determine if the sample
      qualifies as a reference sample [see Section 4.3 of MacDonald and Landrum (2008)
      and below for criteria for evaluating candidate reference sediment samples].  Only
      those samples that meet the evaluation criteria should be included in the data set used
      to develop  the reference envelope.

      A tiered process is recommended for identifying candidate reference locations for the
      Portland Harbor BERA.  As a first step, the desired number of reference sediment
      samples  for developing the reference envelope should be selected. Based on our
      experience, about 15 sediment samples are  required to adequately characterize
      variability  in the responses of toxicity test organisms associated with exposure to
      reference sediments. Then, the six sediment samples that were collected upstream of
      the site should be evaluated to determine if they qualify  as reference  sediment
      samples. Subsequently, sediment samples from within the study area that meet the
      evaluation  criteria [presented  in Section 4.3 of MacDonald and Landrum (2008) and
      refined below] should be identified and their locations plotted. Clusters of samples
      with low chemistry should be selected preferentially as reference samples (i.e., rather
      than isolated  samples) because  such clustering increases confidence that  the
      sediments  in  that geographic area do not contain elevated levels  of COPCs.  If
      insufficient numbers of reference samples are not identified using the  first two

                                                           PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGEA-! 7

      approaches, then it may be necessary to collect additional sediment samples to obtain
      sufficient data to develop the reference envelope. Because additional sampling would
      require  additional time and resources, this  option would be pursued only if the
      requisite data are not already available from  within the existing data set.
A4.2 Criteria for Identifying Reference Sediment Samples
      The recommended criteria for identifying reference sediment samples are presented
      in Section 4.3 of MacDonald and Landrum (2008). These criteria specified the
      chemical and biological characteristics of sediment samples that would qualify for
      inclusion in a reference envelope. We have further reviewed these criteria and would
      like to offer the following refinements (Note: Refinements are shown in bold italics):

         Whole-Sediment Chemistry

         •  All measured metals, PAHs, DDTs, and PCBs occur at concentrations
            below conservative SQGs;
         •  Mean PEC-QDW< 0.1;
         •  £ESB-TUPAH,<0.1;and
         •  (£SEM-AVS)/foc < 130.

         Whole-Sediment Toxicity

         •  Control-adjusted response rate should not exceed the minimum significant
            difference (MSD) for each toxicity test endpoint; or,
         •  In the absence of MSD values, control-adjusted response rate should not
            exceed the Tier II levels applied in the NSI (USEPA 2004);

         Pore-Water Chemistry

         •  Total ammonia (NH4+ + NH^), unionized ammonia (NH^), and hydrogen
            sulfide (H2S) concentrations in pore  water should not exceed lowest
            observed effect levels (LOELs) based on the results of water-only toxicity
            tests conducted with  each of the toxicity test organisms.

                                                          PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE A-18
      Consideration of these  additional criteria is important for several reasons.  First,
      DDTs have been  identified  as  COPCs  in portions of the  Portland Harbor site.
      Therefore, concentrations of DDTs (i.e., sum ODD, sum DDE, sum DDT, and total
      DDTs) should be considered in the selection of reference sediment samples (i.e., DDT
      levels should not exceed conservative sediment quality guidelines). In addition,
      sediment  sampling at the Portland Harbor site targeted the 0 to 30 cm sediment
      horizon.  This horizon  likely encompasses both the biologically-active zone (i.e.,
      typically defined as the top 10 cm of material) and the zone of limited biological
      activity (i.e., deeper sediments; 10-30 cm).  Because anoxic sediments were likely
      included in many of the sediment samples collected at the site,  it is possible that
      toxicity test organisms could have responded to ammonia and/or hydrogen sulfide in
      a portion of the samples  (i.e., these substances could have contributed to the observed
      toxicity).  The reference sediment samples that are selected should reflect conditions
      in the biologically-active  zone at the site,  rather than  conditions that  benthic
      invertebrates at the site would not normally be exposed to.  Therefore, samples
      selected to represent reference conditions should not have elevated levels of ammonia
      or hydrogen sulfide in pore water.
A5.0 Development of Clean-up Goals for Portland Harbor
      It is our observation that the LRM and FPM models that have been developed to date
      for the Portland Harbor site are explicitly intended to support evaluation of risks to
      benthic invertebrates associated with exposure to contaminated sediments.  That is,
      the toxicity thresholds developed using the models are intended to classify sediment
      samples into categories based on the probability that the sample will  be  toxic to
      benthic invertebrates. This is an appropriate use of the models. However, there is
      also a need to establish clean-up  goals for the site to support efforts under the FS
      (e.g., PRGs). It is not clear that the existing models will provide a reliable basis for
      establishing site-wide clean-up goals for Portland Harbor.  The models are likely to
      be limited in this  respect for several reasons, including:

          •  The  sampling strategy  selected  for the site may  have  resulted  in
            interferences that  complicate interpretation of the sediment toxicity data
            (e.g., elevated ammonia and/or hydrogen sulfide levels may occur in a
            portion of the samples); and,

                                                            PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM™ THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY -PAGEA-19

         •  The site appears to have multiple concentration gradients for multiple
            COPCs. As a result, clear relationships between COPC concentrations and
            sediment toxicity may not be evident on a site-wide basis.

      For this reason, an alternate approach may be required to establish clean-up goals for
      the site. For example, ammonia and hydrogen sulfide could be incorporated into the
      chemical mixture models that are developed for the site. In addition or alternatively,
      the site could be divided into multiple areas of interest, each of which has an apparent
      gradient for key COPCs and/or COPC mixtures (e.g., PCBs, DDT, PAHs, etc.).
      Then, area of interest-specific models could be developed for the key COPCs/COPC
      mixtures and the reliability of the toxicity thresholds developed using those models
      could be evaluated.  Another option involves selection of clean-up goals for key
      COPCs and COPC mixtures based on the clean-up goals that have been established
      for sites where these contaminants are the principal COPCs (e.g., 1 ppm for total
      PCBs).  Virtual remediation techniques could  be used to evaluate residual risks to
      ecological receptors if such clean-up goals were adopted at the Portland Harbor site
      (i.e., by calculating post-remediation surface-weighted average concentrations of key
      COPCs/COPC mixtures by area of interest). The point is that different approaches
      could and possibly should be used to develop toxicity thresholds for use in the BERA
      and PRGs for use in the FS.
A6.0 References
      ASTM (American Society for Testing and Materials).  2007.  Test  method  for
         measuring  the toxicity of sediment-associated contaminants with freshwater
         invertebrates.  E1706-05.  In:  ASTM Annual Book of Standards, Vol. 11.06.
         West Conshohocken, Pennsylvania.

      Boudreau, B.P. 1998. Mean mixed depth of sediments: the wherefore and the why.
         Limnology and Oceanography 43:524-526.

      MacDonald, D.D. and P.F. Landrum. 2008.  An Evaluation of the Approach  for
         Assessing Risks to the Benthic Invertebrate Community at the Portland Harbor
         Superfund  Site.  Preliminary Draft.  Prepared for United States Environmental
         Protection  Agency, Region  10,  Seattle, Washington 9 and Parametrix, Inc.
         Albany,  Oregon.   Prepared by  MacDonald  Environmental  Sciences  Ltd.,
         Nanaimo, British Columbia and Landrum and Associates, Ann Arbor, Michigan.

                                                           PORTLAND HARBOR SUPERFUND SITE

-------
ADDENDUM TO THE EVALUATION OF THE APPROACH FOR ASSESSING RISKS TO THE BENTHIC INVERTEBRATE COMMUNITY - PAGE A-20

      MacDonald, D.D., R.L. Breton, K. Edelmann, M.S. Goldberg, C.G. Ingersoll, R.A.
         Lindskoog, D.B. MacDonald, D.R.J. Moore, A.V. Pawlitz, D.E. Smorong, and
         R.P. Thompson. 2003. Development and evaluation of preliminary remediation
         goals for selected contaminants of concern at the Calcasieu Estuary cooperative
         site,  Lake Charles,  Louisiana.  Prepared  for  United  States  Environmental
         Protection Agency, Region 6. Dallas, Texas.

      MacDonald, D.D., C.G. Ingersoll, D.E. Smorong, L. Fisher, C. Huntington, and G.
         Braun.    2005a.   Development  and  evaluation  of risk-based  preliminary
         remediation goals for selected sediment-associated contaminants of concern in the
         West Branch of the Grand Calumet River. Prepared for:  United States Fish and
         Wildlife Service.  Bloomington, Indiana.

      MacDonald, D.D., C.G. Ingersoll, A.D. Porter, S.B Black, C. Miller, Y.K. Muirhead.
         2005b. Development and evaluation of preliminary remediation goals for aquatic
         receptors in the Indiana Harbor Area of Concern. Technical Report. Prepared for:
         United States Fish and  Wildlife Service.   Bloomington, Indiana and Indiana
         Department of Environmental Management. Indianapolis, Indiana.

      Wenning, R.J., G.E. Batley, C.G. Ingersoll, and D.W. Moore. 2002. Use of sediment
         quality guidelines and related tools for the assessment of contaminated sediments.
         SETAC Press. Pensacola, Florida.

      White, D.S. and  M.F Miller.  2008.  Benthic invertebrate activity in lakes: linking
         present and historical bioturbation patterns.  Aquatic Biology 2:269-271.

      USEPA (United States Environmental Protection Agency). 2004. The incidence and
         severity of sediment contamination in surface waters of  the United States.
         National sediment quality survey: Second edition (updated). EPA 823-R-02-013.
         Office of Research and Development. Washington, District of Columbia.
                                                           PORTLAND HARBOR SUPERFUND SITE

-------
Table Al. Reliability of the sediment toxicity thresholds (STTs) that were derived based on the results of 28-day toxicity tests with the amphipod,
          Hyalella azteca, and the mussel, Lampsilis siliquoidea (Endpoints: survival and biomass).
COPC/COPC
Mixture
                         Toxicity Test
                      Endpoint Used to
                         Derive STT
'10
       '20
                                                                                                      Incidence of Toxicity
                Ti0
  Correct
Classification
 Rate for T,0
     T
                                                                                                                                                20
  Correct
Classification
 Rate for T20
Basis for T10/T20: 28-d H. azteca Survival
                        Amphipod 28-d S
                        Amphipod 28-d S
                        Amphipod 28-d S
                        Amphipod 28-d S
                        Amphipod 28-d S
                        Amphipod 28-d S
Cadmium
Lead
Zinc
SSEM-AVS
Mean PEC-Q
Mean PEC-QMETAL
            11% (5 of 45)  61% (19 of 31)
    78%
                         60% (6 of 10)
                          3 3% (3 of 9)
                              ND
                          29% (2 of 7)
                          83% (5 of 6)
                          67% (6 of 9)
20% (11 of 55)   62% (13 of 21)
                                                                        19% (10 of 53)
                                                                        19% (10 of 54)
Basis for T10/T20: 28-d L. siliquoidea Survival
  Copper
  Zinc
  SSEM-AVS
  Mean PEC-QMETALS
  Mean PEC-QMETALS(OC)
                        Mussel 28-d S    48    116    141   37% (17 of 46)
                        Mussel 28-d S    48   20600 23700  38% (17 of 45)
                        Mussel 28-d S    48    38.5   64.1   28% (11 of 40)
                        Mussel 28-d S    48    6.03    10.7   33% (14 of 42)
                        Mussel 28-d S    48    482    621   33% (14 of 43)
Basis for TIO/T2o: 28-d L. siliquoidea Biomass
-d S =-day survival; -d B =-day biomass; n = number of samples.
COPC = chemical of potential concern; PEC-Q = probable effect concentration-qotients;  SEM-AVS = simultaneously extracted metals minus acid volatile sulfides; ND =No data.

 Bolded results indicate that the toxicity threshold met the individual evaluation criteria for the T,0-value, T20-value, or correct classification rate;
 shaded results indicate toxicity thresholds that meet all three criteria.
               75%
                                              61% (14 of 23)     75%
                                              64% (14 of 22)     76%
100% (2 of 2)
67% (2 of 3)
100% (8 of 8)
83% (5 of 6)
100% (5 of 5)
65%
63%
77%
69%
71%
ND
ND
100% (5 of 5)
100% (2 of 2)
100% (3 of 3)
37% (17 of 46)
38% (17 of 45)
36% (16 of 45)
36% (16 of 44)
37% (17 of 46)
100% (2 of 2)
67% (2 of 3)
100% (3 of 3)
75% (3 of 4)
100% (2 of 2)
65%
63%
67%
65%
65%
Copper
Lead
SSEM-AVS
Mean PEC-QMETALS
Mean PEC-QMETALS(OC)
Mussel 28-d B
Mussel 28-d B
Mussel 28-d B
Mussel 28-d B
Mussel 28-d B
48
48
48
48
48
33.4
1085
41.7
7.57
449
47.4
1351
52.8
10.3
490
5% (2 of 41)
7% (3 of 41)
7% (3 of 42)
9% (4 of 44)
5% (2 of 42)
71% (5 of 7)
57% (4 of 7)
67% (4 of 6)
75% (3 of 4)
83% (5 of 6)
92%
88%
90%
90%
94%
75% (3 of 4)
33%(1 of 3)
0%(0of2)
ND
50% (1 of 2)
11% (5 of 45)
9% (4 of 44)
7% (3 of 44)
9% (4 of 44)
7% (3 of 44)
67% (2 of 3)
75% (3 of 4)
100% (4 of 4)
75% (3 of 4)
100% (4 of 4)
88%
90%
94%
90%
94%
                                                                                                                                                    Page A-21

-------
Table A2. Predictive ability of the sediment toxicity thresholds (STTs) that were derived based on the results of 28-day toxicity tests with the amphipod,
         Hyalella azteca, and the mussel, Lampsilis siliquoidea (Endpoints: survival and biomass).
Toxicity Test
COPC/COPC Mixture Endpoint Used to
Derive STT
Basis for T10/T20: 28-d
Cadmium
Cadmium
Cadmium
Cadmium
Cadmium
Cadmium
Lead
Lead
Lead
Lead
Lead
Lead
Mean PEC-Q
Mean PEC-Q
Mean PEC-Q
Mean PEC-Q
Mean PEC-Q
Mean PEC-Q
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS
H. azteca Survival
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-dS
Midge 10-dB
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-d S
Midge 10-dB
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-d S
Midge 10-dB
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-d S
Midge 10-dB
Incidence of Toxicity
n

76
76
48
48
76
76
76
76
48
48
76
76
76
76
48
48
76
76
76
76
48
48
76
76
T,0

11.1
11.
11.
11.
11.
11.
150
150
150
150
150
150
0.556
0.556
0.556
0.556
0.556
0.556
1.
1.
1.
1.
1.
1.
T20

17.3
17.3
17.3
17.3
17.3
17.3
219
219
219
219
219
219
0.732
0.732
0.732
0.732
0.732
0.732
1.78
1.78
1.78
1.78
1.78
1.78
T,0 Classification T10-T20
Rate for T10

61% (19 of 31)
42% (13 of 31)
70% (14 of 20)
30% (6 of 20)
45% (14 of 31)
52% (16 of 31)
65% (20 of 31)
48% (15 of 31)
62% (13 of 21)
29% (6 of 21)
48% (15 of 31)
48% (15 of 31)
66% (19 of 29)
48% (14 of 29)
65% (13 of 20)
30% (6 of 20)
45% (13 of 29)
48% (14 of 29)
65% (20 of 31)
45% (14 of 31)
67% (14 of 21)
29% (6 of 21)
45% (14 of 31)
52% (16 of 31)

78%
72%
77%
69%
64%
74%
80%
78%
71%
67%
67%
71%
80%
78%
73%
69%
64%
71%
80%
75%
75%
67%
64%
74%

60% (6 of 10)
20% (2 of 10)
75% (3 of 4)
25% (1 of 4)
40% (4 of 10)
30% (3 of 10)
33% (3 of 9)
33% (3 of 9)
40% (2 of 5)
0%(0of5)
33% (3 of 9)
33% (3 of 9)
83% (5 of 6)
50% (3 of 6)
0%(0of2)
0%(0of2)
50% (3 of 6)
17%(lof6)
67% (6 of 9)
33% (3 of 9)
25% (1 of 4)
0% (0 of 4)
44% (4 of 9)
33% (3 of 9)
T20 Classification
Rate for T20

62% (13 of 21)
52% (11 of 21)
69% (11 of 16)
31% (5 of 16)
48% (10 of 21)
62% (13 of 21)
77% (17 of 22)
55% (12 of 22)
69% (11 of 16)
3 8% (6 of 16)
55% (12 of 22)
55% (12 of 22)
61% (14 of 23)
48% (11 of 23)
72% (13 of 18)
33% (6 of 18)
43% (10 of 23)
57% (13 of 23)
64% (14 of 22)
50% (11 of 22)
76% (13 of 17)
35% (6 of 17)
45% (10 of 22)
59% (13 of 22)

75%
80%
73%
73%
67%
79%
84%
82%
73%
77%
71%
75%
75%
78%
77%
73%
64%
76%
76%
79%
79%
75%
66%
78%
                                                                                                                                      Page A-22

-------
Table A2. Predictive ability of the sediment toxicity thresholds (STTs) that were derived based on the results of 28-day toxicity tests with the amphipod,
         Hyalella azteca, and the mussel, Lampsilis siliquoidea (Endpoints: survival and biomass).1
Toxicity Test
COPC/COPC Mixture Endpoint Used to
Derive STT
Basis for T,0/T20: 28-d
ESEM-AVS
ESEM-AVS
ESEM-AVS
ESEM-AVS
ESEM-AVS
ESEM-AVS
Zinc
Zinc
Zinc
Zinc
Zinc
Zinc
Basis for T10/T20: 28-d
Copper
Copper
Copper
Copper
Copper
Copper
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS
Incidence of Toxicity
n
T,o
T20
T10 Classification T10-T20
Rate for TIO
T2o Classification
Rate for T20
H. azteca Survival (cont.)
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-d S
Midge 10-d B
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-dS
Midge 10-dB
76
76
48
48
76
76
76
76
48
48
76
76
7.82
7.82
7.82
7.82
7.82
7.82
2083
2083
2083
2083
2083
2083
13.7
13.7
13.7
13.7
13.7
13.7
2949
2949
2949
2949
2949
2949
11% (5 of 44)
7% (3 of 44)
17% (5 of 29)
3%(1 of 29)
20% (9 of 44)
9% (4 of 44)
13% (7 of 53)
6% (3 of 53)
22% (7 of 32)
3% (1 of 32)
21% (11 of 53)
13% (7 of 53)
59% (19 of 32)
41% (13 of 32)
74% (14 of 19)
32% (6 of 19)
47% (15 of 32)
53% (17 of 32)
74% (17 of 23)
57% (13 of 23)
75% (12 of 16)
38% (6 of 16)
57% (13 of 23)
61% (14 of 23)
76%
71%
79%
71%
66%
75%
83%
83%
77%
77%
72%
79%
29% (2 of 7)
14%(1 of 7)
0%(0of2)
0%(0of2)
14%(1 of 7)
43% (3 of 7)
ND
ND
ND
ND
ND
ND
14% (7 of 51)
8% (4 of 51)
16% (5 of 31)
3% (1 of 31)
20% (10 of 51)
14% (7 of 51)
13% (7 of 53)
6% (3 of 53)
22% (7 of 32)
3% (1 of 32)
21% (11 of 53)
13% (7 of 53)
68% (17 of 25)
48% (12 of 25)
82% (14 of 17)
3 5% (6 of 17)
56% (14 of 25)
56% (14 of 25)
74% (17 of 23)
57% (13 of 23)
75% (12 of 16)
3 8% (6 of 16)
57% (13 of 23)
61% (14 of 23)
80%
78%
83%
75%
72%
76%
83%
83%
77%
77%
72%
79%
L. siliquoidea Survival
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-dS
Midge 10-dB
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
75
75
48
48
75
75
75
75
48
48
116
116
116
116
116
116
6.03
6.03
6.03
6.03
141
141
141
141
141
141
10.7
10.7
10.7
10.7
29% (21 of 73)
18% (13 of 73)
3 7% (17 of 46)
11% (5 of 46)
29% (21 of 73)
26% (19 of 73)
23% (15 of 66)
11% (7 of 66)
33% (14 of 42)
7% (3 of 42)
100% (2 of 2)
100% (2 of 2)
100% (2 of 2)
100% (2 of 2)
100% (2 of 2)
100% (2 of 2)
89% (8 of 9)
89% (8 of 9)
83% (5 of 6)
67% (4 of 6)
72%
83%
65%
90%
72%
75%
79%
89%
69%
90%
ND
ND
ND
ND
ND
ND
100% (4 of 4)
100% (4 of 4)
100% (2 of 2)
50% (1 of 2)
29% (21 of 73)
18% (13 of 73)
3 7% (17 of 46)
11% (5 of 46)
29% (21 of 73)
26% (19 of 73)
27% (19 of 70)
16% (11 of 70)
36% (16 of 44)
9% (4 of 44)
100% (2 of 2)
100% (2 of 2)
100% (2 of 2)
100% (2 of 2)
100% (2 of 2)
100% (2 of 2)
80% (4 of 5)
80% (4 of 5)
75% (3 of 4)
75% (3 of 4)
72%
83%
65%
90%
72%
75%
73%
84%
65%
90%
                                                                                                                                      Page A-23

-------
Table A2. Predictive ability of the sediment toxicity thresholds (STTs) that were derived based on the results of 28-day toxicity tests with the amphipod,
         Hyalella azteca,  and the mussel, Lampsilis siliquoidea (Endpoints: survival and biomass).
COPC/COPC Mixture
Toxicity Test
Endpoint Used to
Derive STT
Incidence of Toxicity
n
T,o
T20
T,o
Correct
Classification
Rate for T,0
Tio-T2o
rl2o Classification
Rate for T20
Basis for T10/T20: 28-d H. azteca Survival (cent.)
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS.(O,
Mean PEC-QMETALS (0(
Mean PEC-QMETALS (O<
Mean PEC-QMETALS (O<
Mean PEC-QMETALS (0«
Mean PEC-QMETALS (0,
ZSEM-AVS
ZSEM-AVS
ZSEM-AVS
SSEM-AVS
ESEM-AVS
ESEM-AVS
Zinc
Zinc
Zinc
Zinc
Zinc
Zinc
Midge 10-dS
Midge 10-d B
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-d S
Midge 10-dB
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-d S
Midge 10-dB
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-dS
Midge 10-dB
75
75
75
75
48
48
75
75
76
76
48
48
76
76
75
75
48
48
75
75
6.03
6.03
482
482
482
482
482
482
38.5
38.5
38.5
38.5
38.5
38.5
20600
20600
20600
20600
20600
20600
10.7
10.7
621
621
621
621
621
621
64.1
64.1
64.1
64.1
64.1
64.1
23700
23700
23700
23700
23700
23700
26% (17 of 66)
21% (14 of 66)
21% (14 of 66)
9% (6 of 66)
33% (14 of 43)
5% (2 of 43)
23% (15 of 66)
18% (12 of 66)
26% (18 of 68)
16% (11 of 68)
28% (11 of 40)
5% (2 of 40)
29% (20 of 68)
25% (17 of 68)
28% (20 of 71)
17% (12 of 71)
38% (17 of 45)
11% (5 of 45)
28% (20 of 71)
25% (18 of 71)
67% (6 of 9)
78% (7 of 9)
100% (9 of 9)
100% (9 of 9)
100% (5 of 5)
100% (5 of 5)
89% (8 of 9)
100% (9 of 9)
75% (6 of 8)
63% (5 of 8)
100% (8 of 8)
63% (5 of 8)
50% (4 of 8)
50% (4 of 8)
75% (3 of 4)
75% (3 of 4)
67% (2 of 3)
67% (2 of 3)
75% (3 of 4)
75% (3 of 4)
73%
79%
81%
92%
71%
96%
79%
84%
74%
82%
77%
90%
68%
72%
72%
83%
63%
88%
72%
75%
50% (2 of 4)
75% (3 of 4)
100% (4 of 4)
100% (4 of 4)
100% (3 of 3)
100% (3 of 3)
100% (4 of 4)
100% (4 of 4)
60% (3 of 5)
40% (2 of 5)
100% (5 of 5)
40% (2 of 5)
20% (1 of 5)
20% (1 of 5)
100% (1 of 1)
100% (1 of 1)
ND
ND
100% (1 of 1)
100% (1 of 1)
27% (19 of 70)
24% (17 of 70)
26% (18 of 70)
14% (10 of 70)
37% (17 of 46)
11% (5 of 46)
27% (19 of 70)
23% (16 of 70)
29% (21 of 73)
18% (13 of 73)
36% (16 of 45)
9% (4 of 45)
29% (21 of 73)
25% (18 of 73)
29% (21 of 72)
18% (13 of 72)
38% (17 of 45)
11% (5 of 45)
29% (21 of 72)
26% (19 of 72)
80% (4 of 5)
80% (4 of 5)
100% (5 of 5)
100% (5 of 5)
100% (2 of 2)
100% (2 of 2)
80% (4 of 5)
100% (5 of 5)
100% (3 of 3)
100% (3 of 3)
100% (3 of 3)
100% (3 of 3)
100% (3 of 3)
100% (3 of 3)
67% (2 of 3)
67% (2 of 3)
67% (2 of 3)
67% (2 of 3)
67% (2 of 3)
67% (2 of 3)
73%
76%
76%
87%
65%
90%
73%
79%
72%
83%
67%
92%
72%
76%
71%
81%
63%
88%
71%
73%
                                                                                                                                     Page A-24

-------
Table A2. Predictive ability of the sediment toxicity thresholds (STTs) that were derived based on the results of 28-day toxicity tests with the amphipod,
         Hyalella azteca,  and the mussel, Lampsilis siliquoidea (Endpoints: survival and biomass).1
COPC/COPC Mixture
Toxicity Test
Endpoint Used to
Derive STT
Incidence of Toxicity
n
T,o
T20
T,0 Classification T10-T20
Rate for T10
T20 Classification
Rate for T20
Basis for Ti0/T20: 28-d L. siliquoidea Biomass
Copper
Copper
Copper
Copper
Copper
Copper
Lead
Lead
Lead
Lead
Lead
Lead
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS
Mean PEC-QMETALS(oc
Mean PEC-QMETALS(oc
Mean PEC-QMETALS(OC
Mean PEC-QMETALS(oc
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-dS
Midge 10-d B
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-dS
Midge 10-dB
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-dS
Midge 10-d B
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
75
75
48
48
75
75
75
75
48
48
75
75
75
75
48
48
75
75
75
75
48
48
33.4
33.4
33.4
33.4
33.4
33.4
1085
1085
1085
1085
1085
1085
7.57
7.57
7.57
7.57
7.57
7.57
449
449
449
449
47.4
47.4
47.4
47.4
47.4
47.4
1351
1351
1351
1351
1351
1351
10.3
10.3
10.3
10.3
10.3
10.3
490
490
490
490
19% (12 of 63)
10% (6 of 63)
32% (13 of 41)
5% (2 of 41)
25% (16 of 63)
21% (13 of 63)
24% (16 of 66)
12% (8 of 66)
34% (14 of 41)
7% (3 of 41)
26% (17 of 66)
21% (14 of 66)
25% (17 of 68)
13% (9 of 68)
3 6% (16 of 44)
9% (4 of 44)
26% (18 of 68)
22% (15 of 68)
20% (13 of 65)
8% (5 of 65)
31% (13 of 42)
5% (2 of 42)
92% (11 of 12)
75% (9 of 12)
86% (6 of 7)
71% (5 of 7)
58% (7 of 12)
67% (8 of 12)
78% (7 of 9)
78% (7 of 9)
71% (5 of 7)
57% (4 of 7)
67% (6 of 9)
78% (7 of 9)
86% (6 of 7)
86% (6 of 7)
75% (3 of 4)
75% (3 of 4)
71% (5 of 7)
86% (6 of 7)
100% (10 of 10)
100% (10 of 10)
100% (6 of 6)
83% (5 of 6)
83%
88%
71%
92%
72%
77%
76%
87%
67%
88%
73%
79%
76%
87%
65%
90%
73%
79%
83%
93%
73%
94%
100% (5 of 5)
80% (4 of 5)
100% (4 of 4)
75% (3 of 4)
60% (3 of 5)
60% (3 of 5)
75% (3 of 4)
75% (3 of 4)
67% (2 of 3)
3 3% (1 of 3)
50% (2 of 4)
50% (2 of 4)
100% (2 of 2)
100% (2 of 2)
ND
ND
50% (1 of 2)
100% (2 of 2)
100% (2 of 2)
100% (2 of 2)
100% (2 of 2)
50% (1 of 2)
25% (17 of 68)
15% (10 of 68)
38% (17 of 45)
11% (5 of 45)
28% (19 of 68)
24% (16 of 68)
27% (19 of 70)
16% (11 of 70)
36% (16 of 44)
9% (4 of 44)
27% (19 of 70)
23% (16 of 70)
27% (19 of 70)
16% (11 of 70)
36% (16 of 44)
9% (4 of 44)
27% (19 of 70)
24% (17 of 70)
22% (15 of 67)
10% (7 of 67)
34% (15 of 44)
7% (3 of 44)
86% (6 of 7)
71% (5 of 7)
67% (2 of 3)
67% (2 of 3)
57% (4 of 7)
71% (5 of 7)
80% (4 of 5)
80% (4 of 5)
75% (3 of 4)
75% (3 of 4)
80% (4 of 5)
100% (5 of 5)
80% (4 of 5)
80% (4 of 5)
75% (3 of 4)
75% (3 of 4)
80% (4 of 5)
80% (4 of 5)
100% (8 of 8)
100% (8 of 8)
100% (4 of 4)
100% (4 of 4)
76%
84%
63%
88%
71%
76%
73%
84%
65%
90%
73%
79%
73%
84%
65%
90%
73%
76%
80%
91%
69%
94%
                                                                                                                                      Page A-25

-------
Table A2.  Predictive ability of the sediment toxicity thresholds (STTs) that were derived based on the results of 28-day toxicity tests with the amphipod,
           Hyalella azteca, and the mussel,  Lampsilis siliquoidea  (Endpoints: survival and biomass).



COPC/COPC Mixture

Basis for T10/T20: 28-d L
Mean PEC-QMETALS(OC
Mean PEC-QMETALS(OC
SSEM-AVS
ZSEM-AVS
2SEM-AVS
ZSEM-AVS
2SEM-AVS
2SEM-AVS

Toxicity Test
J
Endpoint Used to
rtAi-ivp STT
A^VI 1VV kj I 1
Incidence of Toxicity


n

. siliquoidea Biomass
Midge 10-d S
Midge 10-dB
Amphipod 28-d S
Amphipod 28-d B
Mussel 28-d S
Mussel 28-d B
Midge 10-d S
Midge 10-dB
75
75
76
76
48
48
76
76


T,0

(cont.)
449
449
41.7
41.7
41.7
41.7
41.7
41.7


T20


490
490
52.8
52.8
52.8
52.8
52.8
52.8


T,0


80% (8 of 10)
90% (9 of 10)
67% (4 of 6)
50% (3 of 6)
100% (6 of 6)
67% (4 of 6)
50% (3 of 6)
50% (3 of 6)

f Of f PPi"
V^UI I ti-l
Classification
Rate for T,0

77%
83%
71%
79%
73%
90%
68%
72%


Tio-T2o


50% (1 of 2)
50% (1 of 2)
0%(0of2)
0%(0of2)
100% (2 of 2)
0%(0of2)
0%(0of2)
0%(0of2)


T20


88% (7 of 8)
100% (8 of 8)
100% (4 of 4)
75% (3 of 4)
100% (4 of 4)
100% (4 of 4)
75% (3 of 4)
75% (3 of 4)

Connect
Classification
Rate for T20

77%
83%
74%
82%
69%
94%
71%
75%
-d S =-day survival;  -d B =-day biomass;  n = number of samples.
COPC = chemical of potential concern; PEC-Q = probable effect concentration-qotients; SEM-AVS = simultaneously extracted metals minus acid volatile sulfides; ND =No data;
OC = organic carbon.
'Bolded results indicate that the toxicity threshold met the individual evaluation criteria for the T ,0-value, T20-value, or correct classification rate.
                                                                                                                                                         Page A-26

-------
Table A3. Incidence of toxicity to Ampelisca abdita and Hyalella azteca exposed to whole-sediment
           samples with various mean probable effect concentration-quotient (PEC-Q) distributions.
Species Tested
Ampelisca abdita*


Hyalella azteca**


Endpoint
Measured
10-day survival


28-day survival


Mean PEC-Q
Range
O.24
0.24 to <0.45
>0.45
O.24
0.24 to <0.45
>0.45
Number of
Samples
124
16
25
75
9
16
Number of Toxic
Samples
61
16
23
14
6
11
Proportion
Toxic
48.4 %
100.0 %
92.0 %
18.7%
66.7 %
68.8 %
*Toxicity was determined based on comparisons to reference results for Phase II samples and to control results for historical sites.
**Toxicity was determined based on comparison to reference results.
                                                                                              Page A-27

-------
Table A4. Biological conditions that occur within the three categories of risk to the benthic invertebrate community in the Calcasieu Estuary,
            identified using the risk designations assigned to each sample.
Benthic Metric/Toxicity Test
Endpoint
Measured
Low
mean ± SD (n)
Indeterminate
mean ± SD (n)
High
mean ± SD (n)
Sediment Toxicity
  28-d Hyalella azteca
  28-d Hyalella azteca
  10-d Ampelisca abdita
  60-m Arbacia punctnlata

Benthic Invertebrate Community Structure
  Mean total abundance (H/H)
  Mean total abundance (H/M)
  Mean total abundance (L/L)
  Mean total abundance (M/H)
  Mean total abundance (M/L)
  Mean total abundance (M/M)
  Nonnormalized mlBI
  Normalized mlBI
  Pollution Indicator Spp. (H/H + H/M + M/H)
  Pollution Sensitive (L/L + M/L)
  Richness = total # sp.
  Total Abundance
     % survival
    length (mm)
     % survival
   % fertilization
   #/35.4 cm sq.
   #/35.4 cm sq.
   #/35.4 cm sq.
   #/35.4 cm sq.
   #/35.4 cm sq.
   #/35.4 cm sq.
      no units
      no units
   #/35.4 cm sq.
   #/35.4 cm sq.
# species/35.4 cm sq.
   #/35.4 cm sq.
  91.6 ±7.03 (54)
 3.82 ± 0.487 (54)
  62.4 ±17.3 (54)
  68.4 ± 25.8 (30)
  3.94 ± 3.38 (54)
  3.53 ± 5.04 (54)
 0.300 ± 1.18(54)
0.0667 ±0.145 (54)
 0.633 ±1.78 (54)
 0.548 ± 0.734 (54)
  9.15 ±8.59 (54)
 0.495 ±0.177 (54)
  7.54 ± 7.65 (54)
 0.933 ± 2.32 (54)
  6.72 ±4.38 (54)
  9.03 ± 8.38 (54)
 80.5 ± 19.5(15)
 3.80 ±0.625 (15)
 43.1 ±23.6 (15)
 56.2 ± 36.2 (10)
  1.48 ±1.54 (15)
 0.787 ±0.955 (15)
 0.760 ±2.94 (15)
0.0667 ±0.209 (15)
 0.587 ±2.27 (15)
 0.293 ±0.506 (15)
  6.88 ±14.0 (15)
 0.354 ±0.136 (15)
  2.33 ±2.00 (15)
  1.35 ±5.22 (15)
  3.87 ±3.64 (15)
  4.00 ±7.35 (15)
  53.6  ± 28.6 (20)
  3.76 ±0.555 (19)
  15.5 ± 17.6 (20)
   23.0 ±29.1 (5)
  1.52 ±2.63 (20)
 0.420 ± 0.908 (20)
     0 ± 0 (20)
0.0200 ± 0.0894 (20)
     0 ± 0 (20)
 0.0800 ±0.151 (20)
  2.56 ± 2.07 (20)
 0.299 ± 0.058 (20)
  1.96 ±3.03 (20)
     0 ± 0 (20)
  2.45 ± 1.93(20)
  2.07 ±3.14 (20)
SD = standard deviation; n = number of samples; d = day; m = minute; H = high;  M = medium; L = low; sp. = species; mlBI = macroinvertebrate index of biotic integrity;
cm sq. = squared centimeters.
                                                                                                                                                  Page A-28

-------
Figure Al.  Relationship between the geometric mean of the mean PEC-Q
               and the average survival of the freshwater amphipod, Hyalella
               azteca, in 28-d toxicity tests (data source:  MacDonald et al.
               2002; dashed lines represent 95% prediction limits).
    100
    90 -

    80 -


g  7<"
1  60 -


I  50H
0)
g>  40 H
«
<  30 -

    20 -

    10 -

     0
           y = 109.8285/[l+(x/0.9951)a6068]
           (n = 100; r2 = 0.98; p = 0.0003)
                     PRG-IR = 0.244-
                    PRG-HR = 0.447-
                        0.1
                       Geometric mean of Mean PEC-Q
                                                                        Page A-29

-------