&EPA
United States
Environmental Protection
Agency
Office of Chemical Safety
and Pollution Prevention , '' —71
(7101)      January 2012
       Ecological Effects
       Test Guidelines

       OCSPP 850.3000:
       Background and
       Special
       Considerations-
       Tests with Terrestrial
       Beneficial Insects,
       Invertebrates and
       Microorganisms

-------
                                     NOTICE

     This guideline is  one of a series of test guidelines established by the United States
Environmental  Protection Agency's Office of Chemical Safety and  Pollution  Prevention
(OCSPP) for use  in  testing pesticides  and chemical substances to develop  data  for
submission to the Agency under the Toxic Substances Control Act (TSCA) (15 U.S.C. 2601,
et seq.), the Federal Insecticide, Fungicide and Rodenticide Act (FIFRA) (7 U.S.C. 136, et
seq.), and section 408  of the Federal Food, Drug and Cosmetic (FFDCA) (21 U.S.C. 346a).
Prior to April 22, 2010,  OCSPP was known as the Office of Prevention, Pesticides and Toxic
Substances  (OPPTS).   To distinguish  these guidelines from  guidelines  issued  by other
organizations, the  numbering convention  adopted in 1994 specifically included  OPPTS as
part of the guideline's number.  Any test guidelines  developed after April 22, 2010 will use
the new acronym (OCSPP) in their title.

     The OCSPP harmonized test guidelines serve as a compendium of accepted  scientific
methodologies and protocols that are intended to provide data to inform regulatory decisions
under TSCA, FIFRA, and/or FFDCA.  This document provides guidance for conducting the
test,  and is  also used by  EPA, the public,  and  the companies that are subject to data
submission  requirements under TSCA,  FIFRA, and/or  the  FFDCA.   As a guidance
document, these guidelines are not binding on either EPA or any outside parties, and the
EPA may depart from the guidelines where circumstances warrant and without prior notice.
At places in this guidance, the Agency uses the word "should."  In this guidance, the use of
"should" with regard  to  an action means that the action is recommended rather than
mandatory.   The procedures contained  in this guideline are strongly recommended  for
generating the data that are the subject of the guideline, but EPA recognizes that departures
may   be  appropriate  in specific  situations.  You  may  propose  alternatives  to  the
recommendations  described in these guidelines,  and the Agency will assess them  for
appropriateness on a case-by-case basis.

     For additional information about these test guidelines and to access  these guidelines
electronically, please  go  to  http://www.epa.gov/ocspp   and select  "Test Methods &
Guidelines" on  the left side navigation  menu.  You may also access the  guidelines in
http://www.requlations.qov grouped by Series under Docket ID #s: EPA-HQ-OPPT-2009-
0150 through EPA-HQ-OPPT-2009-0159, and EPA-HQ-OPPT-2009-0576.

-------
OCSPP 850.3000: Background and special considerations: tests with terrestrial
beneficial insects, invertebrates and microorganisms.

(a) Scope—
       (1) Applicability. This guideline is intended to be used to help develop data to submit to
       EPA under the Toxic Substances  Control Act (TSCA)  (15 U.S.C.  2601, et  seq.), the
       Federal Insecticide, Fungicide, and Rodenticide Act (FIFRA) (7 U.S.C. 136, et  seq.), and
       the Federal Food, Drug, and Cosmetic Act (FFDCA) (21 U.S.C. 346a).
       (2) Background.  This guideline provides general information  applicable to conducting
       OCSPP Series  850, Group C toxicity tests with terrestrial  beneficial insects, invertebrates
       and soil  micoorganisms.  The source materials used in developing this harmonized
       OCSPP guideline are: OPP 140-1 General Information, OPP 140-2 Definitions,  OPP 140-
       3 Basic standards for testing, OPP 140-4 Reporting and evaluation of results, OPP 140-5
       Special test requirements (Pesticide  Assessment Guidelines Subdivision L);  Pesticide
       Reregi strati on  Rejection  Rate Analysis: Ecological  Effects; and background  materials
       included in the specific OCSPP Series 850, Group C guidelines.

       (3) General.

             (i) The OCSPP Series 850, Group C provides guidelines applicable to conducting
             laboratory toxicity tests with terrestrial beneficial insects, invertebrates and soil
             microorganisms.  Field tests are designed on a case-by-case basis. The guidelines
             in OCSPP Series 850, Group C are applicable to evaluating the hazards of
             industrial chemicals and pesticides to terrestrial beneficial insects, invertebrates
             and soil microorganisms exposed directly or indirectly. Data concerning the
             effects of pesticides on terrestrial beneficial insects, invertebrates, and
             microorganisms are used in ecological risk assessment of pesticides (see 40  CFR
             part 158, paragraph (k)(29) of this guideline).  These data are also of use in
             assessments of potential injury to endangered and threatened species listed by the
             Fish and Wildlife Service, Department of Interior,  and when toxicity concerns
             arise from incidents or during Special Review.  These data are  used for both
             deterministic and probabilistic risk assessments.

             (ii) Information is provided on the design and conduct of tests with  terrestrial
             beneficial  insects, invertebrates  and  soil  microorganisms, emphasizing the
             importance  of adequately  characterizing the  test  substance,  use  of suitable
             experimental design, as well as establishing the physical and chemical conditions
             of the test system for providing a scientifically sound understanding of how the
             test substance behaves under test conditions.  Also considered are the factors that
             can affect the test  outcome  and interpretation  of test results.  This general
             information is primarily applicable to the guidelines for laboratory toxicity tests,
             since field tests  are designed on a case-by-case basis.  However, the OCSPP
             850.3000 guideline lists  critical  quality assurance  and reporting  standards
             common to all the guidelines in the OCSPP Series  850, Group  C guidelines.

             (iii) The OCSPP  Series 850, Group C guidelines have generally been validated in
             formal round-robin tests or informally through repeated use.

                                      Page 1 of23

-------
(iv) Each submitted study should meet the data quality objectives for which the
test is designed.  Test validity elements  critical to determining the  scientific
soundness and acceptability of the study have been listed for each guideline in the
OCSPP Series 850, Group C.

(v) The guidelines contained in OCSPP Series 850, Group C recommend specific
procedures to be used in almost all circumstances to result in a satisfactory study
result, while they provide general  guidance that allows for some latitude, based
upon  study-specific circumstances.  It is recognized that certain problems, some
of which are unavoidable, may arise both before and during testing and provisions
have thus been made in the  guidelines for dealing with those that are commonly
encountered.  These guidelines provide  for exceptions, while at the same time
maintaining a high level of scientifically sound, state-of-the-art guidance so that
following  this  guidance will provide  ecological  effect  information  that is
scientifically defensible for its intended use, while also taking into consideration
the chemistry  and experimental fate of the test substance. For a satisfactory test,
the experimental  design, execution  of the experiments,  classification of the
organism, sampling, measurement, and data analysis should be accomplished by
use of sound scientific techniques  recognized by the scientific community. The
uniformity  of  procedures,   materials,  and  reporting  should be  maintained
throughout the toxicity evaluation process.  Refinements of the  procedures to
increase their accuracy and effectiveness are encouraged. When such refinements
include major modifications of any test  procedure,  the Agency  should be
consulted before implementation.  Also when in doubt, users of these guidelines
should  consult with the appropriate regulatory  authorities for clarification or
additional information before proceeding.  All references supplied with respect to
protocols or other test standards are provided as recommendations.

(vi) For pesticides, a tiered  testing approach  given in 40  CFR 158.630 for
nontarget insect data  requirements  provides for greater efficiency  of testing
resources while assuring data development as warranted to meet the objectives of
a hazard or risk assessment.  To reduce or eliminate unnecessary toxicity testing
for regulatory decision making the specific test requirements for pesticides in 40
CFR part 158 depend upon the use pattern of the pesticide and the potential for
exposure of terrestrial beneficial insects and invertebrates.  In addition, there is a
hierarchal or tier system which progresses from basic laboratory tests to applied
field tests, where the results  of each tier of tests should be evaluated to determine
the potential of the pesticide to cause adverse effects,  and to  determine whether
further testing is warranted to meet the objectives of the hazard or risk assessment
(40 CFR part 202).  Generally, the decision as whether to proceed to the second
tier,  or  longer  term higher  tiered  tests,  is based on  the potential  toxicity
demonstrated  in the  first  level  tests,  in  conjunction  with other  pertinent
information such as use pattern and  environmental fate profile.  For nontarget
insects the lower tier test is designed to screen test substances to determine the
potential to cause adverse affects  on pollinators on direct  contact with the test
substance (the OCSPP  850.3020 guideline).  For pesticides, a Tier I test, referred
to as  a limit test in the OCSPP 850.3020 guideline, tests a single concentration
and compares  effects  observed with  appropriate controls.  Tier II  testing for
                         Page 2 of23

-------
              pesticides includes  the  multiple-concentration  definitive  test in  the  OCSPP
              850.3020  guideline.  The  multiple-concentration  definitive  test  provides for
              generation of the dose-response curve for test substances which are known insect
              toxicants or in which Tier I testing demonstrated the test substance was an insect
              toxicant. The higher tier foliar residue test with pollinators (the OCSPP 850.3030
              guideline).determines the length of time post-application that foliar residues are
              toxic  to pollinators and is  conducted  for  test substances  which  are  known
              toxicants  and use  patterns resulting in exposure of honey bees. Higher tier
              nontarget  insect testing  includes the Field Testing for Pollinators  (the OCSPP
              850.3040 guideline) and  are designed on a case-by-case basis to address specific
              objectives concerning detrimental effects on nontarget insects, and are performed
              under simulated or actual field conditions.  Progression to higher tier tests would
              occur on a case-by-case  basis to further refine  and characterize the estimate of
              nontarget insect risk.

              (vii)  Data  on  toxicity  to  terrestrial  beneficial  insects,  invertebrates  and
              microorganisms may also  be used to evaluate the potential hazard and  risk of
              industrial  chemicals.  When the pattern of production, use,  or disposal indicates
              exposure to these terrestrial  organisms, these  tests are  strongly recommended.
              This testing is  part of the Tier I (base set) suite  of tests  in the OPPT  testing
              scheme developed for determining environmental effects (see the references in
              paragraphs (k)(12),  (k)(13), (k)(18), (k)(19), (k)(31) and  (k)(32) of this guideline
              for further details).  The testing scheme is deterministic for the most part, flexible,
              sequential, consistent, iterative,  transparent, discriminatory of  the  extent of
              toxicity, and applicable to all types of chemicals.

              (viii)   While performing field tests, all  necessary  measures should be taken to
              ensure that  nontarget plants  and animals, especially endangered  or  threatened
              species, will not be adversely affected  either by direct hazard or by  impact on
              food supply or food chain.

(b) Definitions.  Terms used in the OCSPP Series 850, Group C guidelines have the meanings
set forth  in Section  3  FIFRA regulations  at 40  CFR  152.3  (Pesticide  Registration  and
Classification Procedures); 40 CFR 158.300 (Product Chemistry Definitions); 40 CFR part 160
(Good Laboratory Practice Standards); and in TSCA Section 3 regulations 40  CFR part 792
(Good Laboratory Practice Standards); and the Agency's  "Terms of  Environment, Glossary,
Abbreviations and Acronyms" (see paragraph (k)(23) of this guideline). The definitions in this
section apply to the OCSPP  Series 850, Group  C test guidelines and where applicable, the
individual test guidelines contain additional or test-specific definitions.

       Acclimation  is the physiological  or  behavioral adaptation of test  organisms to new
       environmental conditions associated with the test procedure.

       Active Ingredient (a.i.) is any substance (or group of structurally  similar substances  if
       specified by  the Agency) that will prevent, destroy, repel or mitigate any pest,  or that
       functions as a plant regulator, desiccant, or defoliant within the meaning of FIFRA (40
       CFR 152.3).
                                       Page 3 of23

-------
Acute  toxicity  is  the  discernible adverse  effects (lethal or sublethal)  induced in an
organism within a short exposure period (usually not constituting a substantial portion of
the total life cycle or life span, e.g. single dose, hours, or days).

Acute toxicity test is a comparative study in which organisms are subjected to a severe,
short-term  stimulus (test substance).  The organisms, exposed to different concentrations
of the test  substance (except in a limit test), are observed for a short period usually not
constituting a substantial portion of the  total life cycle  or life span.  Acute exposure
typically includes a lethal biological response of relatively quick progression.

Adjuvant is a subsidiary ingredient or additive in a mixture which modifies, enhances or
prolongs by physical action the activity  of the active ingredient(s).   Examples  of
agricultural chemical adjuvants include but are not limited to surfactants, crop oils,  anti-
foaming agents, buffering compounds, drift control agents, compatibility agents,  stickers
and spreaders.

Axenic is a culture of one organism free from other organisms.

Chronic toxicity test is a comparative study in which organisms are exposed to different
concentrations of the test substance generally for a relatively long period that constitutes
a substantial, nearly complete, or complete portion  of the total life cycle or life span.
Chronic exposure typically induces  a  sublethal biological response of relatively  slow
progression, or which is cumulative in nature. For some chemicals with certain modes-
of-action, shorter-term exposure may result in chronic or  latent effects, and continued or
cumulative exposure is therefore not necessary.

Concentration-response curve is the graphical and mathematical relationship between the
concentration of a substance and  a specific biological response produced from  toxicity
tests when percent response (e.g., mortality) values are plotted against concentration of
test substance for a given exposure duration. This is also referred to as the dose-response
curve or concentration-effect curve.

Control refers to test organisms exposed to test conditions and test matrix in the absence
of any introduced test substance as part of the test design for the purpose of establishing a
basis  of comparison  with  a  test  substance  for  known chemical  or biological
measurements.

Culture  (noun)  refers to the organisms which are raised  on-site or maintained under
controlled conditions to produce test organisms through reproduction.

Culture (verb)  is to grow, raise, or maintain organisms  under controlled conditions to
produce test organisms through reproduction.

Effect concentration (ECx) is the experimentally derived concentration of a test substance
in a test matrix (e.g., soil, feed) that would be expected to cause a specified effect in x
percent (x%) of a group of test organisms under specified exposure conditions.
                                 Page 4 of23

-------
Effect concentration, median (EC 50) is the experimentally derived concentration of a test
substance in a test matrix (e.g., soil, feed) that would be expected to  cause  a defined
effect in 50% of a group of test organisms under specified exposure conditions.

Formulation, as used within these guidelines, is a packaged end use product (e.g., dust,
wettable powder, emulsifiable concentrate, ultra low volume, etc) of the test  substance
and may contain one or more active ingredients and one or more inert ingredients.

Holding is the period from the time test organisms are received in the laboratory until
they are used in testing or begin acclimation to test conditions.  Holding conditions may
include quarantine, lower temperatures to minimize disease, or other conditions that are
different from test conditions.   Where  holding conditions  are different  from  test
conditions, the test organisms should be acclimated to test conditions prior to  testing to
not stress the organisms.

Inert ingredient is any substance (or group of structurally similar substances if designated
by the  Agency), other than an  active ingredient, which is intentionally included in a
pesticide product (40 CFR 152.3).

Inhibition concentration (/Cx) is  the experimentally derived concentration  of  a test
substance in a test matrix  (e.g., soil, feed) that would be expected to cause a given
percent, x, inhibition or  reduction in a non-quantal response from the  smoothed mean
control  response.  For example, the IC25 for growth is  the concentration of test  substance
that  would  cause  a 25%  reduction in growth in a  test  population from  the  control
response and  the ICso is the  concentration of test substance that would cause a 50%
reduction in growth from the control response.

Lethal concentration (ZCX) is the experimentally derived concentration of a test  substance
in a  test matrix (e.g., soil, feed) that would be expected to result in mortality ofx% of a
group of test organisms under specified exposure  conditions.  For example, the LC25 is
the concentration of test  substance that would result in mortality of 25% of the exposed
test population.

Lethal concentration, median (LCso) is the experimentally  derived concentration of test
substance in test matrix (e.g., soil, feed) that would be expected to result in mortality of
50% of a group of test organisms under specified exposure conditions.

Lethal dose, median (LD^) is the experimentally derived dose of the test substance that
would be expected to result in mortality of 50% of a population of test animals which is
treated with a single dose under specified exposure conditions.

Limit of detection (LOD) is the analytic level below which the qualitative presence of the
material is uncertain.  This is typically defined by the lowest concentration producing a
signal two standard deviations above the background noise from a matrix blank  sample.

Limit of quantification (LOQ) is the analytic level below which the quantitative amount
of the material is  uncertain.  This is typically defined by the lowest concentration of
fortified matrix successfully analyzed.

                                Page 5 of23

-------
Limit test is a toxicity test performed with a single test substance concentration or dose
and a control to establish that the value for the measurement endpoint of concern (e.g.,
LCso, LD50) is greater than the test substance concentration or dose (limit concentration
or dose, respectively).

Lowest observed effect concentration  (LOEC) is the lowest concentration of a  test
substance to which  organisms  are  exposed under  specified  exposure conditions  that
causes  an  statistically significant  adverse  effect  as compared  to  the  control(s).
Throughout these guidelines, the terms LOEC  and lowest observed  adverse  effect
concentration (LOAEC) have the same meaning.

Lowest observed effect level (LOEL) is the lowest dose level of a test substance to which
organisms  are exposed under specified exposure conditions that causes a statistically
significant adverse effect as compared to the control(s).  Throughout these guidelines, the
terms LOEL and lowest observed adverse effect level (LOAEL) have the same meaning.

Maximum acceptable toxicant concentration (MATC) is the maximum concentration at
which a test substance can be present and not be toxic to the test organism.  The MATC
lies within  the  range  between  the  LOEC  and NOEC.   Operationally, for industrial
chemicals, the MATC is defined as the geometric mean of these values.  The MATC is
also referred to  (in the Pre-Manufacture Notification (PMN) program of OPPT) as the
chronic value or chronic no-effect-concentration (NEC).

Measured concentration is an analytically derived quantitative measure above the method
detection limit.

Measurement endpoint is a quantitative measurable response to a stressor that is used to
infer a measure of protection or evaluate risk to valued environmental entities. Examples
of measurement endpoints include, but are  not limited to, mortality (e.g., LDso, LCso),
growth (IC25, ICso), etc.   Each test-specific guideline  identifies  the  measurement
endpoint(s) to be determined by the proscribed study.  The term "measurement endpoint"
is used synonymously with the term "measures of effect".

Medium is the chemically-defined culture solution used in culturing and testing certain
organisms.

Method detection limit (MDL) is operationally defined as the concentration of constituent
that,  when processed  through  the  complete  method,  produces a  signal with  99%
probability that it is  different from the  blank.  It is computed as the standard deviation
multiplied  by the Student's t  constant  corresponding to the  appropriate degrees of
freedom (n-1).   Thus, for seven  spiked  samples prepared at the hypothetical LOQ, the
MDL  is  3.143  times the  standard deviation of  the  mean of the seven replicate
measurements.

Microorganism  is  any  of those organisms  classified  as   fungi  (Myxomycota  and
Eumycota), and bacteria (Schizomycota).

No observed effect concentration (NOEC) is the highest concentration of a test substance
to which organisms are exposed under specified exposure conditions  that  does not cause
                               Page  6 of23

-------
       a statistically significant adverse effect as compared to the control(s). The NOEC is the
       test concentration immediately below the LOEC and can only be defined in the presence
       of the LOEC.  Throughout these guidelines, the terms NOEC and no observed adverse
       effect concentration (NOAEC) have the same meaning.

       No observed effect level (NOEL) is  the highest dose level of a test  substance to which
       organisms are exposed under specified exposure  conditions that  does not cause a
       statistically significant adverse effect as compared to the control(s).  The NOEL is the
       test dosage immediately below the LOEL and can only be defined in the presence of the
       LOEL.  Throughout these guidelines, the terms NOEL and no observed adverse effect
       level (NOAEL) have the same meaning.

       Reagent water is water that has  been prepared by deionization, glass  distillation,  or
       reverse osmosis.

       Replicate is the experimental unit within a toxicity test.  It is the smallest physical entity
       to which treatments can be independently assigned.

       Test substance is the specific form of a chemical substance or mixture being evaluated
       (e.g., pesticide active ingredient or formulation, or industrial chemical).

       Treatment group is the set of replicate test chambers that receive the same amount  (if
       any) of the test  substance; controls are treatment groups  that receive none of the test
       substance.

       Typical end-use product (TEP) is  a term used to convey direction to a data producer to
       use  a commonly  used end-use  product, a pesticide formulation for field or other end use
       (excludes  products with  labeling that allows use  of the product  to  formulate other
       pesticide products), as the test substance.  The term includes any physical apparatus used
       to deliver or apply the pesticide if distributed or sold with the pesticide.

       Vehicle is any  agent  (e.g.,  solvent) which facilitates  the  mixture,   dispersion,  or
       solubilization of  a test substance with a carrier (e.g., dust, spray solution) used to expose
       the test organisms (40 CFR 160.3, 40 CFR 792.3).

(c) Apparatus, facilities and equipment—

       (1)  Laboratory  facilities  and equipment.   The type of facilities  and  equipment for
       conducting the toxicity tests  with  the  organisms  in this group of guidelines varies
       depending upon  the nature of the test and the organism.  In general, these toxicity tests
       use  normal laboratory  glassware, supplies and  equipment,  as  well as equipment for
       maintaining the  organisms under the test conditions and controlling the test conditions
       (e.g., temperature, humidity, lighting).  Construction materials and  equipment  that are
       toxic, may affect toxicity,  or that may adsorb test substances should not be used.  See
       test-specific  OCSPP Series 850, Group  C guidelines for identification  of any  atypical
       facility, equipment, or supplies used in the test.  Construction materials and equipment
       that are toxic, may affect toxicity, or that may sorb test substances should not be used.
                                      Page 7 of23

-------
       (2) Maintenance and reliability.  All equipment used in conducting the test, including
       equipment used to prepare and administer the test substance, and equipment to maintain
       and record environmental conditions, should be of such design and capacity that tests
       involving  this  equipment  can  be  conducted in  a reliable  and scientific manner.
       Equipment should be inspected, cleaned,  and maintained regularly, and be properly
       calibrated.  All  materials that will come in contact with  the  test organisms and test
       substance should be cleaned before use.  Cleaning procedures should be appropriate to
       remove known or suspected contaminants.

(d) Experimental design and data analysis—

       (1) Design elements.   Elements of experimental design  such as the  number  of test
       treatments,  progression  factor between  treatment levels  ,  number of replicates,  and
       number of organisms per replicate and per  treatment are based  upon the purpose of the
       test, variability expected in response measurements, and the type of statistical procedures
       that will be used to evaluate the results.  See the  test-specific guidelines for specific
       information relating to these aspects of test  design. General principles of test design are
       set forth in this guideline. General guidance on the statistical analysis of ecotoxicity tests
       can be found in the references  in  paragraphs  (k)(l), (k)(2), (k)(15), (k)(16),  (k)(17),
       (k)(26), (k)(27) and (k)(28) of this guideline.

       (2) Calculation of endpoints—

              (i) Background.

                    (A)  Data  generated in ecotoxicity tests with terrestrial beneficial insects,
                    invertebrates and microorganisms may be of three types:

                           (1) Quantal (dichotomous), where  the  variable  has  only  two
                           mutually  exclusive outcomes  (e.g.,  dead or alive)—note  that
                           quantal data are a special case of discrete data;

                           (2) Discrete, where there  is a finite number  of values possible or
                           there is a space on the number line between two possible values; or

                           (3) Continuous, where the  variable can  assume a continuum of
                           possible outcomes (e.g., respiration rate).

                    (B)  These data may be analyzed  using  regression-based techniques or
                    hypothesis-testing procedures depending on the objectives and endpoints
                    of a specific test guideline.  Traditionally, the results  of acute toxicity tests
                    have been expressed as point estimates (e.g., LCso or LDso for lethality, or
                    ECso or ICso for other effects), while the results of chronic tests have been
                    expressed as the results of hypothesis-testing procedures to determine the
                    NOEC and LOEC (or NOEL  and LOEL).  Regarding terminology, the
                    term ICX is more appropriately used for continuous endpoints, rather than
                    ECX.  For information on the advantages  and  disadvantages  of these
                    approaches, see the references in paragraphs (k)(5), (k)(8), (k)(16),  (k)(17)
                    and (k)(20) of this guideline.   Specific test guideline  objectives, either
                                       Page 8  of23

-------
       point estimate or hypothesis-based endpoints or both, are identified  in
       each specific test guideline.

(ii) Point estimates and concentration-response or dose-response tests. This
type of toxicity test is designed to allow calculation of a concentration- or dose-
response curve (mathematical model) and to estimate one or more specific points
(point estimates) on the curve, such as an LDio and LDso.  Because of the normal
variation in  sensitivity of individuals within a group of test organisms, a measure
of the degree of certainty in the model parameters and the point estimate value(s)
should be determined.

       (A) No single statistical technique is appropriate for all data sets, and the
       assumptions and requirements of each method should be  known before
       using (see paragraphs (k)(l), (k)(4), (k)(6), (k)(7), (k)(9), (k)(10),  (k)(ll),
       (k)(14), (k)(20), and (k)(30) of this guideline).  Not all methods  suitable
       for continuous data  are appropriate for quantal data (see paragraphs (k)(4)
       and (k)(14)  of this guideline).   For point estimate tests, regression-based
       methods (e.g., probit) that model the full concentration- or  dose-response
       relationship  and provide error estimates of the model parameters and point
       estimate(s) are desired.  The regression model used to fit data should be
       recorded, and the error estimates of the model parameters (e.g., standard
       error of slope and intercept), and goodness-of-fit should be calculated and
       recorded.  For a point estimate (e.g., LDso) the  95% confidence  interval
       and  standard error  are calculated  and  recorded.   If data  do  not fit a
       regression-based model,  other  point estimator methods (e.g.,  binomial,
       moving average,  trimmed Spearman-Karber, linear interpolation (e.g.,
       Boostrap ICp)) are  available (see paragraphs (k)(24), (k)(27) and (k)(28)
       of this guideline). Which of these other methods is selected is dependent
       upon the  shape  of the concentration-response  curve,  the number  of
       treatments with partial mortalities (i.e., where mortality is greater than 0%
       but less than 100%), the magnitude of these mortalities, and  the number of
       replicates.  The method used to estimate the endpoint and, if applicable,
       the 95% confidence interval for the point estimate should be  recorded.

       (B) Concentration-response models  are good estimating tools only for the
       range of concentrations  used to fit them; therefore, endpoints that are
       extrapolated beyond the  range  of the concentrations tested would be
       considered  to be  of  lower confidence  or potentially,  of such  low
       confidence that they would not be appropriate to estimate.

(iii) Hypothesis-based methods—

       (A) Multiple-concentration or dose definitive tests. In this type of test,
       the purpose  is to determine if the biological response to a treatment level
       differs from the response  of the  control.   Hypothesis  testing-based
       endpoints, expressed as the NOEC and LOEC (or NOEL and LOEL), are
       calculated by determining statistically  significant  differences  from the
       control. The null hypothesis is that no difference exists among the mean

                         Page 9 of23

-------
(or median if nonparametric) control  and treatment responses.   The
alternative  hypothesis  is  that the  treatment(s)  result  in  an adverse
biological  effect  relative to  the  control  sample.    Parametric  and
nonparametric  analysis  of  variance  (ANOVA)  tests  and  multiple-
comparison tests are often appropriate for continuous data and for count
data and may be appropriate for some categorical data (rank, order, score).
Contingency  table  tests  are  usually appropriate for categorical  data.
Parametric tests are based on normal distribution theory and  assume that
the data within treatments are a random sample from an approximately
normal  distribution  and  that the  error variance is  constant  among
treatments.  These assumptions should  be  examined using  appropriate
tests,  and  data transformations  (see  paragraph (d)(2)(iv)(A) of  this
guideline)  or non-parametric  techniques  should  be used  where the
assumptions are not met.  Where possible multiple comparison tests that
restrict the number of comparisons made should be used.  Generally, the
more  powerful multiple-comparison tests  are  those which assume  a
concentration-  or  dose-response  relationship  in the  data.   When the
assumption of a  monotonic  dose-response  holds,  Williams'   and
Jonckheere's   test,  respectively,  are   examples  of  parametric   and
nonparametric  tests  that  can be  used.    When  the assumption  of  a
monotonic dose-response  fails, Dunnett's t-test and either Steel's many-
one rank test or the Wilcoxon rank sum test with Bonferroni  adjustment,
respectively,  are  examples of parametric  and nonparametric multiple
comparison tests  requiring no assumption about the  dose-response but
which restrict comparisons of the treatments to a control. A measure of
the sensitivity  of the test, such as  the minimum significant difference
(parametric tests), should be calculated. Alternatively, a calculation of the
number of replicates necessary to achieve data quality objectives given the
actual measured test  responses and variability  should be made.   At a
minimum,  the percent change from the control for each treatment should
be calculated.

(B) Types of decisions and errors.

       (1)  Table 1 presents the two possible outcomes and decisions that
       can be  reached in the statistical hypothesis  tests discussed in
       paragraph ((d)(3)(ii)(A) of this guideline:

              (a)  There is no difference among the  mean  control and
              treatment responses; or

              (b)  There is a difference among the  mean  control  and
              treatment  responses  (concerned  with direction,  where
              response is  adverse relative to the control).

       (2)  Statistical tests of hypothesis can be designed to control for the
       chances of making incorrect decisions.  The types of incorrect and
       correct decisions that  can be  made in a hypothesis-based test and

                  Page 10 of 23

-------
                           the probability of making these decisions are represented in Table
                           1. For multiple comparison tests the Type I error rate is controlled
                           to account for multiple test comparisons.

       Table 1.—Types of Errors and the Probabilities of Making Correct and Incorrect
Decisions Based on the Results of Testing
Test Decision Outcome:
Treatment Response > Control Response
Treatment Response < Control Response
Actual (or True) Condition:
Treatment Response >
Control Response
Correct Decision
probability = 1- alpha (a)
Type I error (False positive)
probability = a
Treatment Response < Control
Response
Type II error (False negative)
probability = beta (p)
Correct Decision
probability (Power of test) = 1-p
                     (C)  Power of the test.  Power of the  test versus percent  reduction in
                     treatment response relative to the control mean at various coefficients of
                     variation is provided  in  the reference in  paragraph (k)(24) of  this
                     guideline.   Examples are specifically given for 5  and 8 replicates for a
                     one-tailed  test alpha (a) of 0.05 and 0.10.  Effects on the number of
                     replicates  at various coefficients of variation are also provided in the
                     reference in paragraph (k)(24) of this guideline for various low a and beta
                     (P) values  (i.e..,  a + P  =  0.25).   See also the references in paragraph
                     (k)(9)and (k)(25) of this guideline.

                     (D) Limit  test.  In a limit test it is only necessary to ascertain that: a fixed
                     standard (such as the LD50 for an acute contact) is greater than a given
                     threshold;  and/or the response at the limit dose or concentration does not
                     differ from the control response.   Only one treatment, the limit dose or
                     concentration, and the appropriate control(s) are tested.  This is referred to
                     as a limit test or maximum challenge concentration test.

                           (1)   Fixed standard.  For a fixed standard limit test,  the  null
                           hypothesis is that the estimated limit treatment parameter (e.g.,
                           percent survival) is  greater than or equal to the fixed threshold
                           value (e.g., 50% survival).  The  alternative hypothesis is that the
                           estimated  limit  parameter is less than the  fixed threshold value
                           (e.g., 50% survival)  (Concerned with direction, where response is
                           inhibition  relative  to  the control  switch  hypotheses around.)
                           Examples of statistical approaches are one sample binomial tests or
                           one sample t-tests.

                           (2)   Difference between two means (or medians).  For testing if
                           the treatment level affects the test organism, the null hypothesis is
                           that the treatment  mean  (or median) response is equal to the
                           control response mean (or median) level  and the  alternative
                           hypothesis is that the treatment  mean response differs from the
                           control response.  The  direction  of the  alternative  hypothesis
                           depends on what is considered an adverse direction for the specific
                           response being evaluated, such as decreased survival and weight or
                                      Page 11 of 23

-------
                    increased mortality as compared to the control response.  Examples
                    of parametric and nonparametric two-group comparison tests are
                    Student's t-test and Wilcoxon rank rank sum test, respectively.

       (iv) Transformations, outliers, and non-detects—

              (A) Transformations.  Transforma-tion of data  (e.g., square root, log,
              arcsine-square root) may be useful for  a number of statistical  analysis
              purposes.  The two main reasons are to satisfy assumptions for statistical
              testing and to derive a linear relationship between two variables, so that
              linear regression analysis can  be  applied.    Added benefits  include
              consolidating data that may be spread out or that have several  extreme
              values (see reference  in paragraph (k)(25) of this guideline).  Once the
              data have been transformed, all statistical  analyses are performed on the
              transformed data.

              (B) Outliers. Outliers are measurements that are extremely large  or small
              relative  to  the  rest  of  the  data  and,  therefore,  are  suspected  of
              misrepresenting the  population from  which they were collected.   Unless
              there  is  a  known  documented  reason  for  the  outlier(s),  such  as
              measurement system problems  or instrument breakdown, the statistical
              analyses performed  should at a minimum include results using  the full
              data set (i.e., the suspected outlier(s) are not discarded). Outliers should
              not be discarded based on a  statistical  outlier  test  (see reference  in
              paragraph  (k)(25) of this guideline).   The  analyst  may  conduct  all
              statistical analysis of the data with both a full  and truncated  (presumed
              outliers are discarded) data set, however,  so that the effect of the presumed
              outlier(s) on the conclusion may be assessed.

              (C) Nondetects.  Data generated  from chemical analysis that  fall below
              the LOD  of the analytical procedure  are generally  described as  not
              detected,  or nondetects, (rather than as  zero  or not present)  and  the
              appropriate LOD should be reported.   There are a variety of  ways to
              evaluate data that include both detected  and non-detected values (see
              reference  in  paragraph  (k)(25)  of  this  guideline).   However,  for  a
              satisfactory test in a number of the  Group C guidelines,  test substance
              concentrations should  not be below the LOD  (see specific  OCSPP Series
              850, Group C guidelines), except in controls.

(3) Selection of test treatments—

       (i) Point estimate and concentration-response or dose-response test.  Toxicity
       tests where  the objective is  the concentration- or dose-response curve and a
       specific point response on the curve (e.g., LDso) usually  consist of one  or more
       control treatments and  at least five test treatments which should bracket the
       specific point (s) of concern for the test.  To obtain a  reasonably precise  estimate
       of the LCso  or LDso using  probit analysis for example, one or more  treatments
       should  be  between, but not including, 0% and 50% and one or more  treatments
       should  be  between, but not  including, 50% and 100%.  The spacing between test
                               Page 12 of 23

-------
       treatments  depends upon the expected  slope  of the concentration-  or  dose-
       response curve, information about which can be  gained  during a range-finding
       test.  The test treatment levels (doses or concentrations) are usually selected in a
       geometric series in which the ratio is between 1.5 and 3.2. When the objective of
       the test is to determine a regression-based estimate and sampling size constraints
       apply, the use of more treatment levels is preferable to the use of more replicates.
       The  inclusion of  additional  treatment levels  rather  than  additional replicates
       results in better characterization of the overall concentration- or  dose-response
       relationship.

       (ii) Hypothesis-based test—

              (A) Multiple-concentration or multiple-dose definitive test. Each test
              usually consists of one or more control treatments and  at least five test
              treatments  which  span the expected environmental concentrations and
              where at least the lowest treatment level is the NOEC (or NOEL).  The
              test treatments are usually selected in a geometric series in which the ratio
              is between  1.5 and 3.2. A key assumption is that the response data are
              monotonic  with increasing concentration  or  dose  (i.e., the degree  of
              biological effect increases as concentration or dose increases) or that there
              is  a threshold  response  such  that a NOEC (or NOEL) for a  given
              biological response should not occur at a treatment concentration or dose
              higher than one found to be statistically different from the control for the
              given biological response.  Where these assumptions do  not hold it is
              recommended that additional concentrations or doses be included to better
              characterize the relationship  of the  biological response with  exposure
              concentration or dose.  If the failure  is  suspected to  be due to high
              variability in a given response measurement, the number of replicates
              should be increased.

              (B) Limit test.  A limit test consists of a single treatment level and the
              appropriate control(s).  Individual OCSPP Series 850 Group C guidelines
              identify  the concentration or dose that  satisfies the  limit treatment level
              test for that guideline.

(4) Randomization. For test results to be satisfactory test treatments should be randomly
assigned to individual  test chambers and  the test  chambers  randomly  assigned  to
locations. The locations may be randomly reassigned during the test.  Randomized block
designs  or  completely randomized designs  may  be used.   For  test  results to  be
satisfactory, test organisms should ideally be  randomly assigned to the test chambers;
where  this is  not practical impartial assignment can  be used (with the  exception  of
assignment intentionally according to sex).   (Note: random assignment as used  here
implies a mathematically-based unbiased assignment method and impartial  assignment
implies a non mathematically-based unbiased assignment procedure.) All test chambers
should be treated as similarly as  possible to eliminate  potential bias in test results.  The
methods used to randomize treatments among test chambers and test chambers among
locations should be recorded, as  well  as  methods of random  or impartial organism
assignment to test chambers.

                               Page 13 of 23

-------
(5) Number of replicates.  The number of replicate test chambers for a given treatment
is dependent upon the objective of the specific guideline test.  Except for field tests which
are designed on a case-by-case basis, the minimum number of replicates for a given test
is described in each individual OCSPP Series 850 Group C guideline.

       (i) Regression-based  test.   When the objective of the test is to determine a
       regression-based  estimate and sample size  constraints  apply,  the  inclusion of
       additional  concentrations  rather than additional  replicates  results in  better
       characterization of the overall concentration-response relationship.  The objective
       of some  OCSPP Group C  guideline tests includes determination  of both a
       regression-based  point estimate  (e.g., LDso) and  a hypothesis-based endpoint
       (e.g., NOEC) in which case the minimum number of replicates will be determined
       by the hypothesis-based method.

       (ii) Hypothesis-based test.  For  hypothesis-based tests,  the determination of the
       test-specific number of replicates depends upon the objectives of the  test, the
       statistical method(s) that may be used, the coefficient of variation, the size of
       effect to  be detected, and the  acceptable  error rate.   (Note: several of the
       recommended non-parameteric multiple-comparison tests can not be  performed
       without at least a minimum of four replicates.) Individual testing facilities should
       consider variability observed  in their  laboratory  and  adjust  the  number of
       replicates upward where the minimum replication  number identified  in the test
       specific  guideline is not sufficient to provide  the  statistical  power to  detect
       adverse effects to the  test organisms or, if appropriate,  identify  and correct any
       environmental, handling, and culturing conditions, etc.  that  are resulting in the
       high variability.

(6) Controls.  Control groups are used to ensure that effects observed are associated with
or attributed only to the test substance exposure.  A control group should be similar in
every respect  to the test substance treatment groups  except for exposure to the test
substance. As described in paragraph (f)(l) of this guideline,  in addition to blank  (or
negative)  controls normally run, a vehicle control (solvent control) is  also tested if a
vehicle  was used to prepare the test substance.   To demonstrate satisfactorily that the
vehicle  has  no unacceptable  effect, the highest concentration  of the vehicle that was
added to any of the test chambers is used in the vehicle control.  It is recommended that
the vehicle  concentration be the same at each treatment level.   If either the control or
vehicle  control  results are not  satisfactory for the test, indicating problems with test
organisms or test procedures, the test results  should be considered unacceptable.  If both
the control  and the vehicle control results verify test  organism health and  status, the
control and vehicle  control results are compared using an appropriate  statistical method to
determine if there  is  an effect of  the  vehicle  on the test organisms.   If  there is a
statistically  significant difference between the control and the vehicle control, indicating
either a positive  or negative vehicle effect, for any of the measured response variables
using an a-level of 0.05, the study may be considered unacceptable.
                                Page 14 of 23

-------
(e) Test substance characterization—

       (1) Background  information on the test substance.  The information in paragraphs
       (e)(l)(i) through (e)(l)(vi) of this guideline should be known about the test  substance
       prior to testing:

              (i) Chemical name; CAS number;  molecular structure;  source; lot or batch
              number;  purity  and/or percent  a.i.;  identities  and  concentrations  of major
              ingredients and major impurities; radiolabeling if any, location of label(s), and
              radiopurity; date of most recent assay and expiration date for sample.

              (ii) Appropriate storage and handling conditions for the test substance  to protect
              the integrity of the test substance.  (Note: health and safety precautions should
              also be known.  These considerations  are beyond the scope of these guidelines
              and depend upon the characteristics of the test substance).

              (iii) Physical and chemical properties of the test substance, including solubility in
              water and various  solvents; vapor pressure; hydrolysis at various pH;  pKa; etc.
              Of particular relevance are rates for processes such as hydrolysis,  photolysis, and
              volatilization.

              (iv) Stability and solubility as relevant, under the test conditions (see  paragraph
              (e)(2) of this guideline).

              (v) Physical and chemical properties and  stability information for the  analytical
              standard (if applicable).

              (vi) Analytical  method for quantification of the test substance  in the feed  or
              dosing solutions.   Analyses are conducted with the specific media for which it
              will be used during the test, i.e. under test conditions.

       (2) Preliminary analyses.

              (i) The Agency recommends preliminary testing  of the  test substance.   The
              information  about  stability  and  solubility  of the  test  substance should be
              developed under actual test conditions.  This information can be gained while
              doing the range-finding studies.

              (ii)  Information on the  behavior of  a  test  substance  should  be  based on
              experiments which are conducted under the same conditions  as those  occurring
              during the test.  These include but are not limited to:

                     (A) Test matrix  characteristics (e.g.,  growth  medium, soil, dust, spray,
                     etc)

                     (B) Temperature, humidity, lighting, etc.

                     (C) With test organisms in place (when practical).

                     (D) Use of the same test containers.
                                       Page 15 of 23

-------
              (iii)  A  list  of recommended tests  is  as  in paragraphs (e)(2)(iii)(A)  through
              (e)(2)(iii)(D) of this guideline:

                     (A) Stability trials should be conducted under actual test conditions.

                     (B) If relevant, solubility trials should be conducted under test conditions.

                     (C) Chemical analysis methods as  detailed in paragraph (g) of this
                     guideline.

                     (D) Storage stability of the test substance in the samples to be collected for
                     chemical analyses should  be determined.   This includes  determining
                     whether and how samples can be stored for future analysis.

       (3)  Sample storage.   If samples of the  exposure matrix (soil,  growth medium, dust,
       spray, etc.) collected for chemical analysis cannot be analyzed immediately, they should
       be handled and stored appropriately to minimize loss of the test substance.  Loss could be
       caused by  such processes  as  microbial degradation, hydrolysis, oxidation, photolysis,
       reduction, sorption, or volatilization. Stability determination under storage conditions,
       whether it refers to storing the test substance before testing or  storing samples awaiting
       analysis, is required by GLP regulation.  Test substance stability under storage conditions
       should be documented.

       (4) Analytical test substance determinations.

              (i) Media to be tested and sampling frequency to document the concentration and
              stability of the test  substance throughout exposure is defined  in the test-specific
              guidelines in the OCSPP Series 850 Group C guidelines.

              (ii) For field tests, media and  frequency of testing depends on the objective of the
              study, the stability and fate of the test substance, and is determined on a case-by-
              case basis.

(f) Preparation of test substances.

       (1)  The preferred choice for preparation  of the test substance is to use reagent water
       (deionized,  distilled or reverse osmosis  water), providing the test substance can  be
       dissolved in water  and does  not readily hydrolyze.  If the test substance cannot  be
       dissolved in reagent  water, vehicles are often used.  If a vehicle, i.e.  a solvent, is
       absolutely necessary to dissolve the  test substance, the amount used should not exceed
       the  minimum volume necessary  to  dissolve or suspend the test substance.  If the test
       substance is a mixture, formulation  or  commercial  product, none of the ingredients is
       considered  a vehicle unless an extra amount is used in its preparation for testing.

              (i) Preferred vehicles are specific to  the test and test organisms and are listed in
              each individual guideline in the OCSPP Series 850 Group C guidelines.

              (ii)  If a vehicle is used to prepare the  test substance,  a vehicle control is also
              included in the test, in addition to the no-treatment control.  The same batch of

                                       Page 16  of 23

-------
              vehicle used to prepare the test treatment doses or concentrations is used in the
              vehicle control.  For a valid test, the selected vehicle should not affect the  test
              organisms at the concentration used.  A vehicle should not interfere with  the
              metabolism (degradation) of the test substance, alter the chemical properties of
              the test substance, or produce physiological or toxic effects to test organisms.

              (iii) Ideally, vehicle concentration should be kept constant in the vehicle control
              and all test treatments.  If the concentration of vehicle is not kept constant,  the
              highest concentration of vehicle used in any test treatment level should be used in
              the vehicle control.  Limits on the amount of vehicle that can be used are given in
              each guideline in OCSPP Series 850 Group C.

       (2)  All techniques used  in  stock  solution preparation  (shaking, stirring, sonication,
       heating, solvent, etc) should be recorded. The appearance of the stock solution should be
       observed and recorded.

       (3)  If the test substance is a formulated preparation, the test concentrations  should be
       expressed in terms of the concentration of a.i.

(g) Analytical methods and sampling for verification of exposure—

       (1) Method validation.

              (i) The analytical method used to measure the amount  of test substance in  the
              exposure matrix (e.g.,  soil, growth  medium,  dust, spray, etc) or stock solution
              should be validated by  appropriate laboratory practices before  beginning  the
              definitive test.   An analytical method  is not acceptable if likely degradation
              products  of the test substance give positive or negative interferences which cannot
              be systematically identified and mathematically corrected, unless it is shown that
              such degradation products are not present in the test system during the test.

              (ii)  Method validation is conducted for the  purpose of determining  the linear
              range, detection limit,  accuracy and precision (repeatability and reproducibility)
              of the method for analysis of the test substance under the conditions of the test.
              Thus, quality control (fortification) samples should be prepared at concentrations
              spanning the range of concentrations to be used in the definitive test, using the
              same procedures (vehicles, etc) and in the same matrix (soil, etc) representative
              of what will be used in the test.

              (iii) The  method validation  should include a determination of linearity between
              detector response and test substance concentration, the  LOQ,  the  MDL, method
              accuracy (average  percent recovery) and precision (relative standard deviation).
              The method validation should establish the  acceptance criteria  for the quality
              control (QC) samples that will be prepared and analyzed during the test.

       (2) Collection of samples. Samples should be  collected in such a manner as to provide
       an accurate representation of the matrix being  sampled.  Samples should be  processed
       and analyzed immediately, or handled and stored in a manner  which minimizes loss of

                                      Page 17 of 23

-------
       test  substance  through microbial degradation,  photodegradation,  chemical  reaction,
       volatilization, sorption or other processes.

       (3) Analysis  of test samples.  Concurrent with each analysis  of test  samples, quality
       control (fortified) samples  should be analyzed.   QC samples  are prepared by adding
       known amounts of the test substance to the test matrix.   Minimally,  one QC sample
       should be at the low end of the test concentration range and one QC sample at the high
       end.   A control (zero-level fortification) sample should also be included.   Test sample
       recoveries may be corrected for inherent method bias as determined from the concurrent
       analysis of freshly fortified quality control samples.

(h) Reference toxicants.   Historically, reference toxicity testing has been thought to provide
three types of information relevant to the interpretation of toxicity test data: An indication of the
relative health of the organisms used in the test; A demonstration that the laboratory can perform
the test procedure in a reproducible manner  over a period of time; and Information to indicate
whether the  sensitivity of a particular strain or population in use at a laboratory is comparable to
those  used in other facilities and how sensitivity varies over time. However, performance of
control organisms over time may be a better indicator of success in handling  and  testing of at
least some organisms. Nonetheless, periodic reference toxicant testing can provide an indication
of the overall comparability of results within  and among laboratories.  Although a positive
control is not standard for each test, a quarterly or semiannual  positive control (on a guideline-
specific basis) can serve as a means of detecting possible interlaboratory or temporal variation.
A reference  toxicant might also be desirable when there is any significant change  in source or
maintenance of test organisms or in other test conditions.

(i) Monitoring of test conditions.  Test conditions are specified in each  test-specific guideline in
the OCSPP   Series  850 Group C.   These  conditions  include environmental factors such  as
temperature, humidity, and lighting.  Methods used for monitoring test conditions should be in
accordance with established methods (e.g., those published by  U.S. EPA, ASTM, APHA et al,
etc.).

       (1) Temperature.  Preferably, temperature should be monitored continuously (recorded
       at least hourly).  Alternatively, the maximum and minimum should  be measured daily
       (which is a minimum of at least two measurements during each 24 hour period during the
       study).   Temperature  measurements should be  made in at  least  one representative
       location.

       (2) Humidity. Where applicable, humidity should be monitored continuously in at least
       one representative location.

       (3) Lighting.   Guidance for lighting in laboratory toxicity tests can  be  found in the
       reference in paragraph (k)(3) of this guideline.

(j) Reporting—

       (1) Background information. In addition to the reporting requirements prescribed in the
       Good Laboratory Practices Standards  (40 CFR part 792  and 40 CFR part 160), the report
       should include the information in paragraphs G)0)(i) through (j)(l)(vi) of this guideline:

                                      Page 18 of 23

-------
       (i) Test facility (name and location), test dates, and personnel.

       (ii) The name of the sponsor, study director, principal investigator, names of other
       scientists or professionals, and the names of all supervisory personnel involved in
       the study.

       (iii) Raw  data sufficient to allow independent confirmation the study authors'
       conclusions should be presented with the  study report.   Raw data includes all
       measurements recorded during the study including, but  not limited to, effects
       (mortality, growth, etc), environmental  conditions  (temperature, etc) and test
       substance  concentration  or dose measured as specified  and are used  for  the
       reconstruction and evaluation of the report of that study.  The absence of raw data
       may make the study incomplete and impossible to review for scientific soundness
       and thus can lead to rejection of the study as scientifically sound.

       (iv) The  signed and dated reports of each of the individual  scientists or  other
       professionals involved in the  study, including each person who, at the request or
       direction of the testing facility or sponsor, conducted an analysis or evaluation of
       data or specimens from the  study after data generation was  completed.

       (v) The locations where all raw data and the final report are stored.

       (vi) The statement prepared and signed by the quality assurance unit identifying
       whether or not the study was conducted in compliance with Good Laboratory
       Practices  Standards (40 CFR part  792 or 40 CFR part 160).   Alternatively  the
       statement  can indicate it  was conducted  under OECD Principles  of Good
       Laboratory Practice, in accordance with  the multilateral agreement with OECD
       member countries.

(2)  Data  elements.  The test report should include all information for providing a
complete and accurate description of test procedures and evaluation of test results.

       (i) Objectives and procedures stated in the guideline, including any changes or
       deviations or occurrences which may have influenced the results of the test.

       (ii) Identification of the test substance (including source, lot or batch number, and
       purity) and known physical and chemical properties that are pertinent to the test.
       As relevant, solubility and stability of the test substance under the test conditions,
       and stability of the test substance under storage conditions  if  stored prior to
       analysis.  It should be reported if a formulation is being tested. Where appropriate
       a cross-reference to OCSPP  Series  830 (Product Properties  Test  Guidelines)
       guideline study results can be used to report this data.

       (iii) Methods of preparation of the  test substance and the concentrations or doses
       used in definitive testing. If vehicles are used, the name and source of the vehicle,
       the nominal concentration  of the test substance in the  vehicle, and the  vehicle
       concentration(s) used in the test.

       (iv) Information about the test organisms.

                               Page 19 of 23

-------
(v) A description of the test system used in definitive and any preliminary testing.
This  includes a  description of the  test chambers,  method of test  substance
introduction, number  of organisms  per chamber,  number of replicates  per
treatment, all environmental parameters, description of any feeding during the test
(if applicable), including type of food, source, amount given and frequency.

(vi) Document and submit to the Agency the preliminary test results for review
with the study to which they apply.

(vii) Results of measurements  of test substance.   All analytical procedures and
results should be described.  Report  all  chemistry methods used in preliminary
trials, in range-finding tests,  in establishing percent purity of batches of test
substance, or in measuring concentrations in feed, dosing solutions, or animals.
Include in the  documentation  a complete description of the method so that  a
bench chemist can independently determine what  equipment to use and perform
the analysis. Also include the raw data, standards, quality control  samples, and
chromatograms from samples taken during either definitive or range-finding tests,
not of standard or samples from  recovery  tests.   For a satisfactory test,  the
accuracy of the method, LOD, MDL, and LOQ should be given.

(viii) Any difficulties in maintaining constant test substance concentrations should
be reported.  If it is observed that the stability or homogeneity of the test
substance cannot be maintained, care should be taken in the interpretation of the
results, and note made that the results may not be reproducible.

(ix)  Methods, frequency, and  results of environmental monitoring performed
during the study (temperature, lighting, etc} and other records of test conditions.

(x) Biological  observations should  be  reported  in sufficient detail to allow
complete independent evaluation of the results (see specific test guidelines in this
group for a description of what should be reported).

(xi) All data developed during the study that are suggestive or predictive of toxic
effects and all concomitant gross toxicological manifestations.

(xii) Calculated endpoints and a description of all statistical methods, including:
software used,  handling  of outlier data  points, handling of non-detect or  zero
values, tests to validate the assumptions of the analyses, level of significance, any
data transformations, for hypothesis tests a measure of the sensitivity of the test
(either the minimum significant difference or the percent change from the control
that this minimum difference represents.  Raw  data should be reported to allow
independent verification of statistical procedures.

(xiii) Methods  used for  test chamber and treatment randomization  as well as
methods for impartial assignment of test organisms to test chambers.
                         Page 20 of 23

-------
(k) References. The references in this paragraph should be consulted for additional background
material on this test guideline.

       (1) American Public Health  Association,  American Water Works Association,  Water
       Environment Federation, 1998.  Standard Methods for the Examination of Water and
       Wastewater, 20* edition. Part 8010, Toxicity: Introduction.

       (2) American Society for Testing and Materials, 2003.  ASTM E 1847-96.  Standard
       practice for statistical analysis of toxicity tests conducted under ASTM guidelines.  In
       Annual_Book of ASTM Standards, Vol.  11.06, West Conshohocken, PA.  Current edition
       approved December 10, 1996, Reapproved 2003.

       (3) American Society for Testing and Materials, 2002.  ASTM E 1733-95.  Standard
       guide for the use of lighting in laboratory testing. In Annual Book of ASTM Standards,
       Vol. 11.06, ASTM, West Conshohocken, PA.  Current edition approved September 10,
       1995; Reapproved 2002.

       (4) Bruce, R.D. and DJ. Versteeg, 1992.  A statistical procedure for modeling continuous
       toxicity data. Environmental Toxicology and Chemistry 11: 1485-1491.

       (5) Chapman, G.A., B.S. Anderson, AJ. Bailer, R.B. Baird, R. Berger, D.T. Burton, D.L.
       Denton, W.L. Goodfellow, M.A. Heber, L.L. McDonald, T.J. Norberg-King and P.J.
       Ruffier, 1996. Methods and appropriate endpoints.  In Whole Effluent Toxicity Testing,
       D.R. Grothe, K.L. Dickson and O.K. Reed-Judkins, eds., SETAC Press, Pensacola, FL.

       (6) Daum,  R.J., 1970. Revision of two  computer programs for probit analysis. Bulletin
       of the Entomological Society of America 16:10-15.

       (7) Daum, RJ. and W. Killcreas, 1966.  Two computer programs for probit analysis.
       Bulletin of the Entomological Society of America 12:365-369.

       (8) deBruijn, J.H.M. and  M. Hof,  1997.  How to measure no effect.  Part IV: How
       acceptable is the ECX  from an environmental policy point of view?   Environmetrics
       8:263-267.

       (9) Fairweather, P.G., 1991.  Statistical power and design requirements for environmental
       monitoring. Australian Journal of Marine and Freshwater Research 42:555-567.

       (10) Finney, D.J., 1971.  Probit Analysis  3rd ed., Cambridge: London and New York.

       (11) Litchfield, J.T., Jr. and F.  Wilcoxon, 1949.  A simplified method of evaluating
       dose-effect experiments. Journal of Pharmacological Experimental Therapy 96:99-133.

       (12) Nabholz, J.V., 1991. Environmental hazard and risk assessment under the Toxic
       Substances Control Act. Science of the Total Environment,  109/110: 649-665.

       (13) Nabholz, J.V.,  P. Miller and M. Zeeman,  1993. Environmental risk assessment of
       new  chemicals  under  the  Toxic  Substances Control  Act  (TSCA)  Section  5,  In
       Environmental_Toxicology and Risk Assessment, Landis, W.G., Hughes, J.S., and Lewis,

                                     Page 21 of 23

-------
M.A., eds., ASTM STP 1179, American Society for Testing and Materials, Philadelphia,
PA, pp. 40 -55.

(14)  Nyholm, N., P.S.  Sorenson, K.O. Kusk,  and  E.R. Christensen, 1992.  Statistical
treatment  of data  from  microbial  toxicity  tests.   Environmental  Toxicology  and
Chemistry 11:157-167.

(15)  Organization for Economic Co-operation  and  Development, 1998.  Report of the
OECD Workshop on Statistical  Analysis of Aquatic Toxicity Data.  OECD Series on
Testing and Assessment, No. 10. ENV/MC/CHEM(98)18

(16)  Organization  for  Economic Co-Operation  and Development,  2006.   Current
Approaches in the Statistical Analysis of Ecotoxicity Data: A Guidance to Application.
OECD Series on Testing and Assessment, No. 54. ENV/JM/MONO(2006)18.

(17)  Pack, S.,  1993.  A review  of statistical data analysis and experimental design in
OECD aquatic toxicology test guidelines. Report to OECD. Paris.

(18)  Smrchek, J.C.,  R. Clements,  R.  Morcock,  and  W.  Rabert, 1993.   Assessing
ecological hazard under TSCA: methods  and evaluation of data, In Environmental
Toxicology_and Risk Assessment, Landis, W.G., Hughes, J.S., and Lewis, M.A.,  eds.,
ASTM STP 1179, American Society for Testing and Materials, Philadelphia, PA, pp. 22-
39

(19)  Smrchek, J.C. and M.G. Zeeman, 1998.  Assessing risks to ecological systems  from
chemicals. In Handbook of Environmental Risk Assessment and Management, P. Calow,
ed., Blackwell Science, Ltd., Oxford, UK, pp. 24-90, Chapter 3

(20)  Stephan, C.E., 1997.  Methods for calculating an LCso.  In Aquatic Toxicology and
Hazard Evaluation, ASTM STP 634, F.L.  Mayer  and  J.L. Hamelink, eds., American
Society for Testing and Materials, Philadelphia,  PA.

(21)  U.S.  Environmental Protection  Agency, 1982.  Pesticide Assessment Guidelines
Subdivision L—Hazard Evaluation:  Nontarget  Insects.  Office of Pesticides and Toxic
Substances, Washington, D.C.  EPA-540/9-82-019

(22)  U.S. Environmental Protection  Agency, 1994.   Pesticides Reregi strati on Rejection
Rate Analysis: Ecological Effects, EPA 738-R-94-035, Office of Prevention, Pesticides
and Toxic Substances, December, 1994

(23)  U.S. Environmental Protection Agency, 1997.  Terms of Environment, Glossary,
Abbreviations, and Acronyms, Communications, Education, and Public  Affairs,  EPA
175-B-97-001, December 1997.

(24)  U.S. Environmental Protection Agency, 2000.  Methods for Measuring the  Toxicity
and   Bioaccumulation   of  Sediment-Associated   Contaminants  with   Freshwater
Invertebrates, Second Edition, EPA 600/R-99/064, March 2000.
                              Page 22 of 23

-------
(25) U.S.  Environmental Protection  Agency,  2000.   Guidance for Data  Quality
Assessment,  Practical  Methods  for  Data  Analysis.    EPA  QA/G9.   Office  of
Environmental Information, Washington, DC.  EPA/600/R-96/084, July.

(26) U.S.  Environmental  Protection Agency, 2002.  Methods for measuring the acute
toxicity of effluents  and  receiving waters to freshwater and  marine  organisms, Fifth
edition, Office of Water, Washington, DC. EPA-821-R-02-012

(27) U.S.  Environmental Protection Agency,  2002.  Short-term methods for estimating
the chronic toxicity of effluents and receiving waters to freshwater organisms, Fourth
edition, Office of Water, Washington, DC. EPA-821-R-02-013.

(28) U.S.  Environmental Protection Agency,  2002.  Short-term methods for estimating
the chronic toxicity of effluents and receiving waters to marine and  estuarine organisms,
Third edition, Office of Water, DC.. EPA-821-R-02-014.

(29) U.S. Environmental Protection Agency, Code of Federal Regulations (CFR) Title 40
- Pesticide Programs Subchapter E—Pesticide Programs. Part 158—Data Requirements
for Pesticides.

(30) VanEwijk, P.H. and J.A. Hoekstra, 1993.  Calculation of the ECso and its confidence
interval when a subtoxic stimulus is present. Ecotox.icology and Environmental Safety.
25:25-32.

(31) Zeeman, M. and J. Gilford, 1993. Ecological hazard evaluation and risk assessment
under EPA's Toxic Substances Control Act (TSCA): an introduction.  In Environmental
Toxicology and Risk Assessment, Landis, W.G., Hughes, J.S., and Lewis,  M.A.,  eds.,
ASTM STP 1179, American  Society for Testing  and Materials, Philadelphia, PA, pp. 7-
21.

(32) Zeeman, M.G., 1995. Ecotoxicity testing and estimation methods developed under
Section 5  of the Toxic Substances Control Act (TSCA),  In Fundamentals of Aquatic
Toxicology^2n Edition, G.M. Rand, ed., Taylor  and Francis, Washington, DC, pp.  703-
715.
                              Page 23 of 23

-------