oEPA
United States
Environmental Protection
Agency
Office of Water
4304T
EPA-822-R-20-002
January 2020
EPA RESPONSE TO THE
EXTERNAL PEER REVIEW
REPORT ON THE CHRONIC
TOXICITY OF ALUMINUM
TO THE CLADOCERAN,
CERIODAPHNIA DUBIA:
EXPANSION OF THE EMPIRICAL
DATABASE FOR
BIOAVAILABILITY MODELING
(2018)

-------
EPA-822-R-20-002
EPA RESPONSE TO THE EXTERNAL PEER REVIEW REPORT ON THE CHRONIC
TOXICITY OF ALUMINUM TO THE CLADOCERAN, CERIODAPHNIA DUBIA:
EXPANSION OF THE EMPIRICAL DATABASE FOR BIOAVAILABILITY MODELING
(2018)
January 2020
U.S. ENVIRONMENTAL PROTECTION AGENCY
OFFICE OF WATER
OFFICE OF SCIENCE AND TECHNOLOGY
HEALTH AND ECOLOGICAL CRITERIA DIVISION
WASHINGTON, D C.
11

-------
Table of Contents
1	Introduction	
1.1	Background	
1.2	Peer Reviewers	
1.3	Review Materials Provided	
1.4	Charge Questions	
2	External Peer Reviewer Comments and EPA Responses, Organized by Charge Question...
2.1	Charge Question 1	
2.2	Charge Question 2	
2.3	Charge Question 3	
2.4	Charge Question 4	
2.5	Charge Question 5	
2.6	Charge Question 6	
2.7	Charge Question 7	
2.8	Charge Question 8	
2.9	Charge Question 9	
2.10	Charge Question 10	
2.11	Charge Question 11	
2.12	Charge Question 12	
3	Additional Comments Provided	
4	References Cited by Reviewers and EPA Responses	
..2
.. 3
.. 5
.. 6
.. 8
.. 9
11
13
15
17
19
21
23
26
28

-------
1 Introduction
EPA organized a contractor-led independent, external peer review of an aquatic life toxicity test
report entitled "Chronic Toxicity of Aluminum to the cladoceran, Ceriodaphnia dubia\
Expansion of the empirical database for bioavailability modeling" (OSU 2018). Oregon State
University (OSU) conducted the invertebrate toxicity tests for aluminum to expand the toxicity
test dataset that may be used for bioavailability model development to estimate the effects of
aluminum on aquatic organisms.
The external peer review was completed on July 31, 2018. The external peer reviewers provided
their independent responses to EPA's charge questions. This report documents EPA's response
to the external peer review comments provided to EPA.
This report presents the 12 peer review charge questions and five individual external peer
reviewer comments (verbatim) on the charge questions in Sections 2.1 through 2.12. Additional
comments outside of the charge questions are presented in Section 3. New information (e.g.,
references) provided by reviewers is presented in Section 4. Each reviewer's comments were
separated by charge question into distinct topics and EPA responded to each topic individually.
1.1	Background
Section 304(a) (1) of the Clean Water Act, 33 U.S.C. § 1314(a)(1), directs the Administrator of
EPA to publish water quality criteria that accurately reflecting the latest scientific knowledge on
the kind and extent of all identifiable effects on health and welfare that might be expected from
the presence of pollutants in any body of water. In support of this mission, EPA is working to
update water quality criteria to protect aquatic life from the potential effects of aluminum in
freshwater environments. Invertebrate toxicity tests for aluminum have been conducted by
Oregon State University and are yet unpublished in the peer-reviewed literature. EPA thus
funded a contractor-led focused, objective evaluation of these invertebrate toxicity tests, to
determine if their quality was sufficient for EPA to include them in the development of a
bioavailability model to calculate the effects of aluminum on aquatic organisms under a range of
water chemistry conditions.
1.2	Peer Reviewers
An EPA contractor identified and selected five expert external reviewers who met the technical
expertise criteria provided by EPA and who had no conflict of interest in performing this review.
The EPA contractor provided reviewers with instructions, the final report, and the charge to
reviewers prepared by EPA. Reviewers worked individually to develop written comments in
response to the charge questions.
1.3	Review Materials Provided
• OSU 2018 Final Report and Appendices
1.4	Charge Questions
1. Were an adequate number of concentrations tested to fully-characterize concentration-
response and determine an accurate and scientifically-defensible chronic effect
concentration (e.g., EC20)?
1

-------
2.	Was there a sufficient number of replicates for each test concentration and control to pass
statistical rigor for the type of test and test conditions?
3.	Was the source, maintenance, and husbandry of test organisms well described?
4.	Were the control's survival rates acceptable?
5.	Were test organisms appropriately acclimated for the type of test and test water
conditions to represent their chronic sensitivity under those conditions?
6.	Were test endpoints and data acceptability criteria well defined and explained?
7.	Was preparation of test solutions fully described and target test concentrations verified
prior to testing?
8.	Were manipulated test water quality variables (e.g., pH, DOC, water hardness) measured
with sufficient frequency and accuracy to represent intended levels?
9.	Was the frequency and accuracy of chemical concentrations measured in test solutions
sufficient to represent intended exposure levels throughout the duration of the test(s)?
10.	Were any anomalies in the test explained or justified with additional information or
testing?
11.	Do the reported test results meet or exceed expectations for use in model development for
the derivation of ambient water quality criteria for the protection of aquatic life?
12.	Is there any reason to be concerned with the use of the test results in the criteria
derivation process?
2 External Peer Reviewer Comments and EPA Responses, Organized
by Charge Question
The following tables list the charge questions submitted to the external peer reviewers, the
external peer reviewers' comments regarding those questions (broken into distinct topics), and
EPA's responses to the peer reviewers' comments.
2

-------
2.1 Charge Question 1
1. Were an adequate number of concentrations tested to fully-characterize concentration-response and determine an accurate and
scientifically-defensible chronic effect concentration (e.g., EC20)?
Re\ iewer
Com mciils
KIW Response (0 Comment
Reviewer 1
Yes. The test was conducted following standard US EPA chronic testing methodology
according to US EPA (2002). This reference is not provided in the reference list (it should
be), but presumably refers to EPA-821-R-02-013. According to this guidance, a minimum of
5 test concentrations and a control should be used in a definitive test. As each test in this
study included 5 exposure concentrations and a dilution water control (p. 2-2), it is judged to
be adequate for the test purpose. The range of concentrations chosen was also deemed
adequate to achieve estimates of the desired effect levels for reproduction (10, 20, and 50%
effect; Table 3-13). With the exception of one test in which effects on survival occurred, all
test concentrations could be used to estimate reproductive effects.
Thank you for your suggestion. The reference is nol
in the main body of the report but is cited in
Appendix A of the report.
Reviewer 2
A total of nine different tests were conducted under different pH, hardness and DOC
conditions. Five total A1 concentrations plus controls were generally used in the various tests.
This number of concentrations is generally considered adequate.
Thank you for your comment.
Reviewer 3
Yes, 5 concentrations of A1 and a negative control were used for each test. This design
appeared to follow the EPA guidelines for toxicology testing with freshwater organisms. The
concentrations used were low that did not result in complete mortality at the highest
concentration of each test. Therefore, lethal effect concentrations (LCs) could not be
calculated.
Thank you for your comment. While lethal
concentrations were not observed in all tests, the
chronic endpoint for reproductive effects occurred in
all tests.
Reviewer 4
Comment:
In my opinion, an adequate number of concentrations were tested to allow full
characterization of the concentration response and allow determination of a scientifically-
defensible chronic effect concentration.
Rationale:
This research project evaluated the effects of multiple water quality variables on the toxicity
of Aluminum (Al) to the cladoceran Ceriodaphnia dubia. The goal of the study was to
increase the range of water quality variables under which a reasonable prediction of
invertebrate toxicity could be performed under a given set of water quality variables. The test
followed standard USEPA methodology (US EPA 2002). The methods included in this
Thank you for your comment.
3

-------
Re\ iewer
Com mciils
KIW Response to Comment

manual are referenced in Table IA, 40 CFR Part 136 regulations and, therefore, constitute
approved methods for acute toxicity tests. These methods were used in the present study with
modifications to address different water types and pH levels. For example, concentrations
were based on previous studies shown to cause a negative impact on C. dubia survival and
reproduction. The standard EPA protocol calls for five test concentrations and a control and
this was mostly followed in the present study. For one test (Test #: A1 1185 CDC; p. 12,
Appendices (page 1, Appendix B) six concentrations of A1 were used, plus a treatment
labeled "non pH"). This was apparently a confirmatory test for comparison to results obtained
at the Chilean Mining and Metallurgy Research Center (CIMM; Santiago, Chile) and
Universidad Adolfo Ibanez (UAI; Santiago, Chile) and reported in Gensemer et al. (2018) as
indicated on p. 29, paragraph 3. Five concentrations is the number usually followed by most
toxicity testing laboratories including those administered by the US EPA (such as the EPA
facility in Cincinnati, OH with which I am familiar). This allows the present study to be
compared to the results of other laboratories and have such results be incorporated into the
statistical model developed by the authors. This regression model can be used to develop a
scientifically defensible chronic effect concentration such as the EC20 (dose which causes a
20% change from control response of the test organisms).

Reviewer 5
The study was performed following the agreed to protocol. However, one study used a 45%
bisection of the test concentrations rather than the protocol specified 50% bisection. While I
do not believe that this is a fatal flaw in the analysis, I believe that it does warrant a section in
the report for protocol deviations (rather than as only noted in Section 2.5 [page 2-2]). This
would also provide an opportunity to offer the analytical issues (as identified in Section 3.2
[page 3-4]). I also believe the authors should assess whether the analytical anomalies bias the
results high, low, or neutral. This is very helpful in the use of these results. In my overall
opinion, all test concentrations were sufficiently characterized to provide a meaningful and
accurate description of the test results and the chronic toxicity of aluminum.
Thank you for your comment. Section 3.5 "Protocol
Deviations and Amendments" provides a statement
that the authors noted that no protocol deviations
occurred during the toxicity tests which would affect
the study outcomes.
4

-------
2.2 Charge Question 2
2. Was there a sufficient number of replicates for each test concentration and control to pass statistical rigor for the type of test and
test conditions?
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 1
Yes. There were 10 replicate chambers for each exposure concentration and control, each
containing one cladoceran. This is consistent with US EPA guidance (EPA-821-R-02-013).
Thank you for your comment.
Reviewer 2
Yes. Ten replicates per treatment is adequate.
Thank you for your comment.
Reviewer 3
Yes, 10 replicates per treatment were usually used for this type of test. The report (section
2.9) did not clearly say the number of organisms used per replicate chamber.
Thank you for your comment. The number of
replicates, as stated in Section 2.5 is ten.
Reviewer 4
Comment:
Yes, the number of replicates (10 per A1 treatment concentration and 10 in the non-treated
control) was sufficient to allow sufficient statistical rigor for a C. dubia chronic toxicity
evaluation under the stated test conditions.
Rationale:
Ten replicates of each toxicant concentration and the control is the number recommended by
the US EPA (2002). This number of replicates is used by most toxicity testing laboratories,
allowing comparison of the results of the present study with previous (and likely future)
results from other laboratories. Statistical dogma suggests that ~30 replicates is the optimal
number when evaluating biological data. However, in this (and most other toxicity testing
laboratories) the test conditions were carefully controlled, using 1) moderately hard diluent
water prepared in-house (please see question 7 below), 2) environmental chambers controlled
for pH and light regimen, and 3) neonates that were all less than 24 hours old. All of these
conditions will serve to reduce variability in organism response to exposure, which will
support rigorous statistical testing using 10 replicates.
Thank you for your comment.
Reviewer 5
The number of replicates (10) and test concentrations (minimally 5 plus a control) were
standard with in ecotoxicity testing with Ceriodaphnia dubia. These are acceptable.
Thank you for your comment.
5

-------
2.3 Charge Question 3
3. Was the source, maintenance, and husbandry of test organisms well described?
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 1
Partially. The source of the organisms was well described. They were obtained from in-house
cultures that had been maintained for over 10 years and originally obtained from Aquatic
BioSystems (Fort Collins, CO, USA) (p. 2-1). Maintenance and husbandry of the test
organisms were not described in the report, although the authors did indicate that they
conducted monthly tests with a reference toxicant (NaCl) to confirm that the organisms were
in good condition (p. 2-1).
Thank you for your comment. EPA confirmed with
the authors that C. dubia were cultured in-house on
brood boards according to standard methodology
(USEPA 2002). Cultures underwent 100% water
renewals (moderately hard reconstituted water) five
time per week, were fed daily, and reproduction was
tracked to ensure health acceptability for testing.
Reviewer 2
Not particularly. This section was remarkably brief and lacking details of animal performance
for the reference toxicant tests. The reporting of volumes of algal suspensions used for
feeding are not useful unless cell densities are reported.
Thank you for your comment. EPA confirmed with
the authors that C. dubia were cultured in-house
according to standard methodology (USEPA 2002).
Each test chamber was fed 0.3 mL of an algal
(Pseudokirchneriella subcapitata) and yeast/trout
chow/cereal leaf (YTC) suspension (1:1) at test
initiation (prior to test organism introduction) and
once daily prior to water renewal. The algal density
was 3xl07 cells/mL used in the food suspension.
Reviewer 3
Organisms were originally from Aquatic Biosystems and cultured at OSU for more than 10
years. Organisms were cultured in moderately hard water. Other environmental conditions
and maintenance procedures were not described, such as temperature, photoperiod (light: dark
hours), food, feeding rates, biomass/water volume, water change, etc.
Thank you for your comment. EPA confirmed with
the authors that C. dubia were cultured in-house
according to standard methodology (USEPA 2002).
By following this protocol all maintenance and
husbandry conditions are deemed to be appropriate.
Reviewer 4
Comment:
No, an adequate description of the source, maintenance, and husbandry of the C. daphnia test
organism was not provided.
Rationale:
In the report, section 2.3.2 SOURCE, the authors state that the <24 hour old neonates were
obtained from in-house cultures which have been maintained successfully at the Aquatic
Toxicology laboratory at Oregon State University (Corvallis) for >10 years. In Appendix A,
section 2.2 and 2.3, feeding diet and feeding regimen during toxicity testing were described.
Thank you for your comment. EPA confirmed with
the authors that C. dubia were cultured in-house
according to standard methodology (USEPA 2002).
By following this protocol all maintenance and
husbandry conditions are deemed to be appropriate.
6

-------
Re\ iewer
Com mciils
Response (o ComiiKMils

However, nowhere that I could find in the report was it explicitly stated that the test
organisms were cultured and maintained under these same conditions. I believe this is an
oversight in reporting, not a failure of procedure, and this oversight can be readily remedied
by the authors by providing the missing information. Husbandry of the test organisms during
culture and testing as described appeared to be adequate.

Reviewer 5
The description of the test animals was adequately presented in the report. Reference toxicant
testing was regularly performed as part of the quality assurance program.
Thank you for your comment.
7

-------
2.4 Charge Question 4
4. Were the control's survival rates acceptable?
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 1
Yes. The authors report that in all tests, control acceptability criteria (> 80 % survival and >
60% surviving females having 15 or more neonates) were met (p. 3-14). These fulfill the
criteria for test acceptability outlined in EPA-821-R-02-013.
Thank you for your comment.
Reviewer 2
The average number of neonates/female in controls ranged from 22 to 37 with 42.5 reported
from a "concurrent control". The test with the poor control reproductive output (All 199
CDC) should not be used.
According to USEPA 2002, "In Ceriodaphnia dubia
controls, 60% or more of the surviving females must
have produced their third brood in 7± 1 days, and the
number of young per surviving female must be 15 or
greater." Since the control group in A1 1199 CDC
met these conditions, EPA disagrees that it should
not be used.
Reviewer 3
The survival of the control organisms of each test was 100%. This meets the test acceptability
criteria of the test method (80-100%).
Thank you for your comment.
Reviewer 4
Response:
Yes, it appears that the survival rate of C. dubia used in the control (no aluminum) treatments
met the accepted survival rate for this type of toxicity testing.
Rationale:
The standard methodology as developed by the US EPA (1982) calls for at least 80% survival
of the control test organisms for the test to be considered valid. On p. 29, paragraph 2, the
authors state that, in all tests, control acceptability criteria (> 80 % survival and > 60%
surviving females having 15 or more neonates) were met. Table 3-12 (p. 30 of report) and
Appendix D Raw Data both indicate that control survival was uniformly 100%, clearly
meeting the EPA (2002) control standard for test acceptability.
Thank you for your comment.
Reviewer 5
Control survival rates were acceptable.
Thank you for your comment.
8

-------
2.5 Charge Question 5
5. Were test organisms appropriately acclimated for the type of test and test water conditions to represent their chronic sensitivity
under those conditions?
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 1
Yes, as far as hardness is concerned. Organisms cultured under standard conditions (100
mg/L as CaC03) were used in the moderately hard water tests (120 mg/L as CaC03).
Organisms were acclimated to the soft (60 mg/L as CaC03) and hard water (250 and 400
mg/L as CaC03) conditions for multiple generations (i.e., over two months), and survival and
reproduction were reported to be excellent (p. 2-2). As far as indicated in the report, there was
no acclimation for different pH (tested range: 6.3 - 8.8; standard culture at 7.8-8.0) or DOC
(tested range: 1-14 mg/L; standard culture unknown) conditions.
Thank you for your comment. EPA confirmed with
the authors that C. dubia cultures were not
acclimated to pH or DOC test conditions. However
all control exposures met the data quality criteria
according to USEPA (2002). Additionally, OSU lab
data quality conditions (Appendix A Section 4.9)
were also met in all tests.
Reviewer 2
The report only mentions acclimation of cultures to different hardness levels, but not pH and
DOC or buffers.
Thank you for your comment. EPA confirmed with
the authors that C. dubia cultures were not
acclimated to pH or DOC test conditions. However
all control exposures met the data quality criteria
according to USEPA (2002). Additional, OSU lab
data quality conditions (Appendix A Section 4.9)
were also met in all tests.
Reviewer 3
Yes, the acclimation of the organisms to the hardness of test waters (250 and 400 mg/L as
CaC03) for multiple generations and over more than 2 months should be adequate.
Thank you for your comment.
Reviewer 4
Comment:
It would appear that the C. dubia used in these toxicity tests were appropriately acclimated
for the stated test type and described test water conditions at the time the chronic toxicity
testing was performed.
Rationale:
The C. dubia used for the present study were reported (Section 2.3.4 ACCLIMATION p. 2-
2;) as being cultured at the Ohio State University AquaTox laboratory, in a "moderately hard"
reconstituted water that was prepared as detailed in standard USEPA methods (USEPA
2002). This diluent was reported to have a measured hardness of 100 mg/L as CaC03 and pH
of 7.8 - 8.0, p. 2-2). All acclimated cultures for all of the toxicity tests were successfully
maintained in their respective laboratory water for multiple generations (2+ months).
Thank you for your comment.
9

-------
Re\ iewer
Com mciils
Response (o ComiiKMils

Organism survival and reproduction were reported as excellent and organism health was
maintained over the period of acclimation. Note: In section 2.3.4, ACCLIMATION is
erroneously labeled, in section 2.3.2 SOURCE, as section 2.4.3).

Reviewer 5
I was quite impressed with the acclimation process used in this study. In many instances,
researchers do not go to the length of details used for the acclimation protocol performed in
this study. The researches should be commended on this practice.
Thank you for your comment.
10

-------
2.6 Charge Question 6
6. Were test endpoints and data acceptability criteria well defined and explained?
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 1
Yes. Test endpoints included NOEC and LOEC for survival and reproduction (if data met
assumptions of normality and homogeneity), as well as effect concentrations (i.e.,
LC10/LC20/LC50 for survival and ECxl0/EC20/EC50 for reproduction). The authors
mentioned that any concentrations for which significant survival effects occurred were not
included in the analysis of reproductive effects. Acceptability criteria for temperature (25 +/-
2°C) and dissolved oxygen (>60%) were indicated (p. 3-1) and met. The authors documented
the range of measured pH and DOC measurements (p. 3-1), but did not indicate what was
considered an acceptable range (Note: there are no acceptability criteria defined in EPA
guidance EPA-821-R-02-013 for these parameters). The authors report that A1 concentrations
among all quality control samples were within acceptability criteria of 85-115%, whereas the
standard addition recoveries were within acceptability criteria of 116-102% with a few
exceptions (n=7) (p. 3-4).
Thank you for your comment.
Reviewer 2
Data acceptability criteria were not explicitly discussed but the software packages used to
assess data have built in tests for homogeneity of variance, etc. Control performance should
be explicitly discussed however.
Thank you for your comment. While control
performance is not discussed, all control information
(i.e., survival, reproduction) is reported.
Reviewer 3
Determination of NOEC, LOEC, LCs, and ECs were described in the statistical analysis
section. However, a separate section to define the measured endpoints of the test is
recommended.
Section 2.10.2 does state that live and dead counts
(i.e., survival), and the number of young (i.e.,
reproduction) was counted daily.
Reviewer 4
Comment:
Test endpoints were sufficiently defined and explained. Data acceptability criteria were not
well defined and explained.
Rationale:
Although rather brief, the authors state under section 2.10.2 BIOLOGICAL MONITORING
p. 2-5 that observations of live and dead organisms were conducted on a daily basis from
initiation to termination, and that the numbers of young were counted daily. This is sufficient
to understand the test endpoints used, but it would be useful to know under what conditions
the organisms were observed (light table? microscope? visual inspection only? time of day?)
and how the test organisms were determined to be either dead or alive. Data acceptability
criteria for this project were not offered. Most uses of data acceptance criteria involve some
Appendix A, Section 4.9 of the report does discuss
data acceptability. Data analysis followed the
statistical decision tree/flow chart according to
methodology described in USEPA 2002 and is
detailed in Appendix D of the study report.
11

-------
Re\ iewer
Com mciils
Response (o ComiiKMils

type of comparison among the data groups to determine if variability falls within a
predetermined acceptable range but the predetermined acceptable range for normality and
homogeneity for these tests were not stated by the authors. The only data acceptability
evaluation offered was that if the data met the assumptions of normality and homogeneity, the
NOEC and LOEC were estimated using an analysis of variance to compare (p. 2-6, the
authors use "p = 0.05 "as the threshold for accepting a significant effect but the correct
variable here would be "a = 0.05 "). There was no explanation offered on how the data were
handled when the data did not meet assumptions of normality and homogeneity. If all data
met those assumptions it should be stated in the report.

Reviewer 5
The test endpoints and data acceptability criteria were well defined and explained in the text.
I would like the authors to further evaluate the pH 6.3, hardness 60, DOC 2 treatment as to
the appropriateness of the results. The 529 A1 treatment had slightly better reproduction
average than the next lower concentration (264.5 A1 treatment). While I know that this
sometimes happens, the control through the 529 A1 treatment (represents 5 of the treatments)
ranged in reproduction from 32.6 to 26.0 neonates (Table 3-12, page 3-15). This represents a
wide range of treatment concentrations, with minimal change in neonate average production.
I couldn't further evaluate whether there was something in this test that might explain this
effect? All other tests looked adequate and were well defined and explained.
The concentration-response data for test A1 1185
CDC does not appear to be abnormal. It is not
uncommon to see response data where the higher
test concentrations may vary in the measured test
endpoint (i.e., in this case the average number of
neonates). Furthermore this test was replicated based
on previous work (Gensemer et al. 2018) and the t-
test analysis between the reported EC2oS had no
significant difference (see Section 3.3 of the report).
Additionally while the reported NOEC-LOEC of the
test was 264.5-529.0 (ig/L total aluminum with very
similar average reproduction (25.8 vs. 26.0), the
reported EC2o for the test was 828.6 (ig/L. This
demonstrates that while that 529 Al treatment may
be significant as the LOEC, the 20% reduction in
reproduction is modeled to occur at a higher
aluminum concentration.
12

-------
2.7 Charge Question 7
7. Was preparation of test solutions fully described and target test concentrations verified prior to testing?
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 1
Yes. Preparation of the test solutions is described in detail at the top of p. 2-3. Analytical
samples from each treatment were collected for total A1 and dissolved A1 (<45 |im) analysis
from newly prepared waters (after the 3-hr equilibrium period) at test initiation, during the
tests, and from a composite of replicates at test termination (p. 2-5). Total A1 concentrations
prior to addition to test chambers were between 93 and 115% of nominal spiked
concentrations, with four measurements outside of this range (with measurements of 75, 117,
120, and 130% of nominal). Total A1 concentrations in test solutions measured in the
replicate chambers at the end of the tests were more variable and the authors explained that it
was more difficult to obtain homogeneous samples from the chambers and that these
measurements were therefore less reliable (p. 3-4). In addition, dissolved A1 concentrations
were found to be highly variable, ranging from 0.1 to 111% of total Al. The authors explained
that this was expected because the majority of solutions were well above solubility limits.
There was some variability in the background levels of Al in the control water, presumably
due to differences in natural organic matter.
Thank you for your comment.
Reviewer 2
Test solutions that were aged 3 hours were taken on day 0 for both total and dissolved Al
concentrations. All tests except Al 1185 CDC also had test solutions measured on days 3 and
6. The All 185 tests did not have a day 3 sample reported.
Thank you for your comment. EPA agrees that this
is the only test where aluminum concentrations were
not measured on Day 3.
Reviewer 3
Yes, the preparation of the test solutions was fully described. The measured total Al were
closed to the nominal concentrations. Usually stock concentrations are verified prior to use.
However, it was not mentioned in the report.
Thank you for your comment. EPA confirmed with
the authors that stock solutions were not measured.
However concentrations were measured in the test
chambers at appropriate intervals to verify
appropriate dosing.
Reviewer 4
Comment:
Yes, the methods of test solution preparation were fully described. The target test
concentrations (both of the treatment chemical, aluminum, and the evaluated water quality
variables) appears to have been extensively tested and verified during the study but there is
no indication that this occurred prior to the study.
Rationale:
It appears that great attention was paid to chemical analyses in this project. The report
Thank you for your comment.
13

-------
Re\ iewer
Com mciils
Response (o ComiiKMils

provides an extensive description of the analytical methodology used, including composition
of sampling containers, commercial source, preparation, and storage of test substance (p. 1-
2), preparation and distribution of text concentrations (p. 2-1), method of pH control (p. 2-3),
timing of collection, treatment and holding time of samples after collection, calibration of
analytical instrumentation, use of blanks (p. 2-5), chain of custody documentation for samples
analyzed, and data handling and storage of results. Analytical samples for each treatment
were obtained from the newly prepared and equilibrated (3 hrs) test concentration prior to the
start of the test but there is no indication that concentrations were verified before testing.
Samples were taken for chemical analysis just prior to introduction of test organisms to the
test chambers. According to Section 2.11 ANALYTICAL CONFIRMATION samples were
analyzed for total and dissolved (defined as sample water that has passed through a 0.45 (j,M
filter) using a Spectro Arcos ICP-OE according to US EPA Method 200.7. with quality
control samples and spiked samples to determine % recovery. Appendix A (Protocol)
indicates that this was a standard procedure for metal analysis to determine A1 concentrations
using an Inductively Coupled Plasma with either Optical Emission Spectrometry or Mass
Spectrometry (p.7). The raw data for these analyses are provided in APPENDIX B - Metals
Analytical Data and comprise the majority of the 405 pages of the appendices. Spiked
samples were used to determine accuracy of analyses by calculating metal recovery and were
shown to be within acceptable analytical limits.

Reviewer 5
The test solutions were well described and were sufficiently verified prior to testing.
Thank you for your comment.
14

-------
2.8 Charge Question 8
8. Were manipulated test water quality variables (e.g., pH, DOC, water hardness) measured with sufficient frequency and accuracy to
represent intended levels?
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 1
Yes. Temperature, pH, conductivity, and dissolved oxygen (DO) were measured in each
concentration at test initiation, once daily, and at test termination. Hardness, alkalinity,
ammonia, and total residual chlorine (TRC) were measured in the control water of each test at
test initiation (p. 2-4). Other parameters (i.e., Calcium, magnesium, sodium, potassium,
chloride, sulfate, cations, anions, and DOC) were measured by outside labs using accepted
methods, but it is not entirely clear from the report how often these measurements were done.
Thank you for your comment. EPA confirmed with
the authors that analytes and DOC were measured in
the dilution water at test initiation and are reported
in Section 3.1 and Appendix C.
Reviewer 2
Temperature, pH, conductivity and DO were measured daily. Details of the frequency of
verification for DOC concentrations were not found.
According to Appendix A, Section 4.5 and verified
by the authors to EPA:
1.	Hardness, alkalinity, total ammonia, and total
residual chlorine were measured in the dilution
water control at test initiation.
2.	A sample of each control/dilution water (prior to
addition of buffer or pH adjustment) was sent to an
outside analytical laboratory for analysis of calcium,
magnesium, sodium, potassium, chloride, sulfate,
and dissolved organic carbon at test initiation.
3.	Dissolved oxygen, temperature, conductivity, and
pH were measured and recorded daily in the new
waters of each treatment. Dissolved oxygen,
temperature, and pH was measured daily in the old
waters of each treatment.
Reviewer 3
The procedure for controlling test water quality, such as pH was clearly described. It was
conducted carefully. Measurement of pH, DO, conductivity, and temperature were sufficient.
The measured values represent the target values. However, hardness and alkalinity were
measured only in the control water of each test at test initiation. This is weak rather than
sufficient. These parameters are usually measured at least in control, the lowest and highest
treatment concentrations at test initiation and termination to make sure the addition of
toxicant into the test treatments does not change the water quality of the test water.
Thank you for your comment. EPA determined that
measured hardness and alkalinity would not be
expected to vary greatly during a test exposure and
thus measurement only at the beginning of this test
would be sufficient.
15

-------
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 4
Comment:
Yes, it appears that the manipulated test water quality variables (pH, hardness, and DOC;
incorrectly called parameters in the report) were measured with sufficient frequency and
accuracy to represent intended levels and allow incorporation into an updated predictive
model of aluminum toxicity under varying water quality conditions.
Rationale:
Under Section 2.10 TEST MONITORING, subsection 2.10.1 WATER QUALITY the
authors indicate that pH, hardness, and dissolved organic carbon (DOC) were measured
during toxicity testing. pH was measured in each concentration at test initiation, once daily,
and at test termination using a HACH HQ3od pH meter. Water hardness was measured in the
control water of each test at test initiation using a colorimetric titration method following
Standard Methods 2340B/C (APHA 2012). DOC was measured by an outside laboratory
(Oregon State University Cooperative Chemical Analytical Laboratory (Corvallis, OR, USA)
using a Shimadzu TOC-VCNS total organic carbon analyzer (Shimadzu Scientific
Instruments, Columbia, Maryland) following a Combustion method ((Standard Methods
5310B APHA 2012). All of the analytical instrumentation used are of sufficient quality to
provide accurate, reproducible data results. Both water hardness and DOC would not be
expected to vary greatly during a test exposure and thus measurement only at the beginning
of the test would be sufficient. The mean and raw values for the data from these analyses are
presented in Tables 3-1 and 3-1 in the report, and the Appendices C and D, respectively.
Thank you for your comment.
Reviewer 5
Water quality variables were adequately manipulated. I believe that the use of the buffers as
well as C02 headspace was warranted for keeping these tight conditions with regards to the
challenging pH parameter.
Thank you for your comment.
16

-------
2.9 Charge Question 9
9. Was the frequency and accuracy of chemical concentrations measured in test solutions sufficient to represent intended exposure
levels throughout the duration of the test(s)?
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 1
Yes. A1 concentrations were measured at test initiation and once during each test, and from a
composite of replicates at test termination. Samples were analyzed for total and dissolved
(<45 |im) A1 using standard US EPA methods. Blanks and quality control samples were also
run (p. 2-5).
Thank you for your comment.
Reviewer 2
Generally, yes for total A1 concentrations. Test All 199 CDC reported considerable variation
in total A1 concentrations among days for a given nominal concentration. Dissolved A1
concentrations were all over the map and incredibly inconsistent.
Thank you for your comment. EPA notes that total
aluminum concentrations will be used for
determining toxicity effect concentrations, not
dissolved concentrations.
Reviewer 3
Total and dissolved A1 were measured in new and old waters at test initiation and termination
and during the test period. This is sufficient. In addition, the measured concentrations of total
A1 were closed to the nominal concentrations, presenting an accuracy of preparation and
measurement of the test solutions. However, the measured dissolved A1 concentrations were
far away from the total concentrations. This weakens the confidence of this study.
Thank you for your comment. EPA notes that total
aluminum concentrations will be used for
determining toxicity effect concentrations, not
dissolved concentrations. The results from this study
are similar to tests from other laboratories.
Reviewer 4
Comment:
The frequency and accuracy of chemical concentrations of the non-manipulated water quality
variables measured in test solutions appeared to be sufficient to represent intended exposure
levels throughout the duration of the tests.
Rationale:
Temperature, conductivity, and dissolved oxygen (DO) were measured in each concentration
at test initiation, once daily from one of the test chambers at each concentration of aluminum,
and at test termination. This frequency is standard protocol for water quality variables that
may exhibit some variation in concentration over the duration of a test exposure. They were
also measured in the renewal water prior to changing out the adult daphnids. These were
reported to be calibrated prior to starting a measurement in Appendix A Protocol following
Oregon State University Aquatic Toxicology Laboratory Standard Operating Procedures.
These were measured using calibrated digital instrumentation as described in Section 2.4
DILUTION WATERS and reported in Table 2-1. Alkalinity, ammonia, and total residual
Thank you for your comment.
17

-------
Re\ iewer
Com mciils
Response (o ComiiKMils

chlorine (TRC), were measured in the control water of each test at test initiation using digital
meters. Temperature was measured with a standard laboratory thermometer. Test solution pH
was measured using a HACH (Loveland, CO, USA) HQ30d pH meter. These methods of
measurement usually provide highly accurate and reproducible results sufficient to ensure
determination of intended exposure levels.

Reviewer 5
I believe that the frequency and accuracy of the chemical concentrations were sufficiently
performed through the duration of the test, (see next charge question for additional input to
this charge question).
Thank you for your comment.
18

-------
2.10 Charge Question 10
10. Were any anomalies in the test explained or justified with additional information or testing?
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 1
Yes. The only anomalies were variability in the total A1 concentrations measured in the
chambers at the end of the test and in dissolved A1 measurements. The authors explained
these results (see answer to question 7). There was one test in which significant effects on
reproduction occurred, and the authors addressed this by omitting the affected test
concentrations from the reproductive effects analysis.
Thank you for your comment. Since the highest
tested concentration in Test A1 1196R CDC
exhibited a significant effect on survival, EPA
(2002) recommends not including the concentration
in the analysis of reproduction effects. The authors
did not use this test in the reproductive effects
analysis.
Reviewer 2
No. Anomalies (see control reproduction in All 199 CDC) were not explained or justified
with additional testing.
The control group in A1 1199 CDC met data
acceptability conditions outlined in USEPA 2002.
EPA disagrees that it is an anomaly.
Reviewer 3
Not really, except for the procedure for controlling the pH of the test waters.
Thank you for your comment, additional details
about testing were provided in Appendix A.
Reviewer 4
Comment:
The relatively few anomalous data were explained/justified without the need for additional
data or testing.
Rationale:
In Section 3. RESULTS AND CONCLUSIONS, subsection 3.1 TEST CONDITIONS the
authors observed some variability in measured DOC. This has been observed in their testing
laboratory previously and they believe it is due to using multiple batches of Suwanee Natural
Organic Matter (NOM) which shows some variation in % DOC among batches. They also
acknowledge that observed differences may be due to variability in analytical measurements.
Because the DOC concentrations are reported as measured and not nominal, they should be
acceptable for this project's goals of incorporation and expansion into the previously
established predictive model. pH was maintained within 0.2 SU of the target pH in freshly
prepared ("new") solutions after the equilibrium period. However, in some studies, an
increase in pH occurred in the "old" waters (pH up to 0.3 - 0.4 SU above the "new" waters)
between each 24-hr water renewal. Both the use of the buffer to control pH, and also slightly
adjusting the C02 atmosphere, limited observed pH drift within limits that allowed
incorporation of mean pH values into the predictive model. Mean conductivity values
Thank you for your comment.
19

-------
Re\ iewer
Com mciils
Response (o ComiiKMils

remained consistent over the 24-hr period between water renewals. But in certain cases the
range in conductivity was wide, primarily in the higher DOC tests (Table 3-2, p. 3-2). This is
likely due to the higher DOC and cannot be eliminated as a (slightly) confounding factor. The
authors also speculate that some increase in conductivity in the "old" water may be due to
addition of food to the test chambers. The authors observed some variability in total A1
recovery from "old" solutions and suggest this was primarily due to the difficulty in
removing the entire homogenized aliquot because it has been altered during final enumeration
of neonates by removing the organisms during counting (to prevent double counting). They
believe this may have resulted in the accidental removal of precipitates from the non-
homogeneous solution, potentially resulting in a misrepresentation of the entire fraction in the
test chamber. Therefore, they feel that the "new" solutions are the most appropriate
measurements for average exposure determination of Al. When comparing total A1 to
dissolved Al in the same sample, dissolved Al was much more variable than total Al, ranging
from 0.1 to 111% of total Al. The author's expected this as the majority of solutions were
well above solubility limits. The observed trend in dissolved concentrations was that higher
percentages of dissolved/total were apparent in the lower exposure concentrations and
percentages decreased as total Al increased. A few dissolved Al measurements were elevated
and unexpected (and did not correspond to total dissolved Al samples from the identical
concentration). The authors feel this is most likely associated with breaching of the 0.45 (jM
filter by insoluble Al clogging the filter and requiring additional pressure on the filter to
obtain sufficient sample volume. The authors addressed this by keeping pressure on the filter
at a minimum. Because (unlike most metals) the dissolved/free ion species of Al has
relatively less effect on toxicity than the Al hydroxide species at circumneutral pH (6-8), and
Al concentration-toxicity relationships correspond to total Al (Cardwell et al., 2017), total Al
was incorporated into the predictive model.

Reviewer 5
I believe that the anomalies observed during testing were well explained and the justification
was sufficiently presented and plausible (page 3-4). However, these anomalies can be
classified as deviations from protocol. I think this report would benefits from a section in the
report presenting these identified anomalies and also the researchers should attempt to assess
whether these anomalies potentially bias the results high, low, or neutral. I think that this
section will help strength the report and further demonstrate a transparent process.
Thank you for your comment. Section 3.5 "Protocol
Deviations and Amendments" provides a statement
that the authors noted that no protocol deviations
occurred during the toxicity tests which would affect
the study outcomes.
20

-------
2.11 Charge Question 11
11. Do the reported test results meet or exceed expectations for use in model development for the derivation of ambient water quality
criteria for the protection of aquatic life?
Re\ iewer
Com mciils
Response to C omiiienls
Reviewer 1
As far as I can tell. The authors followed standard US EPA guidance for conducting chronic
toxicity tests with Ceriodaphnia dubia with some modifications to account for specific water
types and to achieve effective pH control. The general US EPA criteria for test design and
test acceptability were met, and the authors applied principles consistent with Good
Laboratory Practice (GLP). Although documentation on culture maintenance and husbandry
were not included in the report, the fact that the laboratory has been culturing this species
successfully for over a decade and that control organisms showed acceptable performance,
give little cause for concern related to maintenance and husbandry.
Thank you for your comment. EPA confirmed with
the authors that C. dubia were cultured in-house on
brood boards according to standard methodology
(USEPA 2002). Cultures underwent 100% water
renewals (moderately hard reconstituted water) five
time per week, were fed daily, and reproduction was
tracked to ensure health acceptability for testing.
Reviewer 2
Without seeing the entire package of how water chemistry parameters are going to be used to
model both dissolved and particulate/precipitate concentrations and link these to toxicity, it is
impossible to answer this question. The use of total recoverable A1 as a descriptor for toxicity
seems to run counter to BLM principles. Without direct evidence and mechanistic
understanding of how A1 precipitates are toxic to daphnids, it is going to be very difficult to
convince people that the dissolved concentrations reported in these tests can be predictive of
toxicity.
EPA agrees that dissolved aluminum concentrations
are not appropriate for use in criteria derivation.
EPA notes that total aluminum for toxicity test effect
concentrations will be used in model development.
The use of total recoverable aluminum does not run
counter to BLM principles, in fact, the aluminum
BLM also uses total aluminum concentrations
(Santore et al. 2018).
Reviewer 3
This study covered a wide range of water quality parameters that are suitable for BLM
development and calibration. Reproductive results showed concentration-response
relationships that are useful for determination of effect concentrations based on total
concentration basis but not for dissolved concentration basis.
Thank you for your comment.
Reviewer 4
Comment:
The reported test results do meet or exceed expectations for use in model development for the
derivation of ambient water quality criteria for the protection of aquatic life.
Rationale:
This study appears to have been carefully planned and executed and seems to compare well
with the results of other similar studies and laboratories. For instance, the authors compared
their (EC10/EC20 with 95% confidence interval results with Gensemer et al. (2018) using a
Thank you for your comment.
21

-------
Re\ iewer
Com mciils
Response (o ComiiKMils

one-sample paired-comparison t-test and found that the values were not statistically different
between laboratories. The authors also endeavored to make the study results appropriate for
inclusion in previously developed models. For example, the Biotic Ligand Model (BLM) uses
Ca and Mg (in mg/L) as input variables to calculate hardness values and the multiple linear
regression (MLR) for the A1 toxicity prediction model on which the Water Quality Criterion
is based uses hardness (as mg/L CaC03). The calculated hardness values in Table 3-1 were
used in the MLR analysis to maintain consistency between model input values derived from
other studies. The results of this study are directly applicable to the EPA-developed WQC
because that value is derived using an MLR model based on a site's pH, DOC, and hardness
(EPA 2017). These water quality variables are precisely those evaluated by manipulation in
this study and thus the datasets can be included as part of the model refinement effort.

Reviewer 5
I believe that these test results will strengthen the aluminum water quality criteria, however, I
am not sure the results were meant to meet all of this charge question the way it was
described. I am confident that these results will be very useful to the application of the BLM
model and MLR model, however, the results presented in the report do not provide the details
to make this assessment.
Thank you for your comment.
22

-------
2.12 Charge Question 12
12. Is there any reason to be concerned with the use of the test results in the criteria derivation process?
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 1
Three of the tests had very steep concentration-response relationships and were flagged by
the TRAP model as being useful for exploratory analysis only due to an inadequate number
of partial effects. It is difficult to judge what the effect of including these test results in the
Biotic Ligand Model and Multiple Linear Regression Model would be. Certainly the models
could be run with and without these data and a judgement made as to whether their precision
was sufficient for inclusion in the model refinements.
According to the authors, "As shown in Table 3-12
and Appendix D, modeling of reproduction data
resulted in qualifiers identified by the TRAP model,
in addition to undeterminable confidence intervals.
These were identified due to the lack of partial
effects concentrations in the datasets and associated
steep slopes between the concentration with no
effect (NOEC) and the concentration with a
reproductive effect (LOEC). According to available
TRAP guidance, datasets identified as "exploratory"
should be examined on a case-by-case basis to assess
the confidence around the result based upon the
exposure-response relationship. As the three tests
did show a quite significant reproductive effect at
the highest concentration, it is believed that these
datasets can be included as part of the model
refinement effort." EPA concurs with this
assessment.
23

-------
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 2
The complexity of A1 chemistry makes this very challenging. We do not appear to be closer
to understanding the effects of dissolved A1 and its speciation on C. dubia as a result of these
studies, because the dissolved concentrations are not tractable due to precipitation issues. The
uncertainty of the kinetics of precipitate formation and the effects of those precipitates on
different forms of aquatic life bring a large amount of uncertainty into the equation. How
does a 3 hour equilibration period in the laboratory (with high buffer concentrations) translate
to animal exposures in nature? It is interesting that EPA is willing to consider A1 solid phases
in toxicity characterization, but generally refuses to consider the effects of dietary exposures
of metals - which are known to cause deleterious effects in aquatic life.
Thus, there appears to be considerable uncertainty with respect to both dissolved and
particulate A1 forms. It would appear that both dissolved criteria based on BLB type
principles and particulate criteria would be needed - or that a considerably large uncertainty
factor would be applied to a total A1 measurement.
EPA used total aluminum for toxicity test effect
concentrations to account for these precipitated
forms and their potential presence in the
environment. Total aluminum effects are used in
criteria derivation; dissolved aluminum
concentrations are not appropriate for use. The
aluminum BLM also uses total aluminum (Santore et
al. 2018). EPA assumes organisms are exposed to
both dissolved and particulate aluminum in the
treatment concentrations and in the environment.
Reviewer 3
The concern about this study is the measured dissolved A1 concentrations. Dissolved A1
concentrations were totally off the total concentrations, especially at high concentrations. A
few examples are the measured dissolved concentrations were below the detection limit or 7
or 45 ug/L at the total A1 concentrations of 5000 and 10000 ug/L (Table 3-6, Test A1
1205CDC), or 80-217 ug/L at the total A1 concentrations of 30012000ug/L (Table 3-8, Test
A1 1198CDC). Dissolved metal concentration has been using for evaluating metal
bioavailability, especially using the BLM approach. Given that said, I don't know how the
BLM can be applied to the dissolved concentration data set in this report.
EPA agrees that dissolved aluminum concentrations
are not appropriate for use in aluminum criteria
derivation and will use total aluminum for toxicity
test effect concentrations. Dissolved aluminum
concentrations have not been used for evaluating
bioavailability. The aluminum BLM also uses total
aluminum (Santore et al. 2018).
In numerous studies where both dissolved and total
concentrations were reported, the relationship
between total and dissolved aluminum varies. When
the total aluminum concentrations increase, the
dissolved aluminum concentrations do not increase
as expected. The total aluminum concentration is
used because it includes dissolved and particulate
aluminum and the dissolved aluminum fraction
varies and is not reliable.
24

-------
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 4
Comment:
I do not believe there is any significant reason to be concerned with using the test results from
this report in the water quality criterion derivation process.
Rationale:
The main goal of this project was to increase understanding of the bioavailability and toxicity
of A1 to aquatic organisms. To reach this goal, the main objectives of this project were 1) to
quantify the effects of water quality on A1 toxicity and 2) to use the results to develop a
bioavailability-based model to predict A1 toxicity across a wider range of certain water
quality variables (specifically pH, hardness, and dissolved organic carbon). I believe this
study has achieved these objectives and has increased the applicable range of previous
predictive models used to derive an A1WQC. The expansion included increasing pH from
8.10 up to 8.70, hardness (as CaC03) up to 428 mg/L from 123 mg/L, and dissolved organic
carbon from 4.0 mg/L up to 12.30 mg/L. Comparison of the current model predicted effect
concentrations with observed effect concentrations, for water types outside the previous range
of model development, suggests very good predictive capabilities of this new model (Table 3
- 13) and thus may be confidently used in the water quality criterion derivation process.
In terms of future A1 toxicity testing with the goal of developing a new WQC, I would like to
see the following suggestions to be considered:
1)	A1 toxicity tests performed with sodium aluminum sulfate (probably as
NaAl(S04)2- 12H20. This would help address the massive problem with sulfuric acid-
derived acid mine drainage (AMD), of which elevated A1 is often a constituent.
There are more than 500,000 abandoned and inactive mines in 32 states and AMD
has degraded more than 8,000 miles of streams in Appalachia alone.
2)	I would have preferred to see pH controlled in a flow-thru set-up, perhaps using a
digital controller (Grippo 1997) rather than by buffers, which introduce a possibly
confounding effect on the results. A flow-through protocol has not yet been
developed for fecundity of Ceriodaphnia dubia but development of such a protocol
would significantly increase environmental realism.
Thank you for your comment.
Reviewer 5
I have not | sic | concerns with regards to the use of the test results in the criteria derivation
process.
Thank you for your comment.
25

-------
3 Additional Comments Provided
Re\ iewer
Com mciils
Response (o Comments
Reviewer 4
Suggestions to authors
-Authors frequently use the phrase "In order to". Reducing this phrase to simply "To" will
convey the same meaning with fewer words, enhancing the goal of preparing scientific prose
that exhibits clarity and brevity.
This comment is on the toxicity report for
invertebrates that was conducted by OSU, not on an
EPA document. The authors of the OSU report can
access these comments on our website.
Reviewer 4
-In Part 3.3 BIOLOGICAL RESULTS, paragraph 3 the authors state "The results were quite
comparable to those reported in Gensemer et al. (2018) (EC10/EC20 with 95% confidence
intervals: 5 04.4 (226 - 1126) |ig/L total Al and 631.3 (3 62 1101) |ig/L total Al, respectively).
A one sample t-test was performed and the values were not statistically different between
laboratories. Because the comparison was between two independent populations of test
results (ration of EC10/EC20 a two - sample t-test may have been more appropriate.
This comment is on the toxicity report for
invertebrates that was conducted by OSU, not on an
EPA document. The authors of the OSU report can
access these comments on our website.
Reviewer 4
-Table 3-12. Some of the data are set off by both asterisks and bold-type. In the text it is
stated that this indicates significant differences. I suggest including an explanation of what
the bold-face and asterisks denote in the table heading, rather than the text, so the reader does
not have to go searching in the text to determine the meaning of these highlighted results.
This comment is on the toxicity report for
invertebrates that was conducted by OSU, not on an
EPA document. The authors of the OSU report can
access these comments on our website.
Reviewer 5
General Comments:
I found this report to be well written and supported using the information in the appendices. I
support the use of these results for the derivation of the aluminum ambient water quality
criteria.
This comment is on the toxicity report for
invertebrates that was conducted by OSU, not on an
EPA document. The authors of the OSU report can
access these comments on our website.
Reviewer 5
Specific comments from reviewer:
• While the Ceriodaphnia tests followed the protocols as presented in Appendix A, the test as
described by US EPA is a 3-brood test. However as specified in the protocol, the tests were
carried out with 7-days of exposure (and potentially extended another day if 3-broods did not
occur) rather than as a 3-brood test. Thus, the average neonates were considerably higher than
normal 3-brood tests. I think that this should be mentioned in the results. Also, some of the
variability during testing might also be explained because the protocol did not specify that the
neonates are <24 hours old (from an 8-hour window). While the researchers followed the
protocol, these two issues are outside of the US EPA methods that were reported in the
Methods and Materials section (page 2-1).
Thank you for comment. While the protocol
mentions this caveat, the raw data in Appendix D
verifies that all tests were indeed seven-day three-
brood tests. Additionally, both the protocol in
Appendix A and the final report states that the
neonates are less than 24 hours old.
26

-------
Re\ iewer
Com mciils
Response (o ComiiKMils
Reviewer 5
• What was the normality of the dilute NaOH and HC1? (Section 2.5, page 2-3)
Thank you for your comment. EPA confirmed with
the authors that the acids/bases used in the studies
are reported as molarity in the appendices for each
study (Appendix D). The molarity of the acids and
bases used for pH adjustment were: 0.01 M, 0.1 M,
1 M, 5 M HC1 and 0.1 M, 1 M, 5 M NaOH.
Reviewer 5
• Section 2.8 it should be pH rather than all capital letters (page 2-3).
This comment is on the toxicity report for
invertebrates that was conducted by OSU, not on an
EPA document. The authors of the OSU report can
access these comments on our website.
Reviewer 5
• Good spike response, however, I think the dissolved A1 observation needs its own
paragraph. It is buried in the middle of the second paragraph on page 3-4.
This comment is on the toxicity report for
invertebrates that was conducted by OSU, not on an
EPA document. The authors of the OSU report can
access these comments on our website. No change is
needed.
Reviewer 5
• The report states that there was no protocol deviations and amendments, however, there
were several deviations that were noted in the text (i.e., 45% bisections rather than 50%
bisections). This section needs revised as well as I recommend, as stated above, the
researchers should assess whether the deviations bias the results potentially high, low, or
neutral.
This comment is on the toxicity report for
invertebrates that was conducted by OSU, not on an
EPA document. Section 3.5 "Protocol Deviations
and Amendments" provides a statement that the
authors noted that no protocol deviations occurred
during the toxicity tests which would affect the
study outcomes. The authors of the OSU report can
access these comments on our website.
27

-------
4 References Cited by Reviewers and EPA Responses
American Public Health Association (APHA). 2012. Standard Methods for the Examination of
Water and Wastewater, 22nd edition. Washington, D.C.
Cardwell, A.S., W.J. Adams, R.W. Gensemer, E. Nordheim, R.C. Santore, A.C. Ryan and W.A.
Stubblefield. 2017. Chronic toxicity of aluminum, at a pH of 6, to freshwater organisms:
empirical data for the development of international regulatory standards/criteria. Environ.
Toxicol. Chem. 37: 36-48.
Gensemer, R., J. Gondek, P. Rodriquez, J.J. Arbildua, W.A. Stubblefield, A.S. Cardwell, R.C.
Santore, A. Ryan, W.J. Adams and E. Nordheim. 2018. Evaluating the effects of pH, hardness,
and dissolved organic carbon on the toxicity of aluminum to freshwater aquatic organisms under
circumneutral conditions. Environ. Toxicol. Chem. 37: 49-60.
Grippo, R.S. 1997. A gravity-based system for controlling pH in flow-through aquatic toxicity
experiments. Environ. Technol. 18: 763-768.
Santore, R., A.C. Ryan, F. Kroglund, P.H. Rodriguez, W.A. Stubblefield, A.S. Cardwell, W.J.
Adams, E. Nordheim. 2018. Development and application of a biotic ligand model for predicting
the chronic toxicity of dissolved and precipitated aluminum to aquatic organisms. Environ.
Toxicol. Chem. 37:70-79.
USEPA 2002. USEPA. 2002. Short-term methods for estimating the chronic toxicity of effluents
and receiving waters to freshwater organisms. Fourth edition. Office of Water, U.S.
Environmental Protection Agency, Washington, DC 20460. EPA-821-R-02-013.
USEPA 2017. United States Environmental Protection Agency. Fact Sheet: Draft Aquatic Life
Ambient Water Quality Criteria for Aluminum in Freshwaters Office of Water EPA 820-F-17-
002 July 2017.
28

-------