FINAL REPORT
HISTORICAL DATA QUALITY REVIEW
FOR THE U.S. EPA NATIONAL ESTUARY PROGRAM
to
Office of Marine and Estuarine Protection
U.S. Environmental Protection Agency
Washington, DC
Contract No. 68-03-3319
Work Assignment No. 20
Work Assignment Managers: Joe Hall, Ray Baum
Prepared by
Tetra Tech, Inc.
for
Battelle Ocean Sciences
397 Washington Street
Duxbury, MA 02332
July 1987
-------
CONTENTS
Page
LIST OF FIGURES iii
LIST OF TABLES iv
INTRODUCTION 1
BACKGROUND 1
OBJECTIVES 3
AVAILABLE HISTORICAL DATA 3
OVERVIEW OF DATA USES AND REQUIREMENTS 7
DATA USES 7
DATA REQUIREMENTS 8
QUALITY REVIEW OPTIONS 10
LEVELS OF QUALITY REVIEW 10
TECHNICAL OVERSIGHT OF DATA ENTRY 11
COMPUTERIZED CHECKS 12
TECHNICAL EVALUATION OF ENTERED DATA 14
RECOMMENDATIONS 17
OVERVIEW 17
STANDARD FORMATS AND CODES 19
ESTUARINE VARIABLES 28
CRITICAL DATA REQUIREMENTS 28
RANGE LIMITS 33
NATIONAL QUALITY REVIEW 50
REGIONAL QUALITY REVIEW 50
i i
-------
FIGURES
Number Page
1 Example of a form used to identify priority data sets for
uses in estuary characterization 6
2 Overview of the proposed quality review process 18
3 Schematic of the recommended five-level hierarchy for SAS
libraries 23
-------
TABLES
Number Page
1 Variables commonly encountered in historical estuarine
data sets 5
2 List of estuarine variables 29
3 Critical data requirements for estuarine variables 30
4 Range limits for estuarine variables 34
5 Upper range limits for chemical contaminants in the water
column and bottom sediments 38
6 Upper range limits for chemical contaminants in
muscle and liver tissue 44
-------
INTRODUCTION
BACKGROUND
The National Estuary Program is administered by the Office of Marine
and Estuarine Protection (OMEP) of the U.S. Environmental Protection Agency
(EPA). The Program is implemented through U.S. EPA regional offices under
the guidance of OMEP. The National Estuary Program has two major compo-
nents. The first is oversight and implementation of existing estuarine
management programs such as the Chesapeake Bay and Great Lakes Programs.
The second major component is initiation of new programs. At present, new
programs are being developed for Puget Sound (WA), San Francisco Bay (CA),
Long Island Sound (NY), Buzzards Bay (MA), Narragansett Bay (RI), and
Albemarle-Pamlico Sounds (NC).
For each estuary within the National Estuary Program, a five-year
program is developed for addressing environmental problems. In the first
year, a planning initiative is prepared. This initiative defines the
organization of the estuary program and identifies key participants. In the
second, third, and fourth years, environmental problems within the estuary
are identified and evaluated from both a scientific and programmatic
perspective. In the fourth and fifth years, a comprehensive conservation
and management plan is developed. This plan presents the details of how
environmental problems will be corrected, including who will conduct various
activities and when those activities will be conducted.
A key process in addressing the environmental problems of an estuary is
defining those problems and conveying the relevant information to the
public. This process is termed characterization, and occurs in the
following major steps:
-------
The historical (i.e., existing) data sets needed to define
environmental problems are identified, collected, and
screened.
0 New data are generated to fill important gaps in the
historical database.
0 Data are analyzed to define the present status of the estuary,
historical trends, and likely future trends if current
practices are not modified.
Results of the data analyses are conveyed to the public in a
form that can be understood and supported.
Most of the individual estuary programs rely primarily upon historical
data to characterize the status and trends of estuarine conditions. Given
the value of historical data to the development of estuary programs, it is
essential that these data be treated in a manner that maximizes their
usefulness to the individual estuary programs. This treatment includes
identification of priority data sets, transfer of data to computer files,
and verification of data quality.
The National Estuary Program, in conjunction with U.S. EPA regional
offices, has identified a number of historical data sets as useful for
characterizing estuarine conditions. These data sets have been or will be
transferred to SAS computer files on the U.S. EPA National Computer Center
(NCC) mainframe computer. However, before these data can be used to
characterize the status and trends of estuarine conditions, they will be
subjected to a quality review process to ensure they are appropriate for
those evaluations.
Although the quality requirements for new data collected by individual
estuary programs generally are known and specified, the requirements for
historical data are not well defined. Specification of quality requirements
for historical data is difficult, because these data often were collected
for a variety of reasons using different methods. In addition, much of the
-------
information required to conduct a detailed quality review of historical data
is not available. Quality review of historical data must therefore strike a
balance between the ideal of a rigorous scrutiny of all data and the reality
of the limitations of this kind of data.
OBJECTIVES
The primary objective of this document is to develop an approach for
conducting quality reviews of historical data used by the National Estuary
Program. The goal is to ensure that all data used to characterize estuarine
conditions pass a minimum level of quality review. Data users can therefore
be assured that these data are of known quality.
The proposed quality review approach is described from a national
perspective, to ensure consistency among individual estuary programs.
However, the approach has the flexibility to be modified as necessary to
meet the specific needs of individual programs. For example, additional
variables can be added or more stringent quality review criteria can be
specified for individual programs. To be cost-effective, the proposed
quality review approach is based primarily on computerized checks, rather
than evaluations by technical experts. However, an overview is presented of
the general kinds of technical review that may be conducted by the individual
estuary programs.
The remainder of this section describes the data sets currently selected
for use by the National Estuary Program. The following sections provide
overviews of how data are used by the program and what options are available
for conducting quality review evaluations of those data. The last section
of the document presents the quality review approach recommended for
historical data used by the National Estuary Program.
AVAILABLE HISTORICAL DATA
Historical estuarine data generally are found in two major forms:
measurements and attributes. Measurements are data to which numerical
values can be assigned (e.g., concentrations of dissolved oxygen), whereas
-------
attributes are data that cannot be measured or ordered, but must be expressed
qualitatively (e.g., male or female, juvenile or adult). Both kinds of data
are valuable for characterizing the status and trends of estuaries.
A wide variety of variables is encountered in historical estuarine data
sets (Table 1). Most variables pertain to the characteristics of stations,
the water column, sediments, or organisms. The contents of individual data
sets range from several variables (e.g., abundance of fish at a transect) to
a very large number of variables (e.g., a large-scale survey of chemical
contamination and biological effects).
The specific data sets used for characterization by individual estuary
programs generally are a subset of the total number of data sets available
for each estuary. These priority data sets are selected on the basis of the
following criteria:
Relevance of the data to the objectives of characterization.
Identity of the key variables included in the data set.
t Preliminary quantitative or qualitative assessment of the
quality of the data.
Accessibility of the data set.
To assist in the identification of priority data sets, forms (Figure 1) are
frequently sent to investigators to obtain detailed information on candidate
data sets.
At present, over 40 priority data sets from four estuaries have been
identified and entered into SAS files on the U.S. EPA NCC computer. Within
the next year, up to 60 additional data sets may be added to this estuarine
database.
-------
TABLE 1. ENVIRONMENTAL VARIABLES COMMONLY ENCOUNTERED
IN HISTORICAL ESTUARINE DATA SETS
Kind of Data
Variable3
Station description
Water column variables
Sediment variables
Biological variables
Position - latitude and longitude or
other kinds of coordinates
Depth
Sampling time - date, hour
Ambient conditions - tidal stage and
height, current speed and
direction, wave height, wind speed
and direction
Nutrients - various forms of
nitrogen and phosphorus
Organic carbon
Alkalinity
Temperature
PH
Salinity
Specific conductivity
Dissolved oxygen
Transparency
Turbidity
Total suspended sol ids
Chloride
Grain size
Total solids
Total volatile solids
Total organic carbon
Oil and grease
Chemical contaminants"
Bacterial indicators - abundance in
water and tissue
Plankton - species composition and
abundance
Benthic macroinvertebrates - species
composition and abundance
Fishes and megainvertebratesc -
species composition and abundance,
tissue concentrations of chemical
contaminants'*, histopathology
a Variables were selected from the historical
submitted to the National Estuary Program.
b U.S. EPA priority pollutants and other chemicals.
data sets already
c Large invertebrates captured in trawls, dredges, and traps.
Distinguished from smaller benthic macroinvertebrates that are sampled
using grabs or box corers.
5
-------
LONG ISLAND SOUND DATA CHARACTERIZATION
OXYGEN DEPLETION IN WESTERN LONG ISLAND SOUND
1. LIS Document Reference Number:
2. Organization Contacted:
3. Principal Investigator:
4. Contact:
5. Telephone Number:
6. Address of Contact:
7. Citation:
a) Author(s)
b) Year
c) Title
d) Journal/Rept.
e) Volume: Number
f) Pages
8. Sample, Survey Type:
a) Station(s)
b) Synoptic Survey
c) Vertical Resolution
9. Measurements:
a) Dissolved Oxygen
b) % Oxygen Saturation
c) Temperature
d) Salinity
e) Phytoplankton Pigments
f) Phytoplankton Counts
g) Inorganic Nutrients (Ammonium,
Nitrite. Phosphate, Silicate)
h) Organic Nutrients (DOC, TOC
DON, TON. OOP. TOP)
i) BOD. COD
j) Biological Rates (Primary Produc-
tivity, Water Respiration, Sediment
Respiration, etc.)
10. Data, Study Area:
11. Time Span of Data:
12. Status of Data:
a) Raw
b) Reprint
c) Computerized
d) Database Name
e) Data Products
13. Comments:
Y/N
Frequency/Resolution
Y/N
Units
From . ,
Y/N
To
Availability
Cost
Figure 1. Example of a form used to identify priority data sets for use in
estuary characterization.
-------
OVERVIEW OF DATA USES AND REQUIREMENTS
This section provides a general description of how historical data are
used by the National Estuary Program, and the requirements the data must
meet to be acceptable for the desired uses. This information is needed to
evaluate the quality review options and recommendations that are described
in subsequent sections of this document.
DATA USES
The primary use of historical estuarine data by the National Estuary
Program is for characterizing the status and trends of conditions within
specific estuaries. In general, characterization has four major components:
Identification of important variables.
Spatial patterns of variables.
Temporal trends of variables.
Relationships among variables.
Descriptions of variables include evaluations of the chemical, physical,
and biological characteristics of all or part of each estuary. These
descriptions are useful as a broad overview of the conditions encountered in
each estuary. They may include lists of the species and chemicals that are
commonly encountered within an estuary. Descriptions may also include the
mean values and ranges of conditions (e.g., water temperature, salinity,
depth) within the estuary.
Evaluations of the spatial patterns of variables within an estuary are
useful for identifying such locations as critical habitats, resource
harvesting areas, pollutant sources, and areas exhibiting environmental
-------
impacts. Spatial patterns usually are displayed by mapping or contouring
the values of a variable. These kinds of maps can be used by managers and
the public to visualize the magnitude and extent of environmental problems.
Evaluations of the temporal trends of variables within an estuary are
useful for determining how variables change over time. This information can
be used to assess how conditions have varied in the past and how they might
change in the future. Temporal trends usually are displayed by plotting the
values of a variable observed at different times. This kind of information
is important for determining if environmental conditions are improving or
deteriorating over time.
Evaluations of the relationships among variables within an estuary are
useful for determining potential cause and effect relationships. For
example, by evaluating similarities in the spatial patterns (e.g., pollutant
sources and impacted areas) or temporal trends (e.g., increasing turbidity
and decreasing density of aquatic vegetation), potential cause and effect
relationships can be identified. Relationships among variables can be
evaluated by simply plotting values and looking for similar trends.
Alternatively, relationships among variables can be evaluated more rigorously
using statistical techniques (e.g., correlation, regression, analysis of
variance). Understanding the relationships among variables is an important
step in the process of recommending corrective action.
DATA REQUIREMENTS
The requirements necessary for interpreting estuarine data can be
subdivided into those that are universal (i.e., they apply to all variables)
and those that are specific to each variable. Universal data requirements
are the location and time of data collection, the methods used to measure
the variable, and the measurement units in which the data are expressed.
Variable-specific requirements depend upon the intended use of the data.
Location of data collection for all kinds of estuarine data refers to
the geographic position of the sampling site within an estuary. It generally
is expressed as latitude and longitude, or as coordinates of alternate
8
-------
systems (e.g., Loran and Raydist navigation systems, state plane grids) that
can be converted to latitude and longitude. For some purposes, location can
be expressed less precisely as the waterway or estuary segment within which
data were collected.
In addition to geographic position, location for some variables also
refers to vertical position. Examples of vertical position are depth in the
water column, depth below the sediment surface, and elevation above sea
level (i.e., altitude). For interpretation of some variables, knowledge of
vertical position can be as critical as knowledge of geographic location.
Time of data collection refers to the hour, day, month, or year in
which sampling occurred. Depending on the kind of data and the intended
use, the precision with which time is expressed can vary widely. For
example, evaluations of diel movements of fish might require that sampling
time be reported to the nearest hour, whereas stock assessments of fish
might only require that data be reported to the nearest month.
The measurement units of each variable must be known to interpret the
absolute magnitude of each data value. In most cases, data must be converted
to common units before being compared. The kinds of units reported for each
data value therefore are not important, as long as they can be converted to
the units commonly used for the variable. In some relatively rare cases,
estuarine data are unitless by definition. Unitless data frequently are
encountered when indices are used. In such cases, to interpret the absolute
magnitude of each data value, it is critical to know how the unitless data
were derived.
Variable-specific data requirements are dependent upon the intended
uses of the data. In general, the critical data requirements for a variable
should include the universal data requirements discussed previously, as well
as any other information that is essential for interpretation of a data
value. Additional data requirements (i.e., beyond those considered critical)
may be necessary for specialized uses of the data (e.g., detailed statistical
analyses).
-------
QUALITY REVIEW OPTIONS
This section describes the general kinds of quality review that can be
applied to estuarine data. This information provides the basis for the
detailed recommendations made in the final section of this document.
LEVELS OF QUALITY REVIEW
The level of quality review applied to a data set can vary from no
review to detailed scrutiny of every data value. From a cost-benefit
standpoint, neither extreme may be desirable. In the former case, failure
to identify and correct substantial errors could lead to costly and
ineffective management decisions based on those data. In the latter case,
excessive quality review may be costly and yield little additional benefit
in terms of enhanced data quality, compared to a more modest review.
The optimal level of quality review generally lies between the extremes
of no review and detailed review of all data. This review may consist of a
combination of computerized checks and evaluations by technical experts. In
general, computerized checks can be conducted inexpensively on a complete
data set. By contrast, technical evaluation generally is more expensive, and
therefore usually cannot be applied to all values in a data set. However, a
technical evaluation can produce an assessment of many aspects of a data set
that computerized checks cannot. The optimal quality review approach
combines the strengths of both kinds of evaluation to effectively review a
data set at a reasonable cost.
The remainder of this section describes the kinds of computerized
checks and technical review that can be applied to estuarine data. In the
following section, recommendations are made for combining these two kinds of
review to evaluate the historical data used by the National Estuary Program.
10
-------
TECHNICAL OVERSIGHT OF DATA ENTRY
When data in hard-copy form are entered into a machine-readable format,
it is desirable that a technical expert oversee the entry process. The two
major kinds of technical oversight are 1) assurance that data from the hard
copy are interpreted accurately and 2) assurance that data are transferred
accurately to the machine-readable format.
Historical data in hard-copy form frequently are found in a variety of
locations (e.g., text, tables, appendices) and formats (e.g., different
units, significant figures). Because data entry personnel may not have the
training and experience required to understand the details of technical
information, it may be necessary for a technical expert to ensure that data
are interpreted accurately prior to entry. Data interpretation may include
transformation to different units, rounding off to fewer significant figures,
or calculations (e.g., from wet weight to dry weight). It may also include
review of the data source to ensure that all pertinent supporting information
is collected with the data values. Such information might include detection
limits for chemical analyses, mesh size for benthic infaunal analyses, or
depth for water column variables. Technical oversight at this stage is
critical because subsequent data users may not have access to the original
hard copies and therefore cannot check for accurate interpretation of the
data.
Whenever data are transferred from hard-copy form to a machine-readable
format, it is advisable to check at least 10-15 percent of the data for
accurate transferral. Accurate transferral refers to use of proper codes
and formats, as well as accurate entry of values. Given the complexity of
many historical environmental data sets, it is preferable that a technical
expert oversee the transferral checks. It is recommended that these checks
focus primarily upon the most complex components of each data set (i.e.,
those components having the highest potential for data transferral errors).
In many data sets, these components are related to taxonomic names and names
of complex organic compounds.
11
-------
COMPUTERIZED CHECKS
The speed and reliability of computers can be used to conduct a
variety of cost-effective quality review checks. Four major kinds of
computerized checks include the following:
Format checks to ensure data are entered in the proper format.
Coding checks to ensure that all codes are valid.
Range checks to ensure that all numerical values fall within
specified ranges.
Checks for critical data requirements to ensure that all
essential ancillary information (e.g., station location,
sampling time) is available.
To conduct computerized format and coding checks efficiently, it is
essential that all machine-readable data sets have a uniform file structure
and coding system. Data sets in hard-copy form can be receded before entry,
and then entered directly according to the standard format. By contrast,
data sets existing in machine-readable form must be receded and reformatted
automatically. As mentioned in the previous section, reformatting and
receding of data generally require technical oversight to ensure that the
diverse kinds of data encountered in unrelated original data sources are
translated properly into the desired uniform system.
Format checks ensure that no data field contains inappropriate
characters. For example, fields with numeric data should not contain
alphabetic characters, and alphabetic fields should not contain numeric
characters. Format checks will not ensure that numeric and alphabetic
characters were entered accurately.
Coding checks ensure that all coded entries have valid codes. For
example, if taxonomic codes are used instead of species names, the coding
checks will determine whether or not each taxonomic code is valid (i.e., it
12
-------
is in the code dictionary). These checks will not determine whether each
valid code is properly matched with each species name.
To conduct range checks, a list of variable-specific ranges must be
developed. Each range establishes the numerical limits within which the
value of a variable is expected to occur. The automated checks identify
data that lie outside the specified ranges. For example, the range limits
for the sediment concentrations of naphthalene might be 0 and 10 mg/kg (dry
weight). A value of 20 mg/kg would therefore be identified as being outside
the specified range.
Range limits can be established to identify at least two kinds of
extreme values. For example, an initial upper range limit of 10 mg/kg might
be used to identify naphthalene concentrations that are unusually high, but
sometimes found. This initial range limit would identify potential errors.
A second upper range limit of 50 mg/kg might also be used to identify concen-
trations that are unusually high, and unlikely to be found. This second
range limit would identify probable errors.
An important consideration when using range checks is that the results
only indicate which values are inside or outside specified ranges. They do
not indicate that values within the ranges are correct. For example, a
naphthalane concentration of 0.4 mg/kg that was entered mistakenly as
4.0 mg/kg would pass the range-checking procedure, but be incorrect.
To conduct computerized checks for critical data requirements, a list of
variable-specific essential ancillary information must be developed. This
information represents the supporting information that is essential for
interpreting a particular data value. For example, knowledge of sieve mesh
size might be a critical data requirement for interpreting the total
abundance of benthic invertebrates at a station. An abundance of 10,000
individuals/m2 might be interpreted as high if a 1.0-mm mesh was used,
whereas the same value might be considered low if a 0.5-mm mesh was used.
Once the list of critical data requirements is developed, data sets can be
searched automatically and missing critical data requirements can be
identified as such. The identification of a missing critical data require-
13
-------
ment does not imply that the respective data value is incorrect; it only
suggests that the data value will be difficult to interpret.
TECHNICAL EVALUATION OF ENTERED DATA
A second level of quality review that can be conducted in conjunction
with computerized checks is evaluation of entered data by technical experts.
This kind of evaluation is most valuable if the experts have access to the
documents that describe the field and laboratory techniques used to generate
the data. However, technical evaluation is valuable whether or not the
original documentation is available.
In many cases, automated quality review checks simply identify aberrant
data. The data user must decide how the aberrant data will influence the
intended use of a data set. If the data user does not have the technical
training to understand the implications of the aberrant data, the data set
may either be used inappropriately or rejected unnecessarily. Thus, to
ensure that data sets containing aberrant data are used properly, a technical
evaluation may be desirable.
The most detailed kind of technical evaluation involves assessments of
the study design, sampling procedures, and analytical methods used to
generate the data set of interest. This kind of evaluation usually requires
review of the original documents from which the data were taken.
Evaluation of the study design might focus on how the study objectives
influence subsequent uses of the data. For example, if the objectives were
to characterize conditions near sources of contamination, most stations
within a particular water body may be located as close to sources as
possible. Use of such a highly biased data set to characterize conditions
throughout the water body could produce misleading results.
Evaluation of sampling protocols can determine how they influenced the
accuracy of the resulting data. For example, checks can be made to ensure
that collection equipment was operated properly (e.g., an otter trawl was
fishing on the bottom), that samples were handled appropriately following
14
-------
collection (e.g., preserved as specified), and that the entire sampling
effort was documented adequately (e.g., adequate logkeeping and chain-of-
custody). The knowledge that these procedures were executed properly greatly
increases confidence in the resulting data.
As with sampling protocols, evaluation of analytical methods can
determine how they influenced the accuracy of the resulting data. For
example, checks can be made to ensure that acceptable methods were followed
(e.g., that departures from standard protocols were justified) and that
application of the methods was adequate (e.g., that analyses of standards or
spiked samples were acceptable).
A less detailed kind of technical review would place less emphasis on
examining original documents, and focus primarily on the information
available in machine-readable form. Automated quality review checks would
greatly facilitate this kind of review by identifying data that do not
conform to established criteria. As mentioned in the previous section,
these automated checks can include checks for proper formats, valid codes,
range limits, and critical data requirements.
Data identified as having improper format or invalid codes can be
evaluated to determine the implications of their exclusion from subsequent
uses of the data set. If data are not considered critical for certain kinds
of analyses, they can be deleted from those analyses. However, if data are
considered essential for an intended use, the technical expert may be
required to examine the previous machine-readable or hard-copy forms of the
data set to rectify the problem.
Data identified as lying outside of range limits can be evaluated to
determine whether the values may be accurate or whether they appear to be
erroneous. A technical expert familiar with the conditions encountered in a
particular estuary often can review supporting information such as station
location, season, depth, and habitat characteristics, and judge whether an
unusual value was possible under the specific set of existing conditions.
In some cases, review of original documentation may be required to evaluate
an unusual value.
15
-------
When critical data are missing, a technical expert may be required to
determine the implications of the missing information with respect to
subsequent uses of the data set. For example, if information on sieve mesh
size is missing for a data set composed of abundances of benthic inverte-
brates, meaningful comparisons with other data sets based on known mesh
sizes would not be possible. Because abundance is partly a function of
sieve mesh size, interpretation of differences in abundances between data
sets would be difficult. The differences could be due primarily to mesh
size differences rather than to differences in the variable under study
(e.g., concentration of a chemical contaminant).
16
-------
RECOMMENDATIONS
OVERVIEW
This section presents recommendations for conducting quality reviews of
historical data used by the National Estuary Program. The background for
these recommendations is presented in previous sections. The recommended
quality review process (Figure 2) relies primarily upon computerized checks
of entered data. However, the potential roles for technical oversight and
review are also described. Key assumptions used to derive these recommenda-
tions include the following:
t All estuarine data must pass some level of quality review.
t Data not passing quality review criteria will be flagged, but
otherwise left intact in the database.
Individual estuary programs will be responsible for deter-
mining their own quality review criteria.
Funding for quality review will be limited, requiring that
emphasis be placed on cost-effective computerized checks.
The initial step of the quality review process involves translating
diverse historical data into a set of standard codes and a standard format.
For data already in machine-readable form, translation involves computerized
receding and reformatting. For data in hard-copy form, translation entails
manual receding and reformatting as data are entered into computer files.
It is recommended that the manual receding and reformatting be conducted
with technical oversight, to ensure that data are translated and entered
accurately.
17
-------
HISTORICAL DATA
(MACHINE-READABLE)
OR
HISTORICAL DATA
(HARD COPY)
DATA
REFORMATTING
AND RECODING
NCC SAS FILES
STANDARD FORMAT
STANDARD CODES
COMPUTERIZED
CHECKS
FORMAT
CODES
CRmCAL DATA
REQUIREMENTS
. RANGE LIMITS
EGENTRY
H TECHNICAL
RSIGHT
QUALITY REVIEW
DICTIONARY
NATIONAL AND/OR
REGIONAL
CRITERIA FILES
CRITICAL DATA
REQUIREMENTS
RANGE LIMITS
SAS FILES WITH
DATA QUALIFIERS
OPTIONAL
TECHNICAL
EVALUATIONS
BY REGIONAL
ESTUARY PROGRAM
Figure 2. Overview of the recommended quality review process.
18
-------
The next step in the quality review process entails computerized checks
of formats, codes, critical data requirements, and range limits for a group
of estuarine variables. To accomplish this, a series of computer programs
will be developed, with each program specific to a particular type of data
(e.g., sediment chemistry, water quality). These programs will read in the
SAS data files and compare them with a quality review dictionary. The
critical data requirements and range limits will be specified in these
dictionary files. Because the specifications in the dictionary files can be
modified independently by each estuary program, quality review checks can be
tailored to the specific needs of each estuary. After scanning the SAS data
files and the appropriate quality review dictionary, the computer programs
will produce new SAS data files containing qualifiers for all those data
that failed to meet the specifications in the quality review dictionary.
Aside from being flagged, these data will be left intact in the database.
The initial variables and range limits to be included in the quality review
dictionary are discussed in the following sections.
The final step in the proposed quality review process is optional, and
will be conducted by the regional estuary programs. This step involves
evaluations of the machine-checked data by technical experts. In some cases,
these evaluations may require review of the original hard-copy documentation
of the data.
The remainder of this section describes the proposed coding and
formatting systems and the computerized quality review criteria that will be
applied to the standard variables in each historical estuarine data set. In
addition, general guidance is provided for the kinds of criteria modification
and technical review that can be conducted by the regional estuary programs.
STANDARD FORMATS AND CODES
A key element in the recommended quality review procedures shown in
Figure 2 is the use of standard data formats and codes. By standardizing
these data elements, computer programs will need to be developed only once.
Costs for quality review will be minimized, because these programs will not
require extensive modifications for each data set that is scanned. An
19
-------
additional benefit of standardization is increased user familiarity with
data files and variables.
Formats
This section describes the recommended system for formatting historical
estuarine data. The standardized, modular structure of the system is
designed for the following purposes:
Reduce quality review and maintenance costs over a multiyear
operational period.
0 Ensure consistency in naming conventions and file structures.
Facilitate system updates and modifications.
Minimize use of on-line disk space at NCC.
Facilitate use of data by program participants.
Reduce training time and associated costs.
Minimize data retrieval time.
Facilitate the addition of specialized data from individual
estuary programs.
To achieve the above objectives, it is recommended that top-down
standards be established for the following system levels:
Names and organization of SAS libraries as catalogued in the
NCC environment.
0 Names and organization of members within SAS libraries.
0 Names and organization of variables within the SAS members.
20
-------
Details on standards for these levels are provided in the following
sections.
Naming Conventions and Organization of SAS Libraries
At the highest level, SAS library names should be developed in a
consistent fashion to allow users to quickly identify and retrieve data of
interest. It is recommended that the following standard three-level naming
convention be used:
PREFIX.ESTUARY.DATA_TYPE, where:
PREFIX = the catalog prefix assigned by NCC (e.g., XXXODES)
ESTUARY = a two-character code unique to each estuary study area
(e.g., "NB" for Narragansett Bay)
DATA_TYPE = a three-character code for standard types of data (e.g.,
"WAC" for Water Column Data).
For example, all water column data for Narragansett Bay would be stored in
an SAS library named "XXXODES.NB.WAC."
Naming Conventions and Organization of SAS Members
Within the SAS libraries, members should be organized in a five-level
hierarchy based on the range of information they contain. Member names and
hierarchy levels should be the same for all data types. Information common
to more than one level should be retained only at the highest level for
which it is relevant. All levels should contain one or more primary sort
keys that would enable users to move from level to level by using SAS "MERGE"
commands.
The standardized five-level hierarchical organization minimizes data
retrieval time, user-training time, and system resource demands. Because it
21
-------
is modular, it gives program participants a great deal of flexibility in
their use of data, and it simplifies modification and maintenance of the
entire system. It is recommended that the following hierarchy of members
and member names be used for all data types:
DATA_SET - contains basic information about the data
collected; provides a descriptive index to the data set.
VARIABLE - contains a list of all variables in the data set
and their general quality review status (data dictionary).
t STATION - contains station-specific information and flags.
SAMPLE - contains sample-specific information and flags.
SOURCE - contains variable-specific information and flags;
may also contain additional regional variables.
Figure 3 provides a diagram of the hierarchical relationship among these
five SAS members. For example, under this organizational scheme, all
station-specific data values for Narragansett Bay water column data would be
contained in SAS library XXXODES.NB.WAC, member STATION. These values would
not be repeated in member SAMPLE. To obtain station-specific values for use
with SAMPLE data, users would simply sort and then merge STATION and SAMPLE
by their common primary sort key, sample code.
Naming Conventions and Organization of SAS Variables
For each SAS member, there will be a series of standard SAS variables.
Variables will remain as uniform as possible across all data types, recog-
nizing the obvious differences in file structures for different data types.
For example, members SAMPLE and SOURCE will contain additional depth
variables for water column data, and member SOURCE may contain additional
regional variables. Currently, OS_ID, STN_CD, and SAMP_ID are designed to
be used as primary sort keys on which data from different members may be
22
-------
DATA_SET
KEYS:
DS ID
STATION
KEYS:
DSJD
STN CD
VARIABLE
KEYS:
DS ID
STN CD
SAMPLE
KEYS:
STN_CD
SAMP ID
SOURCE
KEYS:
STN CD
SAMP ID
Figure 3. Schematic of the recommended five-level hierarchy for SAS
libraries.
23
-------
matched. However, all variables have the potential to be used as keys which
may be matched to information in other files, tables, and data dictionaries.
Use of these standard variables will enable users to access special
tables and data dictionaries in a logical and efficient manner; obtain
uniform definitions, ranges for variables, and units of measurement for data
comparisons; and document and disseminate information according to a
standardized format. For example, program participants may choose to
develop specialized tables of values to perform additional edits on their
data. This standardized system would allow those program participants to
use the same tables to selectively process all data sets in the system. By
contrast, an understandable method of naming and organization would require
the use of multiple tables. Use of these standard variables should also
simplify system modification, maintenance, and documentation. In accordance
with these standards, it is recommended that standard variables be used and
organized as follows [note that field type (A=alphanumeric, N=numeric,
I=integer) and length are listed for each variable]:
DATA_SET (Standard format for all data types, K = Key
Variable)
K DS_ID - data set identification code (A10)
ESTUARY - name of the estuary from which the data were
obtained (A20)
DATASET - name of the data set (A40)
SUBMITR - name of the individual or organization responsible
for submittal of the data (A15)
-- SUB_ADDR - address of the data submitter (A40)
SUB_PHON - phone number of the data submitter (A12)
SD_ED - starting and ending dates for the sampling period
(N12 or SAS date YYMMDD)
STACOUNT - number of stations included in the data set (15)
DOC - field indicating whether documentation for the data
set is present (A3)
PURPOSE - field describing the purpose of the data (A40)
24
-------
QC_LEVEL - field expressing the submitter's subjective
review of the overall quality of the data (A5)
AUTHOR - if Doc flag is set, author.name (A40)
YEAR - if Doc flag is set, year of document (14)
TITLE - if Doc flag is set, first 80 characters of title
(A80)
JOURNAL - if Doc flag is set, journal name (A40)
VOL_PAGE - if Doc flag is set, volume and page numbers (A20).
0 VARIABLE (Standard format for all data types, K = Key
Variable)
K DS_ID - data set identification code (A10)
VARIABLE - name of variable in data set (A12)
QA_RV - flag indicating whether quality review was performed
for the above variable (Al)
UNITS - units of each variable (A15)
VARCOM - comment field for each variable (A60)
METHOD - method code (A12).
t STATION (Standard format for all variable types, K = Key
Variable)
K DS_ID - data set identification code (A10)
K STN_CD - code identifying the station at which sampling was
performed (A7)
F_STN_CD - flag providing information about the quality of
the value for STN_CD (Al)
LAT - latitude (degrees, minutes, and seconds to nearest
tenth) at which the station is located (N7)
F_LAT - flag providing information about the quality of the
value for LAT (Al)
LONG - longitude (degrees, minutes, and seconds to nearest
tenth) at which the station is located (N8)
FJ.ONG - flag providing information about the quality of the
value for LONG (Al)
25
-------
-- SDEPTH - station depth (meters to nearest tenth) (N5)
F_SDEPTH - flag providing information about the quality of
the value for SDEPTH (Al).
SAMPLE (Additional water column variables are preceded by
'*', K = Key Variable)
K STN_CD - code identifying the station at which sampling was
performed (A7)
K SAMP_ID - sample identification code (A4)
DATE - code indicating the date (year, month, day) on which
the sample was taken (N6 or SAS date YYMMDD)
F_DATE - flag providing information about the quality of the
value for DATE (Al)
TIME - code indicating the time (hours, minutes) at which
the sample was taken (N4 or SAS format HHMM)
F_TIME - flag providing information about the quality of the
value for TIME (Al)
TIDE_HT - tidal height (meters to nearest tenth) (N3)
F_TIDE - flag providing information about the quality of the
value for TIDE_HT (Al)
WAVE HT - wave height (meters to nearest tenth) (Al)
F_WAVE - flag providing information about the quality of the
value for WAVE_HT (Al)
CURR_SP - current speed to nearest tenth (N3)
F_CURR - flag providing information about the quality of the
value for CURR_SP (Al)
WIND_SP - wind speed to nearest tenth (N2)
F_WIND - flag providing information about the quality of the
value for WIND_SP (Al)
* DEPTH - depth at which sample was taken (meters to nearest
hundredth) (N6)
* F_DEPTH - flag providing information about the quality of
the value for DEPTH (Al).
26
-------
SOURCE - (Additional water column variable is preceded by
***, K = Key Variable)
K STN_CD - code identifying the station at which sampling was
performed (A7)
K SAMP_ID - sample identification code (A4)
DATE - date (year, month, day) on which sample was taken (N6
or SAS format YYMMDD)
TIME - time (hours, minutes) at which sample was taken (N4
or SAS format HHMM)
* DEPTH - depth (meters) at which sample was taken (to nearest
hundredth) (N5)
VARIABLE - name of variable in data set (A12)
F_VAR - flag providing information about the quality of
VARIABLE (Al)
ORIG_AMOUNT - value of the original variable as reported by
the investigator (N8)
STD_AMOUNT - value of the variable in National Estuary
Program units (N8)
F_AMOUNT - flag providing information about the quality of
the value for AMOUNT (Al)
-- ORIGJJNIT - unit of measurement used to express variable
value as reported by the investigator (A3)
STDJJNIT - National Estuary Program standard units (A3)
F_UNIT - flag providing information about the quality of the
value (Al).
Codes
Taxonomic, variable, and method codes as specified in the Ocean Data
Evaluation System (ODES) Data Submissions Manual are recommended for use with
National Estuary Program data. Key features of these codes are the use of
National Ocean Data Center (NODC) codes for species identifications as well
as mnemonic codes for chemical variables (e.g., the code for copper is
"copper").
27
-------
ESTUARINE VARIABLES
The variables encountered most frequently in the historical estuarine
data sets submitted to the National Estuary Program are listed in Table 2.
These variables are the ones for which computerized quality review criteria
were developed. Other variables that have been measured in estuarine studies
are encountered less frequently than those in Table 2 and were not considered
for quality review.
The estuarine variables can be grouped into the following four
categories:
Station information - geographic location and depth of the
station, time and location (i.e., depth) of sample collection,
and characteristics of gross environmental variables (i.e.,
tides, currents, wind) at the time of sampling.
Water column variables - physical and chemical characteristics
of the water column.
0 Sediment variables - physical and chemical characteristics of
bottom sediments.
0 Biological variables - abundances, tissue chemical concen-
trations, and other characteristics of aquatic organisms.
CRITICAL DATA REQUIREMENTS
The critical data requirements for estuarine variables are listed in
Table 3. They include sampling location, sampling time, analytical method,
and measurement units for all variables, as well as a range of additional
variable-specific requirements. Each critical data requirement should be
included on the computerized record for each respective value. Missing
critical data will be identified as such during the automated quality reviews
of historical estuarine data.
28
-------
TABLE 2. LIST OF ESTUARINE VARIABLES
Station
Information
Water Column
Sediment
Biological
Latitude
Longitude
Station depth
Sampling date
Sampling time
Sample depth
Tidal height
Wave height
Current speed
Wind speed
Water temperature
pH
Dissolved oxygen
Salinity
Turbidity
Transparency
Total suspended
solids
Specific
conductivity
Chloride
Nitrogen
Phosphorus
Carbon
Total alkalinity
Silica
Chemical
contaminants3
Grain size:
-gravel
-sand
-silt
-clay
Total solids
Total volatile
solids
Total organic
carbon
Oil and grease
Chemical
contaminants3
Benthic
invertebrates:
-area of sampler
-sieve mesh size
-species abundance
Megainvertebrates:
-species abundance
-tissue chemical
contaminants3
-tissue lipids
Demersal fishes:
-fishing duration
-distance fished
-species abundance
-fish length
-fish weight
-tissue chemical
contaminants3
-tissue lipids
Phytoplankton:
-species abundance
-chlorophyll a
Bacteria:
-total coliforms
-fecal coliforms
3 U.S. EPA priority pollutants and other chemicals.
29
-------
TABLE 3. CRITICAL DATA REQUIREMENTS FOR ESTUARINE VARIABLES3
Variable
Additional Critical Data Requirements1*
WATER COLUMN
Water temperature
PH
Total alkalinity
Dissolved oxygen
Salinity
Specific conductivity
Turbidity
Transparency
Total suspended solids
Chloride
Nitrogen (all kinds)
Whole
Filtered
Particulate
Phosphorus (all kinds)
Whole
Filtered
Particulate
None
None
pH for manual titrimetric method
(should=4.5)
Time of day
None
Water temperature (should=25° C)
None
Time of day
Kind of filter, filter pore size
None
None
Kind of filter, filter pore size
Kind of filter, filter pore size
None
Kind of filter, filter pore size
Kind of filter, filter pore size
30
-------
TABLE 3. (Continued)
Variable
Additional Critical Data Requirements
Carbon (all kinds)
Whole
Filtered
Particulate
Total silica
Filtered
Chemical contaminants
SEDIMENT
Grain size (all fractions)
Total solids
Total volatile solids
Total organic carbon
Oil and grease
Chemical contaminants
BIOLOGICAL
Benthic invertebrates
Species abundance
Megainvertebrates
Species abundance
Tissue levels of chemical
contaminants
None
Kind of filter, filter pore size
Kind of filter, filter pore size
Kind of filter, filter pore size
Detection limits, holding times
Presence/absence of oxidation step
None
Combustion temperature
None
None
Detection limits, holding times
Kind of sampler, area of sampler,
sieve mesh size
Kind of sampler, mesh size (if
applicable), area or time fished (if
applicable)
Detection limits, holding times
31
-------
TABLE 3. (Continued)
Variable
Additional Critical Data Requirements
Demersal fishes
Species abundance
Tissue levels of chemical
contaminants
Phytoplankton
Species abundances
Chlorophyll a
Bacteria
Total or fecal coliform
abundance
Kind of sampler, mesh size (if
applicable), area or time fished (if
applicable)
Detection limits, holding times
Kind of sampler, enumeration method
None
None
a Universal requirements for all variables are location, time of measurement,
analytical method, and measurement units.
b Other than the universal requirements.
32
-------
In addition to the critical data requirements, various other kinds of
information are desirable for interpreting and evaluating most kinds of
estuarine data. These additional kinds of information are discussed below
(see Technical Evaluations).
Approximately 60 percent of the estuarine variables have some kind of
critical data requirement (Table 3) other than sampling location, sampling
time, analytical method, and measurement units. The most common kind of
variable-specific requirement is related to the collection and partitioning
of samples prior to laboratory analysis (e.g., kind of filter, filter pore
size, kind of biological sampling equipment, mesh sizes of biological
samplers). A second common requirement is related to the conditions under
which laboratory measurements were made (e.g., titration endpoints, water
temperature, incubation temperature, combustion temperature, presence/absence
of oxidation step). Because all of the above factors can bias analytical
results, they must be known so that data can be interpreted accurately.
RANGE LIMITS
The range limits for estuarine variables are presented in Tables 4-6.
Two kinds of range limits are used to identify unusual and unlikely values.
Unusual values are ones that are extreme but are sometimes encountered.
Unlikely values are also extreme, but are almost never encountered or are
not possible. Values exceeding the specified range limits will be identified
as such during the automated quality reviews of historical estuarine data.
The range limits presented in Tables 4-6 were developed from a national
perspective. That is, they correspond to the ranges of values encountered
over all estuaries of the National Estuary Program. The ranges commonly
found in individual estuaries may be narrower than those presented here.
For many kinds of chemical variables (e.g., nutrients, chemical contami-
nants), ranges were specified for individual chemicals or groups of
chemicals. This is appropriate because most of these chemicals could
possibly occur in all of the estuaries within the National Estuary Program.
By contrast with chemical variables, the primary biological variable (i.e.,
33
-------
TABLE 4. RANGE LIMITS FOR ESTUARINE VARIABLES
Range
Lower
Variable
STATION INFORMATION
Latitude
Longitude
Station depth
Sampling date
Sampling time
Sample depth
Tidal height
Wave height
Current speed
Wind speed
WATER COLUMN
Water temperature
PH
Dissolved oxygen
Salinity
Units
Degrees
Minutes
Seconds
Degrees
Minutes
Seconds
m
Month
Day
Year
h
m
m
m
m/sec
m/sec
°C
Standard units
mg/L
PPt
A
34
0
0
70
0
0
0
1
1
1900
0000
0
-1.2
0
0
0
0
6
0
0
B
34
0
0
70
0
0
0
1
1
1940
0000
0
-1.5
0
0
0
0
5
0
0
Limits3
Upper
A
49
59
59
125
59
59
200
12
31
1987
2400
200
4.0
2.0
4.0
13.0
30
9
14
32
B
49
59
59
125
59
59
245
12
31
1987
2400
245
4.5
3.0
5.0
18.0
35
11
17
35
34
-------
TABLE 4. (Continued)
Range Limits
Lower Upper
Variable
Turbidity
Transparency (Secchi depth)
Total suspended solids
Specific conductivity
Chloride
Total dissolved nitrogen
-filtered
Total Kjeldahl nitrogen
-filtered
-whole
Particulate organic
nitrogen
Nitrite
-filtered
-whole
Nitrate
-filtered
-whole
Nitrite and nitrate
-filtered
-whol e
Ammonia
-filtered
-whole
Units
NTU
m
mg/L
umhos/cm
mg/L
mg/L
mg/L
mg/L
mg/L
mg/L
mg/L
mg/L
mg/L
mg/L
mg/L
mg/L
mg/L
as
as
as
as
as
as
as
as
as
as
as
as
N
N
N
N
N
N
N
N
N
N
N
N
A
0
0
0
-1
0
0
0
0
0
0
0
0
0
0
0
0
0
.5
.1
.5
.02
.1
.1
.00005
.0004
.0004
.001
.001
.001
.001
.001
.001
B
0
0.
0.
-1
0
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
1
5
60
19
02
1
1
A
300 1
10.0
250
,000 100
,000 25
2
2.1
2.1
00005 3
0004 0.2
0004 0.2
001
001
001
001
001
001
2
2
2
2
1
1
B
,000
10.
500
,000
,000
4
3
4
6
0.
0.
4
4
4
4
2
2
0
4
4
Total inorganic nitrogen
mg/L as N
0.001 0.001
35
-------
TABLE 4. (Continued)
Variable
Total phosphorus
-filtered
-particulate
-whol e
Orthophosphate
-filtered
Inorganic phosphorus
-whol e
Organic phosphorus
-filtered
Organic carbon
-filtered
-whole
Total carbon
Total alkalinity
Total silica
filtered
Chemical contaminants
SEDIMENT
Grain size
-gravel
-sand
-silt
-clay
Total solids
Total volatile solids
Total organic carbon
Oil and grease
Units
mg/L as P
mg/L as P
mg/L as P
mg/L as P
mg/L as P
mg/L as P
mg/L
mg/L
mg/L
mg/L as CaC03
mg/L as Si
mg/L
% dry weight
% dry weight
% dry weight
% dry weight
% wet weight
% dry weight
% dry weight
mg/kg dry
weight
Lower
A
0.003
0.001
0.005
0.001
0.001
0.001
0.4
0.5
0.4
1
0.01
(see
0
1
1
1
5
0.1
0.1
5
Range
B
0.003
0.001
0.002
0.001
0.001
0.001
0.4
0.5
0.4
1
0.01
Table
0
0
0
0
0
0
0
0 2,
Limits
Upper
A
0.
0.
1
0.
0.
0.
10
20
30
125
3
5)
98
98
98
98
90
50
20
000
B
5 1
3 0.6
2
2 0.4
2 0.4
2 0.4
20
40
60
250
6
100
100
100
100
100
100
75
20,000
Chemical contaminants
mg/kg dry weight (see Table 5)
36
-------
TABLE 4. (Continued)
Variable
Units
Range Limits
Lower Upper
B
BIOLOGICAL
Benthic Invertebrates
Area of sampler
Sieve mesh size
Species abundance
Megainvertebrates
Species abundance
Tissue chemical
contaminants
Tissue total extractable
lipids
Demersal Fishes
Net widthb
Net mesh size**
Fishing duration**
Distance fished**
Species abundance**
Individual length
Individual weight
Tissue chemical
contaminants
Tissue total extractable
lipids
Phytoplankton
Species abundance
Chlorophyll a (corrected)
mm
#/m2
0.05 0.01 0.1 0.25
0.5 0.5 1.0 1.0
0 0 10,000 20,000
0 10 100
;see Table 6)
0.1 20 100
#/m2 0
mg/kg wet weight
% wet weight 0
m 3
mm 6
min 5
m 50
#/haul 0
mm (TL or SL) 2
g wet weight 1
mg/kg wet weight (see Table 6)
% wet weight 0.1 0 20 100
1
0
0
10
0
0
0
9
50
30
2,000
100
600
5,000
15
100
60
5,000
500
1,000
10,000
#/mL
ug/L
0 0 5,000 10,000
0.01 0.01 200 400
Bacteria
Total col i forms
-water
-tissue
Fecal col i forms
-water
-tissue
MPN/100 mL
MPN/100 g
MPN/100 mL
MPN/100 g
0
0
0
0
0
0
0
0
10,000 100,000
1,000 10,000
10,000 100,000
1,000 10,000
a A = Range limit for unusual values.
B = Range limit for unlikely values.
b For collections made with otter trawls.
37
-------
TABLE 5. UPPER RANGE LIMITS FOR CHEMICAL CONTAMINANTS
IN THE WATER COLUMN AND BOTTOM SEDIMENTS
Variable3
Water Column** Sediment^
(mg/L) (mg/kg dry weight)
A
B
B
Phenols
*phenol
2-methylphenol
4-methylphenol
*2,4-dimethylphenol
*2-chlorophenol
*2,4-dichlorophenol
*4-chloro-3-methylphenol
*2,4,6-trichlorophenol
2,4,5-trichlorophenol
*pentachlorophenol
*2-nitrophenol
*4-nitrophenol
*2,4-dinitrophenol
*4,6-dinitro-o-cresol
Low Molecular Weight
Aromatic Hydrocarbons
*naphthalene
*acenaphthylene
*acenaphthene
*fluorene
*phenanthrene
*anthracene
ACID-EXTRACTABLE COMPOUNDS
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.5
10
10
10
10
10
10
10
10
10
10
10
10
10
10
100
100
100
100
100
100
100
100
100
100
100
100
100
100
BASE-NEUTRAL EXTRACTABLE COMPOUNDS
0.05
0.05
0.05
0.05
0.05
0.05
0.5
0.5
0.5
0.5
0.5
0.5
10
10
10
10
10
10
100
100
100
100
100
100
High Molecular Weight
Aromatic Hydrocarbons
*fluoranthene
*pyrene
*benzo(a)anthracene
*chrysene
*benzo(b)f1uoranthene
*benzo(k)f1uoranthene
*benzo(a)pyrene
*indeno(l,2,3-c,d)pyrene
*dibenzo(a,h)anthracene
*benzo(g,h,i jperylene
.05
.05
0.05
0.05
.05
.05
.05
.05
0.05
0.05
0.
0.
0.
0.
0.
0.
0.5
0.5
0.5
0.5
0.5
0
0
0
0
0.5
10
10
10
10
10
10
10
10
10
10
100
100
100
100
100
100
100
100
100
100
38
-------
TABLE 5. (Continued)
Variable3
Water Column**
(mg/L)
A B
Sediment**
(mg/kg dry weight)
A B
Chlorinated Aromatic
Hydrocarbons
*1,3-di chlorobenzene
*l,4-dichlorobenzene
*l,2-dichlorobenzene
*1,2,4-trichlorobenzene
*2-chloronaphthalene
*hexachlorobenzene (HCB)
0.05
0.05
0.05
0.05
0.05
0.05
0.
0.
0.
0.5
0.5
0.5
10
0.5
3
0.5
0.1
1
100
5
30
5
1
10
Chlorinated Aliphatic
Hydrocarbons
*hexachloroethane
*hexachlorobutadiene
*hexachlorocyclopentadiene
0.05
0.05
0.05
0.5
0.5
0.5
1
1
0.1
10
10
1
Halogenated Ethers
*bis(2-chloroethyl) ether 0.05 0.5
*bis(2-chloroisopropyl) ether 0.05 0.5
*bis(2-chloroethoxy)methane 0.05 0.5
*4-chlorophenyl phenyl ether 0.05 0.5
*4-bromophenyl phenyl ether 0.05 0.5
0.1
0.5
0.1
0.1
0.1
1
5
1
1
1
Phthalates
*dimethyl phthalate 0.05 0.5
*diethyl phthalate 0.05 0.5
*di-n-butyl phthalate 0.05 0.5
*benzyl butyl phthalate 0.05 0.5
*bis(2-ethylhexyl)phthalate 0.05 0.5
*di-n-octyl phthalate 0.05 0.5
0.
0.
2
1
2
5
5
5
50
20
50
100
Miscellaneous Oxygenated
Compounds
*isophorone
benzyl alcohol
benzoic acid
*2,3,7,8-tetrachlorodi-
benzo-p-dioxin
dibenzofuran
0.05
0.05
0.05
0.001
0.05
0.5
0.5
0.5
0.01
0.5
1
0.5
1
0.5
2
10
5
10
5
20
39
-------
TABLE 5. (Continued)
Water Column^
(mg/L)
Sediment'*
(mg/kg dry weight)
Variable3
Organonitrogen Compounds
aniline
*nitrobenzene
*N-n i t roso-d i -n-propyl ami ne
4-chloroaniline
2-nitroaniline
3-nitroaniline
4-nitroani 1 ine
*2,6-dinitrotoluene
*2,4-dinitrotoluene
*N-ni trosodi phenyl ami ne
*N-ni trosodimethyl ami ne
*1 ,2-di phenyl hydrazi ne
*benzidine (4,4'-diamino
bi phenyl )
*3,3'-dichlorobenzidine
A
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
B
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
A
1
0.1
0.1
0.5
0.5
0.5
0.5
0.1
0.1
1
?
6.1
0.1
0.1
B
10
1
1
5
5
5
5
1
1
10
7
i
i
i
PESTICIDES AND PCBs
Pesticides
*p,p'-DDE
*p,p'-DDD
*p,p'-DDT
*aldrin
*dieldrin
*chlordane
*alpha-endosulfan
*beta-endosulfan
*endosulfan sulfate
*endrin
*endrin aldehyde
*heptachlor
*heptachlor epoxide
*alpha-HCH
*beta-HCH
*delta-HCH
*gamma-HCH (lindane)
*toxaphene
0001
0001
0001
0001
0.0001
0.0001
0.0001
0.0001
0.0001
0.0001
0001
0001
0001
0.005
0.001
0.001
0.001
0.0001
0.
0.
0.
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.05
0.01
0.01
0.01
0.001
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
40
-------
TABLE 5. (Continued)
Water Column^
Sediment**
(mg/L) (mq/kq dry weiqht)
Variable3
PCBs
*Aroclor 1016
*Aroclor 1221
*Aroclor 1232
*Aroclor 1242
*Aroclor 1248
*Aroclor 1254
*Aroclor 1260
Total PCBs
A
0.000005
0.000005
0.000005
0.000005
0.000005
0.000005
0.000005
0.000005
B
0.00005
0.00005
0.00005
0.00005
0.00005
0.00005
0.00005
0.00005
A
1
1
1
1
1
5
4
10
B
10
10
10
10
10
50
40
100
VOLATILE ORGANIC COMPOUNDS
Volatile Halogenated Alkanes
dichlorodi fluoromethane
*chloromethane
*bromomethane
*chloroethane
*methy1ene chloride
(dichloromethane)
f1uorotrichloromethane
*1, l'-dichloroethane
*chloroform
*l,2-dichloroethane
*l,l,l-trichloroethane
*carbon tetrachloride
*bromodichloromethane
*1,2-di chloropropane
*chlorodibromomethane
*1,1,2-tri chloroethane
*bromoform
*1,1,2,2-tetrachloroethane
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.5
0.5
0.5
0.5
0.
0.
0.
0.
0.
0.
0,
0.
0.
0.5
0.5
0.5
0.5
0.1
0.1
0.1
0.1
10
0.1
0.1
1.0
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
1,000
1
1
10
1
1
1
1
1
1
1
1
1
Volatile Halogenated Alkenes
*vinyl chloride
*l,r-dichloroethene
*trans-l,2-dichloroethene
cis and trans-l,3-dichloro-
propene
*trichloroethene
*tetrachloroethene
0.05
0.05
0.05
0.05
0.05
0.05
0.5
0.5
0.5
0.5
0.5
0.5
0.1
0.1
0.1
0.1
0.1
1
1
1
1
1
1
10
41
-------
TABLE 5. (Continued)
Variable3
Water Column**
(mg/L)
A B
Sediment^
(mg/kg dry weight)
A B
Volatile Aromatic Hydrocarbons
*benzene
*toluene
*ethylbenzene
styrene (ethenylbenzene)
total xylenes
*chlorobenzene
0,
0.
0.
05
05
05
0.05
0.05
0.05
0.
0.
0.
0.
0.5
0.5
0.1
0.1
0.5
0.5
1
0.1
1
1
5
5
10
1
Volatile Unsaturated Carbonyl
Compounds
*acrolein
*acrylonitrile
0.05
0.05
0.5
0.5
0.1
0.1
Volatile Ethers
bis(chloromethyl)ether
*2-chloroethylvinyl ether
0.05
0.05
0.5
0.5
0.1
0.1
Volatile Ketones
acetone
2-butanone
2-hexanone
4-methyl-2-pentanone
0.05
0.05
0.05
0.05
0.5
0.5
0.5
0.5
0.1
0.1
0.1
0.1
Miscellaneous Volatile
Compounds
carbon disulfide
vinyl acetate
aluminum
*antimony
*arsenic
*beryllium
*cadmium
*chromium
0.05
0.05
0.05
0.05
0.05
0.1
0.1
0.5
0.5
METALS
100
0.5
0.5
0.5
1
1
0.
0.
,000
20
200
1
50
500
1 1
1 1
500,000
5,000
100,000
100
1,000
50,000
42
-------
TABLE 5. (Continued)
Variable3
Water Columnb
(mg/L)
A B
Sediment**
(mg/kg dry weight)
A B
*copper
*lead
*mercury
*nicke1
*selenium
*silver
*thal Hum
*zinc
*cyanide
iron
0.1
0.6
0.001
0.6
0.05
0.1
0.05
0.1
10
1
2
0.01
2
0.5
1
0.5
1
100
500
1,000
1
100
5
5
1
1,000
0.
100,000
500,000
100,000
500
5,000
500
500
100
100,000
5 100
500,000
a Each U.S. EPA priority pollutant is preceded by an asterisk.
b A = Range limit for unusual values.
B = Range limit for unlikely values.
43
-------
TABLE 6. UPPER RANGE LIMITS FOR CHEMICAL CONTAMINANTS
IN MUSCLE AND LIVER TISSUE
Variable3
Muscle Tissue**
(mg/kg wet weight)
A B
Liver Tissueb
(mg/kg wet weight)
A B
Phenols
*phenol
2-methylphenol
4-methylphenol
*2,4-dimethylphenol
*2-chlorophenol
*2,4-d i chlorophenol
*4-chl oro-3-methylphenol
*2,4,6-trichlorophenol
2,4,5-trichlorophenol
*pentachlorophenol
*2-nitrophenol
M-nitrophenol
*2,4-dinitrophenol
*4,6-di ni tro-o-cresol
Low Molecular Weight
Aromatic Hydrocarbons
*naphthalene
*acenaphthylene
*acenaphthene
*fluorene
*phenanthrene
*anthracene
ACID-EXTRACTABLE COMPOUNDS
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.01
0.005
0.005
0.005
0.005
0.
0.
0.05
0.05
0.05
0.05
.05
.05
0.05
0.05
0.05
0.1
0.05
0.05
0.05
0.05
0.
0.
0.
0.
0.
0.
0.
0.
0.1
0.
0.
0.
0.
0.1
BASE-NEUTRAL EXTRACTABLE COMPOUNDS
0.1
0.01
0.01
0.01
0.1
0.01
2
1
1
1
1
1
0.
0.
0.
0.
0.
0.2
2
2
2
2
2
2
High Molecular Weight
Aromatic Hydrocarbons
*fluoranthene
*pyrene
*benzo(a)anthracene
*chrysene
*benzo(b)f1uoranthene
*benzo(k)f1uoranthene
*benzo(a)pyrene
*indeno(l,2,3-c,d)pyrene
0.
0.
0.
0.
0.
0.
0.1
0.1
0.
0.
0.
0.
0.
0.
0.
0.1
44
-------
TABLE 6. (Continued)
Muscle Tissueb Liver Tissue1'
(mg/kg wet weight) (mg/kg wet weight)
Variable3
*dibenzo(a,
*benzo(g,h,
h)anthracene
ijperylene
0
0
A
.1
.1
B
1
1
A
Oil
0.1
B
1
1
Chlorinated Aromatic
Hydrocarbons
*l,3-dichlorobenzene 0.02 0.2 0.1 1
*l,4-dichlorobenzene 0.02 0.2 0.1 1
*l,2-dichlorobenzene 0.02 0.2 0.1 1
*l,2,4-trichlorobenzene 0.02 0.2 0.1 1
*2-chloronaphthalene 0.02 0.2 0.1 1
*hexach1orobenzene (HCB) 0.02 0.5 1.0 10
Chlorinated Aliphatic
Hydrocarbons
*hexachloroethane 0.05 0.5 0.2 2
*hexachlorobutadiene 0.1 1 1 10
*hexachlorocyclopentadiene 0.05 0.5 0.2 2
Halogenated Ethers
*bis(2-chloroethyl) ether 0.01 0.1 0.5 5
*bis(2-chloroisopropyl) ether 0.01 0.1 0.5 5
*bis(2-chloroethoxy)methane 0.01 0.1 0.5 5
*4-chlorophenyl phenyl ether 0.01 0.1 0.5 5
*4-bromophenyl phenyl ether 0.01 0.1 0.5 5
Phthalates
*dimethyl phthalate 0.01 0.1 0.05 0.5
*diethyl phthalate 0.01 0.1 0.05 0.5
*di-n-butyl phthalate 0.01 0.1 0.5 5
*benzyl butyl phthalate 0.01 0.1 0.05 0.5
*bis(2-ethylhexyl)phthalate 1 10 0.05 0.5
*di-n-octyl phthalate 1 . 10 0.05 0.5
45
-------
TABLE 6. (Continued)
Variable3
Muscle Tissue**
(mg/kg wet weight)
A B
Liver Tissue'*
(mg/kg wet weight)
A B
Hi seel 1 aneous Oxygenated
Compounds
*isophorone
benzyl alcohol
benzoic acid
*2,3,7,8-tetrachlorodi-
benzo-p-dioxin
dibenzofuran
Organonitrogen Compounds
aniline
*nitrobenzene
*N-ni troso-di -n-propyl ami ne
4-chloroaniline
2-nitroaniline
3-nitroaniline
4-nitroanil ine
*2,6-dinitrotoluene
*2,4-dinitrotoluene
*N-ni trosodi phenyl ami ne
*N-ni trosodimethyl ami ne
*1 ,2-di phenyl hydrazi ne
*benzidine (4,4'-diamino
bi phenyl )
*3,3'-dichlorobenzidine
0.01
0.01
0.01
0.001
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.1
0.1
0.1
0.01
0.1
0.1
0.1
01
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.05
0.05
0.05
0.005
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.05
0.5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
PESTICIDES AND PCBs
Pesticides
*p,p'-DDE
*p,p'-DDDO
*p,p'-DDTO
*aldrin
*dieldrin
*chlordane
*alpha-endosulfan
*beta-endosulfan
*endosulfan sulfate
*endrin
*endrin aldehyde
*heptachlor
*heptachlor epoxide
.5
.5
0.005
0.01
0.1
0.01
0.01
0.01
0.01
0.01
0.01
0.01
20
10
10
0.05
0.1
1
0.1
'0.1
0.
0.
0.
0.
50
1
50
0.
0.
0.
0.
1
1
1
01
0.1
0.01
0.01
0.01
0.01
0.01
0.01
500
10
500
1
1
1
0.
0.
0.
0.
0.
0.
0.1
46
-------
TABLE 6. (Continued)
PCBs
*Aroclor 1016
*Aroclor 1221
*Aroc1or 1232
*Aroclor 1242
*Aroclor 1248
*Aroclor 1254
*Aroclor 1260
Total PCBs
Muscle Tissueb
(mg/kg wet weight)
Liver Tissue*1
(mg/kg wet weight)
Variable3
*alpha-HCH
*beta-HCH
*delta-HCH
*gamma-HCH (lindane)
*toxaphene
A
0.01
0.01
0.01
0.01
0.01
B
0.1
0.1
0.1
0.1
0.1
A
0.01
0.01
0.01
0.01
0.01
B
0.1
0.1
0.1
0.1
0.1
0.
0.
0.
0.
0,
1
1
2
5
5
5
5
5
10
10
20
5
5
5
5
5
10
10
20
50
50
50
50
50
100
100
200
VOLATILE ORGANIC COMPOUNDS
Volatile Halogenated Alkanes
dichlorodifluoromethane
*chloromethane
*bromomethane
*chloroethane
*methylene chloride
(dichloromethane)
fluorotrichloromethane
*l,l'-dichloroethane
*chloroform
*l,2-dichloroethane
*l,l,l-trichloroethane
*carbon tetrachloride
*bromodi chloromethane
*1,2-di chloropropane
*chlorodibromomethane
*1,1,2-trichloroethane
*bromoform
*1,1,2,2-tetrachloroethane
0.005
0.005
0.005
0.005
0.
0.
0.005
0.005
.005
.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.05
0.05
0.05
0.05
0.05
0.05
.05
.05
.05
.05
0.05
0.05
.05
.05
0.05
0.05
0.05
0.
0.
0.
0.
0.
0.
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0
0
0
0
0
0
0
0
0.5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
Volatile Halogenated Alkenes
*vinyl chloride 0.005 0.05 0.5 5
*l,l'-dichloroethene 0.005 0.05 0.5 5
*trans-l,2-dichloroethene 0.005 0.05 0.5 5
47
-------
TABLE 6. (Continued)
Muscle Tissue** Liver Tissueb
(mg/kg wet weight) (mg/kg wet weight)
Variable3
*cis and trans-l,3-dichloro-
propene
*trichloroethene
*tetrachloroethene
A
0.005
0.005
0.01
B
0.05
0.05
0.1
A
0.5
0.5
0.5
B
5
5
5
Volatile Aromatic Hydrocarbons
*benzene 0.01 0.1 0.5 5
*toluene 0.01 0.1 0.5 5
*ethylbenzene 0.01 0.1 0.5 5
styrene (ethenylbenzene) 0.01 0.1 0.5 5
total xylenes 0.01 0.1 0.5 5
*chlorobenzene 0.01 0.1 0.5 5
Volatile Unsaturated Carbonyl
Compounds
*acrolein 0.05 0.5 0.5 5
*acrylonitrile 0.05 0.5 0.5 5
Volatile Ethers
bis(chloromethyl)ether 0.005 0.05 0.5 5
*2-chloroethylvinyl ether 0.005 0.05 0.5 5
Volatile Ketones
acetone 0.005 0.05 0.5 5
2-butanone 0.005 0.05 0.5 5
2-hexanone 0.005 0.05 0.5 5
4-methyl-2-pentanone 0.005 0.05 0.5 5
Miscellaneous Volatile
Compounds
carbon disulfide 0.005 0.05 0.5 5
vinyl acetate 0.005 0.05 0.5 5
48
-------
TABLE 6. (Continued)
Variable3
Muscle Tissue^
(mg/kg wet weight)
A B
Liver Tissue^
(mg/kg wet weight)
A B
METALS
aluminum
*antimony
*arsenic
*beryllium
*cadmium
*chromium
*copper
*lead
*mercury
*nickel
*selenium
*silver
*thallium
*zinc
1
5
1
5
5
20
5
0.5
1
1
1
1
50
10
50
10
50
50
200
50
5
10
10
10
10
500
5
10
5
5
5
50
5
20
5
5
5
5
100
50
100
50
50
50
500
50
20
50
50
50
50
1,000
a Each U.S. EPA priority pollutant is preceded by an asterisk.
b A = Range limit for unusual values.
B = Range limit for unlikely values.
49
-------
species abundance) was considered at a general level for all groups except
bacteria. Species-specific range limits could not be developed from a
national perspective because species composition differs among estuaries.
Differences in species composition are most dramatic between east and west
coast estuaries.
NATIONAL QUALITY REVIEW
Automated quality review checks will be made of all historical data
sets included in the database of the National Estuary Program. Checks will
be made for proper formats and codes, critical data requirements, and range
limits. These checks will be made from a national perspective, using the
specifications presented in this document. Each data value identified by
the automated checks will have a qualifier permanently attached to it, but
otherwise remain intact in the database.
After being subjected to the automated quality review, data sets will be
made available to the regional offices for use in characterizing their
respective estuaries. The regional offices will decide how to treat
qualified data values and will have the option of conducting a more rigorous
evaluation of the data.
REGIONAL QUALITY REVIEW
After estuarine data sets have been reviewed and qualified at the
national level, the regional offices may conduct additional evaluations
before the data are used. This section presents general guidance for
conducting these additional evaluations using both automated checks and
technical review.
Automated QA/QC Checks
Because all data sets of the National Estuary Program should have a
standard format and coding system when they are made available to the
regional offices, use of automated checks to conduct additional evaluations
will be facilitated. The most effective method of conducting these
50
-------
evaluations might be to modify the quality review dictionary that has been
developed for review at the national level.
By "fine tuning" the existing quality review dictionary to represent
the characteristics of individual estuaries, the regional offices can greatly
enhance the effectiveness of the automated quality review checks. Examples
of modifications include the following:
t Addition of new variables to the list of estuarine variables.
Specification of additional critical data requirements for
individual variables.
Adjustment of the range limits for each variable to represent
more precisely the conditions encountered in individual estu-
aries.
Because the list of estuarine variables was limited to those variables
commonly measured in most estuaries, variables measured primarily in a
single estuary are not included. However, these somewhat unique variables
may be important for characterizing conditions in a particular estuary. For
example, hepatic lesions in demersal fish have been used routinely as
indicators of biological effects in Puget Sound. Their use in other
estuaries is much rarer. Thus, liver pathology should probably be added to
the list of standard estuarine variables when evaluating historical
information for Puget Sound.
By narrowing the range limits for each variable, the precision of
quality review checks would be enhanced. For example, the upper range limit
for depth from a national perspective is 200 m, because depths in Puget
Sound sometimes exceed that value. However, the maximum depth in Chesapeake
Bay is less than 70 m. Thus, although depths of 70-200 m cannot occur in
Chesapeake Bay, they would not be flagged as erroneous during the initial
quality review.
51
-------
The greatest benefit from developing estuary-specific quality review
dictionaries might be the ability to set species-specific criteria for all
groups of organisms (e.g., phytoplankton, benthic invertebrates, megainverte-
brates, fishes). As noted earlier, these kinds of criteria generally cannot
be developed from a national perspective. A species list for an estuarine
study could be examined to detect species known not to occur in that
estuary. In addition, different range limits could be set for species that
are always rare and species that are sometimes or always abundant in a
particular estuary.
Technical Evaluations
In addition to conducting automated quality review checks, the regional
offices may elect to have technical experts examine historical estuarine
data sets. In some cases, these technical evaluations may require examina-
tion of original documents (e.g., reports, laboratory notebooks, data
sheets). For many historical data sets, the amount of information available
for a detailed technical review will be limited. A general discussion of
the kinds of information that may be required for a technical evaluation is
presented below.
Field Collection
Because field collection techniques can substantially influence the
results obtained in subsequent data analyses, it is recommended that those
techniques be evaluated as closely as possible. The evaluation should
attempt to verify the following items (if applicable) for each data set:
0 Navigation was sufficiently accurate to ensure that the
sample was collected at the appropriate location.
Collection containers and devices were cleaned properly
before sample collection.
Collection devices were operated properly.
52
-------
t Samples were collected in a representative manner.
Samples were preserved, stored, and transported properly, so
that sample integrity was maintained.
The information needed to verify the above items generally can be found in
final reports, cruise reports, field logbooks, and chain-of-custody docu-
ments.
Biological Laboratory Analyses
The primary biological measurement for most groups of organisms is
number of individuals. Additional measurements often include biomass and
size of organisms. A key concern for all of these measurements is accurate
identification of organisms. Technical evaluation of biological laboratory
analyses might focus on the following considerations:
Benthic sorting efficiency.
Subsampling representativeness.
Taxonomic accuracy.
Taxonomic representativeness.
Interlaboratory comparisons.
Physical and Chemical Laboratory Analyses
The level of technical review appropriate for physical and chemical
variables can differ, depending on the variable under consideration. For
example, review of temperature measurements made with a thermometer may
require only that the instrument be calibrated with a standard thermometer.
By contrast, evaluation of measurements of U.S. EPA priority pollutant
organic compounds may require measurements of extraction efficiency, recovery
of spiked compounds, blanks, and replicate samples. Technical evaluation of
53
-------
physical and chemical laboratory analyses might focus on the following
considerations:
Holding times.
t Analytical methods.
Methods modifications.
Analyses of replicates.
t Analyses of blanks.
t Analyses of spikes.
Analyses of standard reference materials.
0 Instrument calibrations.
0 Laboratory audits.
0 Interlaboratory comparisons.
54
------- |