GUIDELINE SERIES
OAQPS NO. 1.2-ooe
GUIDELINES FOR EVALUATION OF
SUSPECT AIR QUALITY DATA
fin
US. ENVIRONMENTAL PROTECTION AGENCY
Office of Air Quality Planning and Standards
Research Triangle Park, North Carolina
-------
Attachment 3
GUIDELINES FOR EVALUATION OF SUSPECT AIR QUALITY DATA
Purpose
The purpose of this guideline 1s to provide the Regional Office
with suggested procedures for verifying and evaluating both specific
air quality data values (Section A) and annual averages with respect
to ambient air quality standards (Section B).
Data Flow
Data generated by the State and local agencies is coded
in SAROAD (Storage and Retrieval of Aerometric Data) format
and submitted through the appropriate Regional Office to the
OAQPS (Office of Air Quality Planning and Standards) for
submittal to the NADB (National Aerometric Data Bank). From
the bank the National Air Data Branch provides this
aerometric data upon request for Federal, State and local
needs.
Flagging of Potentially Anomalous Data Values
Currently, the procedure used by the National Air Data
Branch in the identification-of potentially anomalous data
values depends to a large extent on chance discovery by
someone scanning a print-out of either raw data or summary
statistics. If the values are found to be spurious as a con-
sequence of internal processing errors, the discovery contributes
to the process of de-bugging the data system. If the questioned
values appear in the data as received, a query is forwarded
\
through the appropriate Regional Office to the originating agency,
asking for verification.
-------
Attachment 3
This process of detecting questionable data
values will be supplanted when the data system 1s trans-
ferred to the Unlvac computer 1n March of 1974. Potentially,
anomalous values will be objectively identified as a step in
the addition of all new data to the file. Tests, preferably
nonparametric, will be applied to the incoming data and a
listing printed of all values that trigger one or another
of the test criteria. Examples of such criteria are 1)
values that are some factor, say, 1 1/2 times larger than
some expected value such as the 99th percentile, 2) hourly
values that differ from adjacent values by more than some
ratio, suggesting an abrupt change 1n baseline or a transient
Interference.
This formal procedure for Identifying values that call
for deliberate verification, performed as an Integral part
of incorporating received data, will speed the process of
minimizing errors in the data bank. As the overall system
1s refined, internally incurred errors can be expected to
continue to diminish. There then remains the following
\
process for checking the identified values as received to
see if there was some error in recording or transmittal, or
1f they 1n fact reflect some peculiar or infrequent phenomenon
in the ambient air. A brief file of these latter instances
-------
Attachment 3 3
of unusual but explainable data records could provide some
profitable Insights to the task of urban resource manage-
ment.
Data Listing
The NADB provided the pollutant averages for Priority I and
II regions which were above their corresponding primary standards.
These averages have been compiled and referred to the appropriate
Regional Office for their verification.
Regional Office Responsibility
The Regional Office has been given the responsibility of
being the prime contact with the State and local agencies and re-
porting to the National Air Data Branch either to accept, reject
or modify the data value or average in question. The Regional Office
has the option to ask the originating agency to determine the
validity of the data or to provide the Regional Office with certain
Information and documentation so that it may make the final judge-
ment.
The procedure used to check out any specific data value could
depend on: the Regional Office's assessment of the originating
agency, its capability, quality control program, and previous
performance, or the Regional Office's own personnel and work load.
In any case, the following sequence of procedures is suggested.
Internal Check
Any Agency which alters, manipulates or transcribes a
data value in any way has a potential for error. When a
data value is flagged the Agency should determine that the
data value has maintained its integrity from the initial'
^ . *.
contact through the final processing by that Agency.
-------
Attachment 3
The data should be traced through the SAROAD system, the Regional
Offices, State Agency and/or local Agency to Its original re-
cording, whether It be a value from a computer readout, paper tape
printer, strip chart, or a report from the chemist in the laboratory
The types of errors usually found In the Internal check are: typing
key punching, tabulating and transposition, mathematical (such as
addition, multiplication and transcribing). Further discussion of
these errors and method to reduce their frequency may be found in
123
already published guideline documents, ' *
If no errors have been identified in the internal check, at all
agency levels, the verification and evaluation process continue down
two similar but separate paths. Which path is chosen depends on
whether the data in question is a single value (Section A) or a
composit average (Section B).
A. Verifying and Evaluating Specific Air Quality Data Parts
Instrument Calibration, Specifications and Operations
The operation and calibration of continuous instruments
Is of the utmost Importance in the production of valid air
quality data. The instrument calibration should be reviewed
for the time in question, both before and after the suspected
data point. It should be determined if the instrument was
operating within pre-determined performance specifications
such as drift, operating temperature fluctuations, unattended
operational periods, etc. These performance specifications
for automatic monitors are defined and published in the
Federal Register^ and summarized in various guideline documents.1'2
Guidelines on air quality control practices and error tracing
\
techniques are available also.3
-------
Attachment 3
Before and After Readings
If the Instrument generating the data was found to be
'In control1, the values Immediately before and after should be
determined. Comparisons between the percent and/or gross
deviations could be made. Ideally, this difference in con-
t
centration should be determined through a statistical analysis
of historical data. For example it may be determined that a
difference of 0.05 ppm in S0£ concentration for successive
hourly averages occurs very rarely (less than one percent of
the time). The criteria for what constitutes an excessive
change may also be linked to the time of day. For example,
an hourly change of CO of 10 ppm between 6 AM and 7 AM may
be common but would be suspect if it occurred between 2 AM
and 3 AM. 1>3
Other Instruments at the Same Location
Observing the behavior of other instruments at the same
location, if any, would give the evaluator a qualitative
insight into the possible reasons for the anomalous reading.
If all of the instruments showed a general increase, meteoro-
logical factors might be considered, while a dramatic deviation
over the same short period of time may indicate an electrical
problem or an air conditioning malfunction. On the other hand,
1f the other instruments behaved normally, a temporary influence
of a single pollutant or single pollutant source may be
suspected.
-------
Attachment 3
Similar Instruments at Adjacent Locations
Comparing the behavior of other Instruments in the
vicinity which monitor the same pollutant could further
elucidate the situation. For example, 1f the adjacent
Instruments (upwind and downwind) exhibited the same general
trend, an area problem in which the maximum effect was over the
station of interest, would be indicated. However, if the
adjacent stations seemed to peak either before or after the
time the suspect value was recorded, the station may have
been under the influence of plume fumigation which wandered
according to wind direction influences. Micro meteorological
Influences should not be overlooked either. The station may
be under the influence of subsidence effects from the urban
heat island or upslope-downslope influences. '
Meteorological Conditions
No attempt to explain an anomalous air quality data point
would be complete without a consideration of the meteorological
conditions present at the time of the reading. A passing
front and strong Inversion, extended calms or strong winds
are conditions which have a large impact on air quality. '
Influences of precipitation, temperature and season could be
Included to Interpret the reasonableness of the data as well.
Time-Series Check
Checking a time plot of the data might reveal a repetitious
pattern during similar time periods. An extreme excursion might thus
-------
Attachment 3
be explained. For Instance, the Instrument
may be extremely temperature sensitive and may be under the
Influence of the sun shining betv/een buildings from two to
four each afternoon. Similarly, every Thursday may be
delivery day for an adjacent supermarket and trucks tend to
«
spend the bulk of the day idling 1n the vicinity of the probe.
Physical Site Location
From time-to-tlme local air quality influences may change
'and adversely affect a given air monitoring station's
representativeness. Examples of this might be an adjacent
apartment house or supermarket changing from garbage haul-away
to an incinerator without informing the local agency. Urban
renewal may render the location temporarily unrepresentative for TS:
also. The site may fall prey to vandalism or even premeditated
and systematic tampering designed to draw attention to an
underprivileged area.
The site location, sampling probe material and configu-
ration should be within the bounds of published guidelines
also. 1>2
Data Verification Flow Chart
In summary, the following 1s presented as a stepwise
guide to the verification of specific data values. It pictorially
presents the previous discussion and hopefully will give the
reader the overall view of data verification.
-------
ERROR
FOUND
ERR°R
/REJECT ^S
N. DATA .X
/REJECT^
^ DATA S
ERROR
FOUND
ERROR
FOUND
DATA
FLAGGED
INTERNAL
CHECK
7
ERROR
NOT
CONTACT
REGIONAL
INTERNAL
CHECK
1 ERROR
\7 NOT
VFOUND
CONTACT STATE
AND/OR
LOCAL AGENCY
V
INTERNAL
CHECK'
INSTRUMENT
CALIBRATION
OPERATION
SPECIFICATIONS
ERROR
CORRECTED?"
ERROR
_________
->
ERROR
CORRECTED
->
ERROR
CORRECTED
-------
REJECT
DATA
/REJECT >v
C DATA ?
{ERROR
NOT
POUND
GREATER
^ THAN
CRITERIA
' REVERSE '
^s TREND
^INDICATED
V
*
REVERSE
TREND
^
INDICATED
^UNFAVORABLE
^TOWARD
OCCURRENCE
^ REVERSE
^ CYCLE
SITE
^ DEVEATES
^ FROM
GUIDELINES
BEFORE AND
AFTER
INSTRUMENT
READINGS
*
. OTHER
INSTRUMENTS
SAME
LOCATION
JSUBSTANT
i*j TRE
V INDICA
SIMILAR
INSTRUMENTS
ADJACENT
j SUBSTAMT
Jy TRE
V INDICA
METEOROLOGICAL
f* C\*xr\'T rr\ T^vrc?
CUNL/1 1 lUNb
1 FAVORAB
rS TOWARD
VOCCURREN
TIME-SERIES
OVOT f
C.YCli£
..... /
LESS THAN | NO I
CRITERIA ^DECISION I
2 <
NO TREND nrviTPTOM « i
INDICATED \ / <
\ / <
\ / g
IATING \/ o
ND . t*
TED 2
U
K
NO TREND . 1 1 NQ J < J
INDICATED ' DECISION ) g
\ / Q
IATING \ / ^
ND >/ rt
TED ^
w
NEUTRAL [ NO o
I'OWAHD \JbClSiUN 1 t,
OCCURRENCE >v / °
\ / £
LE V §
H
CE ^
o
N° MO 1 ^
CYCLE DECISION )
1 POSITIVE \ /
\f CYCLE >s/
PHYSICAL
SITE, PROBE,
VANDALISM
SITE IS OK
-------
Hi (.acninen v
B. Verifying and Evaluating Annual Air Quality Averages
Summary Statistics
If no calculation or recording errors have been found, the
summary statistics describing the average should be checked. These
may include both geometric and arithmetic means and standard deviation,
and the frequency distribution in percentiles. Both the standard de-
viations and the magnitude of the difference between the geometric and
arithmetic mean are more sensitive to a few extremely high values than
to many moderately high levels. Inspection of the values cor-
responding to the higher percentiles would also show the influence
of-abnormally high values on the average. Standard deviations do
not generally change much from year-to-year.
List Individual Values
If the summary statistics indicate that the mean was heavily
Influenced by a few high values, or in the absence of summary statis-
tics, the individual bits of data which comprised the average
should be listed. From inspection of this list, it can be de-
termined if the average was influenced by a relatively few large
values or whether the bulk of the data appears to be consistently
Jiigh. If the former appears to be the situation, treat each in-
dividual point according to the guidelines for specific air quality
data points presented in Section A. In the latter case, proceed to
the next step in the verification of annual averages.
Physical Site Inspection
The physical site location should be evaluated in terms of
its representativeness of the pollutant of interest, the averaging
time of interest and the pollutant receptor. The operation of
-------
the site should be evaluated In terms of sampling methodology,
maintenance procedures, calibration procedures and quality con-
trol practices. The actual sampling probe and manifold material,
configuration and placement should be evaluated also. Guidelines
describing in detail these aspects of air quality monitoring have^
1 9 O
been published. '><:'0 The evaluator should familiarize himself
with these manuals before attempting to determine the acceptability
of an air monitoring site and operation.
Plot Data
Comparing a visual plot of the current data to that of prior
years on a typical annual patterns could further pinpoint reasons
to accept or reject the annual average 1n question. Keep in mind,
however, that some year-to-year variation is expected. Figure 1
shows a typical SOg annual pattern based on expected monthly
averages. Figure 2 also shows this same pattern with a constantly
Increasing baseline drift. A pattern of this type suggests a con-
tinuing long-term failure (change) in a component of the instrument -
a deterioration in the supplies being used or a subtle change in
the environment. Figure 3 shows the typical pattern with an abrupt
dislocation of the base line. This may be indicative of a change
In Instruments, method of analysis, procedures used or personnel.
It should not be arbitrarily assumed that any such shift is wrong.
For Instance, the analytical method may have been changed to the
standard reference method, sources of interferences may have been
eliminated, or the operators may be following the procedure cor-
rectly for the first time. Figure 4 shows a seasonal abnormality
1n the expected pattern. It should be kept in mind that a devia-
-------
SUi.FLLR-P.lQJCl
.|:.:.i.
8
8
09
:FIGURE
..
t .... - -* - - * T^?/}O*.V T p'VrrrT>*?*>M -* -* - * --- .
FA N F E'B~M'A R~~A P R~M AT~JU NJ U
r.V'.. '. '""' f::j::::::n;,-:ir:;:;: ::.:::::!
G SEP OCf "NOV O'EC"
-------
Attachment 3 S9
tlon from the expected pattern can be negative as well as
positive. Figure 5 demonstrates how the expected pattern can be
smoothed (masked) by a nearby source whose emissions are fairly
constant throughout the year. The pattern may also show part of
the year "normal" and part of the year "masked" if there are
pronounced seasonal wind direction changes.
For those pollutants such as oxidants whose peak values
occur during a single 'season' a plot of weekly or bi-weekly
averages through the period of interest would provide more
Information on the cyclical patterns than monthly averages.
Check Prior Data for Trend
Plotting at least four previous annual averages along
with the current year and visually inspecting the graph
could give the evaluator a qualitative Insight into whether
the current annual average is a significant deviation from or
mearly an extension of the projected trend.
Compare With Surrounding Stations
.
If there are enough surrounding sites to develop air
quality isopleths of the area, the evaluator could see how the
annual average in question fits in with the overall picture.
For instance, if the point in question was midway between
the isopleth lines representing 80 and 60, but the recorded
value was 50% greater than expected, i.e., 105, an abnormality
may be expected.
This comparative technique may also be used in areas where
there are not enough sites to directly plot air quality
isopleths but where a predictive air quality model has been
-------
Attachment 3
developed and verified with a limited number of actual data
points. In these cases, deviations of jf 100% could be
suspect for instance.
Meteorology
Finally, the annual average should be interpreted in
conjunction with meteorological conditions for that year.
For example, if the winter of the year in question were the
coldest in 50 years or the overall degree days were 50%
above the 20-year norm, an increased $03 average would be
expected. Suspended particulate values can be greatly
affected by wind direction and a disproportionate wind rose
(atypical for the area) could help explain unusual values.
Comparing the appropriate meteorological parameters such as
rainfall, wind speed, number and length of inversion,
temperature and degree days to their long-term averages, i.e.,
20- or 50-year norms, before attempting to change implementation
plans is prudent.
In summary, when an annual average Indicates that change
1n Implementation plan may be warranted, it is necessary to
verify and evaluate that air quality measurement according
to the following general steps:
-------
Attachment 3
1. make Internal check for manipulative errors.
2. look at summary statistics.
3. look at individual values'
4. inspect the physical site.
5. plot data and compare pattern to normal*
6. check method, instrument, procedures, personnel
for changes*
7. check calibration practices, quality control procedures,
8. check prior data for trend.
9. compare with surrounding stations (isopleths).
10. review meteorological conditions.
-------
BIBLIOGRAPHY
1. "Field Operations Guide for Automatic Air Monitoring
Equipment," Office of Air Programs, Publication No.
APTD 0736, EPA, Research Triangle Park, N.C., November
1971.
2. "Guidelines for Technical Services of a State Air
Pollution Samples," Office of Air Programs, Publication
No. APTD 1347, EPA, Research Triangle Park, N.C.,
November 1972.
3. "Quality Control Practices In Processing Air Pollution
Samples," Office of Air Programs, Publication No. APTD
1132, EPA, Research Triangle Park, N.C., September 1972.
4. Federal Register, Vol. 36, No. 228, November 25, 1971,
page 22404.
\
5. Lowry, W. P. and R. W. Boubel, "Meteorological Concepts
in Air Sanitation," Type-Ink., Corvallis, Oregon, 1967.
6. Symposium; Air Over Cities, Public Health Service, SEC
Technical Report A-62-5, Cincinnati, Ohio, November 1961
------- |