GUIDELINE SERIES
OAQPS NO. 1.2-013
PROCEDURES FOR FLOW AND AUDITING
OF AIR QUALITY DATA
US. ENVIRONMENTAL PROTECTION AGENCY
Office of Air Quality Planning and Standards
Research Triangle Park, North Carolina
-------
TABLE OF CONTENTS
PREFACE i
1. Introduction 1
2. Data Flow Procedures 3
2.1 Current Data Flow System 3
2.2 Current Data Editing 7
2.3 Current Data Validation and Certification 8
2.4 Current Data Verification 12
2.5 Future Data Flow System 12
2.6 Future Data Editing 14
2.7 Future Data Validation 14
3. Regional Office Air Quality Data Responsibilities 15
3.1 Current Areas of Responsibility 16
3.2 Future Areas of Responsibility 30
4. Current Techniques for SIP Progress Evaluation 32
PROPERTY OF:
NATIONAL VEHICLE AND FUEL EMJSSIONS
LABORATORY LIBRARY
20GO THAVERWOOD DRIVF,
ANN ARBOR, W
-------
LIST OF FIGURES
FIGURE PAGE
1. Current Air Quality Data Flow System 4
2. Future Data Flow 13
3. Data Anomaly Processing Flow 24
4. Typical S02 Annual Pattern 28
5. Typical S0? Annual Pattern With Constant
Baseline Drift 28
6. Typical S02 Annual Pattern With Abrupt Baseline
Change 28
7. Typical S02 Annual Pattern With Seasonal Abnormality 28
8. Influence of Nearby Source on SOp Annual Pattern
9. Plan Revision Management System 33
-------
1
PREFACE
The Monitoring and Data Analysis Division of the Office of Air
Duality Planning and Standards has prepared this report entitled
i"
"Procedures for Flow* and Auditing of Air Quality Data" for use by the Regional
Offices of the Environmental Protection Agency. The purpose of the
report is to provide guidance information on current data auditing
techniques that should be followed as part of the procedure for in-
putting air quality data into the National Aerometric Data Bank. The
primary audience for this report is the administrative and management
personnel in the Regional Office whose need is limited to a general
overview of the system rather than detailed information concerning
specific elements. The AEROS (Aerometric and Emissions Reporting
System) contact personnel will continue to receive specific detailed
information directly from the National Air Data Branch, MDAD. Adherence
to the guidance presented in the report will, hopefully, ensure mutually
compatible ambient air quality data for all States and Regions and should
also facilitate data evaluation and interpretation. Further, any risks
involved in policy decisions concerning National Ambient Air Ouality
Standards should be minimized. This report is intended to update and
expand upon the previously issued Interim Guidance Report on "Evaluation
of Suspect Air Quality Data."
-------
-1-
1. INTRODUCTION
The purpose of this Guideline, the fifth9 in a series to be issued
by the Monitoring and Data Analysis Division (MDAD) of the Office of
Air Duality Planning and Standards, is to provide the Regional Offices
of EPA with guidance on data auditing techniques that should be
followed as part of the procedure for inputting air quality data into
the National Aerometric Data Bank. Information and suggestions are
presented for both the current and planned computer systems concerning:
' Data Flow
' Data Editing
' Data Validation
' Data Correction Procedures and Certification
* Data Verification
' Statistical Flagging Techniques
In conjunction with this Guideline, the MDAD is also developing sophisti-
cated data edit, validation and quality control programs which should help
smooth the transition between current and planned Regional Office air
quality data responsibilities.
This report will serve on an interim basis until more explicit and
detailed guidance is developed by the Monitoring and Data Analysis Division
as a result of the expected interaction with the Regional Offices on air
This document supercedes a previously issued interim report entitled
"Evaluation of Suspect Air Duality Data" OADPS # 1.2-006 issued in
August 1973.
Information presented in this report is also intended to alert the
Regional Offices of their increasing responsibilities with respect to
air quality data as a result of the planned upgrading of the EPA/RTP
computer system.
-------
-2-
quality data handling techniques and procedures. For purposes of
definition the following terms are listed as they are used in this
report:
Data Check (Data Screen, Screening)
The comparing of a piece of data to a specified entity.
The comparison may be manual (visual), or automatic (com-
puterized). The entity may be a code or location (edit)
or a value (validation).
Data Auditing
The systematic checking of identifying information and data
before or after it resides on the Aerometric and Emissions
Reporting System. Includes EDIT, VALIDATION, VERIFICATION,
ANOMALY, INVESTIGATION, and CERTIFICATION.
Data Edit (Edit Check)
The comparing of data and its unique identification to a set
of specifications concerning format, alphabetic and numeric
requirements and coding requirements,- etc., either manually or
automatically.
Data Validation (Validation Screen)
The comparing of data values to a set of predetermined criteria
concerning minimum and maximum limits, deviation from average
values, percent change overtime, etc., either manually or
automatically.
Data Anomaly (Anomalous Data)
Any data or data summary about which some problem exists or
about which there arises a question as to its integrity of
Data Flag (Flagging)
Calling attention to and uniquely identifying data for
further action, the flagging maybe done manually or automatically.
-------
-3-
information. Anomalous data may be identified (flagged
by a report) either manually or automatically by edit checks,
validation or any other flagging technique.
Data Verification
The total process involved in determining the existence of
data which, while not on NADB, has been indicated as existing
by knowledgeable sources.
Data Certification
The process by which data currently residing on NADB is deter-
mined to be correct and complete or is receded by individuals
sufficiently knowledgeable to have background authority and
data to represent the source.
2. DATA FLOW PROCEDURES
This Section presents the current procedures for processing air
quality data. These procedures include, as required, data editing,
validation, verification, certification and flagging technioues for SIP
progress evaluation.
2.1 Current Data Flow System
The general flow of air quality data from the States
through the Regional Offices to the National Aerometric Data
Bank is presented in Figure 1. The steps in the system are
as follows:
a. The State agency submits air ouality data to the
appropriate EPA Regional Office as part of the State Imple-
mentation Plan reporting procedures. These reports which
are forwarded on a quarterly basis contain the air quality
-------
<£>
-S
n>
t
o
-s
-s
.•o
Q)
o>
n-
Oi
O
-------
-5-
data and new site descriptions for the State's air monitoring
stations. The data may be sent in more frequently than
quarterly if desired, but must be submitted to the Regional
Office in SAROAD format on either coding forms, punched cards,
or magnetic tape. Data for all operational stations as
described in the SIP's, beginning with that used in plan
preparation, must be submitted. It is strongly encouraged
that all reliable data obtained by the State which satisfies
the criteria established for monitoring network adequacy be
submitted.
b. The NEDS/SAROAD contact in the Regional Office arranges
for keypunching of forms if necessary and then mails the data to
the MDAD's National Air Data Branch in card or tape form.
c. Air Quality data submitted to the National Air Data
Branch should have the following characteristics:
i. Data must be coded in SAROAD format.
ii. Data values less than the monitoring minimum de-
tectable sensitivity should be reported as a "zero"
value. A value equal to half the minimum detectable
sensitivity will be substituted when calculating
summary statistics for continuous data.
iii. It is desirable that the data be representative of
a consecutive three-month period for which at least
75 oercent of the data values are valid. A non-
detectable measurement, i.e., a value below the
minimum detectable sensitivity (Limits of Detection),
-------
-6-
is considered valid. Summary statistics are not
automatically machine computed if greater than
50 percent of the valid measurements are below the
minimum detectable concentration. However, if the
criteria are not met, the data should still be sub-
mitted particularly for evaluation of maximum value
standards. For noncontiguous 24-hour data there
should be at least five data points in the quarter,
with at least two months being reported and a mini-
mum of two data values in the month with the least
number of data value reported.
iv. Data must represent an interval of one-hour or
greater — shorter interval data must be averaged
over an hour.
v. Data must be representative of the conditions of
the site for the period of time specified; modifi-
cation of the environment in which the site is
located must be reported to the MDAD by the State
and/or the Regional Office.
d. Data are processed using the SAROAD edit program and
the error messages generated are provided to the AEROS contact.
e. Investigation and correction of potential errors is
accomplished by the Regional Office in conjunction with the
States using procedures described later in this document.
Corrected data are submitted to the National Air Data Bank for
file updating.
-------
-7-
2.2 Current Data Editing
The incoming air quality data, in SAROAD format, is
subjected to various checks by the National Aerometric Data
Bank's computer programs. The data will fail to pass the
edit programs for the following reasons:
a. No existing site description. Before any data are
accepted, the site file must contain the information from
the site identification form. The program checks the 12-digit
site code on the data and if no corresponding record is availa-
ble in the site file, the data are rejected. Therefore, the
site identification must be entered before data from a new site
can be accepted.
b. No existing description of sampling or analytical
method. The program automatically rejects data if a record
of the method used to generate the data is not available.
c. No match on the pollutant-method-interval-unit
combinations for these codes. Anything else will be rejected.
For example, there is no monthly interval suspended particu-
late data using a hi-vol sampler and gravimetric analysis.
d. Any data field other than "Agency" or "Interval"
which has been coded in alphabetic rather than numeric
characters.
e. Data on the wrong form, such as trying to send 24-
hour data on the hourly data form.
f. Incorrect start hour. For hourly data the start hour
must be 00 or 12. For two-hour data through twelve-hour data
-------
-8-
legitimate values are given on page 36 of the SAROAD Users
Manual. For twenty-four hour or greater data, legitimate
values are from 00 to 23. Anything else is automatically
rejected.
g. Data incorrect. Data are checked for meaningful
days. Examples of meaningless days are February 30 or April 31.
Some data had to be rejected because the year was designated as
1977. Eventually, the capability to flag data which have a date
other than the current quarter will be added. However, this
capability will be delayed until all back data are incorporated
in the system.
h. Imbedded non-numeric characters in values. There is
a four digit field for the value. For example, values which
have blanks between digits, such as two zeros, a blank, and an
eight instead of three zeros and an eight would be rejected.
i. Decimal place indicator not between 0 and 5. The data
which are currently being generated all have fewer than five
decimal places.
2.3 Current Data Validation and Certification
Currently, the manual procedure used by the MDAD in the
identification of potentially anomalous data values depends,
to a large extent, on chance discovery by someone scanning a
computer printout of either raw data or summary statistics.
Automatic procedures have not yet been developed for computer
applications.
-------
-9-
This process of detecting questionable data values will be
supplanted when the data system is transferred to the Univac
computer in August, 1974. Potentially anomalous values will
be objectively identified as a step in the addition of all
new data to the file. Both parametric and non-parametric tests
could be applied to the incoming data and a listing printed of
all values that meet one or another of the test criteria for
flagging. Examples of such tests are given below.
Non-parametric tests
' Values that are larger than the arithmetic mean of the
data by some preassigned factor (such as 2).
' Values that are some factor, say 1.5 times larger than
the estimated assigned 99th percentile of the data.
' Hourly values that differ from adjacent values by more
than some preassigned ratio, suggesting some abrupt
change in baseline or a transient interference.
' Chebyschev type tests, wherein values that are more than
four standard deviations away from the mean are to be
considered suspect.
Parametric tests
Efficient use of these tests depends on knowledge of the
frequency distribution of the quantity being measured. Example
of such tests are presented below. (The sensitivity of these
tests can be determined analytically from the frequency distri-
bution.)
-------
-10-
• Detection of any values that are larger by some factor
(e.g., 1.5) than the expected value of the assigned 99th
percentile of the distribution under question.
' The finding that the average of K >_ 5 successive values
2
falls outside the (y +_ 3o_) limit, where y and a are the
N/K
mean and the variance of the distribution under question.
Note: The difference between the non-parametric test and the para-
metric test is that in the former, the assigned percentile is esti-
mated from the data, whereas in the latter it is theoretically
obtained.
Validation of the pollutant measurements involves technical
judgment about what constitutes questionable data, and is expected
to be applied systematically in the form of a set of criteria
defining, for each pollutant, what constitutes an unusual or anomalous
value or an abnormal fluctuation. Excursions outside of expected
bounds should be flagged or tabulated but cannot be automatically
rejected or deleted. They must be brought to the attention of the
contributing agency for correction.
Pefinitions of what constitute unusual values or abnormal
fluctuations are required for each pollutant. These criteria
should be defined by people familiar with the characteristic behavior
of the pollutants and the instruments used to measure them. Realis-
tically, these criteria for identifying questionable values should
be open to revision. Once developed, these criteria can be readily
incorporated as a standard element in the data bank's editing and/or
validation procedures.
-------
-n-
Certification by States is accomplished by using available SAROAD
output to determine the accuracy and completeness of all submitted
data. Particular emphasis should be placed on the following:
a. Site identification information
b. Methods of collection and analysis
c. Integrity of the actual data
All three items must be coded and represented on the data bank as
accurately as possible to insure the proper interpretation and
evaluation of the data.
Certification may be triggered by either of two mechanisms:
First, any time there are FDIT or VALIDATION reports flagging either
incorrect data or data of a questionable nature, implicit certifica-
tion is required. This means that the data must be corrected and
resubmitted, if necessary; otherwise, for data which has been
flagged as being possibly invalid, no action is necessary if the
data is correct as it was submitted.
The second trigger for certification may be dependent upon
time or the number of anomalies being reported for a specific
subset of data. It may be determined that an agency should inspect
a set of data to certify it as being correct and complete. In this
situation, and it will always be identified as such, the appropriate
agency must make any corrections necessary to the data and must
always respond in writing that the data are correct as they stand
or that the corrections which have been attached will solve the
problem.
-------
-12-
2.4 Current Data Verification
Currently the entire procedure of data verification is
being handled through contractual resources. This involves
the use of reference publications to determine the probable
existence of additional air quality data. Once NADB is aware
of this data the necessary steps are taken with the appropriate
agency to coordinate the submission of the data to the National
Aerometric Data Bank.
2.5 Future Data Flow System
As previously mentioned, it is expected that the Regional
Offices will assume more responsibility with respect to the
validation of air quality data. This will be accomplished by
their taking a central role in the screening of air quality
data before it is inputted into the National Aerometric Data
Bank. The screening will involve not only editing the coding
format but also the validation of the measurements.
During the transition period of shifting more responsibility
to the Regional Offices, it is anticipated, at least initially,
that the MDAD will do minimal revalidation of the data. Also,
the flagging technique for measuring SIP progress will still be
employed and the National Air Data Branch will assume the ulti-
mate responsibility of entering the "correct" SAROAD data into
the National Aerometric Data Bank (Figure 2).
-------
State/Local Agencies
Flog (Including edit and validation)
Air Quality and Emissions Data
Regional Office
KEDS/SARQAD Contacts
National Air Data Branch
Data Processing Section
ro
i
ro
o
n>
to
o
Interactive
Terminal
Display
co
i
-------
-14-
2.6 Future Data Editing
One of the highest priorities within MDAD concerns making
available all Edit and Validation programs to each Regional
Office. It has been determined that this can best be accom-
plished by providing terminal edit capability on the RTCC-
UNIVAC 1110.
The procedure to be followed would involve either trans-
mitting or mailing the AQ report in a computer readable medium
(cards or tape) to RTCC. Once the data has arrived, the edit/
validation programs could be executed via the Regional Office
terminal with the error diagnostics being returned via the
medium speed remote terminal. This output could then be returned
to the appropriate agency as required.
After a successful edit of the data has been completed the
culled data would be identified to NADB who would concatenate
several Regional Office data sets into a single update. Any
additional errors generated by the actual update (i.e., duplicate
data) would be routed directly to the appropriate Regional Office.
2.7 Future Data Validation
As data are audited by the Terminal Edit/Validation program
it is planned that, in addition to the edit rejection listing
being produced, a special report will be generated which auto-
matically will identify data which seem for one reason or
another to be invalid. This data although identified in the
validation report will nevertheless be updated onto the SAROAD
files.
-------
-15-
Due to storage constraints there are no plans for these
data to be further "flagged" while stored. It is imperative
that the data be checked immediately to determine its validity
by the submitting agency. If the data are confirmed to be
correct no further action is necessary. If, however, the data
are incorrect then the agency must immediately code the neces-
sary changes and/or deletions and submit these to the appropriate
Regional Office.
In addition to the types of validation tests already
discussed the following list illustrates the computerized
hourly validation checks under consideration:
CO 100 ppm
S02 2 ppm
Ozone (Total Oxidant) .7 ppm
Total Hydrocarbons 1C ppm
Non-methane Hydrocarbons 5 ppm
N02 2 ppm
NO 3 ppm
NO 5 ppm
/\
Total Suspended Particulate 2000 g/m
3. REGIONAL OFFICE AIR QUALITY DATA RESPONSIBILITIES
This Section presents recommendations and suggestions as to those
methods and techniques which the Regional Offices can employ to validate
air quality data. The Monitoring and Data Analysis Division recognizes
that some of the areas of responsibility are beyond the capability of some
of the Regional Offices at this time. In these cases, the MDAD will
-------
-16-
vide technical and other assistance on an as needed basis in order
t the current and planned data flow system operate in the most
icient and effective manner possible.
3.1 Current Areas of Responsibility
At this time, there are various tasks which the Regional
Offices perform in the validation of air quality data. These
include the following:
a. Preliminary Data Inspection
The Regional Office can make a visual screening of the
SAROAD sheets before forwarding the data to MDAD. Ensuring
that the site identification and descriptions, pollutant,
sampling and analytical method, interval, units and decimal
point locations are properly filled in on both the 24-hour
and hourly SAROAD coding.form will greatly reduce the edit
and resulting correspondence between MDAD and the Regional
Offices. If a particular agency shows a history of care-
lessness in correctly filling out their SAROAD sheets, the
Regional Office may want to check these sheets for their
"correctness" as discussed in Section 2 rather than just
for their completeness.
If the data submitted to the Regional Office from the
States are in the form of punched cards, the Regional Office
can visually inspect the batch to make sure that pertinent
columns are punched and aligned correctly. The Regional
Office may find it desirable to actually print out or list
the data from selected agencies before forwarding the cards
-------
-17-
to MDAD. If the data are sent on magnetic tape, there
is little the Regional Office can do, at present, but
forward it on.
b. Interrogate Data Bank, Data Requests and Manual
Examination
Some existing SAROAD outputs are available which the
Regional Office may find helpful in evaluating their air
quality data. The Regional Office can request output from
the data bank and get quarterly and yearly frequency dis-
tribution lists for each sampling station. The output
includes the site description at the top of each page and
a frequency distribution for each pollutant, year or
quarter-year. The number of observations, minimum, maximum,
and the percentile values are listed for each pollutant-
quarter-year. The arithmetic mean, geometric mean, and
geometric standard deviation are given only for those
pollutant-quarter-years which meet National Aerometric
Data Bank criteria.
The frequency distributions are available on a national,
EPA regional and State basis. Other options include the
ability to request the distribution for limited numbers of
pollutants, years or quarters.
These and other outputs and remote batch and inter-
active access methods are more fully defined and discussed
2
in the SAROAD Terminal Users Manual, and the Regional
Office MEDS/SAROAD contact should be contacted for addi-
tional information.
-------
-18-
The Regional Office will, in the future, be able to
make comparisons between measured air quality data and
that which they, and/or the State and local agencies,
intuitively feel is reasonable for that geographical area,
station and pollutant.
c. Check Anomalous Data
Anomalous or questionable data values may arise from
the data flow system as a result of the following procedures:
edit checks, validation screen and the application of the
flagging technique. The Regional Office has the responsi-
bility of either accepting, rejecting or modifying the data
value or average in question. In this regard, the Regional
Office has the option of requesting that the originating
agency determine the validity of the data or provide certain
information and documentation so that they may make the
final determination.
The procedure used to check out any specific data
value prior to the initiation of an anomaly request to
NADB could depend on: the Regional Office's assessment
of the originating agency in terms of its capability,
quality control program, and previous performance. MDAD
suggests that the following sequence of steps be followed
in order to check out anomalous data values or composite
averages. In all cases, it should be recognized that any
agency which alters, manipulates or transcribes a data
value in any way is potentially capable of introducing an
-------
-19-
error. When a data value is identified as being Questionable,
the responsible agency must determine whether or not the data
value maintained its integrity throughout the agency's data
acquisition and processing system.
The data should be traced through the SAROAD system,
the Regional Offices, State agency and/or local agency to
its original recording, whether it be a value from a computer
readout, paper tape printer, strip chart, or a report from the
chemist in the laboratory. The types of errors usually found
in the internal check are: typing, key punching, tabulating
and transposition, mathematical (such as addition, multipli-
cation and transcribing). Further discussion of these errors
and methods to reduce their frequency may be found in already
345
published guideline documents. ' '
If no errors have been identified in the internal check,
at all agency levels, the verification and evaluation process
should continue down two similar but separate paths. Which
path is chosen depends on whether the data in question is a
single value or a composite average.
i. Evaluating Specific Air Quality Data Values
' Instrument Calibration, Specifications and Operations
The operation and calibration of continuous instru-
ments is of the utmost importance in the production
of valid air quality data. The instrument cali-
bration should be reviewed for the time period in
question, both before and after the suspect data
-------
-20-
point, It should be determined if the instru-
ment was operating within pre-determined
performance specifications such as drift,
operating temperature fluctuations, unattended
operational periods, etc. These performance
specifications for automatic monitors are defined
and published in the Federal Register and sum-
3 4
marized in various guideline documents. ' These
specifications are likely, however, to be super-
ceded by those published in the October 12, 1973,
issue of the Federal Register on proposed
Eauivalency Regulations. Guidelines on air guality
control practices and error tracing techniques are
also available.
Before and After Readings
If the instrument generating the data was found to
be "in control," the values immediately before and
after should be determined. Comparisons between the
percent and/or gross deviations could be made. Ideally,
this difference in concentration should be determined
through a statistical analysis of historical data.
For example, it may be determined that a difference
of 0.05 ppm in SO^ concentration for successive hourly
averages occurs very rarely (less than one percent of
the time). The criteria for what constitutes an
. excessive change may also be linked to the time of day.
-------
-21-
For example, an hourly change of CO of 10 ppm between
6 AM and 7 AM may be common but would be suspect if
3 *;
it occurred between 2 AM and 3 AM.
1 Other Instruments at the Same Location
Observing the behavior of other instruments at
the same location would give the evaluator a quali-
tative insight into the possible reasons for the
anomalous reading. If all of the instruments showed
a general increase, meteorological factors might be
considered while a dramatic deviation over the same
short period of time may indicate an electrical
problem or an air conditioning malfunction. On the
other hand, if the other instruments behaved normally,
a temporary influence of a single pollutant or single
pollutant source may be suspected.
Similar Instruments at Adjacent Locations
Comparing the behavior of other instruments in the
vicinity which monitors the same pollutant could
further elucidate the situation. For example, if
the adjacent instruments (upwind and downwind)
exhibited the same general trend, an area problem in
which the maximum effect was over the station of
interest, would be indicated. However, if the adjacent
stations seemed to peak either before or after the
time the suspect value was recorded, the station may
have been under the influence of plume fumigation
-------
-22-
which wandered according to wind direction influences.
Micrometeorological influences should not be over-
looked either. The station may be under the influence
of subsidence effects from the urban heat island or
7 8
upslope-downslope influences.
' Meteorological Conditions
No attempt to explain an anomalous air quality data
point would be complete without a consideration of
the meteorological conditions present at the time of
the reading. A passing front and strong inversion,
extended calms or strong winds are conditions which
7 8
have a great impact on air quality. ' Influences of
precipitation, temperature and season could be included
to interpret the reasonableness of the data as well.
' Time-Series Check
Investigating a time series plot of the data might
reveal a repetitious pattern during similar time
periods. An extreme excursion might thus be explained.
For example, the instrument may be extremely tempera-
ture sensitive and may be under the influence of the
sun shining between buildings from 2 PM to 4 PM each
afternoon. Similarly, for example, every Thursday may
be delivery day for an adjacent supermarket where the
delivery trucks spent the bulk of the day idling in
the vicinity of the sampler probe.
-------
-23-
' Physical Site Location
From time-to-time local air quality influences may
change and adversely affect a given air monitoring
station's representativeness. Examples of this might
be an adjacent apartment house or supermarket changing
from garbage haul-away to an incinerator. Urban
renewal may also render the location temporarily un-
representative. It may be beneficial for each agency
or Regional Office to maintain a map and photograph
of each site showing influencing site characteristics.
These could be updated on a periodic basis. The site
location, sampling probe material and configuration
should also be within the bounds of those specified
3
in published guidelines. Figure 4 presents a step-
wise review and guide to the verification of specific
data values. It should provide the Regional Offices
with an overall picture of the suggested processing
of State and local air duality data.
ii. Evaluating Annual Air Quality Averages
Summary Statistics
If no calculation or recording errors have been found,
those summary statistics which describe the average
should be checked. These may include both geometric
and arithmetic means, standard deviations, and the
frequency distribution in percentiles. Both the
-------
-2U-
Error
Found
Error
found
Error
Found
Error
Found
Anomalous ,0
Data
Identified
[National Aero-
\matic Data Bank
MDAD
Internal
Check
Error
Corrected
Error not
^ found,.
Contact
Regional
Office
Regional
Office
Internal
Check
Error
Corrected
Error not
found
Contact State
md/or Local
kgency
State and
Local
Internal
Check
Error
Corrected
Instrument
Calibration
Operation
Specifications
Error
Corrected
FIGURE A.
DATA VERIFICATION FT,DM CIIAR'i1 FOR SPECIFIC
!V-Y:\ VALUES
-------
ERROR
NOT
FOUND
V
GREATER
^ THAN
CRITERIA
' REVERSE
^ TREND
^INDICATED
BEFORE AND
AFTER
INSTRUMENT
READINGS
^
. OTHER
INSTRUMENTS
SAME
LOCATION
/*
LESS THAN | NO
CRITERIA DECISION J
V7
NO TREND 1 o° «
INDICATED \ / <
x. j tr1
\ / g
(SUBSTANTIATING \/ o
r-S TREND E-
V INDICATED 2
REVERSE
TREND
INDICATED
UNFAVORABLE
.x-1
^TOWARD
OCCURRENCE
REVERSE
^ CYCLE
SIMILAR
INSTRUMENTS
ADJACENT
. LOCATION
j SUBSTAMT
JU TRE
V INDICA
METEOROLOGICAL
CONDITIONS
FAVORAE
^j TOWARD
TIME-SERIES
CYCLE
a
NO TREND „„ ' < J
NO I ^ 8
INDICATED DECISION ; g
\ / a
IATING \ / ^
ND \/ <
TED
CO
NEUTRAL j NO 1 u
J j M
TOWARD ^E^lcDluiN I t,
OCCURRENCE \ / o
\ / ^
-LE V 0
H
'E w
O
N° „., NO
CYCLE DECISION ;
j POSITIVE \ /
\7 CYCLE V
SITE
^ DEVEATES
^ FROM
GUIDELINES
PHYSICAL
SITE, PROBE,
VANDALISM
SITE IS OK
-------
-25-
standard deviations and the magnitude of the dif-
ference between the geometric and the arithmetic
means are more sensitive to a few extremely high
values than to many moderately high levels. Inspec-
tion of the values corresponding to the hiqher per-
centiles would also show the influence of abnormally
high values. On the average, standard deviations do
not generally change much from year-to-year.
• List Individual Values
If the summary statistics indicate that the mean was
heavily influenced by a few high values, or in the
absence of summary statistics, the individual data
values which comprised the average should be listed.
From inspection of this list, it can be determined
if the average was influenced by a relatively few
large values or whether the bulk of the data appears
to be consistently high. If the former appears to be
the situation, each individual data value should be
treated according to the guidelines for specific air
quality data points presented above. In the latter
case, proceed to the next step in the verification of
annual averages.
' Physical Site Inspection
The physical site location should be evaluated in terms
of its representativeness of the pollutant of interest,
-------
-26-
the averaging time of interest, and the pollutant
receptor. The operation of the site should be
evaluated in terms of sampling methodology, mainte-
nance procedures, calibration procedures and quality
control practices. The actual sampling probe and
manifold material, configuration and placement should
also be evaluated. Guidelines describing in detail
these aspects of air quality monitoring have been
published.3'4'5
Plot Data
Comparing a visual plot of the current data to that
of prior years on a typical annual pattern could further
pinpoint reasons to accept or reject the annual average
in question. Note that, however, some year-to-year
variation is expected. Figure 4 presents a typical
SCU annual pattern based on expected monthly averages
(exaggerated for purposes of illustration). .Figure 5
also shows this same pattern with a constantly in-
creasing baseline drift. A pattern of this type
suggests a continuing long-term failure (change) in
a component of the instrument, deterioration in the
supplies being used or a subtle change in the environ-
ment. Figure 6 presents the typical pattern with an
abrupt dislocation of the base line. This may be
indicative of a change in struments, methods of
analysis, procedures used or personnel. It should
-------
-27-
not be arbitrarily assumed that any such shift
is wrong. For instance, the analytical method
may have been changed to the standard reference
method, sources of interferences may have been
eliminated or the operators may be following the
procedure correctly for the first time. Figure 7
presents a seasonal abnormality in the expected
pattern. It should be kept in mind that a devia-
tion from the expected pattern can be negative as
well as positive. Figure 8 demonstrates how the
expected pattern can be smoothed (masked) by a
nearby source whose emissions are fairly constant
throughout the year. The pattern may also show
part of the year as "normal" and part of the year
"masked" if there are pronounced seasonal wind
direction changes. For those pollutants such as
oxidants whose peak values occur during a single
season a plot of weekly or bi-weekly averages through
the period of interest would provide more information
on the cyclical patterns than monthly averages.
' Check Prior Data for Trend
Plotting at least four previous annual averages
along with the current year and visually inspecting
the graph could give the evaluator a qualitative
insight into whether the current annual average is
a significant deviation from or an extension of the
projected trend.
-------
-28-
3.U.L f.U JL.D J.Q X I.D..L
——777:1 ;r---r-rirr::~7™]-:—::T7^^-7r~ •^r~'^:,\~rt:~"-\~-rrir~r:r~~.:.-'-"~rr:r.
jEt •-•---—-—-——•-•—--"—-——•••" ^_,->..-._».--»..-...rf.._-. ^— -.-
7"""~j*• ^^ --~pT:.. j •—^-y 1 'V5^c^"!7^"^
... ,. ..T _L.__---— i- _, .-. -^V-- T ~" " . "" ' 'i*~"JP _" " "". " • -•- ._"'' " ': _'• »' • ~J_ •'
._ ._I .- ( . - - -^h- — -- r I /" ' 1 _ l _ I -'— . I . i
*-~-J-"T'.-~ . r~"-l!l^,1'Z.7.1'4 "'" ' -""~~~;" 1_N"-, I_J __i!I_Z-. .?.>' I - . 1-j..-- .-._t , i, t_-_— __^ -« |._
r:--:J::i:r-U-fJi^r.:.:-: I::::::-:—i~r^:£S. i.:z.:.;z:7r r:T•//..-:.:r:~-.~~ :r-::7.i:ir:l r— ;rrj:l:-i::~i7r—zrrrr-.tl:.~:t:::n-:
— • i ' ; ' • —-^
_ _ •__ ..... _ _ .L •• _- . ____ - _______
*^
*
~-~^J"™"^'"""* v''^"^" -.crr.rmT""1..!.™^- f.
i
-O
:i:
: I" J^3^T^ W^iW//t\
' ' ~^- _^"-^' T^X'ulIT' 1" "'"" i " " '" '--Vf—,' 1Z—1—^j> '///'/
_ ; .. .... » . i.. — -.. - —v.r / / / y
.i£i-~i-t j-i_i-"i_i zru iTr r
^-«i,-w*»w^^i4*»*-r^*w^rj.. •»* »X*M iBMtfoKM,! *umHi iyjrw^*-«rw«»»o *
"JAN FE'B"~MA"R~~"A'PR~M"AY
'S"E"P~~OCT ~NO"V~~DTC"
-------
-29-
Compare With Surrounding Stations
If there are enough surrounding sites to develop
air quality isopleths of the area, the evaluator
could see how the annual average in question fits
in with the overall picture. For instance, if the
point in question was midway between the isopleth
lines representing 80 and 60, but the recorded value
was 50% greater than expected, i.e., 105, an ab-
normality may be expected.
This comparative technique may also be used in areas
where there are not enough sites to directly plot air
quality isopleths but where a predictive air quality
model has been developed and verified with a limited
number of actual data values. In these cases, for
example, deviations of +_ 100% could be suspect.
' Meteorology
The annual average should be interpreted in conjunc-
tion with meteorological conditions for the year in
question. For example, if the winter of the year in
question were the coldest in 50 years or the overall
degree days were 50% above the 20-year norm, an
increased SOg average would be expected. Suspended
particulate values can be greatly affected by wind
direction and a disproportionate wind rose (atypical
for the area) could help explain unusual values.
Comparing the appropriate meteorological parameters
-------
-30-
such as rainfall, wind speed, number and length of
inversion, temperature and degree days to their long-
term averages, i.e., 20- or 50-year norms, before
attempting to change implementation plans is suggested,
d. Data Bank Add/Correct/Delete Procedures.
As Regional Office interaction with the SAROAD data
bank increases, there will be an increasing need to become
proficient with the procedures used to update the bank with
new data, correct existing data and delete data which are
incorporated in the data bank but have been found to be in
error. There are then three types of transactions which can
be processed by the SAROAD data bank: add, correct, and
delete. In each case data in SAROAD format must be submitted
on a separate tape or set of cards and must be identified both
on the tape and by an accompanying memorandum.
Documentation of each of the transaction types, describing
the processing which the data goes through and indicating the
limitations of each type of transaction has been provided to
the Regional Office by MDAD (Slaymaker's memorandum of June 6,
1973).
The Regional Office should use the previously discussed
procedures to determine if identified suspect data should be
updated, corrected or deleted by means of these transactions.
3.2 Future Areas of Responsibility
Future areas of Regional Office responsibility with
respect to air quality data include:
-------
-31-
a. Quality Control
Quality control practices in the operation of air
monitoring instruments, laboratory analysis and data handling
procedures is of the utmost importance in producing valid
air quality data. The Regional Offices should therefore
encourage quality control programs at the State and local
level. To aid the Regional Offices in this effort, the
Quality Assurance and Environmental Monitoring Laboratory,
NERC/RTP, has and is developing various manuals describing
in detail, procedures to be followed during the course of
sampling analysis and data handling for various pollu-
tants.9a'b'c'd
The Control Programs Development Division has developed
a general guideline for State and #bcal wuality control pro-
grams entitled "Quality Control Practices in Processing Air
Pollution Samples.' This guideline should help the Regional
Office establish a general quality control program at the
State and local level.
b. Edit and Validation Checks
When MDAD develops the data validation programs and turns
both the editor and data validation programs over to the
Regional Offices, it is expected that the Regional Offices
will assume the lead in initiating edit and validation checks
on the incoming data. High quality data should then be trans-
mitted to the National Aerometric Data Bank via upgraded
remote access computer terminals.
-------
-32-
4. CURRENT TECHNIQUES FOR SIP PROGRESS EVALUATION
It is difficult to develop comprehensive guidance on exactly how
to determine whether a control strategy will need to be revised. While
there may be a few situations where it is obvious that a plan revision
is necessary, in general it will be a difficult task to determine that a
plan is inadequate to attain the standards prior to the established attain-
ment date. The problem is to determine whether AQCR's are progressing
satisfactorily in relation to the emission limitations contained within
the SIP. To this end, a Plan Revision Management System (PRMS) was
developed to track the progress being made by States in implementing
their SIP. PRMS provides a means for effectively combining information
contained in SAROAD (air quality) NEDS (source emissions), and CDS (enforce-
ment and compliance information) to compare measured progress against
expected progress.
This system is designed to monitor the progress of actual air quality
levels, obtained from the quarterly reports, in relation to the anticipated
air quality reductions which should occur as a result of compliance with
approved emission limitations. If the difference between the observed and
projected air quality levels exceed certain specified limits, then the
site is "flagged" as a "potential problem." A number of flagging levels
or tolerance limits are incorporated in the system to indicate that the
site either has acceptable progress or is having a minor, moderate, or major
problem toward attainment of the NAAQS. The tolerence limits were
developed through the application of statistical Quality control techniques
which allow for the many variables associated with measured air duality
concentrations. (See Figure 9)
-------
-33-
Hcjurc 9
PLAN REVISION MANAGEMENT SYSTEM
Particulate Matter
Emissions
(1000 tons/year)
150
100
50
1970 1971 1972 1973 1974 1975 1976 1977
Air quality
150
100
50
0
-2 o
-Tolerance limits
Projected air
quality
1970 19X1
1972 1973 1974 1975 1976 1977
Calendar Year
0 Me as u re d air o u a 1 i ty
Step
//I Calculation cf emission reduction fNi'DS, Emission Regulations)
n. Review of co:-;plii:nco daU-s (SIP1, CDS,- Ciiviss-io/i Regulations)
i-3 Projection of :-\r reality
i-'': Establ is:'i!:''.-i!t of to] oranc'.' 1 v'.:i'>;:i or bc;i!;:c!,iries
;;5 Mea'-MJied air quality treiu.! (S;,;;;V.:'X
-------
-34-
Once a "potential problem region" is identified, OAQPS will notify
the appropriate Regional Office. This will be done on a semiannual basis.
The Regional Office will be responsible for investigation and further
assessment of the problem. The Regional Office should also report their
findings to OAQPS indicating the action they have taken or plan to take.
While the PRMS will provide a mechanism to identify "potential problem
regions" from an analytical point of view, the Regional Offices should be
more intimately aware of the status of Regions within their States. Thus,
the Regional Offices may be aware of other AQCR's not currently being
analyzed by the PRMS which should be reviewed to determine if the plan is
adequate to attain the NAAQS by the specified data for attainment.
Initially, there are 17 AQCR's contained in the PRMS. An additional
50 Regions were included in the system in January 1974. The additional
50 Regions that were selected for analysis were based on recommendations of
the Regional Offices as to those AQCR's which should be reviewed to insure
that adequate progress is being made toward attainment of the standards.
By mid-1974, 50 more AQCR's are scheduled to be included in the PRMS. Thus,
by July 1974, 117 Regions will be analyzed. The Regional Offices should
i
indicate to OAQPS those AQCR's that they believe should be reviewed to
determine the possible need for plan revisions.
It is understood that air quality levels throughout an AQCR are
highly variable and that each monitoring site within the region must
have levels at or below the national standards by the specified date
for attainment to be in compliance with the Act. The PRMS analyzes all
monitoring sites within SAROAD for the particular AQCR in question to
-------
-35-
determine if adequate progress is being made. Thus, the system is capable
of defining the problem on a much smaller scale than the entire AQCR.
While most of the region may be showing adequate progress, a few sites,
located in areas of maximum concentration, may be deviating from the
desired air Quality levels. Review of these sites will allow the Agency
to take a much closer look at the real problem areas. Because the R.O.
may only be required to review a very few problem sites, more effort can
be placed upon those areas within an AQCR which appear to be having the
most difficulty in attaining the standards. It is believed at this time
that it will not be necessary in most cases to require a major plan
revision for an entire AQCR. The revision or additional action can be
tailored to a minimum number of sources to give the maximum amount of
benefit toward attainment of the standards. Thus, a review to determine
the adequacy of the progress for a region should be done on a site by site
basis. The following two pages present the PRMS responsibilities and the
associated action procedures.
-------
ACTION PROCEDURES
A. Data Review Actions
1. The air quality data should be reviewed and work should pro-
ceed to certify the data if possible.
2. The monitoring site should be visited to detenriine if the
monitor is properly located.
3. The meteorological conditions associated with the sampling
period in questions should be reviewed to determine if any
abnormal conditions could have effected the air quality
«
levels.
4. The site location is source oriented and a unique projected
curve for that site should be developed to better analyze
the data.
5. A more detailed projected curve should be developed for the
entire air quality control region.
B. Program Actions
1. A review of the compliance schedules for the AQCR should be
conducted to determine if any sources have failed to meet any scheduled
-."•* • *
milestones or final compliance dates.
•
•>
2. The State should be notified that a more effective implementation of
the new source review procedures is needed to restrict growth in
certain areas.
3. A special study should be initiated to determine the cause of the
present air quality problem and the results are expected by .
C. 'Legal Actions
1. EPA/State enforcement action is necessary
2. Plan revisions is determined to be necessary and the State has
been notified of the need for the revision.
3. The State's plan revision has not been submitted or approvpri
and work has been initiated by EPA to develop the necessary
-------
PRMS Responsibilities
OAQPS Responsibilities
o Calculate initial emission/time curve
o Develop initial projected air quality curve (Proportional model)
o Perform the computer analysis of measured vs projected air quality
o Notify each Regional Office of possible deficiencies
o Prepare a summary of the PRMS analysis for the Administrator's
Progress Report
0 Offer technical assistance to the Regional Office in investigating
identified deficiencies
o If requested, rerun computer analysis with additional data provided
by the Regional Office
Reg i on a1 Office Res pon s i b i1i t i es
° Investigate areas with possible deficiencies
o Inform OAQPS of the results of the investigation
o If a new projected air'quality curve is determined to be necessary,
it should be developed by the R.O.'s and submitted to OAQPS fo'r
a rerun of the PRMS analysis.
o If a plan revision is determined to be necessary by the R.O., inform
the State of the type of revision necessary to correct the plan
deficiency.
-------
-36-
REFERENCES
1, SAROAD Users Manual, Office of Air Programs Publication No. APTD 0663,
EPA, Research Triangle Park, N. C., July 1971.
2. SAROAD Terminal User's Manual, Office of Air Programs, Publication
No. EPA-450/2-73-004, EPA, Research Triangle Park, N.C., October 1973.
3. "Field Operations Guide for Automatic Air Monitoring Equipment,"
Office of Air Programs, Publication No. APTD 0736, EPA, Research
Triangle Park, N.C., November 1971.
4. "Guidelines for Technical Services of a State Air Pollution
Samples," Office of Air Programs, Publication No. APTD 1347, EPA,
Research Triangle Park, N.C*., November 1972.
5. "Quality Control Practices in Processing Air Pollution Samples,"
Office of Air Programs, Publication No. APTD 1132, EPA, Research
Triangle Park, N.C., March 1973.
6. Federal Register, Vol. 36, No. 228, November 25, 1971, page 22404.
7. Lowry, W.P. and R.W. Boubel, "Meteorological Concepts in Air
Sanitation," Type-Ink., Corvallis, Oregon, 1967.
8. Symposium; Air Over Cities, Public Health Service, SEC Technical
Report A-62-5, Cincinnati, Ohio, November 1961.
9. Guidelines for Development of a Quality Assurance Program, Office of
Research and Monitoring, Quality Assurance and Environmental Monitoring
Laboratory, Publication N.C. EPA-R4-73-028, EPA, Research Triangle
Park, N.C., June 1973.
a. Reference Method for the Continuous Monitors of Carbon
Monoxide in the Atmosphere.
b. Reference Method for the Determination of Suspended Particulates
in the Atmosphere (High Value Method).
c. Reference Method for Measurement of Photochemical Oxidants.
d. Reference Method for the Determination of Sulfur Dioxide in
the Atmosphere.
10. OAQPS #1.2-011 Guidelines for Determining the Need for Plan Revisions
to the Control Strategy Portion of the Approved SIP.
11. Plan Revision Management System, System Summary, May 1974, USEPA, OAQPS,
CPDD, Research Triangle Park, N.C.
------- |