FINAL REPORT
U.S. ENVIRONMENTAL PROTECTION AGENCY
Office of Air and Waste Management
Office of Air Quality Planning and Standards
Research Triangle Park, North Carolina 27711
-------
EPA-450/3-75-070
SOTDAT
i
FINAL REPORT
by
TRW
Transportation and Environmental Engineering Operations
800 Follin Lane, SE
Vienna, Virginia 22180
Contract No. 68-02-1007, Task 3
EPA Project Officer: Gregory Bujewski
Prepared for
ENVIRONMENTAL PROTECTION AGENCY
Office of Air and Waste Management
Office of Air Quality Planning and Standards
Research Triangle Park, North Carolina 27711
July 1975
-------
This report is issued by the Environmental Protection Agency to report
technical data of interest to a limited number of readers. Copies are
available free of charge to Federal employees, current contractors and
grantees, and nonprofit organizations - as supplies permit - from the
Air Pollution Technical Information Center, Environmental Protection
Agency, Research Triangle Park, North Carolina 27711; or, for a fee,
from the National Technical Information Service, 5285 Port Royal Road,
Springfield, Virginia 22161.
This report was furnished to the Environmental Protection Agency by
TRW, Transportation and Environmental Engineering Operations,
Vienna, Virginia 22180, in fulfillment of Contract No. 68-02-1007. The
contents of this report are reproduced herein as received from TRW,
Transportation and Environmental Engineering Operations. The opinions,
findings, and conclusions expressed are those of the author and not
necessarily those of the Environmental Protection Agency. Mention of
company or product names is not to be considered as an endorsement
by the Environmental Protection Agency.
Publication No. EPA-450/3-75-070
-------
1.0 INTRODUCTION
1.1 GENERAL DESCRIPTION OF THE SOURCE TEST DATA (SOTDAT) SYSTEM
Throughout the country, there is a vast amount of source test data
which has been compiled in recent years. These data are on file in EPA
offices, both in Durham ancTin the"regions, in~state and local control
agency offices, with private consultants who have conducted stack tests,
industrial plants where tests have been run, control equipment manufactur-
ers, and others. Up until now, these data have been of little use to any-
one needing a large amount of data, because they are stored in so many
different places and formats.
The Source Test Data (SOTDAT) System is a useful solution to that
problem. The ^OTDAT System permits the gathering of source test data from
many places and their storage in a computer-accessible data bank in a com-
mon format. SOTDAT is designed so that each record describes, in detail,
one run of a stack test. Variables included are most of those which enter
into the normal stack test calculations, as well as some which will be
necessary to future users of.SOTDAT. Information stored in SOTDAT contains
an adequate number of source parameters (e.g. plant name, location, stack
height, etc.) and concentrates heavily on data describing a specific test
run. Since each SOTDAT record is keyed to a record in the National Emis-
sions Data System (NEDS), any required source parameters are readily avail-
able from a NEDS listing. An exception to this will exist in the case where
test data are coded anonymously in order to protect the .confidentiality of
"the data. For a complete list and description of the SOTDAT variables, see
the August 1973 National Air Data Branch publication "Source Test Data Sys-
tem (SOTDAT)" which describes in detail each data element.
1 2 VALUE OF SOTDAT INFORMATION
The data contained in the SOTDAT System will be useful for many
purposes. The single fact which makes these data so useful is that, in-
stead of being a mixture of measured, calculated, and estimated data as
NEDS is, SOTDAT is composed entirely of measured data. This greatly in-
creases the reliability of any deductions based on data from SOTDAT.
-------
The most immediate use to which SOTDAT will be put is to validate
and/or correct existing emission factors, and to create new ones in areas
where factors have not yet been compiled. In conjunction with this use,
SOTDAT could probably be used as a validity check on the NEDS system.
Estimated emissions in NEDS which are grossly inconsistent with SOTDAT-
generated factors could be flagged for further investigation.
Another use for data in SOTDAT is the development of accurate methods
for calculating control device efficiencies based on specific operating
parameters. These parameters are part of the SOTDAT data base.
A system which contains the type of basic, fundamental data that
SOTDAT does, is sure to become extremely valuable in the future. Data which
deal with actual (not estimated or calculated) emissions from specific
pollution sources is certainly more valuable than what has been available
thus far. Hopefully, the individuals charged with maintaining the SOTDAT
system will be sensitive to the needs for this data, and will remain flexible
enough to implement changes as they are needed.
-------
2.0 DATA ACCUMULATION
2.1 DATA ON FILE IN THE EMISSIONS MEASUREMENT BRANCH OF EPA
Many source tests have been performed by personnel of EPA's Emissions
Measurement Branch (EMB) or by EMB-obtained private contractors. Results
from many of these tests are also on file in the National Air Data Branch
(NADB), and were therefore available for removal from Durham. The data
from these 155 reports were coded onto SOTDAT coding forms in TRW's McLean,
Va. office. This effort produced 1292 completed coding forms.
Following the data validation process described in Section 3.0, the
data from another 9 test reports (submitted to NADB after the original
coding effort) were entered on 109 SOTDAT coding forms.
Another 26 test reports were on file in the EMB office but not in the
NADB. Since these reports could not be removed, the data they contained were
coded in Durham. This additional data generated 209 completed forms.
2.2 DATA ON-FILE IN THE EMISSION STANDARDS AND ENGINEERING DIVISION OF EPA
The Emission Standards and Engineering Division of EPA has a file of
incinerator test results located in the IRL Building in Durham. These are
test reports which have been submitted to EPA in an attempt to obtain EPA
certification for a specific model of an incinerator. The file contains
reports on incinerators which have received certification as well as those
which have been unable to meet the certification standards. Data from 68
reports were coded onto 173 coding forms during the data accumulation effort
expended in this location.
2.3 SUMMARY OF RESULTS
All in all, during the project, 190 source test reports were.read;
data were extracted from them and coded onto 1607 SOTDAT coding forms.
The data now present in the SOTDAT system comprise a relatively good
cross section of most types of industries. However, since the majority of
the tests were performed to accumulate data to be used in the establishment
of New Source Performance Standards, the SOTDAT data base may presently be
biased toward the better controlled, or more efficient sources.
-------
3.0 DATA VALIDATION
3.1 NEED FOR VALIDATION
After the coding effort was completed, the data were keypunched and
loaded into the computer. The resulting output from the system revealed
that several problems existed either in the input data, or in the computer.
program. It was decided that for the SOTDAT System to be a truly useful
tool, it would be necessary to rectify as many of the" existing problems
as possible.
3.2 DATA VALIDATION EFFORT
The apparent errors noted during the initial brief examination of the
computer output included missing data, erroneous values, and unexplainable
printed symbols in the place of data. The approach employed to identify
and correct the errors involved reviewing each output record (one per input
coding form), and checking for noticeable errors of the type listed above.
Whenever suspected errors were discovered, it was necessary to deter-
mine whether an error actually existed, and if so, note it appropriately
for later correction. This was accomplished by checking each entry which
appeared to be wrong against the original coding form, and then, if neces-
sary, against the data in the stack test report. True errors were noted
directly on the output.
At the time this effort was taking place, it was impossible to ascer-
tain definitive procedures for updating the data stored in the computer, so
the changes were made directly on the original coding forms. This retained
the maximum flexibility since either the entire form could be repunched, or
just the card or cards which required correction. This resulted in approxi-
mately 300 coding forms which contained errors.
Examination of the apparent errors demonstrated many, general problems
with the computer program, and generated several suggestions for improving
the system. They were noted during the validation process and are discussed
in Appendix B.
-------
4.0 PROBLEM AREAS
4.1 QUESTIONS ARISING DURING THE PROJECT
The instruction manual supplied by the EPA Project Officer was very
complete and made a very successful attempt to deal with all problems
which might occur. However, a few questions required answers which were
not available from the manual. These questions, along with their answers
were documented as they arose, and a copy was given to the Project Officer
at the completion of the task, (see Appendix A). The problems raised by
the questions should be considered prior to any future revision to the
coding procedure manual.
Probably the most serious problem deals with the case where an ex-
haust gas stream from a single pollution-producing piece of equipment is
split into two or more streams, not all of which are sampled. In this situ-
ation, there is no correlation between the process rate for the piece of
equipment, and-the emissions as determined by the test. Either the process
rate must be reduced a proportionate amount, or the emissions increased. The
problem is what (if any) apportioning factor to use.
Another problem applicable almost exclusively to the EMB data was the
lack of process and control equipment efficiency data. Without these two
data elements, emission factors cannot be calculated. Some effort should be
made to insure that these data are taken during a test, and, equally impor-
tant, that they are included in the report.
-------
5.0 SUGGESTED FUTURE EFFORT
The two most promising areas for obtaining additional data are proba-
bly the individual state control agency offices, and the control equipment
manufacturers.
o State Control Agency Offices - Although there are probably less
data available from the state offices, they will certainly be
easier to obtain. Some states have already expressed an interest'. .
in having their data coded into SOTDAT, and it seems unlikely "•'''''
that other states would refuse to make their data available. All
states will, however, resent having to supply the manpower neces-
sary for the coding effort.
o Control Equipment Manufacturers - Control equipment manufacturers
usually conduct an inlet and outlet stack test whenever a new piece
of equipment is delivered, to insure that the guaranteed efficiency
is being met. Therefore there is a large amount of test data in
existence, but the manufacturers are extremely reluctant to release
the data without"first making them anonymous. They are afraid of.'
releasing any proprietary information about their customers. How-
ever, the great amount of data available, and the usefulness in
evaluating control devices may justify the additional time and
expense required to obtain it.
-------
6.0 SUMMARY
The effort expended on this project has produced a sizable data bank
of SOTDAT data, and the data obtained are a good representation of most
types of pollution sources. However, this effort only scratches the surface
of what is available. Many other sources of data are available in addition
to those discussed in the preceeding section. Some of these are private
consulting firms which have conducted tests, industrial trade associations,
plants which have either done their own testing or contracted for required
tests, and other government agencies which have conducted tests in connec-
tion with research and development projects or the preparation of environ-
mental impact statements and/or permit applications. Since the SOTDAT sys-
tem has the potential to accept a very large amount of additional data, and
since there exists a virtually unending supply of data, the data accumula-
tion can continue far into the future, constantly improving and increasing
the capabilities and value of the system.
-------
APPENDIX A
QUESTIONS AND ANSWERS CONCERNING SOTDAT CODING
During the initial SOTDAT coding effort, a list of questions was
compiled, the answers to which could not be determined from the manual
of coding procedures. Those questions are presented here along with the
answers supplied by the Project Officer. It is hoped that this will fa-
cilitate future coding of source test data by persons unfamiliar with the
system. . • •
Q. Are Orsat analyses considered as test results for coding on
C cards?
A. No. These data are to be entered in field B 10.
Q. Should control devices listed on D cards be all devices on the
piece of equipment or just those indicated in field C 05?
A. Only those in C 05.
Q. If a device control efficiency is unknown, what code should
be used?
A. Use the code for a medium efficiency device.
Q. What pollutant code should be used for total gaseous hydro-
carbons, since "total" is listed under aliphatic compounds
and "gross" is listed under aromatic compounds?
A. Use code 3101.
Q. Are gaseous samples which are taken non-simultaneously with a
particulate sample considered part of the same run?
A. If at least one-half of the gaseous sample was taken during the
particulate sample, they are considered to be part of the same
run. If not, code it on a separate form even though most process
stream parameters will be unavailable.
Q. If a traverse point is sampled more than once during a particu-
late run, how many times is it counted for coding in field B 06?
A. Only once.
Q. If a test is actually performed, and the result is nil or zero
(below the detection limit for the method used) should the test
be recorded?
A. Yes. Enter the result as zero.
8
-------
Q. Is the code for participate caught by a control device to be
"total particulate", "filterable particulate", or "condensable
participate"?
A. Use code "A1101" (total participate), because most particulate
devices are designed for controlling both filterable and con-
densable fractions, and design efficiencies (field D 02) are
usually given in terms of total particulate. However, if de-
sign efficiencies are given for the other particulate fractions.
They should also be entered alongside their respective pollu-
tant codes ("B 1101" for "filterable particulate", and "C 1101"
for condensable particulate).
Q. What should be done with data that are either too large or small
to "fit" in the field(s) allotted for them on the coding form?
A. Enter in "Comments" (Section E). Fill the appropriate field(s)
with nines.
Q. How does one enter a negative pollutant temperature?
A. Leave field C 07 blank, and write the true temperature in "Comments",
Q. If effective duct cross-sectional area is different from the de-
si gn~lireirTd~ue to negative flow or sediment build up) which should
be entered in field B 03?
A. Enter effective area in field B 03 and write the actual area in
"Comments".
Q. If a single stack, fed by several gas streams, each containing a
different number of control devices is sampled, how many devices
are considered to be upstream from the sampling point?
A. The number of devices found in the stream containing the largest
number of devices is used. Indicate in comments.
Q. Are operating parameters (field D 05) "operating" or "design"
values?
A. Operating. No design data are to be entered in this field.
Q. If the exhaust from a single piece of equipment breaks into two
or more separate gas streams, and both streams are tested, what
values are entered in fields A 11 and A 12 (activity levels)?
A. None. Leave those fields blank and enter activity levels in "Com-
ments" along with a statement such as:. "This form contains test
data from one of three stacks. See form numbers and
for data on the other stacks".
-------
APPENDIX B
OBSERVATIONS ON THE SOTDAT SYSTEM
During the data validation process, several items (some essential and
some not) came to mind concerning ways to improve the SOTDAT System.
These were noted at the time, and are discussed in this appendix.
1. The original EPA Project Officer directed that instead of using a
great amount of time writing the plant name and address on each form,
the name and address be written only on the first form of a series of
tests at a plant, and the form number of that first form be written
in place of the name and address.on subsequent forms. It was sup-;v ^
posed to be included in'the keypunching instructions that the name'"1 /
and address from the first form'be duplicated on the subsequent
forms, but the instruction was apparently either not given or mis-
understood. Therefore on forms with a form number greater than
A 00390, the form number of the first form for a series of tests
appears in the output as the name and address on subsequent forms.
2. Test results are coded three per C card. If the computer finds data
in the first test results fields it expects to find data in the re-
maining fields. Therefore fields which are specified as requiring
numeric data and are left blank are interpreted as containing .illegal
characters, and are printed out as ampersands.
3. Related to the previous problem is the problem of how to treat un-
known data. All data in the system now were coded assuming (as is the
case with NEDS) that unknown data should be left blank, while data
with a numeric value of zero should be coded as a zero. Both types
of entries are printed out as zero in cases where blanks are legal
characters for the field, and ampersands where they are illegal.
Some of the fields where they are illegal are; "Control Device Year
Installed", "Sampling Location", "Flow Rate", "Flow Rate Units", "Test
Method", and "Sampling Location".
4. On almost all records, the NEDS ID data (State, County, AQCR, Plant,
and Point) are incomplete. These should be as complete as possible
due to the fact that these items are used by the computer, along with
run number, to sort and group the stored data. It was decided during
the original coding effort that contractor time could be better spent
coding data, leaving the NEDS/SOTDAT correlation for NADB personnel.
Based on instructions from the project officer, this is the approach
which was taken. Determination of the NEDS ID data is a matter of
taking the plant's city and state from the SOTDAT form, going to an
atlas and looking up that state and city in the index. From the index,
the county can be determined. Then the AQCR can be found in AP-102.
After the state, county and AQCR NEDS codes have been found, then the
NEDS Plant ID can be determined by checking for that plant's name in
a listing of NEDS sources. This of course will be successful only
if the plant in question has been input into NEDS (not the case for
test data from foreign plants, or for data which are to be input into
SOTDAT anonymously).
5. Frequently, extra zeroes randomly appear at" the- begi'ririTng-oT some of"
the output data fields. Some of these are: "Process Rate" (both capa-
city and "This Run"), "Test Result," "Cross Section Area," and "Flow
Rate".
: 10
-------
6. The output would be cleaner and easier to read if some or all of the
unused fields (all printed as zero) were suppressed.
7. One digit is often not sufficient to code "Sampling Location". In
those cases, to prevent the ampersand when left blank, a nine was
coded in that field, and the actual value was written in "Comments".
For future efforts, however, it is probably unnecessary to enter
the actual value in "Comments" since only the fact that the value
is greater than seven is of any significance.
8. Where there is'more than one control device entered.on a form, the
computer drops the pollutant codes for all devices past the first one.
9. During the original coding effort, two forms were inadvertantly
coded with form number A 00098. It seems unlikely that either one
of them is currently stored in the computer. Additionally, the A 00098
form for the Wood River Power Plant should have 77.32 instead of 30.44
coded in the "Gas Pressure" field. No attempt was made to correct
this problem since the proper procedure for correction was unknown.
10. When trace metal sampling results are to be coded in field C 06, the
results will often be too small to enter in the field. It is suggested
that another Units code be adopted-for field C 03 to represent milli-
micrograms per cubic meter. .
IT
-------
TECHNICAL REPORT DATA
(Please read Instructions on the reverse before completing)
1. REPORT NO.
EPA-450/3-75-070
2.
3. RECIPIENT'S ACCESSION* NO.
4. TITLE AND SUBTITLE
SOTDAT Final Report
5. REPORT DATE
July, 1975
6. PERFORMING ORGANIZATION CODE
7. AUTHOR(S)
8. PERFORMING ORGANIZATION REPORT NO:
96005.003
9. PERFORMING ORGANIZATION NAME AND ADDRESS
TRW
Transportation and Environmental .Engineering Operations
800 Foil in Lane, SE
vjgnnqii Virginia 22180
12. SPONSORING AGENCY NAME AND ADDRESS
U.S. Environmental Protection Agency
Office of Air Quality Planning & Standards
Research Triangle Park, N. C. 27711
10. PROGRAM ELEMENT NO.
11. CONTRACT/GRANT NO.
68-02-1007
13. TYPE OF REPORT AND PERIOD COVERED
Final Report..
14. SPONSORING AGENCY CODE
15. SUPPLEMENTARY NOTES
16. ABSTRACT
Throughout the country, there is a vast amount of source test data which has
been compiled in recent years. Up until now, these data have been of little use
to anyone needing a large amount of data, because ,they are stored in so many
different places and formats.
The Source Test Data (SOTDAT) System is a useful solution to that problem. The
SOTDAT System permits the gathering of source test data from many places and their
storage in a computer-accessible data bank in a common format. SOTDAT is designed so
that each record describes, in detail, one run of a stack test. Variables included
are most of those which enter into the normal stack test calculations, as well as
some which will be necessary to future users of SOTDAT. Information stored in SOTDAT
contains an adequate number of source parameters and concentrates heavily on data
describing a specific test run. Since each SOTDAT record is keyed to a record in the
National Emissions Data System (NEDS), any required source parameters are readily
available from a NEDS listing. An exception to this will exist in the case where
test data are coded anonymously in order to protect the confidentiality of the data.
For a complete list and description of the SOTDAT variables, see the August 1973
National Air Data Branch publication "Source Test Data System (SOTDAT)" which
describes in detail each data element.
17.
KEY WORDS AND DOCUMENT ANALYSIS
DESCRIPTORS
b.lDENTIFIERS/OPEN ENDED TERMS
c. COSATI Field/Group
SOTDAT
NEDS
Emission Factors
18. DISTRIBUTION STATEMENT
Release Unlimited
19. SECURITY CLASS (This Report)
Unclassified
21. NO. OF PAGES
15
20. SECURITY CLASS (This page)
Unclassified
EPA Form 2220-1 (9-73)
12
------- |