One Point Quality Control Single Point Precision and
Bias Graphics
For Criteria Pollutants
Companion Document
Prepared by the United States Environmental Protection Agency
Office of Air and Radiation
Office of Air Quality Planning and Standards
Air Quality Assessment Division
Ambient Air Monitoring Group
Research Triangle Park, North Carolina

-------
Table of Contents
1	BACKGROUND	1
2	REPORT DESCRIPTION	2
2.1	COMPONENTS OF THE GRAPHS	2
2.1.1	Overview	2
2.1.2	Data Grouping	2
2.1.3	Supplemental Statistics	3
2.1.4	Box and Whisker Plots	3
2.1.5	95% Probability Limits	4
2.2	DEFINITIONS FOR EACH OF THE DISPLAYED ITEMS 5
2.2.1	Region	5
2.2.2	State	5
2.2.3	Primary Quality Assurance Organization (PQAO)	5
2.2.4	Parameter	5
2.2.5	Monitor Type Classification	5
2.2.6	Year	5
2.2.7	AQSSitelD	5
2.2.8	Pollutant Occurrence Code	5
2.2.9	CV Upper Bound	6
2.2.10	Bias	6
2.2.11	# Obs	7
2.2.12	Method	7
2.2.13	Quartiles (Ql, Q2, Q3)	7
2.2.14	Mean	7
2.2.15	Whiskermin and Whiskermax	8
2.2.16	Outliers	8
2.2.17	95 % CFR Upper/Lower Probability Limits	8
2.2.18	Percent Difference	8

-------
One Point Quality Control Precision and Bias Graphics For Criteria Pollutants
Date: 9/9/2016
1 Background
In order to provide decision makers with data of adequate quality, OAQPS is using the Data
Quality Objective (DQO) process to determine our data quality needs for our ambient air criteria
pollutants. There are some data quality indicators, such as precision, bias, and completeness that
directly affect the attainment of the DQOs. These variables need to be in certain acceptable
ranges (called measurement quality objectives) in order for us to make decisions (like
comparison with the NAAQS) with specified levels of confidence. 40 CFR Part 58 Appendix A
provides the minimum requirements for the collection and reporting of data to assess the data
quality indicators of precision, bias, and completeness. On an annual basis, the Ambient Air
Monitoring Group (AAMG) develop summary reports on these data quality indicators.
In 2006 OAQPS revised 40 CFR Part 58 Appendix A in order to base the precision and bias
measurement quality objectives on confidence intervals at the site level of data aggregation.
Since the criteria pollutant data are used for very important decisions (comparison to the
NAAQS) it is felt that providing precision and bias estimates at upper confidence limits provides
a higher probability of making appropriate decisions. This statistic provides a conservative
approach to measuring precision and bias. A document describing these statistics is available on
AMTIC. (http://www.epa.gov/ttn/amtic/parslist.html)
Estimates of both bias and precision for the four automated gaseous methods (CO, NO2, O3, and
SO2) are derived from the bi-weekly one-point QC checks. Since every site should perform the
QC checks at an acceptable frequency, there is enough information to assess and control data
quality at the site level.
In 2005, OAQPS developed a new report in AQS (AMP255 - P/A Quality Indicator Summary
Report) that summarized precision, bias, and completeness of the required QC data for each
criteria pollutant. The data tables may be generated at any time within the AQS application using
the standard report. Earlier reports also generated this data graphically which monitoring
organizations found very useful. Since AQS presently does not have this capability, AAMG is
providing these graphs annually as an addendum to the summary tables. The AMP255 was later
updated and replaced by the AMP256 report
This document defines the elements displayed in the corresponding box and whisker graphics as
well as base assumptions of the data contained within the AMP256 report.
1.1 Generating the Report
Go to the following website https://www.epa.gov/air-data/single-point-precision-and-bias-report. It will
bring you to the Single Point Precision and Bias Report (Figure 1). From this page you can select
individual gaseous criteria pollutants or all 4. You can then select a year and one of three domains: 1) an
EPA Region, 2) a State or, 3) a PQAO. There are "bounds for graph". It's use is discussed in section
2.2.19. Once you made your selection hit the "Plot Data" button.
1

-------
One Point Quality Control Precision and Bias Graphics For Criteria Pollutants
Date: 9/9/2016
You are here: lPA Home » AirData » Visualize Da!a "Single Point Precision and Bias Report
Single Point Precision and Bias Report
This report provides monitor-level precision and bias summaries for the specified parameter and year.
1.	Pollutant
All four pollutants] v[
2.	Year
2016 0
95% upper pTol
limit By PQAO
Whisker,^ & Wh«ikerr^ The lowest and highest values respectively that are fouod
within the upper and lower fence. The upper and lower fences are defined as values
between Ql - (L5*IQR) and Q3 ~ (1.5 * IQft), where "IQft" » the difference between Q3
3. Domain
Select an EPA Region
or
Select a State ... vj
— or —
Select a PQAO...
4. Bounds for graph (leave defaults lo plot the full range of data)
Lower Default! vi Upper Default^/]
Not Data!
Fig 1. Selection Criteria for Box and Whisker Plots
2 Report Description
2.1 Components of the Graphs
2.1.1 Overview
Each graph presented in the s is comprised of four parts. Each of these parts are discussed in the
following sections. The four parts of each graph are as follows:
•	Data Grouping
•	Supplemental Statistics
•	Box and Whisker Plots
•	95% CFR Confidence Limits
Figure 2 illustrates how these different components appear within each graph.
2

-------
One Point Quality Control Precision and Bias Graphics For Criteria Pollutants
Date: 9/9/2016
Selection
Criteria
Data
Grouping
CV
08?
»—\ IO 1-
300 -297
087
087
Supplemental
Statistics
3 67 ~A2.93
067
3 78 *f. 306
007
Box&
Whisker
087
087
3 00 +7-2 38
00?
007
-10
95% Probability Limits
Figure 2. Components of Box and Whisker Plots
A given page will display up to 12 box plots.
2.1.2	Data Grouping (upper right hand corner)
Each page of the report displays the results for a particular data grouping. A "data grouping" is
defined by unique combinations of Domain (Region or State or PQAO) and Monitor Type
Classification. However, once the report is generated, the data is output by PQAO and monitor
type classification. For example, if one selected a state that had four PQAOs all with SLAMs
monitor type classifications and two of those PQAOs also had an "SPM" monitor type
classification, the evaluation would display a total of 6 groupings. Each report identifies the
number of monitors in that group as well as the number of pages in the group. If a PQAO has
more than 12 monitors measuring the same pollutant for the same monitor type classification, the
graphs will appear on multiple pages.
2.1.3	Supplemental Statistics
In addition to the statistics represented in the box and whisker, the following information and
statistics are displayed for each monitor within each data grouping:
•	AOS ID - The plots are sorted by the AQS ID in ascending order.
•	CV Upper Bound
•	Bias Upper Bound
•	# Qbs - Number of Samples contained within the set
•	Method Designation
The information displayed in this area of the plots would also be found in the AMP256 Report.
3

-------
One Point Quality Control Precision and Bias Graphics For Criteria Pollutants
Date: 9/9/2016
2.1.4 Box and Whisker Plots
A "Box and Whisker Plot" is created for each monitor within a reporting organization measuring
a gaseous criteria pollutant (carbon monoxide, nitrogen dioxide, ozone, and sulfur dioxide). A
single box plot is based on the percent relative error statistics from the one-point precision
checks for a single monitoring site measuring a pollutant conducted within the effective time
period in 2013. Multiple box plots are displayed within a data grouping. A box plot displays the
following statistics:
•	03 (75th Percentile)
•	02 (50th Percentile) - Median
•	01 (25th Percentile)
•	Arithmetic Mean
•	Whisker.™ & Whiskermax The lowest and highest values respectively that are found
within the upper and lower fence. The upper and lower fences are defined as values
between Q1 - (1.5*IQR) and Q3 + (1.5 * IQR), where "IQR" = the difference between
Q3 and Ql.
•	Outliers: All values that fall outside (above or below) the upper and lower fences.
The statistics are represented according to Figure 3.
Outlier > upper fence
95% upper probability
Limit By PQAO
Median
(50th percentile)
Zero line
(No bias)
Outlier > lower fence
Whiskern
Mean
Q3 75"1
percentile
Ql 25th
percentile
Whisker,,
95% lower probability
Limit By PQAO
Figure 3 - Components of a Schematic Box and Whisker Plot
2.1.5 95% CFR Limits
The following statistics are calculated for all Pollutant - monitor type category - monitors within
the PQAO:
•	95% CFR Upper Probability Limit
•	95% CFR Lower Probability Limit
The 95% Probability Limits are displayed as vertical dashed line within the box and whisker
plots.
4

-------
One Point Quality Control Precision and Bias Graphics For Criteria Pollutants
Date: 9/9/2016
2.2 Definitions for Each of the Displayed Items
The data grouping is a unique combination of a Region, State, Agency, Pollutant, and Monitor
Type Classification. The data grouping for each box plot appears at the top of each page.
2.2.1	Region
The USEPA Region code associated with a given monitoring site.
2.2.2	State
A 2-character postal abbreviation is used to identify a state in which the monitoring site is
located.
2.2.3	Primary Quality Assurance Organization (PQAO)
This is the description for the PQAO along with its corresponding PQAO code in parenthesis.
The PQAO is the organization responsible for the quality assurance information submitted to
AQS.
2.2.4	Parameter
This is the gaseous pollutant under consideration.
2.2.5	Monitor Type Classifications
Each monitor type will be grouped separately. Monitor types include EPA, INDUSTRIAL,
NON-EPA FEDERAL, OTHER, SLAMS, SPM, TRIBAL.
2.2.6	Year
The calendar year the data was collected
2.2.7	AQS Site ID
The AQS Site ID (State code, County code, and AQS Site ID) combined with the Pollutant
Occurrence Code (POC) used to uniquely define a monitor within the AQS database. The AQS
ID appears in the top block of statistics as well as along the x-axis of the graph.
2.2.8	Pollutant Occurrence Code (POC)
The identifier used to distinguish between multiple monitors at the same site that are measuring the
same parameter.
5

-------
One Point Quality Control Precision and Bias Graphics For Criteria Pollutants
Date: 9/9/2016
2.2.9 CV Upper Bound
Equations from this section come from CFRPt. 58, App. A, Section 4, "Calculations for Data
Quality Assessment ". For each single point check, calculate the percent difference, dt, as follows:
Equation 1
meas- audit
d, 		100
audit
where meas is the concentration indicated by the monitoring organization's instrument and audit
is the audit concentration of the standard used in the QC check being measured.
The precision estimate is used to assess the one-point QC checks for gaseous pollutants
described in section 3.2.1 of CFR Part 58, Appendix A. The precision estimator is the coefficient
of variation upper bound and is calculated using Equation 2 as follows:
Equation 2
CV =
•>X- Xd
where x2 o.i.n-i is the 10th percentile of a chi-squared distribution with n-1 degrees of freedom.
2.2.10 Bias
The bias estimate is calculated using the one point QC checks for SO2, NO2, O3, or CO described
in CFR, section 3.2.1. The bias estimator is an upper bound on the mean absolute value of the
percent differences (see equation 1) as described in Equation 3 as follows:
Equation 3
1 1	AS
| bws\ =AB +f0.95,„-i —j=
yjn
where n is the number of single point checks being aggregated; to.95.n-1 is the 95th quantile of a t-
distribution with n-1 degrees of freedom; the quantity AB is the mean of the absolute values of
the di's (calculated by Equation 1) and is expressed as Equation 4 as follows:
Equation 4
n
AB=-.y\o
n
6

-------
One Point Quality Control Precision and Bias Graphics For Criteria Pollutants
Date: 9/9/2016
and the quantity AS is the standard deviation of the absolute value of the di s and is calculated
using Equation 5 as follows:
2.2.10.1 Sign Association for Absolute Bias
Since the bias statistic as calculated in Equation 3 of this document uses absolute values, it does
not have a tendency (negative or positive bias) associated with it. A sign will be designated by
rank ordering the percent differences (di s) of the QC check samples from a given site for a
particular assessment interval and identifying the 25th and 75th percentiles of the percent
differences. The absolute bias upper bound should be flagged as positive if both percentiles are
positive and negative if both percentiles are negative. The absolute bias upper bound would not
be flagged if the 25th and 75th percentiles are of different signs (i.e. straddling zero).
2.2.11	# Obs
The number of samples is the number of valued pairs of 1-point QC check data used within the
calculation. The number should be represented on the graphic with an "n".
2.2.12	Method
The three digit method designation of the monitor for the particular Site ID and POC and is
related to the Federal Reference Method (FEM) or Federal Equivalent Method (FEM). These
codes (last three digits) with additional descriptions can be found on AMTIC1.
2.2.13	Quartiles (Q1, Q2, Q3)
The quartiles are the 25th, 50th, and 75th percentiles respectively. Each of these values is
represented within the lines of the box on the box plot. The Q1 value is the lowest line on the
box. The Q2 value is the line within the box and the Q3 value is the top line of the box.
2.2.14	Mean
The mean is the average value of the percent differences within the dataset.
where ' is the percent difference for a given observations, and "//" is the total number of
observations within the dataset. The value of the mean is represented by a plus sign (+) on the
box plot.
Equation 5
AS=\——			^	
|	n(n —l)
Mean
1 List of Designated Reference and Equivalent Methods at http://www.epa.gov/ttn/amtic/criteria.html
7

-------
One Point Quality Control Precision and Bias Graphics For Criteria Pollutants
Date: 9/9/2016
2.2.15	Whiskermin and Whiskermax
The Inter-Quartile Range (IQR) is defined as the difference between Q3 and Ql. The Whiskermin
is defined as the smallest value that is greater than or equal to Ql - (1.5*IQR). The Whiskermax
is defined as the largest value that is less than or equal to Q3 + (1 5*IQR). The Whiskermin and
Whiskermax values define the lengths of the "whiskers" that extend above and below the box.
The ends of the whiskers are terminated with a horizontal line.
2.2.16	Outliers
Any value that is less than Ql - (1.5*IQR) or any value that is greater than Q3 + (1.5*IQR) is
defined as an outlier. All outliers are plotted on the graph. Outliers are represented by an "O".
2.2.17	95 % CFR Upper/Lower Probability Limits
The CFR Upper / Lower Probability Limits give the 95% probability limits of percent difference
values for the time period for a given data grouping. The probability limits are calculated as
follows:
CFR Upper Probability Limit = D + (1.96 * s)
CFR Lower Probability Limit = D - (1.96 * s)
where D and 5 are the respective mean and standard deviation of percent differences for the data
grouping.
2.2.18	Percent Difference
The statistics are based on the percent difference values calculated from single point QC checks.
Percent difference expresses the difference between the pollutant concentration indicated by
monitoring equipment and the known concentration of the sample used in the check. The
percent difference is calculated as:
d - 100 * (ulcI'ca,ed ~ actliaI)
actual
where d is percent difference, indicated is the value obtained from the monitor, and actual is the
known concentration level.
8

-------
One Point Quality Control Precision and Bias Graphics For Criteria Pollutants
Date: 9/9/2016
2.2.19 Use of "Bounds for Graph" Selection
Since outliers will be displayed, they dictate how the box and whisker plots are generated.
Single large outliers can condense the plot size making the plots virtually unreadable. However,
they can help to identify possible errors in data entry or data that should have been invalidated.
An example of this follows. Figure 4 represents a plot with the bound of graph at default (all
outliers shown). The -100% difference for one QC check dictate the size of the box and whisker
for the group. It is suggested that the plots initially be reviewed in default to identify outliers for
potential correction action. Figure 5 is the same set of data with the bounds set to +10% and -
10%.
cv
Bias
Mobs
Msfhod


122
~091
182
087

4
165
-1 79
179
087
With default

0*3
•151
177
087

4
154
-295
180
087

ifr
726
~£3 93
250
087
O

097
~068
46
087

 |	1
093
•1 31
177
087
to*/-10 ^
1 54
•2 95
180
087
>-DE]h
7 26
•M 93
250
087
,	1 ©| |	1
097
*•068
46
087
o 
-------