Uncertainty in North American Wet Deposition Isopleth Maps: Effect of Site Selection and Valid Sample Criteria

&EPA
           United States
           Environmental Protection
           Agency
             Atmospheric Research and
             Exposure Assessment Laboratory
             Research Triangle Park NC 27711
           Research and Development
Uncertainty in North
American Wet
Deposition lsoplet.li
Maps:

Effect of Site
Selection and Valid
Sample Criteria
EPA/600/4-90/005
August 1990

-------

-------
                                             EPA/600/4-90/005

                                            August 1990
UNCERTAINTY  IN NORTH AMERICAN
WET DEPOSITION ISOPLETH MAPS:
EFFECT OF SITE SELECTION AND VALID
SAMPLE CRITERIA
J. C. Simpson
A. R. 01 sen
Prepared for                           ''•
the U.S. Environmental Protection Agency
under a Related Services Agreement with
the U.S. Department of Energy
Contract DE-AC06-76RLO 1830
Pacific Northwest Laboratory
Richland, Washington  99352

-------
                      NOTICE

This document has been reviewed in accordance with
U.S. Environmental Protection Agency policy and     ,
approved for publication.  Mention of trade names
or commercial products does not.constitute endorse-
ment or recommendation jfor use.  '  %- .  ;.           :
                        11

-------
                                ABSTRACT

      This report considers several  issues related to the preparation of
isopleth maps for the display of spatial patterns of wet deposition.
The valid sample criteria and data completeness rating used in the data
summarization process are described.  The data interpolation technique.
kriging, is presented and it's derivation in terms of generalized least
squares regression is given.  Four different annual  summaries for pH.
sulfate concentration, and sulfate deposition in 1986 are prepared using
either the Unified Deposition Database Committee (UDDC) definition of
valid sample criteria or a relaxed valid sample criteria and the UDDC
data completeness rating or a relaxed data completeness! rating.  The
kriged estimates for the different annual summaries  and^ the differences
between these estimates are contoured.  The effects  of relaxing the
valid sample criteria and data completeness rating are discussed.
Conclusions are drawn about network operation, network design and the
uncertainty of contour maps.  It is recommended that in! the case where
the objective is contour maps to show regional patterns:. the emphasis in
most regions needs to be on the number of valid samples per site and the
regional representativeness of the sites.
                                  i i i

-------
                                CONTENTS                         ;

ABSTRACT	...... j...   i i 1
1.0  INTRODUCTION		......   1.1
2.0  DESCRIPTION OF WET DEPOSITION DATA SETS	.........   2.1
      2.1   WET DEPOSITION DATA SOURCES		......   2.1
      2.2   DATA SUMMARIZATION PROCESS	, •••   2-3
      2.'3   SITE SELECTION PROCESS	   2.6
      2.4   DEFINITION OF ALTERNATIVE DATA SETS,USED  IN STUDY .......   2.10
3.0  KRIGING	   3.1
      3.1   KRIGING ASSUMPTIONS	...;...   3.1
      3.2   SEMI-VARIOGRAM ESTIMATION	, ...	   3.2
      3.3   KRIGING ESTIMATOR	>> • •'  3.5
      3.4   GENERALIZED  LEAST SQUARES (GLS)  REGRESSION	   3.6
      3.5   DERIVATION OF THE GLS  ESTIMATOR		,...   3.7
      3.6   WITHIN-SITE  VARIATION	i. • • •   3.10
      3.7   ADVANTAGES OF GLS APPROACH	;	.  3.11
4.0   VARIOGRAM  ESTIMATES		» • • •   4.1
5.0  RESULTS	\	    5.1
      5.1   CONTOUR MAPS  OF THE  ESTIMATES.		..............    5.1
      5.2   CONTOUR MAPS OF  THE  DIFFERENCES		    5.9
6.0  CONCLUSIONS	'•  •' • •    6.1
7.0  REFERENCES	,	    7.1
APPENDIX A -  COMPARISON  OF  1985  PH CONTOUR  MAPS	.'	    A.I
                                    IV

-------
                                 TABLES
2.1   Definition of Data Completeness Measures	i,	   2.9

2.2   Data Completeness Level  Criteria for Annual  Summaries...	   2.10

2.3   The Number of Sites for Each of the Four Sets of pH and
      Sulfate Summaries	,	   2.13

2.4   The Number of Sites Where pH Values Changed  by the UDDC
      Versus Relaxed Valid Sample Criteria and the Magnitudes
      of Those Changes	;	   2.13

2.5   The Number of Sites Where Sulfate Values Changed by the
      UDDC Versus Relaxed Valid Sample Criteria and the
      Magnitudes of Those Changes	   2,14
2.6   The Sixteen Sites with the Largest Changes in Their pH
      and /or Sulfate Concentrations as a Result of Using the
      Relaxed Versus UDDC Valid Sample Criteria	   2.15

2.7   The Samples Within the 16 Sites (Shown in Table 2.6) Which
      Did Not Meet the UDDC Valid Sample Criteria	   2.17


                                 FIGURES                 •


2.1   Location of Sites in the Relaxed Valid Sample Criteria
      and Completeness Rating Subset	i	   2.11

3.1   Semi-Van" ogram Models Where the Sill and Range are One for
      All the Models Except the Power Models Where "b" is Set to
      One and the Semi-Variogram is Truncated at One at Distance
      of One	   3.4

4.1   Semi-Variogram for each of the Subsets of pH, Sulfate
      Concentration and Sulfate Distributions	   4.3

5.1   Contours of pH Estimates Using the UDDC Valid Sample
      Criteria/UDDC Data Completeness Rating Subset..	   5.2

5.2   Contours of Sulfate Concentrations (mb/1) Estimates Using
      the UDDC Valid Sample Criteria/UDDC Data Completeness Rating
      Subset	   5.3

5.3   Contours of Sulfate Depositions (g/sq m) Estimates; Using
      the UDDC Valid Sample Criteria/UDDC Data Completeness Rating  .
      Subset	..,	   5.4

5.4   Contours of pH Estimates of the Four Subsets Using the
      UDDC or Relaxed Valid Sample Criteria (UVSC or RVSG)
      and the UDDC or Relaxed Data Completeness Rating
      (UDCR or RDCR)	                           55

-------
5.5




5.6




5.7



5.8

5.9
Contours of Sulfate Concentration (mg/1) Estimates
of the Four Subsets Using the UDDC or Relaxed Valid
Sample Criteria (UVSC or RVSC) and the UDDC or
Relaxed Data Completeness Rating (UDCR or RDCR)		   5.7

Contours of Sulfate Deposition (g/sq m) Estimates          ;
of the Four Subsets Using the UDDC or Relaxed Valid '       !
Sample Criteria (UVSC or RVSC) and the UDDC or Relaxed     \
Data Completeness Rating (UDCR or RDCR)		...;.;	   5.8
Comparison of Sulfate Concentration and Deposition
Estimates With and Without Parson. West Virginia
Site Present .		
5.10
Differences in pH Estimates for Selected Pairs of Subsets 	   5.11

Differences in Sulfate Concentration Estimates for
Selected Pairs of Subsets		   .5.12
5.10  Differences in Sulfate Deposition for Selected Pairs of Subsets  5.13

5.11  Extent and Magnitude of Effect the Parson, West Virginia   ;
      Site has on the Sulface Concentration and Deposition	;.....   5.18
                                    VI
                                                                               .

-------
                           1.0   INTRODUCTION

      Junge (1963) operated the first precipitation chemistry network in
the United States and reported the data by using isopleth maps of ion
concentrations.  Since then a number of regional and national isopleth
maps of precipitation chemistry data have been published: Semonin
(1981). Cowling (1982). Calvert et al. (1983). Hunger and iiisenreich  .
(1983). Barrie and Hales (1984). Ellis et al.  (1983). arid Husar (1988).
The researchers determined the location of the contours on these maps
either by subjectively using expert opinion to hand draw them or simple
weighting schemes such as inverse distance squared.     ,

      More recently several researchers have used the geostatistical
technique termed  kriging to estimate the spatial surface and then
display the surface using isopleth maps.  Eynon and Switzer  (1983) and
Finkelstein (1984) published isopleth maps based on two alternative
kriging approaches..  Since then  a number of authors have used kriging in
the production of maps:  Seilkop and Finkelstein (1987), Barchet (1987).
Wampler and Olsen (1987). Guertin et,al. (1988). Haas et al.  (1988). and
Venkatram  (1988).  Bilonick  (1985) and  Le and Petkau  (1988)  applied
spatial time  series models to precipitation chemistry data  in North
America.

      The  National Atmospheric  Deposition Program  (NADP) and the Acid
Deposition System (ADS)  regularly publish United States  and  North
American  isopleth maps  of  annual precipitation  chemistry.   NADP
coordinates the  NADP/NTN network and uses the network data  to produce
annual maps starting with  1983  data.   NADP  (1987)  is  an'example  of their
publications.  The ADS  combines  the  NADP/NTN  network  data with data  from
several other North American precipitation  chemistry  monitoring  networks
to  produce annual isopleth maps  for  North America.   See  Olsen and Watson
(1984). Olsen and Slavich  (1985). Olsen  and Slavich  (1986),  Sweeney  and
Olsen  (1987).  Sweeney  and  Olsen  (1988).  and Olsen  (1988).   NADP  uses  a
constrained distance-squared weighting  function to estimate  the  surface
while  ADS  uses kriging  to  estimate  the  surface.  A more  detailed
                                   1,1

-------
explanation  comparing the approaches  is given in an informal  report by
01 sen  (1988). which  is  reproduced as  Appendix A.                ,

       In 1988.  researchers  concerned  with wet .deposition data organized
two spatial  analysis workshops.  The  first,workshop on regional janalysis
of wet deposition for effects  research,was held June 7-8. 1988  in
Corvallis. Oregon.   Vong et  al.  (1989) summarize the workshop
discussion.  The workshop focussed on issues connected with estimating
wet deposition  an non-monitored  sites and did not explicitly  consider
presentation of regional wet deposition surfaces as isopleth  maps.  One
conclusion of the workshop  is  that .kriging is currently a preferred
technique for interpolation  to non-monitored sites.  The second workshop
focussed explicitly  on  presentation as isopleth maps and was  held
October xx.  1988 in  Champaign. Illinois.  The workshop organizers:
prepared standard data  sets  for  North America and invited six different
organizations to use the standard data sets to produce isopleth maps.
Methods used by the  organizations included hand drawn by an expert,
inverse distance squared. Cressman's  objective weighting, other •  ..
objective analysis schemes," and  several kriging alternatives.   No
workshop report is available at  this  time.  Our conclusions frorii the
workshop are that regardless of  the interpolation technique the .broad
overall regional features of the surface are very similar across .
'techniques but  local features  do differ when different techniques are
used.                                   ...,•:,.".;•

      The production of an  isopleth map involves:  1) selection of sites
to be used in surface estimation, 2);  calculation, of .an annual summary  ,
value at a monitored site.  3)  selection of a surface estimation I
technique, 4) selection on  how to display the estimated surface; and 5)
production of a final document quality display.  Decisions made :for each
of these can affect  the final  display and how it is perceived.  'When the
surface is displayed using  isopleth maps, alternative choices can result
in different maps.   Even if  agreement on these issues could be  reached,
the wet deposition data has measurement .error associated with the
sampling and laboratory analysis process.  Hence an isopleth mapj has
uncertainty associated with the location of its isopleth lines due to

                                   1.2

-------
 measurement  error  and  to  the  uncertainty  associated  with  possible.
                                                        i
 alternative  production decisions.                       <

       Isopleth maps  are a common  method  of  summarizing  and  displaying
 the spatial  pattern  for wet deposition  over North  America.   The process
 includes determining annual summaries at  wet deposition monitoring
"sites1,  selecting sites with representative  data,  estimating "a spatial
 surface by interpolating the selected sites to a  regular grid,  and  then
 displaying the surface using an isopleth  map.  It  is generally.
 recognized that the  location of the isopleth lines can  be affected  by
 the procedures used  at each stage of the process.   In fact, isopleth
 maps for the same annual  time period and ion species constructed by
 different organizations have been constructed, for example, the pH  map
 for 1985 given in the NAPAP Interim Assessment report (Barchet 1987) and
 the pH map for 1985  in the NADP/NTN Annual  Data Summary! report (NADP
 1987).  The maps do differ and the question arises as to why. (Olsen
 1988). A natural follow-up question is can  an agreement be reached  on
 what the "correct" process is to go from sample data collected at sites.
 to the "correct" isopleth map.                         i

       An isopleth map is a display of an estimated surface.  Data at
 monitored sites used  in the surface estimation process are only
 estimates of wet deposition at the monitored site.  Hence, the location
 of the isopleth lines would still have uncertainty associated with  them
 even if organizations would agree to use the same data, same sites  and
 same surface estimation methodology.  The uncertainty of the data at the
 site and the uncertainty associated with estimating wet deposition  at
 non-monitoring locations as part of the surface estimation methodology
 both contribute to  the uncertainty in the location of the  isopleth
 lines.  A natural question is how to determine (estimate)  the
 uncertainty on the  location of the isopleth  lines.     i

       This  report investigates the impact on estimated surfaces due to
 the use of  alternative procedures for estimating wet deposition at
 individual  monitoring sites and for  selecting representative monitoring
 sites.  The impact  of alternative surface estimation methodologies is
 not explicitly  investigated.
                                    1.3

-------

-------
              2,0  DESCRIPTION OF WET DEPOSITION DATA SETS
      The purposes of this section are:  1) to describe wet deposition
monitoring data sources, data summarization processes, and site
selection processes that have been used by organizations who produce
isopleth maps for wet deposition ion species and 2) define alternative
data sets used in this report.  No attempt is made to include all?
procedures that have been used.  The discussion is focusised on the
processes used to define the alternative data sets used:later in the
report.

2.1   WET DEPOSITION DATA SOURCES

      Various, federal, state, and local governmental agencies and
private industry organizations support networks of sites for the
collection and chemical analysis of precipitation samples, i.e.. for wet
deposition monitoring.  Determining which networks of sites to include
is a primary decision in a study of wet deposition spatial patterns.
Issues typically considered  in the selection process are:  1) objectives
and spatial coverage of the  network. 2) quality assurance  associated
with the data. 3) compatibility of network monitoring, laboratory, and
data validation protocols. 4) availability of data, and 5) special
requirements/restrictions by spatial pattern study organization. • The
first three issues are associated with the representativeness and
compatibility of the data.   The latter two issues are typically
operational issues rather than representativeness issues.  The selection
of different networks to be  included in spatial pattern studies for the
same year can potentially be a major source  of differences between
isopleth maps prepared by two organizations.   Interpretation or
comparison of isopleth maps  requires an understanding and  assessment  of
the network selection criteria used.                    i
                                                        I
                                                        ,!
      The wet deposition data used for this  study are from six regional
or national networks that contribute data to the Acid Deposition System
(ADS)  (Watson and Ol.sen 1984).  The networks are the Multi -State
Atmospheric Pollution and Power Production Study initiated precipitation
chemistry network (MAP3S/PCN), the National  Atmospheric Deposition
                                   2.1

-------
 Program/National  Trends Network (NADP/NTN).  the Utility Acid
 Precipitation Study Program (UAPSP),  the Canadian Acid Precipitation
 Monitoring Network (CAPMoN).  the Acidic Precipitation  in Ontario -Study
 daily (APIOS-D) .and cumulative (APIOS-C) networks.   Sweeney and'. 01 sen
 (1988)  give additional  descriptions  of the networks.   Criteria  jused  in.
 selecting these  networks are:   1)  the network provides regional  or   •
 national  coverage at regionally representative sites.  2) each network
 has  an  implemented quality assurance  program.,3)  the network chemical
 analysis  laboratories participate  in  an,inter-laboratory comparison
 program,  and 4)  the data are  readily  available in a  common  format from
 ADS.                                                            I
       The  decision  to  use  these  networks  does  impact  the  spatial  pattern
 displayed  by  an  isopleth map.  Site'selection  protocols for  the; networks
 restricts  location  of  sites  within  urban  areas  and minimizes  impact  of
 local  point source  emissions.  Although not  an'explicit site  selection
 protocol,  almost  all sites are located away  from mountainous  regions.  '
 Consequently,  isopleth  maps  based on  the  networks may not  reflect
 variations In  the spatial  surface related to local point  sources
 effects, urban influences, or elevation effects.  Any spatial pattern
 derived will  at best reflect the broad regional or national  structure of
 wet deposition.   The impact  of this decision on isopleth maps will not"
 be explicitly  studied in the report.                            i

       The networks  used in the study.each has a chemicaV analysis
 laboratory that performs sample analyses  and a data, management function
 that checks the reasonableness of .the s,ample analysis results using
 information available from the analysis and from the sample field notes.
 That is. each network receives data from a laboratory which has:been
 subjected to internal laboratory checks and.to sampling protocol checks.
 The checks result in supporting comments and data flags being attached
 to the samples.  The ADS data base incorporates all  of the comments,
 codes and flags.   A quality assurance program is used by each of the
networks to insure that their protocols are implemented and their
laboratories are in.control.   In  addition, 'all  of the laboratories
participate.in inter!aboratory comparison  studies.

                                  2.2

-------
2.2   DATA SUMMARIZATION PROCESS

      Calculation of an annual  summary from sample data collected at a
site requires criteria for assessing whether a sample data value is
valid or invalid for the purposes of the study.   These criteria are
called valid sample criteria.  Even with careful  attention, some samples
collected must be declared invalid because of the violation of some
aspect of a network's protocol  that affects the sample's
representativeness of. the precipitation chemistry.  In some cases, such
as severe contamination by debris, the sample is  clearly not
representative of the precipitation chemistry.  In other cases, such as
bulk sampling or non-protocol sample period, whether the sample is
representative of the precipitation chemistry during the period is not
as clear.  Currently, a sample's representativeness and a criteria for a
valid sample have not been generally developed and accepted.  Valid
sample criteria that do exist are based on the best professional
judgements of wet deposition monitoring researchers.   '  '             .

      To study the impact of different valid sample criteria, hence
indirectly sample representativeness, two alternative valid sample
criteria have been defined for the study.  One alternative uses the
valid sample criteria developed  by the Unified Deposition Database
Committee  (UDDC)  (01 sen et al 1987).  The other alternative relaxes the
UDDC valid sample criteria by removing two UDDC valid sample criteria.
This alternative  is  termed the relaxed valid sample Criteria.
                                                       ,i
      The  UDDC valid  sample  criteria  have been designed  to incorporate
each network's comments,  codes and flags into the decision process  of
determining  whether  an  individual wet deposition  sample  result  is  to  be
included  or  excluded  from a  summary.  The discussion  on  screening  for
valid samples is  stated in terms of  the ADS  data  base  common  record
format  (Watson and  Olsen. 1984)  with  some reference to  network  specific
Codes as  necessary  for  clarification.                  I
                                                       j-
      All  networks  include note  codes which  are informational  in  nature.
Some codes denote reasons why sample  results  are  not  available  or
reported.  Other  codes  describe  conditions present  in  the  field,  and
                                   2.3                 •

-------
during sample transit and sample receipt.  Unless explicitly stated
elsewhere, these note codes are not used in determining whether a sample
is valid.  The basic premise is that each network has screened  I
individual sample results for possible contamination.  If a sample
result passes the network's screening, it is assumed that possible
sample contamination indicated by field or lab comments did not
materially affect the sample ion species concentrations.

      A set of valid sample criteria has been designed for each network.
Each sample associated with a sampling period is screened to determine
whether the sample meets specific criteria.  The screening criteria use
the informational comments and codes provided by each network.  The
criteria are:                                                   |

         All sampling periods for which it is known that no     ;
         precipitation occurred are considered valid sample
         periods.  This applies mainly to weekly, monthly and 28-
         day sampling protocols.  For event and daily sampling
         protocols the absence of a sample record for a day
         implies that no precipitation occurred.               -;

         The wet deposition sample must be a wet-only sample.  All
         samples identified as bulk, partially bulk or undefined.
         are invalid.

     •    Wet deposition samples that have insufficient
         precipitation to complete a chemical analysis for a.
         specific ion species are invalid for that specific ion j
         species.  Event/daily samples are most likely to have
         this occur.            .

         An individual ion species concentration accompanied by a
         comment code designating the measurement to be "suspect"
         or "invalid" is declared an invalid sample.  Deletion of
         the ion species concentration by the network for the same
         reason has the same result.                            '••
                                   2.4

-------
         The  actual  sampling  period  for  a wet  deposition  sample
         must be  close  to  the network's  protocol  sampling period.
         Specifically,  the following conditions  lead  to  an invalid
         sample:                                     .  r     ..

               For  NADP/NTN.  actual  sample  period less than
            6 days  or greater than 8 days.   This  includes
            all  NADP/NTN  samples  coded  "LD"  with  measured
            precipitation.                            ;

               For  APIOS-C,  actual  sample period  less than
            21 days or greater than  35  days.

            For UAPSP,  actual sample period greater than 1
            day.

      The restriction to  wet-only samples  used for the'UDDC valid sample
criteria may be too stringent and their exclusion may result in  a bias.
For the networks used in  the study,  the sample chemistry is checked for
internal consistency and  for consistency with historical  data from the
site.  Hence severe contamination due to an exposed bucket will  cause
other data flags to be set.  Large differences between;wet-only  and bulk
sample chemistry may not  be common in networks designed  with wet-only
sampling protocols.  Typically, the sampler fails at some time during
the sampling .period, violating the protocol.  Exclusion  of bulk  samples
can cause a bias in annual precipitation weighted mean'concentrations,
especially if an ion species has a seasonal pattern at!a site.  It may
be more appropriate to include a bulk sample as the best estimate of
concentration rather than exclude the sample.and in effect use the
annual precipitation weighted mean concentration as the estimate for the
sample.  No detailed study has been completed to study this question.

      The UDDC valid sample  criteria also restricts the actual sample
period length of a sample to be within a few days of the network's
protocol sampling period.  For example. NADP/NTN samples must be 6. 7,
or 8 days in length.  The reason for the restriction is  to maintain
consistency within the data.  Again however, no objective study  has been

                                  2.5                 i

-------
 completed to show the effect of departure from a daily,  weekly,  or four^
 weekly sampling period.  The criteria may be overly restrictive  and
 introduce a bias as stated above for bulk samples.                '•

 2.3   SITE SELECTION PRnr.FSS                                .  .   ',

       Calculation of an annual  summary based on valid data at  a  si'te
 does not imply that the annual  summary represents the chemistry  of
 precipitation that occurred at  the site during the  year.   This may be
 due to the sampling not being completed continuously  at  the site/ to
 incomplete collection of individual  precipitation events  (low  collection
 efficiency),  to chemical  changes occurring in  the sample  during  or after
 sample collection,  or to precipitation associated with invalid samples
 having concentrations different, than  that associated  with valid  samples
 (e.g..  seasonal  effects).

       The study defines and uses two  alternative  site selection  criteria
 for assessing the completeness,  i.e..representativeness,  of  the  apnual
 summary.   One is  the UDDC  data  completeness  criteria  which  incorporates
 both an  annual  and  a quarterly  criteria.   The  other,  called  the  relaxed
 data completeness criteria,  is  a relaxed  UDDC  criteria which eliminates
 the UDDC  quarterly  criteria  and  weakens the  UDDC  annual criteria.

       The Unified Deposition  Database  Committee defined five
 quantitative  data completeness measures and  assigned  annual  and   ;
 quarterly thresholds to each  of  the five  measures in  constructing 'their
 UDDC data completeness  criteria.  The  following questions concerning
 data  completeness and temporal  representativeness motivated  the
 measures:  for what  portion  of the summary period is  the  occurrence and
 amount of precipitation known; what portion  of the precipitation volume
 collected  is  associated with  valid deposition samples: what percent of
 the  time  and what percent of  the samples  collected are associated With
 valid samples: and what fs the ratio of the wet deposition sample volume
 to the precipitation measured by a standard gage?

      Data completeness measures are based on the assumption that the
entire season or year consists of sample periods that  account for,every

                                   2.6

-------
day of the summary period.  It is normal  for a site to 'have incomplete
information for some precipitation events during a summary period,  to
deviate from established collection protocols due to circumstances
outside the operator's control, or to collect samples that are
subsequently eliminated during -the network's data screening process.
Therefore, it is necessary to establish criteria for determining when
sufficient valid wet deposition data are present to calculate a
meaningful seasonal or annual summary for a site.  Data completeness
measures  are designed to quantify the amount of information upon which a
data  summary is based and enable criteria to be established that
indicate  the quality of the  summary.  Five data completeness measures
are proposed:  percent precipitation coverage length, percent total
precipitation, percent valid  sample length, percent of samples with
measured  precipitation that  are  valid and percent  collection efficiency.
A  sixth measure,  percent  sea  salt  correction  is  applied to  sulfate
summaries for  sites within  100  km  of a coast.   Definitions  for the  data
completeness measures  are given  in Table  2.1.
                                                       i
       The data completeness  measures are  the  basis for assigning a  data
completeness  level  (1  to  4)  to each  seasonal  and  annual summary.   The
criteria  for  the  data  completeness  levels  are given  in Table  2.2.   A
summary with  data completeness level 1 has  the  best  information, or the
highest  level  of  data  completeness.  The  least  confidence is  given  to a
summary with  data completeness level 3.   Level  4  summaries  fail  level  3
 criteria.  They  are viewed  as not providing  a representative  summary for
 the period.   In  order for a data summary to  be  assigned a specific
 level, all criteria listed  for the level  must be met.  The most
 favorable level  attained  is assigned to  the  summary.   A summary,that
 does not  meet one or more of the criteria for level  3 is  assigned  as
 level 4.                                               ;
                                                       f
       The collection efficiency data completeness level  criteria for a
 seasonal  summary is relaxed somewhat for Canadian winter  summaries
 compared to other seasons due to the generally  poorer collector
 performance for snow sampling.  If the criteria for other seasons  is
 applied  to winter months when a large  percentage of the  precipitation in

                                    2.7                 !

-------
Canada Is in the form of. snow, then only a few locations meet even the
level 3 criterion.  It is believed that a lower percentage could; be  ..--•
accepted for winter because the problems are primarily due to undercatch
of snow.  An under-collected snow sample may reasonably represent the
concentration but not the deposition.

      The data completeness level for an annual summary is based on
annual criteria as well as criteria for the four quarters. January-
March. April-June. July-September, and October-December, which comprise
the year.  The addition of quarterly criteria to the annual criteria is
to insure that adequate data from each quarter is present in the annual
summary.  Because the emphasis is on insuring adequate data for an
annual- summary, some quarterly criteria are relaxed from the seasonal
criteria (see Table 2.2).
                                                                !
      The relaxed data completeness criteria applies only the annua-1
%PCL (£90£) and %TP (>602) data completeness measures. ;  ATI quarterly
criteria are dropped.  Sites meeting the relaxed data'completeness ;
criteria but not the UDDC criteria, almost exclusively fail the
quarterly criteria.  The quarterly criteria are intended to insure that
all quarters of the year are represented in an annual summary.  However.
the criteria may be overly restrictive.  The relaxed criteria result in
almost all  sites that monitored continuously during a year being
included in the study.                                     '     ,
                                  2.8

-------
          TABLE 2.1.  Definition of Data Completeness Measures
Data.
Completeness
Measure
%PCL
 35VSL
 %VSMP
 SCOLEFF
 SSEASALT
                      Definition
Percent precipitation coverage length is the percent of the
summary period for which information on whether or not
precipitation occurred is available.  If precipitation is
known to have occurred during a particular sampling period
but no measurement of the amount is available, then no
knowledge of precipitation is assumed.  This measure can be
less than 100% because the site started (stopped) operation
after (before) the beginning (end) of the summary period or
because equipment or operator problems! caused the site to be
shut down for a portion of the summary period.

Percent total precipitation is the percent of the total
precipitation depth measured that is  associated with valid
samples collected during the summary  period.

Percent valid sample length is the percent of the days durin
the  summary  period for which valid  samples are obtained.
Note that sample periods with no  precipitation are  considers
valid  samples.                        !
                                      i
Percent valid samples with measured  precipitation is the
percent of  all wet deposition samples during  the  summary
period  that  are valid samples.

Percent collection efficiency is  the ratio  of  the wet
deposition  sample  volume  (converted  to  a  depth)  to  the  total
precipitation depth  as  measured by  a collocated  rain gauge.
Only valid  samples with  both  a  collocated standard  rain  gaug
and  sample  volume  measurement available  are  used.

Percent sea  salt  correction  is  the  percent  of  the  average
sulfate concentration that .is estimated  to  be  due  to  sea
salt,  using  sodium or magnesium as  tracers  of  sea  salt.
                                   2.9

-------
   TABLE 2.2.   Data  Completeness  Level Criteria for Annual Summaries
                                      Annual Data Completeness Level
                                 £95%
                                 £75%
                                 £80%
                                 £70%
                                 £80%(70%)*
                                 £80%(50%)*
                                 £80%
£90%
£60%
£70%
£60%
£60%(40%)*
£60%(40%)*
£60%
£90%
£50%
£60%
£50%
£50%(30%)*
£50%(30%)*
£50%
Data Completeness Measure
%PCL
   Annual and
   each quarter
%TP, SVSL. %VSMP
   Annual and
   each quarter
%COLEFF
   Annual and
   for winter and
   spring, summer, autumn
%SEASALT
* The bracketed value applies to Canadian sites.

2.4   DEFINITION OF ALTERNATIVE DATA SETS USED IN STUDY
      The UDDC valid sample criteria (UVSC) versus relaxed valid sample
criteria (RVSC) and the UDDC data completeness rating (UDCR) versus
relaxed data completeness rating (RDCR) create four subsets of sites
(UVSC/UDCR. UVSC/RDCR. RVSC/UDCR and RVSC/RDCR) for both the pH'and
sulfate observations.  The locations of the sites in the RVSC/RDCR
subset are shown in Figure 2.1.  The estimation technique used in this
study (kriging) requires that, the phenomenon be stationary in the region
being investigated.  For pH and wet deposition of sulfate, this
stationarity requirement is not met when considering the entire United
States and southern Canada as one region (Vong et. al.  1989).  Therefore
this study is restricted to southern Canada and eastern United States
(less the southern most states).  As seen in Figure 2.1, the pH I
observations are at the same sites as the sulfate observations with the
exception of the six sites without pH observations.  The number :of sites
in each of the subsets are shown in Table 2.3.  Of the 194 sites, in the
                                                       / "*        i
region shown in Figure 2.1, that have pH and sulfate summaries in 1986,
185 sites (95.4%) have annual pH summaries and 191 sites (98.5%) have
annual sulfate summaries which meet the minimal requirements
(RVSC/RDRC).  When the most stringent requfrements (UVSC/UDCR) are used,
only 113 sites (58.2%) have annual pH summaries and 125 sites (64.4%)
                                  2.10

-------
                                                                   0         300
                                                                    Kilometers
                                                          2 Collocated Sites

                                                          * Sites with nopH
                                                             Region used for semi-
                                                             variogram estimation
                                                         —  Region containing grid nodes
FIGURE  2.1.  Location  of Sites  in  the Relaxed  Valid Sample Criteria
             and Relaxed Data Completeness  Rating Subset
                                          2.1]

-------
 have annual sulfate summaries which meet these requirements.  The
 difference between the UDCR and RDCR is solely in the number of sites in
 the subsets.  The primary difference between the UVSC and RVSC subsets
 is that at approximately one third of the sites the summaries change in
 value.  Additionally, the RVSC subsets have a few more sites, since more
 samples are included in the annual summary causing the data completeness
 rating to improve.

       The number of sites whose observations changed and the magnitude
 of those changes are shown in Tables 2.4 and 2.5. for pH and sulfate
 respectively.   As seen in Tables 2.4 and 2.5. the majority of the
 changes-in the observations are relatively small.  The 16 sites.'within
 the region where estimates are calculated (see Figure 2.1),  with!the
 largest changes in their observations are shown in Table 2.6.  A? seen
 in this table, most of the large changes in pH and sulfate are at the
 same sites.                                                     !

       Table 2.7 shows the samples  from the 16 sites shown in Table 2.6
 which did not  meet the UDDC valid  sample criteria.   There are two causes
 for these samples to be rejected,  either the collector is open during
 the collection period when it is not raining (bulk) or the sample
 collection period violates the  networks  protocol  by being either  too
 long or too short.   The precipitation weighted  average of pH and 'sulfate
 concentration  are given for most of  the  samples,  as data  validation
 procedures of  the networks did  not indicate anything  unusual  about  the
 chemistry  of these  samples.   As  seen  in  Table  2.7.  three  of  the sites
 each  have  samples with  extremely high sulfate  concentration  (greater
 that  50 mg/1).   Although  there  is  little  precipitation' associated with
 these  samples,  even  when  precipitation weighted averages  are  calculated.
 these  samples  have  a  significant effect  on  the  average.   These unusual
 samples also have extremely  small  collection efficiencies  (percent of
.predicted  sample  volume,  from the  rain gauge, that  is  contained in the
 actual sample).   For  a  number of other sites, the samples  that did not
 meet the UDDC  valid  sample  criteria all occur contiguously in  time (the
 NADP samples represent  one week while for the APIOS-G  network  one1 sample

                                   2.12                           :

-------
represents a four week period).  For sites that have a seasonal  trend,
the loss of a month or more of data can significantly raise or lower the
annual average.                                        ]

      A fifth subset of observations for the sulfate observations is
investigated.  This set consisted of the UVSC/UDCR subset with the
extreme value at Parsons. West Virginia (ADS ID 075a) removed.  This
subset is used to demonstrate the range and magnitude of the effect that
one unusual site can have.
     TABLE 2.3.
           S04
The Number of Sites for Each of the Four Sets of pH
and Sulfate Summaries
                        RDCR
                        UDCR
                        RDCR
                        UDCR
uvsc
184
113
UVSC
190
125
RVSC
185
122
RVSC
191
133
        TABLE  2.4.   The  Number  of  Sites  Where  pH  Values  Changed  by
                    the UDDC Versus Relaxed Valid Sample Criteria
                    and the Magnitudes of Those Changes
                  (RVSC - UVSC)

                  (-0.127,-0.100)
                  (-0.100.-0.050)
                  C-0.050,-0.025)
                  (-0.025.-0.005)
                  (-0.005. 0.000)
                    no change
                  (  0.000, 0.005)
                  (  0.005. 0.025)
                  (  0.025. 0.050)
                  (  0.050. 0.096)
                         UDCR

                           1
                           3
                           1
                          11
                          10
                          80
                           9
                           5
                           2
                           0
 RDCR

  2
  3
  3
 13
 13
114
 14
  7
  5
  2
                                   2.13

-------
TABLE 2.5.  The Number of Sites Where Sulfate Values
             Changed  by the  U.DDC  Versus  Relaxed  Valid
             Sample Criteria  and  the  Magnitudes  of  Those
             Changes
(Rvsc-uvsn

(-0.27. -0.25)
(-0.25. -0.15)
(-0.15. -0.05)
( -.05. -0.00)
no change
( 0.00. 0.05)
( 0.05. 0,15)
(0.15. 0.25)
(0.25. 0.50)
( 0.50. 0.52)
SO/i Deoosltlon (

(-0.33. -0.25)
(-0.25. -0.15)
(-0.15. -0.05)
(-0.05. -0.00)
no change
( 0.00. 0.05)
( 0.05. 0.15)
(0.15. 0.25)
( 0.25, 0.50)
( 0.50. 0.53).
UDCR
0
0
2
14
89
11
9
4
3
1
is/in2)
UDCR
0
0
2
14
89
11
8
3
5
1
.BJ&fi
1
2
3
19
120
16
11
6
4
1


1
2
2
20
120
16
10
4
7
1
                      2.14

-------
to
 M-

 4-> O

 X
   O)
   o;
     o w
     ^. <»
0= C_>
Q.
  O)


•si  3 ^
j= g  o 91
1— (/I  I—
           OOOOOOOOOO-r-OOOOO
            •^^^••^••^^••^•^•'a-^1'*^''*. ^l'^''1*'
            eoeocMrtootoeocNjCNjeooio^coeooo

            •^:-ooois-o>oooo>coo)O)O>
                                           o>.
                                                 03
 0>
 4v«




 2


 9L


 .1

 S
 '5.
: o
                                            2 oj-Q.15
                                       11 1111 |  §

                                       £-«8ffl2>2  |
                                       c_j .—  _ ?y *^ *^^ /i\  /A
            S99e999e9eesEsya   s««^o«I  5

            eeee99eeees9eeee   ilglBsIf>
                                        c C"Q —i oj 
                        O O O O
                          ..
                          <<
                                        ^

      c/3®3  e\iwr>-P5r^«o3f^ooo4ooT-ocoow  o>

-------
•a
 cu
 =9
 C
o
o

                                           i-l'v, CM-|    oococMioooooorocn-r-OTi-flooioco                       j—


    55    I    co° c\i  c\i CM' t-^ i-^ c\i c«i c\i co'  o'  ^ c\i c\i eo' o                       j£»

    O                                                                                -H
    Q.                                                                                iS
    UJ                                                                                —
    a


l_      3j    co'cMCMCM^T-icM«eJe\ic>	                       '-
<                         " "~                 T-       to o                       p.

uj                                                                                   '"

        CM                                                                            §
        o                                                                            .2

     og    coi-iocMCMi-^cooinmcoiocococo              ^    -2 .•§


        fc    OOOoOOOoOOOOOOC3o          .£  CO JjJ — 'o
        ^     ,++,+ + +,   +  +  +  + + + +,           -^v.^

                                                                             CO O (jj *"*"

     QJ        ••*>•••—***w*^*i**^i—w-^-wj-^--^"u*T"^w          <2)C .92 2" Si      i


     cdd    t^°. °it>:00. T<°. "T'?03. ":^i°'*    c    fi  3> E 5! i
        UJ    CM CNJ  CO CM  T— r— CM  CO CM CO  •>—  ^— CMCMCOi—    OC3)CDQ-™CO    c/1
                                                                   V= .E "Q. f= OT -O ™    CD
     coj1                                                         Sc3EpS:aS'Q.
     c        cr> is.                                                '" *™ Q    ^ > Q    E


        *^1    ^hi   *^7^!   *  "  *   _•  ••••••••    "^ ^v jrt T^ V^ x^ i^lS    *n
        ^ji    CM Y**  CM CM  ^H* ^^ OJ  CO CM CM  ^~  T^ ^^ ^M CM ^**    •'•• ^^ ^^    ri TLJ       ,js

                                                                   ^5^5o<"-^*i'®>i
                                                                   3 "5.Q  S       ® "5.-*-
u                                                                  CCQ— CDd5'-"pO

,_   col9    co-trTreoco«-«i-T-i-cMT-r>.cM'*co       e°8^*"*'*"'£ScD
o                                                             — *LS*2O>O)COJ3
-Q   p                                                             CO CO (jj  Q) CZ C C^ "^ C

§«                                                             OCO^^'co'cocDiaa

Z°3^j    CM-rJ-^Mm-e-m^^-CM'^ts.rvi^.m             ^^       U.>Z
              cc
              o
              o
a.

Is
Q. £
BS
                                                                                               1
                                                                                                a.

                                                                                               155
      %&#
                                                                                            CO
                                                                                            _CD

                                                                                            Q.


                                                                                            (O
                                                                                            CO
          CO  2

          <  co
                                                                            1
                                                       2.10

-------
TABLE
ADS
SITE ID
032a

0.43b

047a
073a
077a

163a

168a


!87a
I88a
I92a
208a

241a
250a

276a

420a

495a


2,7
. The Samples
Within
the 16 :
Did Not Meet the UDDC Vali
% TOTAL
M
JLJ
1
6
1
3
1
2
1
1
2

4
2
1
2
2
1
1
1
2
2
2
1
1
6
1
1
1
1
CAUSE
period
period
bulk
bulk
bulk
period
bulk
bulk
bulk

bulk
bulk
bulk
period
bulk
bulk
period
period
period
bulk
bulk
bulk
bulk.
bulk
bulk
period
period
period
PRECIP
25.6
11.2
3.5
4.3
4.4
12.0
0.8
0.3
0.9

5.5
3.0
0.1
7.5
18.7
13.5
2.7
9.9
6.1
5.8
2.2
2.5
8.2
12.4
0.5
2.2
2.2
7.8
Sites (Sh
d Sample
SULFATE
PRECIP oH CONG tooZIl
20.6
9.0
3.0
3.6
5.3
12.4
1.0
0.4
1.0

5.2
2.8
0.1
8.0
18.9
13.9
4.7
16.9
7.3
5.6
2.1
2.2
7.2
10.8
0.4
3.6
3.5
12.7
4.600
4.334
4.190
4.506
3.770
4.662
6.44
3.75
4.726

4.158
4.182
3.860
4.666
4.751
4.120
5.020
4.570
4.053
4.542
4.044
6.96
4.23
4.556
2.680
4.640
7.120
6.210
1 .450
3.040
7.790
2.152
9.560
1.102
1.250
74.050
51.995

2.441
4.734

1.478
2.826
4.100
0.800
1.700
4.744
1^540
11.321
5.824
2.750
3.906
69.320

1.100
1.300
own in Table 2.
Criteria
6) Which

COMMENTS
no sample volume

July
July










5.8% collection efficiency
1 1 .2% and 24.8% collection
efficiency

February and March


i


i
35.1% collection
i
March and April
March i and April

I






efficiency





February and March
2.9% collection
i
November
December, 2.0%
efficiency


collection
2.17

-------

-------
                              3.0   KRIGIN6
      To investigate the effects of using the four different subsets  of
the observations, the observations are interpolated unto a regular grid
using a variation of kriging.  Kriging has often been the tool  used to
make predictions of a spatial phenomenon (e.g. sulfate deposition) at
unobserved sites (e.g. a grid node).   Kriging's popularity is  based on
the fact that the estimates it produces are 'sensible' and it also
produces a variance that is often used in setting confidence intervals
about the estimates.

3.1   KRTfiING ASSUMPTIONS              •                '

      Let Z(>0 be a realization of  a spatial phenomenon.  For example,
Z(A) can be the  sulfate deposition  at a  site and A is the location of
the  site in two  dimensional  space  (A- (x.y)).   'Simple1  Kriging  assumes
that the increments  (difference between  the sulfate  deposition at  two
sites).  CZ(A')  - ZU)3. are  stationary in the weak sense.   That is.
 and
             E[Z(x') - ZOO] -  0
             VAR[Z(x,') - Z(x)]  = 2y(h)
(1)
(2)
 where
                       h  -  distance between  x. and  x.1

                   y(h) -  semi-variogram  .            :

       Equation (1) states that the expected difference between the two
 sites is zero or that the expected sulfate depositions are constant over
 the region of interest.  Equation (2) states  that the increment has a
 variance and this variance is a function (called the semi-variogram)
 only of the distance between the two sites.

       When there is a systematic change (drift or trend) in the spatial
 phenomenon, then
                                    3.1                 ]

-------
            ECZ(X)]  -

where mfe) is usually modeled as a low order polynomial.
equation (3),
                                                      (3)
                                             Now, usiing
E[Z(x') - ZOO]
                                  ') - m(x) .
(4)
and
            VAR[ZC0 - ZOO]  = 2y(h) -  [mfc1) -
                                                    1  (5)
      Thus, the increments no longer have a constant expected value and
the variance of the increments is a function of both the semi -variogram
and the drift.  Now the variance of the increments cannot be modeled by
the semi-variogram alone.  The problem at this point is that to be able
to estimate the drift, the semi-variogram needs to be known, and:to be
able to estimate the semi-variogram. the drift needs to be known.;
Unfortunately, neither are known.

      In practice, often the drift is ignored.  There are two common
justifications given for ignoring the drift.  First, the actual
estimates derived from kriging only uses a subset of the sites near the
point being estimated, thus if the drift is 'small'  in the
'neighborhood' of the point being estimated then the drift's effect on
the estimates will be negligible.  The second justification is that
'what's drift to one person is correlation to the next."

3.2   SEMI-VARIQGRAM ESTIMATION                                 :

      As seen in equation (2), under the assumption  that the increments
are stationary in the weak sense, the variance of the increments ;are
modeled by the semi-variogram.  The semi-variogram model must be >of a
form so that the variance is non-negative.                      ;

      Five of the more common semi-variogram models  which give non-
negative values are:                                             I
                                  3.2

-------
  1.  Power  Model

     y(h) = b|hp            for 0 < p < 2            i
                                                     i
     (when  p-1,  the semi-van'ogram is  simply  a  linear ;model)
  2.  Spherical  Model
1   Ihl3'
 2  r3  .
     Y(h)  - C
  3. Cubic Model
                                               for |h| < r
                       for |h| >  r
 35   mil   7  Jill   3
I. H ' I I n -i-	 *	 I  — •*	 '  ' «.  ""
                                                 LJilll
                                                 4   r? J
     y(h)
   4.  Exponential  Model
   5.  Gaussian  Model
      In the 'above equations,  r equals  the  range  of  the  semi -van" ogram
and C equals the sill.   The range can  be  thought  of  as
influence'.  If the distance between  two  sites  is  less
                                   for |h|  < r
                                   for |h|  > r
                               the  'zone  of
                               than  the range,
then the value of one site influences  the value  of  the  other  site.   If
the distance between two sites is greater than the  range,  then  the  sites
are independent.  The sill is the bound on the semi-variogram and
provides an estimate of the overall  variability.  The power model does
not have a range or a sill.  The exponential  and gaussian  models  never
reach their sill.  Figure 3.1 gives  a  comparison of these  semi-variogram
models where the sill and range are  one for all  the models except the
                                  3.3

-------
power models where b is set to one and the semi-van"ogram is truncated
at one at distance one.  As seen in Figure 3.1. a wide range of semi-
variograms can be modeled using these five models.              j
    1.0H
 3
 CO
   0.5
   0.0-
         Cubic	

         Exponential -„

         Power, p=1/2
                     Spherical

              Gaussian

          Power, p=3/2
      0.0
       FIGURE 3.1.
        0.5                   1.0
               Distance (h)
1.5
Semi-Van'ogram Models Where the Sill  and Range
are One For All the Models Except the Power Models
Where "b" is Set to One and the Semi -Van"ogram is
Truncated at One at Distance of One
      When h is  zero. g(0) must also be equal to zero.  However, if the
semi-van"ogram does not tend to zero for measurements taken at
arbitrarily close points, then there is a discontinuity of the semi-
variogram at the origin.  This discontinuity is called the nugget
effect.  If there is a nugget effect, the variogram model  is adjusted to
take it into account.  For example, if the model  is linear with a nugget
of size CQ (the intercept) then
                            for |h| > 0

                            for |h| = 0
                = b|h| + C0

            y(h) = 0
                                  3.4

-------
     The  semi -van' ogram  is  estimated- by
                         N(h)
                         ?
                                      h) "
(6)
where                                                  '

         Z(Xj + h) - Z(}(j) =  difference between  a  pair of observations
                             which  are  a  distance h  apart

                   N(h) =  number of  pairs  of  points actually taken
                           into the  sum.

 In practice h is  a range of  distances.                 ;

 3.3   KRTRING ESTIMATOR               .                 |

      The  Kriging estimator  is                         <
                                                       j
                        n                              !
 where                                                 ;

                   YQ =  the Kriging estimator at grid  node XQ

                   X.  = kriging weights                i
                Z(x,j) =  the  observed sulfate deposition at site xi

                   ri   -  the number of sites used in the estimator.
                                                       i
       The number of sites used in the estimator, n, is only a small
 fraction of the total number of sites (generally between 8 and 16).
 Only 'close1 observations are used to reduce the size;of the matrices
 that need to be manipulated.  As long as the semi -variogram -model has a
 small nugget as compared to the sill, then the weights decrease rapidly
with distance from
negligible weight.
                       and thus observations that are  'far' from A  have
                                    3.5

-------
       The 'simple'  Kriging variance is
                     n
                =  2
                  n   n
                     Zy
                     £-t  A:A:
       where

                iXj)  = semi-van"ogram. g(h) where  h  is the
                          distance  between  x.  and  x-  .
                                           "i      -M

       If  the  semi-variogram  has  a  sill  and there  is  no  drift,  the;
 Kriging variance may be  stated in  terms  of variances  and  covariances.
                     n    n
                         n
                            X
                                                            o-
      where
=  covariance  between
                                             and
                 a2  = variance of  Yg.  (sill).

3.4   GENERALIZED LEAST SQUARES (GLS) REGRESSION
                                               (7)
      Least squares regression assumes that the errors in the
observations are independent.  That is. the deviations from a trend
surface fit using least squares regression and the actual observations
are independent.  If it assumed that these errors are not independent,
then generalized least squares regression can be used to estimate;the
trend surface.  For example, in kriging the dependence in the errors is
assumed to be a function only of the distance between the observations.
Because these errors are related, they also form a 'surface' that fits
on top of the trend, surface.  Therefore, it makes sense to also estimate
the error 'surface1  and then add it to the trend surface.
                                   3.6

-------
3.5   nFRTVATIQN OF THE  GLS  ESTIMATOR
      The GLS model  is
            Y =
      where
YJ
            x =
(YrY2,.
°1  X1
  1  xrt
                            .,Yn)
                                  • fm(2L,)
                   1  xn  yn  .   .   . fm(x.) -
                       a  low-order polynomial  of
                             fin,)
                               (i.e.  distributed with mean zero and
                               covariance  matrix c^V, the form of the
                               distribution  (e.g. Gaussian) need not be
                               specified until  required  for computing
                               confidence  intervals,  etc.)
                   the sill of the semi -van' ogram.
        Then
                                    3.7

-------
                     a2(X'V1X)-1 .



 Now, the value of the realization at the unobserved location >u is
      where
Then.
so
Thus.
            COV[£,e]
                  (
      where
                               v
                                 3.8

-------
Then.
Now.
      where^ is the vector of kriging weights
      VAR[^0] = VAR[^'Y] = 1'VAR[YJ2, =
                          2a2Y0'V-1X(X'V-1X)-1(X<:)1 -
      VAR[Y0]
 and
      COV[ YQ , YQ ]  = COV[ 1'Y , YQ ]  = ^'COV[ Y , Y
                           1   4-a2(X'-\£'V-1X){XIV-1X)-1X'V-1V
00
                                                              0
 So,
      VAR[ Y0 - YQ ] =  VAR[ YQ ] + VAR[ YQ ] - 2COV|[ YQ ,




                               3.9

-------
thus.
'-1
                                             '-1-1
                                ' - V0'V-X>(X'V-X)-(X01
This is the kriging variance.                                 '•

3.6   WITHIN-SITE VARIATION

      The nugget- effect in kriging is often attributed to the within-
site variation.  The practical result of using the nugget effect in
kriging is to force the surface through the observations.  That  is. the
kriging estimate at a site which has an observation is that observation.
Thus the surface has 'spikes' wherever there are observations.  These
'spikes' have no surface area associated with them, they are a jump .
discontinuity at the site of the observation.  Additionally, the sites
with duplicate observations must be preprocessed (usually the mean
observation is calculated), since only one observation per site  can be
used.

      In the GLS estimation .procedure, the within-site variation is
accounted for by an additional parameter in the model.
      where
            31 and £ are  uncorrelated.
These two sources of error can be combined so that
      where
            y. ~ ,(0,a2V + aI)
                                  3.10

-------
      Then  in  the derivations  shown in Section S.S.^Vis  replaced  by
    + a2|  Note  that Y is not  changed.               j
      This formulation assumes  that the within-site  variation  is  the
same for all  sites.   However,  if the information  is  available,  each  site
could have it's own within-site variation and ajl is  simply  replaced
with a diagonal matrix, with the diagonal elements being the  within-site
                                                      i
variation at each site.                               !
                                                      I
                                                      i
      By using this within-site variation formulation. ! the only time the
estimate will differ from the kriging estimate using the nugget is at
the  observation.  Now  the estimate is no longer the same as the
observation.  Additionally, sites with duplicate  observations do not
need to  be preprocessed.  The algorithm  in essence uses their mean.
                                                     estimate at a site
                                                         That is. the
                                                        the data.  In the
                                                       ion, this
                                                        to wet deposition
                                                       no inherent
      If there is no within-site variation,  then the
where there is an observation will  be that observation
model assumes that there is no variation (or error)  in
original application of kriging to ore reverse estimati
assumption may be valid.  However, in applying kriging
summaries, the assumption that the data at a site has
variability or error does not appear to be valid.

3.7   ADVANTAGES DF RLS APPROACH

      When the variance-covariance matrix is specified using a function
that  is  dependent on distance, the GLS approach is  kriging.  However,
the  GLS  approach has the flexibility to both investigate the phenomenon
of interest  and  to  add  additional information to the model.

      Often  one  semi-variogram  does  not fit the entire region of
interest.  Currently,  to get around  this  problem, the region is divided
into several  smaller  regions.   The boundaries of the  sub-regions  are
usually artificial  and ad-hoc procedures  must be used! to blend the
 results for  the  separate  sub-regions together.   By  doing a  'moving'  GLS,
where only a relatively small  number of  observations  near  the point  of
 interest are used in the  estimation  process,  the  changes in the  'sill'
                                   3,11

-------
of the semi-variogram can be investigated over the region of interest.
Additionally, by using a  'moving1 approach, the problem with the
boundaries of the arbitrarily chosen regions no longer exist,  !

      The 6LS approach allows other information to be added to jbhe
model.  For example, there are some pollutants that are highly '
correlated with population density (e.g. automobile exhausts).  ,
Additionally, information about population density across the United
States can be obtained or well estimated with census information.   By
adding this population information to the model, observations in urban
areas will not cause overestimation in nearby rural areas and
observations in rural areas will not cause underestimation in nearby
urban areas.
                                 3.12

-------
                        4.0  VARIQGRAM ESTIMATES
      From previous studies it is known that one semi-yariogram does not
fit the entire region shown in Figure 2.1.  The shape of the semi-
variograms for pH and sulfate precipitation concentration and deposition
are primarily due to the 'depression', for pH. and 'hump', for sulfate.
in the surface that is centered around northeastern West Virginia.
These surfaces maintain about the same slopes within the smaller region
shown in Figure 2.1.  Outside this region the slopes change markedly
(increases with pH and decreases with sulfate) and thus the semi -
                     \
variogram model also changes.
                                                      •i
      Because our concern in this paper is about the effects of changing
criteria for data, we will only look at the region where one variogram
reasonably works.  Because the exact boundary for this region is not
known, the semi-variograms are calculated using only the sites within
the smallest region.  However, the grid is expanded to the larger region
and any site shown in Figure 2.1 is potentially used i;n the estimate.

      Figure 4.1 shows the raw semi-variograms and the estimated models
for pH, sulfate concentration and sulfate deposition.  As seen in this
figure the semi-variograms do not change very much between the four
different subsets.  Therefore, one semi-variogram model can be used for
all four subsets.  The 'nugget' that is observed at the origin of these
semi-variograms is used as the estimates of the within-site variation.
Additionally, since we use the GLS method the semi-variogram model is
converted to it's covariance equivalent (see equation ;7).  The pH has a
linear model with a sill of 0.03 (pH units)2, a.range of 1280 kilometers
and a within-site variance of 0.002 (pH units)2.  The range of the
linear model is artificially set beyond the actual range used since the
covariance model needs a range.  The sulfate concentration has an
exponential model with a sill of 0.44, (mg/1)2 a range of 1150
kilometers and a within-site variance of 0.02 (mg/1)2.  The sulfate
deposition has an exponential model with a sill of 0.70 (g/m2)2. a range
of 960 kilometers and a within-site variance of 0.03 (g/m2)2.  The
                                   4.1

-------
exponential model actually never reaches the sill, the ra.nge given above
is the distance where the model is 95 percent of the sill.       l
                                   4.2

-------
 (E
 OC
 o
 o
 oc
0.040
0.036
0.032 '
0.028
0.024
0.020 •
0.016 '
0.012
0.008
0.004
0.000
     0.60
     0.55
     O.SO
     0.45
  g  0.40
  |  0.35
  1  0.30
  §  0,25
  E  "0.20
  «»  0.15
     0.10
     0.05
     0.00
  1.1:
  1.0
  0.9
  0.8
  0.7
  0.6
  0.5
  0.4
  0.3
  0.2
  0.1
  0.0
   ui
   v>
                   RVSC/RDCR
                   RVSC/UDCR
                   UVSC/RDCR
                   UVSC/UDCR
          1—i—i—i.  i-   '   i—•—r—«—\—'—i   '   i   ^~   ^T     r
          0    129   256    385   513    641    769    897   1026   1154   1282
                                    KILOMETERS
                                              Sulfate Concentration
              — T - r— i - T— 1 - r— | - 1 - 1 - 1 - 1   r— HI   r
               128    256    385    513    610   769    897    1026   1154   1282

                                   KILOMETERS
                   Semi-variogram Model
                                                  Sulfate peposltlon
FIGURE 4.1
 ,—i—|—r—|—i—i—«—i—«—i   i   i   «   n"1   i   '   r
128    256    385   513    641    769    897    1026   1154   1282
                     KILOHETERS              \
Semi-variogram  for  each  of the  Subsets  of pH.  Sulfate
Concentration and Sulfate  Distributions
                                        4.3

-------

-------
                              5.0  RESULTS
      The GLS variation of kriging,  described in section 3,  is used to
estimate the pH,  sulfate concentration,  and sulfate deposition for each
of the subsets at grid nodes of a regular square grid.   This procedure
uses the eight closest sites to the grid node in the estimate.  The grid
nodes are 32 kilometers apart and the area of the grid,  shown in
Figure 2.1. consists of 3669 nodes.   This grid is then  contoured using
bilinear interpolation in SAS (procedure gcontour).

      The contour maps of the pH and sulfate concentration and
deposition show,  in broad terms, the effects of using different subsets
of the observations.  Additionally,  contour maps of the differences
between several of the different subsets at each grid node are prepared.
These maps show the 'local1 effects, both extent and magnitude, of using
different subsets of observations.

5.1  CONTOUR MAPS OF THE ESTIMATES                 j

      Figures 5.1. 5.2. and 5.3 show the contoured estimates of pH.
sulfate concentration and deposition, respectively, using the UVSC/UDCR
subsets of observations.  These maps are shown for two reasons.  First.
they are the maps that use the observations that meet the current, and
most stringent, sample validity and data completeness criteria.  Thus.
the effects of using different subsets of observations are judged
relative to these maps.  Secondly, these maps are  large enough to
include details, such as the contour levels, that become lost as the
                                                   i'.
maps are reduced in size for comparisons.  As seen in Figures 5.1 and
5.2. the pH and sulfate concentrations have relatively smooth contour
maps.  However, as seen in Figure 5.3, sulfate deposition's contour map
has several mounds and depressions.  The sulfate deposition is the
multiple of the sulfate concentration and total precipitation.  The
mounds and depressions in the sulfate deposition contour map  are due to
precipitation gradients that are not parallel to the concentration
gradients or to local precipitation that is unusually high or low
compared to neighboring sites.

                                   5.1              ;

-------
FIGURE 5.1.  Contours of pH Estimates Using the UDDC Valid Sample Criteria/
             UDDC Data Completeness Rating Subset1

-------
FIGURE 5.2.  Contours of Sulfate Concentrations (mg/1)  Estimates Using the
             UDDC Valid Sample Criteria/UDDC Data Completeness Rating
             Subset.                  .                j
                                      b.3

-------
FIGURE 5.3.
Contours of Sulfate Depositions (g/sq m) Estimates Using the
UDDC Valid Sample Criteria/UDDC Data Completeness Rating  •
Subset                       ,                             :
                                      5.4

-------
      Figures 5.4,  5.5.  and 5.6 show the contoured estimates of pH,
sulfate concentration and deposition,  respectively,  using all  four
subsets of observations.  In these figures,  the number of sites used
increases from left to right (UDCR versus RDCR) and  the value of the
observations change from top to bottom (UVSC versus  RVSC).

      The pH estimates (see Figure 5.4) increase in  Ontario and Quebec
when the number of sites used increase, from UDCR to RDCR.  This region
has few sites, so the addition of a few more sites has a profound effect
on the contours.  The additional sites also produced increases in pH in
northeastern New York, northern Virginia, southern West Virginia.
eastern Tennessee, northern Alabama, northern Mississippi and Arkansas
decreases in North Carolina, eastern Wisconsin, and southwestern
Indiana.  Along the border of Pennsylvania and New York the value of the
observation at one site  (ADS ID 047a: NADP; Jasper.  New York) has a
profound effect when the UVSC is used versus the RVSC.  This site is not
in the UVSC/UDCR subset.                       ;

      The sulfate concentration estimates (see Figure 5,,5) change i.n
Ontario  and Quebec when  the number of sites used increase, from UDCR to
RDCR.  As with pH this  region has few sites, so the addition of a few
more sites  has a profound  effect on the  contours.  However, for sulfate
concentration there is  a decrease to the north and an increase in the
south.   The increase  in  sites have a profound effect ion the largest
contour. 3.5 mg/1.  When the UDCR is used, this contour is confined to
northern West Virginia,  southeastern Ohio and southwestern Pennsylvania.
However  with the additional sites 
-------
FIGURE 5.4.
Contours of pH Estimates of the Four Subsets Using the UDDC or
Relaxed Valid Sample Criteria (UVSC or RVSC) and the UDDC or
Relaxed Data Completeness Rating (UDCR or RDCR).  The Values of
the Contour Lines are Given in Figure 5.1.
                                      5.6
-------
FIGURE 5.5.   Contours of Sulfate Concentration  (mg/1)  Estimates  of the Four
             Subsets Using the UDOC or Relaxed  Valid  Sample Criteria (UVSC
             or RVSC) and the UDDC or Relaxed Data  Completeness  Rating
             (UDCR or RDCR).   The Values  of the Contour Lines are Given in
             Figure 5.2.                                 '
                                      5.7
-------
                             0   300
                             Kilometers
                        UVSC/UDCR
UVJ5C/RDCR
                         RVSC/UDCR
FIGURE 5.6.  Contours of Sulfate Deposition (g/sq m) Estimates of the  Four
             Subsets Using the UDDC or Relaxed Valid Sample Criteria  (UVSC
             or RVSC) and the UDDC or Relaxed Data Completeness  Rating
             (UDCR or RDCR).  The Values of the Contour Lines are Given  in
             Figure 5.3.
                                      5.8
-------
      The sulfate deposition estimates (see Figure 5.6)  increase in
southern Ontario, southern Wisconsin and northern Ohio when the number
of sites used increase, from UDCR to RDCR.  The_3.0,g/m2 contour moved ,,
approximately 300 kilometers west.  The presence of two sites on the
southwestern border of Indiana bordering on Illinois 
-------
                                0   300
                              Kilometers
                          Concentration
                             with
                          Parsons, wv
Concentration
  without
 Parsons, WV
                          Deposition
                            with
                          Parsons, wv
 Deposition
  without
 Parsons, wv
FIGURE 5.7.   Comparison of  Sulfate Concentration  and Deposition  Estimates
              With and Without  Parsons. West  Virginia Site Present.   The; Rest
              of the Sites Belong to UVSC/UDCR  Subset.   See Figures  5.2 and 5.3
              for the Values  of the Contour Lines.'
                                        5.10
-------
                          RVSC/RDCR
                             minus
                          UVSC/UOCR
UVSC/RDCR
  minus
UVSC/UDCR
   r
                          RVSC/UDCR
                             minus
                          UVSC/UDCR

RVSC/RDCR
  minus
UVSC/RDCR
Figure 5.8.  Differences in pH estimates for selected pairs of subsets.  The
             dark lines indicate an increase in the estimate and the light lines
             indicate a decrease in the estimate.  The ou.ter contour is 0.025,
             the middle contour is 0.05 and the interior Contour is 0.10.
                                      5.11
-------
                              0   300
                             Kilometers
                          RVSC/RDCR
                           • minus
                          UVSC/UDCR
                                             r
UVSC/RDCR
   minus
UVSC/UDCR
   r
                          RVSC/UDCR
                            minus
                          UVSC/UDCR
RVSC/RDCR
  nrrinus
UVSC/RDCR
FIGURE 5.9.  Differences in Sulfate Concentration  Estimates  for Selected Pairs
             of Subsets.  The Dark Lines  Indicate  an  Increase  in the Estimate
             and the Light Lines  Indicate  a  Decrease  in  the  Estimate,   the Outer
             Contour is 0.25 mg/1 and the  Interior Contour  is  0.5 mg/1.
                                      5.12
-------
                               0   300
                              Kilometers
                          RVSC/RDCR
                            minus .
                          UVSC/UDCR
                                                       UVSC/RDCR
                                                         minus
                                                       UVSC/UDCR
                         RVSC/UDCR
                            minus
                         UVSC/UDCR
                                                       RVSC/RDCR
                                                         minus
                                                       UVSC/RDCR
FIGURE 5.10.
Differences in Sulfate Deposition for Selected  Pairs  of  Subsets
The Dark Lines Indicate an Increase in the  Estimate  and  the
Light Lines Indicate a Decrease in the Estimate.   The Outer
Contour is 0.25 g/sq m and the Interi-or Contour is 0.5 g/sq m.
                                       5.13
-------
subsets.  The primary difference in these groups is the change in the
site values, however there are a few additional  sites.   The lower right
map is the difference between the UVSC/RDCR and  the RVSC/RDCR subsets.
The differences in these groups are the changes  in the values of the
observations with only one additional site on the Vermont-New York
border.  For each of these maps, the dark contours indicate an increase
in estimates from the the UDDC to the relaxed definition(s) while the
lighter contours indicate a decrease in the estimates.   For pH the outer
contour indicates a difference of 0.025. the second contour is for a
difference of 0.05 and the third contour is for a difference of 0.1.
For sulfate only two contour levels are used.  The outer contour is 0.25
mg/1 for the concentration and 0.25 g/m2 for the deposition, while the
                                                             p
inner contour is 0.50 mg/1 for the concentration and 0.50 g/nr for the
deposition.  The additional sites that are used in each comparison are
indicated with dark squares while the site locations that are the same
(although the value of the observation may have changed) are indicated
with white squares.  In the maps showing the differences between strict-
all and relaxed-all. only one additional site is used and it is
indicated with an X.                                               ,

      The effects of changing the subsets for pH,  shown in Figure 5.8.
extend  over a large area in many different regions.  As seen in this
figure, the additional sites are the primary cause of these differences
(the maps of RVSC/RDCR minus UVSC/UDCR and UVSC/RDCR minus UVSC/UDCR are
quite similar).  The few additional  sites in Ontario and Quebec cause an
area whose length is greater than 1000 kilometers  to be increased by
more than 0.1 pH units.  The additional  sites in Arkansas, northern
Mississippi and northern Alabama also increases the pH estimates
throughout this region as much  as 0.1 units.  The  additional sites;in
southern Ontario reduced the pH  in this  region, although much of this
region  is over Lake Huron.   Several  additional  sites effected  'local'
regions of  over one hundred  kilometers  in diameter, with several of  the
regions have considerable  area  with  estimates that changed by more that
0.05 units.  Only the five sites with the largest  changes  in their pH
observations (see Table 2.6, sites 047a  (Jasper, New York).  208a  (Lac Le

                                   5.14
-------
Croix. Ontario), 241a (Gaylord.  Michigan),  420a (Vincennes,  Indiana).
and 495'a (Mooseonee, Ontario)) have a noticeable effect (see RVSC/RDCR.
minus UVSC/RDCR).  Of these sites.  495a (Mooseonee.  Ontario) which is"
the northern most site shown in  Ontario has the largest effect.   The
size of this effect is due to the sparsity of data in the region.  As
seen in Table 2.7. the changes in the values for sites 420a (Vincennes.
Indiana) and 495a (Mooseonee, Ontario) are primarily due to one sample
with extremely unusual pHs.  These samples have extremely poor
collection efficiencies (2.9% and 2.0%, respectively,, of the sample
volume predicted from the rain gauge is actually present in the
collector).  Site 047a (Jasper.  New York) is very unusual, on one map it
causes an increase, on another map it causes a decrease and on the other
two maps it has  no effect.  This site has only one sample that is added
(see Table 2.7), however, it is also the only sample in June that is
analyzed, thus when the sample is removed, all of June is essentially
removed.  Without this sample, the pH value is large compared to it's
neighboring sites.  When the sample is added, the pH decreased and the
site  is no longer large compared to it's neighboring sites.

      The effects of  changing the subsets for sulfate concentration.
shown in Figure  5.9.  are greatest in southern Ontario.  As seen  in this
figure, the additional sites are the primary cause of these differences.
Most  of the new  sites are in southern Ontario and have higher
concentrations  than their neighboring sites. These sites  have poor UDDC
data  completeness ratings and are only included when the  data
                                                     i
completeness rating in relaxed.  Their poor ratings  are primarily
because of their low  collection efficiencies.  The three  sites with the
largest changes  in  the sulfate concentrations  (see Table  2.4. sites 047a
(Jasper, New York),  163a  (Caribou, Maine) and 420a (Vincennes, Indiana))
have  a  noticeable effect. The extent of the effects  are related  to the
density of neighboring sites, with the size increasing as the density
decreases.  The  extent of the effect due to site  163a  (Caribou.  Maine).
is  effected by  site 436a  (Presque  Isle. Maine) that  is just southwest of
it.   When the UODC  valid  sample criteria is used  to  select  the samples.
these two sites  are  similar  (1.32 mg/1 at  163a verses  1.55  mg/1  at
436a).  thus there is  no effect on  the  UVSC/RDCR minus  UVSC/UDCR  map.
                                   5.15               !
-------
When the relaxed valid sample criteria is used to select the samples.
both site's observations increase (1.84 mg/1 at 163a verses 1.70 mg'/l. at
436a), thus the extent of the effect shown on the has a RVSC/RDCR minus
UVSC/RDCR map has a diameter of a little over 100 kilometers (the  .
averages of these two sites change from 1.44 mg/1 to 1.77 mg/1).
However on the RVSC/UDCR minus UVSC/UDCR map the diameter of the area
effected has increased to over 200 kilometers because the values change
from 1.32 mg/1 (163a only) to 1.77 mg/1.  The increase in the sulfate
concentration at site 163a is due to two samples with unusually high
sulfate concentration (52 mg/1 precipitation weighted average) with very
low collection efficiencies (11.2* and 24.8%).  Sites 420a along the the
Indiana-Illinois border (Vincennes. Indiana) and 154a along the Indiana-
Kentucky border  (Rockport. Indiana) also effect each other.  Under the
strict criteria  for selecting samples their  sulfate concentrations are
2.62 mg/1 and 2.65 mg/1.  respectively.  However only site 154a  has an
effect when UVSC/RDCR and UVSC/UDCR are compared because of the lower
observations to  the south of  154a.  However  when the valid  sample
criteria is relaxed, site 420a increases to  3.08 mg/1 and causes an
effect with a diameter  of close  to 200  kilometers without the present  of
154a  and about  100  kilometers when 154a is  present.  The large  increase
at  site 420a  is  primarily due to  one  sample  with a  concentration of 69
mg/1  and a  collection efficiency  of only 2.9%.   Finally, as with pH. the
effects of  site  047a  (jasper. New York) change  with  the  different  maps
and the reasons  are the same  as  before.

      The  effects  of  changing the subsets  for  sulfate  deposition,  shown
in  Figure  5.10.  are also  greatest in  southern  Ontario.  As  seen  in  this
figure, the additional  sites  are again  the primary  cause of these
differences.   Unlike  the  sulfate concentration,  not all  the effects,  in
southern  Ontario,  are  an  increase in  the  estimates.  While  the  sulfate
concentration (see Figure 5.9)  at some  sites are high  compared  to  the
neighboring sites,  their  sulfate deposition is  low  when  compared  to  the
'same neighboring sites.  This anomaly is  a result  of the very low  :
precipitation at these  sites.   The effects of sites 154a and  420a  in
southwestern  Indiana  are  larger  than  those shown for the sulfate
concentration primarily because  of the  high precipitation  at  site  154a.
                                   5.16
-------
These sites together account for an increase of 0.5 ig/m  in sulfate
deposition over an area with a diameter of over 200 kilometers.  As with
pH and sulfate concentration, site 047a remains a very unusual site.
Although the sulfate concentration increases to a level that is similar
to it's neighboring sites, when the valid sample criteria is relaxed,
this site has a low precipitation, amount as compared to its neighbors.
Thus, although the sulfate deposition increase when 'the valid sample
criteria is relaxed, this new value is still low compared to it's
neighboring sites.

      The extent and magnitude of effect that the -Pajrsons, West Virginia
site has on the 'local' sulfate concentration and deposition estimates
are shown in Figure 5.11.  The extent of this effect, is over a region
with a diameter of approximately 400 kilometers.  This region is limited
by the use of only the eight closest sites in the estimation process.
The magnitude of the effect increases to over 0.5 mg/1 for the
concentration and 1.5 g/m2 for the deposition.  The region of those
magnitude's of effect or approximately 100 kilometers.  The magnitude of
the effect-of the sulfate deposition's much greater than that for the
sulfate concentration because of the high precipitation at this site.
                                  547
-------
                              0   300
                             Kilometers
                         Concentration
                                                      Deposition
FIGURE 5.11.
Extent and Magnitude  of  Effect the Parsons. West Virginia  Site
has on the Sulfate Concentration and Deposition.  The Concentration
Contours are 0.5. 0.25.  0.10  and 0.001 mg/1 and the' Deposition
Contours are 1.5. 1.0. 0.5  a.25 and 0.001 g/sq m.
                                       5.10
-------
                            6.0  CONCLUSIONS
      Application of UDDC valid sample criteria requires that the UDDC
rating be used to ensure representativeness throughout all  seasons.
This is a conservative position for inclusion of data for maps.  As seen
with site 047a and other sites shown in Table 2.7. when the UDDC valid
sample criteria is used, blocks of data over a period of greater than a
month may be lost.  When a site has seasonal trends, the loss of that
much data can adversely effect the annual estimates.

      Relaxed valid sample criteria may or may not lead to
representative annual summaries for the year.  Sites must be evaluated
with respect to other nearby sites or years.  Seasonal criteria may not
guarantee representativeness.  That is. using the UDDC data completeness
rating requirement does not protect you.  As seen in Table 2.7, at a
number of sites, the relaxed valid sample criteria let a number of
extremely unusual samples to be included in the annual estimates.
However, all these samples have very small collection efficiencies.
Possibly, the collection efficiency needs to be examined on a  sample by
sample basis, instead of seasonally and annually.    j

      Representativeness of a  site for its surrounding area is very
important with a sparse network.  Does the sulfate concentration and
deposition  at the Parsons. West Virginia site represent the area within
200  kilometers of it?                                ;

      Although the relaxed data completeness rating  allows the number of
sites used  to increase  by over 50%. except for a  few sites, these
additional  sites did not change the spatial patterns of wet deposition
on  a  region scale.   Most of the changes due to the additional  sites are
on  a  local  scale where  these differences are smaller than the  scale used
in  regional  isopleth maps.  The few additional sites which do  have a
profound effect  are  located in areas where  there  is  a sparsity of .site's
(e.g. northern Ontario) or whose  summaries  are changed markedly due to
either the  addition  of  an extremely unusual sample or, the addition of a
large contiguous  number of samples.  Thus,  for contour maps whose

                                   6.1               ;
-------
objective is to show regional pattern, the key issue is not the number
of sites but the location of the site and the validity of the samples
from the site.

      There are two sources of uncertainty in the contour maps that need
to be addressed.  First, the within-site variation in the annual
summaries.  Because there are so few collocated sites, the within-site
variation is poorly estimated and thus any variance estimate from
kriging is also poorly estimated.  Second, the variation from year to
year in the annual summaries at a site have not been considered.   For
example, at Parsons. West Virginia the annual sulfate depositions (g/m2)
are 4.4 (1979). 4.4 (1980). 4.6 (1981). 4.0 (1982). 3.1 (1983). 3.4
(1984), 4.8 (1985). 5.3 (1986) and 3.0 (1987).  It should be noted that
in 1985 and 1986. 19% and 9%. respectively, of the precipitation  has no
chemistry results for sulfate.  This occurs primarily in months when the
sulfate concentration is low at this site (late fall).  Additionally, in
1987 the annual precipitation was only 73% of the average annual  ;
precipitation for'the previous eight years.

      For network operations, the implication of this study is that
sites must give valid data for the entire year.  The loss of a large
number of contiguous samples renders the site useless for the purpose of
annual summaries.  The relationship between small sample collection
efficiency and the representativeness of the sample's chemistry needs to
be reevaluated.

      For network design, the implication of this study is that the
total number of sites is less important than the representativeness of
the site.  When considering a region the size of the eastern United
States, where the sheer magnitude of the region forces sites to be
hundreds of kilometers apart, the summaries from one site can profoundly
impact a large region.  If that site is not representative of the region
between sites, then it will bias the results.
                                   6.2
-------
                            7.0  RFFFRFNCES
Barrie  L  A. and J. M. Hales.  1984.  "The Spatial  Distributions of
   Precipitation Acidity and Major Ion Wet Deposition in North America
   during 1980." Tell us 36B. pp. 333-355.

Bilonick  R  A   1985.  "The Time-Space Distribution of Sulfate
   Deposition in the Northeastern United States." Almpju FnvKon. 19.
   11. pp. 1829-45.
Calvert. J.. J. N. Galloway. J. M. Hales. G  M  H1dy-^-.
   A  Lazrus, J. Miller. V. Mohnen. and M. F. Uman.  1983.   Acid
   Deposition. Atmospheric Processes in Eastern North America.  National
   Academy  Press. Washington. D.C.

Cowling  E  B   1982.   "Acid Precipitation in Historical  Perspective."
   Fnviron. Sc1 . Tech.  16. 2. pp.  110A-123A.         ,

Eynon.  B.  P..  and P.  Switzer.   1983.   "The Variability  of Rainfall
   Acidity." r.anariian  J. Statistics 11. 1. PP.  11-24.

Finkelstein. P.   1984.   "The Spatial Analysis of  Acid  Precipitation
   Data."  .1.' Climate  and Applied  Met.  23. pp. 52-62.
                                                     f

Guertin,  K.. J.  -P.  Villeneuve.  S. Deschenes. and G.'Jaques.   1988.
    "The Choice of Working  Variables in Geostatistical  Estimation  of  Acid
    Precipitation."   AtTT™-  Environ. 22.  12.  pp. 2787-2801.

 Haas. T.  C..  J.  C.  Moore.  P.  L.  Chapman,  and J. H. Gibson.  1988.   "Acid
    Deposition:  The Structure  Analysis and  Kriging of  a Process with
    First and Second Order  Non-Stationarity," submitted for publication.
    Dept.  of Statistics and Natural Res.  Ecology Lab.,  Colo. State Univ..
    Ft. Collins. Co.                                 !

 Junge  C. E.  1963.  Air Chemistry and Radioactivity!.  International
    Geophysics Series. Vol. 4.  Academic Pres. New York. pp. 311-346.

 le   D  N.  and J. Petkau.   1988.  "The Variability of Rainfall Acidity
   "Revisited."  r.anadian j. Statistics 16.  1.  pp. 151-38.

 Munger  J  W.. S. J.  Eisenreich.  1983.  "Continental -Scale Variations
    in  Precipitation Chemistry."  Environ. Sci .  Tech. II.  1. pp. 32A-42A.

 NADP  1984.   "NADP/NTN Annual Data Summary. Precipitation  Chemistry in
    the United States." National  Atmospheric Deposition Program
    Coordinator's Office.  Natural  Resource Ecology Lab. Colorado State
    Univ..  Fort Collins, Co. September 1987.         ;
                                                     j
 01 sen  A   R.   1988.   "1986 Wet Deposition Temporal and Spatial Patterns
    in  North America."  U.S. Environmental Protection Agency.  Research
    Triangle Park.  North Carolina.
                                    7.1
-------
Olsen, A. R.,  and A. L. Slavich.  1986.  "Acid Precipitation in North
   America:  1984 Annual-Data Summary from ADS Data Base."
   EPA-600/4-86-033.  USEPA. RTF1. NC.

Olsen. A. R..  and A. L. Slavich.  1985.  "Acid Precipitation in North
   America:  1983 Annual Data Summary from ADS Data Base."
   EPA-600/4-85-061.  USEPA. RTF. NC.

Olsen. A. R..  and C. R. Watson.  1984.  "Acid Precipitation in North
   America:  1980. 1981. and 1982 Annual Data Summaries Based on ADS
   Data Base." EPA-600/7-84-097. USEPA. RTP. NC.
Seilkop, S. K.. and Finkelstein,  1987.
   and Trends in Eastern North America.
   Met. 2fi. pp. 980-994.
               "Acid Precipitation Patterns
              1980-84." J.  Climate and Applied
Semonin. R. 6.  1981.  "Seasonal Precipitation Concentrations and  :
   Depositions for North America from the CANSAP/NADP Network." in
   Nineteenth Progress Report to U.S. DOE. Pollutant Characterization
   and Safety Div.. Contract DE-AC02-76EV01199.

Sweeney, J. K. and A. R. Olsen.  1987.  "Acid Precipitation in North
   America:  1986 Annual and Seasonal Data Summaries from Acid
   Deposition System Data Base." Report EPA/600/4-88. EMSL ORD USEPA.
   Research Triangle Park. N. C.. November 1987.
Venkatram. A.  1988.  "On
   Acid Precipitation Data
the Use of
 " Atrnos.  Environ
Kriging in the Spatial  Analysis of
        22., 9. pp.  1963-76.
Vong. R.. S. Cline. 6.  Reams,
   Haas, J. Moore. R. Husar.
   1989.  "Regional analysis
   Report EPA/600/3-89/030.
    J, Bernert. D. Charles. J. Gibson. T.
   A. R. Olsen. J. C. Simpson, and S. Seilkop.
   of Wet Deposition for Effects Research."
Wampler, S. J. and A.  R.  Olsen.   1987.   "Spatial  Estimation  of Annual
   Wet Acid Deposition  Using  Supplemental  Precipitation Data." Preprint:
   Tenth Conf. on Probability and Statistics  (held in  Edmonton. Canada).
   American Meteorological  Society.  Boston. Mass.
                                   7.2
-------
                                APPENDIX A              !
                     COMPARISON OF 1985 pH CONTOUR MAPS
                              Anthony R. 01 sen
                        Pacific Northwest Laboratory
                                June 22, 1988

     This paper gives a comparison of the 1985 pH maps that; appear in the
NAPAP Interim Assessment (NAPAP 1987) and in the NADP 1985 Annual Data Summary
report (NADP 1987).
     Figure 1 reproduces the NADP report map and Figure 2 reproduces the NAPAP
report contour map, prepared by Pacific Northwest Laboratory (PNL).  The first
feature of the maps that differs is the use of different: contour levels.
This makes any comparison difficult.  Judgement of whether the maps agree
depends on who makes the judgement and what is the intended purpose of the
maps.  An appropriate purpose for the map appears to be a semi-quantitative
display of the spatial pattern of pH during 1985.  With this as the purpose,
the two maps appear to agree.
     The production of contour maps  is  a problem of surface estimation and
display.  All solutions will not agree  exactly in the location of contour
lines but strong qualitative agreement'should be expected.  I have taken a
closer look at the production process for the maps.  First, I reviewed the
production processes used by NADP and PNL to determine the steps used.  Second,
I requested that PNL and NADP prepare several alternative maps to investigate
possible reasons for quantitative differences in the maps.
PRODUCTION PROCESSES
     The production of a contour map includes the following:
             Calculation of an annual pH value at a site,
             Selection of sites to be used in surface estimation,
             Selection of a surface  estimate technique,
             Display of the estimated surface,
             Production of final document quality contour map.
NADP and PNL use the same calculation procedure to obtain an annual pH value
for a site.  The pH values for sites that NADP and PNL both used in preparing
-------
-------
            APPENDIX  A
COMPARISON OF 1985 DH CONTOUR MAPS
-------
the maps agree in all cases.  Both use pH to the nearest hundredth in the
production process.  Note that pH values that appear on the maps are rounded
to the nearest tenth.  Because of the rounding, contour lines may include or
exclude sites that appear to be on the "wrong" side.
     The NADP and PNL maps do use different sites for the surface estimation.
NADP uses only NADP/NTN network sites.  NADP used 123 NADP/NTN sites and PNL
used 183 sites from multiple networks.  PNL uses sites from the MADP/NTN,
UAPSP, MAP3S, CAPMoN, and APIOS networks.  Fewer NADP sites (20) are used by
PNL than by NADP.  This is due to using different criteria for inclusion of a
site.  The major difference is that PNL includes a quarterly criteria as well
as an annual criteria.  The underlying data completeness measures are
calculated the same, the difference is in which measures are applied and what
cutoff criteria is used.  Based on our current knowledge on the relationship
between the criteria and the "representativeness" of the annual summary, I do
not believe that a case can be made for preferring either the NADP or:the PNL
criteria.  Both are reasonable choices.
     NADP and PNL use entirely different surface estimation algorithms.  The
initial step in both algorithms is to estimate the surface on a regular grid.
The regular grid is then used to determine the location of the contour lines.
NADP uses a grid on a transverse mercator projection at an unknown but
reasonable grid density.  PNL uses an 80 km grid on a Lambert conic projection
that is near-distance preserving.  NADP uses the SURFACE II graphics system
(Sampson 1984).  The gridding routine uses a constrained distance-squared
weighting function applied to the eight nearest site locations (a maximum
search radius is imposed).  PNL uses the kriging algorithms in BLUEPACK
software (Delfiner 1979).  Kriging also is distance-weighted but the distance
weights are derived from the variation observed in the site data.  The pH
weights are from a spherical semi-variogram.  Eight sites are used with a
restriction that each octant from the grid node contributes a site, if
available within a maximum search radius.
-------
     Smooth contour lines are interpolated from the regular grid.  NADP uses
the "CONT" function in Surface II, which uses a piecewise Bessel interpolation
within a grid cell.  PNL uses the DISSPLA graphics contouring function with a
cubic spline interpolation, termed "spline under tension."  I review the maps
that are computer generated for consistency with the monitoring site data.
This review typically includes removing contours in the West, deleting
extensions of contours over the ocean, and subjectively smoothing contours,
especially in data sparse regions, to remove non-data supported features.
ALTERNATIVE MAP COMPARISON
     Since NADP and PNL used different sets of sites and different surface
estimation and contouring algorithms, four alternative maps are produced for
the four possible combinations.  As a point of departure, Figures 1 and 2 are
the original maps as they appeared in the NAPAP interim assessment and the
NADP 1985 Annual Data Summary.  The sites included in the NADP data set  (123
sites) and the PNL data set  (183 sites) are shown in Figures 3 and 4,
respectively.                                           |
     For the comparison, PNL produced two maps  (Figures 5 and 6) using our
standard surface estimation  and contouring procedure.  Contour  levels were
changed to be the  same as the  NADP original map.  The same  kriging semi-
variogram  is used  for both maps  (same as used  in  NAPAP map).  The maps provide
an  assessment of differences that arise from using different subsets of  sites.
My  assessment is that the maps agree very well  east of the  Mississippi River
within the United  States.  The 4.7 and 4.9 contours have  bends  in the south
that are not supported by  site data.  These would be subjectively smoothed to
remove  "artificial"  features.  The 5.1 contour extends  into the  West where
contouring is questionable.  The  4.5 contour differs in Canada  and Maine due
to  the  PNL data set  including  Canadian sites.   The 4.3 contour  differs in
northern  New York.   Other  differences in the contours are small.  In Figure
7,  an expanded view  of eastern North America for  the original PNL NAPAP  map
is  given  for comparison.                                !
-------
     NADP used the NADP contouring algorithm to produce a map based on the NADP
data set sites (Figure 8) and the PNL data set sites (Figure 9).  The maps
agree very well east of the Mississippi within the United States.  The
differences between them are similar to those present between the two PNL
maps.  The areas, west of Mississippi and Canada, where the density of sites
differs between the two data sets show the greatest differences.
     The NADP map  (Figure 8) and the PNL map (Figure 5) using the 123 sites
in the NADP data set are remarkably similar.  The north, northeast and
southwest portion of the NADP 4.3 contour extends farther than the PNL contour.
The southern portion of the 4.7 NADP contour extends into Texas while the PNL
•contour does not.  Larger differences occur for the 5.1 contour in the west,
reflecting sparse data support in this region.  The NADP map (Figure 9) and the
PNL map  (Figure 6) using the 183 sites in the PNL data set show differences
remarkably similar to the previous comparison.
SUMMARY
     My assessment of the comparison of the different data sets and the
different "contouring" algorithms used by NADP and PNL is that they produce
remarkably similar maps.  Agreement is best where the density of sites is
greatest and poorest where the density is lowest.  Inclusion of Canadian sites
does aid 1n completing contours  in the northeast.  This is to be expected.
When the algorithms  are compared on the same data set, the differences are no
greater  than differences observed when using same algorithm with different
data sets.  The computer drawn NADP maps  appear  to be smoother than the
computer drawn PNL maps.  This is related to the density of grids used and
the selection of  a smoothing parameter.
-------
REFERENCES                                              j

Delfiner, P., J.P. Delhomme, J.P. Chiles, D. Renard, and F Irigoin.  1979.
  BLUEPACK 3-D.  Centre De Geostatistique et De Morphologies, Fontainebleau,
  France.

National Acid Deposition Program. 1987.  NADP/NTN Annual Data Summary.
  Precipitation Chemistry in the United States.  1985.  Natural Resource
  Ecology Laboratory, Colorado State University, Fort Collins, CO.

National Acid Precipitation Assessment Program.  1987.  Interim Assessment;
  The Causes and Effects of Acidic Deposition.  Vol. Ill  Atmospheric Process,
  U.S. Government Printing Office, Washington, DC.

Sampson, Robert J.  1984.  SURFACE II GRAPHICS SYSTEM.  Kansas Geological
  Survey, Lawrence, Kansas.
-------
-------
5-1
       FIGURE 1.   NADP Original 1985 Annual Data Summary  pH  Map
-------
1985 Annual  •
Precipitation-Weighted pH
            FIGURE 2.   PNL Original NAPAP Interim Assessment
                       Document 1985 pH Map
-------
FIGURE 3.  Sites Used in NADP pH Data  Set
-------
1985 Annual
pH
                                                      PNL June 1988
                 FIGURE 4.  Sites Used  in  PNL pH  Data Set
-------
FIGURE 5.  PNL Kriging Map Using NADP Data Set Sites
-------
FIGURE 6.  PNL Kriging Map Using PNL Data Set Sites
-------
FIGURE 7.  PNL Computer Drawn Map for Original NAPAP
           Interim Assessment pH Map
-------
FIGURE 8.  NADP Map Using NADP Data Set Sites
-------
FIGURE 9.   NADP  Map Using PNL Data  Set Sites
                                  •&U.& GOVERNMENT IttlNTING OFFICE: H*0 • 74S-BV2MM
-------
-------
-------
c

§ 0 °
~ O -^
3 O CD
• < O
® ro
•""* <
l|
(D W
? -i
2 o
3 "O
-* o
o 3
? o

Jen
o. g
go
3 3
S 3

3° ni


c
foT
o-

0
  -
                        0
                       , 
-------