EP A/600/ J-94/167
            Hexagon Mosaic  Maps for Display  of Univariate
                         and Bivariate  Geographical  Data

                  Daniel B.  Carr, Anthony R. Olsen, and Denis White
         ABSTRACT. This paper presents concepts that motivate the use of hexagon mosaic maps and hexagon-based
         ray-glyph maps. The phrase "hexagon mosaic map" refers to maps that use hexagons to tessellate major areas
         of a map, such as land masses. Hexagon mosaic maps are similar to color-contour (isarithm) maps and show
         broad regional patterns. The ray glyph, an oriented line segment with a dot at the base, provides a convenient
         symbol for representing information within a hexagon cell. Ray angle encodes the local estimate for the hexagon.
         A simple extension  adds upper- and lower-confidence bounds as a shaded arc bounded by two rays. Another
         extension, the bivariate ray glyph, provides a continuous representation for showing the local correlation of two
         variables. The theme of integrating statistical analysis and cartographic methods appears throughout this paper.
         Example maps show statistical summaries of acidic deposition data for the eastern United States. These maps
         provide useful templates for a  wide range of statistical summarization and exploration tasks. Correspondingly,
         the concepts in this paper address the incorporation of statistical information, visual appeal,  representational
         accuracy, and map interpretation.

         KEYWORDS: hexagon mosaic maps, ray-glyph maps, comparison plots, bivariate maps, brushing.
                   Introduction

         Maps based on hexagon  tessellations are seldom
         used, but offer numerous opportunities for rep-
         resenting statistical summaries. Two fundamen-
tal maps, the hexagon mosaic map and hexagon-based ray-
glyph map, provide a foundation for  many  task-specific
variations.  For example, the hexagon mosaic map  shows
broad regional patterns for a single variable. Variations of
this map bring out distribution features of the variable rep-
resented. Important application variants include compari-
son of two variables through the direct display of differences
and through map overlays. The ray-glyph map provides a
foundation for adding more information to a map. One
map variation shows local estimates and their confidence
intervals. A second ray-glyph variation  shows the local as-
sociation of two or more variables. This variation provides
an alternative to two juxtaposed maps (Monmonier 1979)
and color-coded bivariate maps (Olson 1981; Eyton 1984).
A third variation highlights values in specific regions based
on dynamic graphic subset-selection techniques or com-
puted criteria. This paper discusses the relative merits of
these proposed maps and commonly used alternatives.
  An acidic deposition study of the United States provides
Daniel B. Carr is an associate professor in the Center for Com-
putational Statistics, George Mason .University, Fairfax, VA 22030.
Anthony R. Olsen is an ecological statistics program leader at
the  U.S. EPA Environmental Research Laboratory and Denis
White  is a research geographer with ManTech Environmental
Technology Inc. at the U.S. EPA Environmental Research Lab-
oratory, Corvallis, OR 97333.
examples that illustrate the map variations. The study in-
tegrates measurements from  several monitoring networks
and addresses the substantial variation in data quality. The
variables in the study include measurements on 19 chem-
ical ion species. The current  examples use quarterly sum-'
maries for the years 1982 to 1987 and focus on sulfate and
nitrate deposition. Simpson and Olsen (1990a, 1990b) de-
scribe these data in detail.                             '

      A Context for Hexagon Mosaic  Maps
A tessellation is an aggregate of cells that  covers space
without overlapping. Only three regular polygons tessel-
late the plane: equilateral triangles, squares, and hexagons.
This paper focuses oh hexagon tessellations. The phrase
"hexagon mosaic map" refers to a map that uses hexagons
to tessellate major areas, such as land masses.
  We do not know who created the first hexagon mosaic
map.  Since hexagon tessellations occur in nature  (i.e.,  in
bee honeycombs), the extension to game boards, image
processing, and maps seems straightforward. We conjec-
ture that the first use occurred in the context of games.  In
image processing, Pfaltz and Rosenfeld (1967) state, "Other
digitized image configurations  are possible,  for example,
using a hexagonal rather than a rectangular grid, which  in
fact seems preferable for some applications." The image-
processing history surely goes further back. While the im-
age-processing literature has  devoted significant attention
to hexagon tessellations (Serra 1982), technological conve-
nience for raster devices and computational convenience
have fostered the use of rectangular pixels. Most of today's
image-processing literature addresses square  pixels. Thus,
students new to image processing have ample opportunity
           Cartography and Geographic Information Systems, Vol. 19, No. 4, 1992, pp. 228-236, 271

-------
to rediscover hexagon tessellations. This situation likely is
similar in other fields. The purpose here is not to claim
priority for hexagon mosaic maps, which may have been
discovered independently by many different people over
recent decades, but rather to further explore the use of this
structure in cartography.
  Hexagons have at least two advantages over squares: vis-
ual appeal and representational accuracy. Carr et al. (1987)
and Carr (1991) discuss the visual appeal. The construction
of maps based on either hexagon or square tessellations
creates visual lines. These visual  lines are artifacts of  the
construction process and compete  with data-generated pat-
terns. The basic claim is that  humans, with  their sense of
gravitational balance, have a strong response to horizontal
and vertical lines. Thus,  the  horizontal and vertical lines
generated by square tessellations  (in standard orientation)
are particularly distracting and should be avoided.
   The strong visual response to horizontal and vertical lines
at different scales might be questioned. At a fine-grain level,
such as using fill patterns based  on parallel line screens,
the visual lines should be oriented at 45 degrees from hor-
izontal to facilitate interpretation as value. Castner and Ro-
binson (1969) note that anyone can observe this phenomenon
by examining a half-tone  print in a newspaper from differ-
ent orientations. Might not  a strong response occur at  a
coarser level as well?
   Consider an example that addresses symbol congestion
in a scatterplot. Figure la shows  a hexagon tessellation of
a scatterplot.  The size of a hexagon symbol, as shown in
Figure Ib, represents the relative counts of observations
falling in grid cells or "bins." The largest symbol fills the
highest count cell. Figure Ib is a form of a bivariate density
plot designed to handle large sample sizes (Carr et al. 1987;
Scott 1992). Figure Ic is  similar,  but it uses a square tes-
sellation  with  cells having the same area as those in the
hexagon  tessellation. Comparison of binned plots to the
original scatterplot shows that the lattice of bin centers dis-
tracts from the pattern of the data.  Comparison of the binned
plots against each other  suggests that the nonorthogonal
visual lines of the hexagon tessellation are less distracting.
   The shape (edges) of the symbols contribute to the dis-
traction in the binned plots. Plotting round symbols in the
square cells helps somewhat (i.e., the sunflower symbols
in Chambers et al. [1983]), but the visual equivalent of put-
 ting round pegs in  square  holes wastes space  and  calls
 attention to the lattice lines. Another improvement shifts the
 symbol for each cell toward the center of mass of points  in
 the cell (Carr 1991). This reduces the visual emphasis on the
 lattice lines. The best one can do is ameliorate the visual
 artifacts;  the hexagon tessellation provides a good starting
 place.
   Representational accuracy also favors using hexagon cells
 over square cells. For bivariate densities with the necessary
 derivatives, Scott (1985)  has shown  that hexagon-based
 density estimates have a somewhat smaller integrated mean
 square error than square-based estimates. The situation  of
 approximating a bivariate function using a  cell-based  step
 function is similar. When the bivariate function is reason-
 ably smooth, the range of function values over a fixed area
 cell generally will be smaller if  the cell is  as  compact  as
Figure 1. (a, top) Bivariate points falling in hexagon tessellation
cells of a scatterplot. The bivariate points are sulfate deposition
trends for 1982-87 (x-axis) and nitrate deposition trends for 1982-
87 (y-axis) for sites in the eastern region of the United States.
Binning consists of counting the number  of points in each cell.
(b, middle)  Hexagon-bin bivariate  density plot. The  size of the
hexagon symbol represents the number of points in the tessellation
cell. The symbol is scaled so  that the  symbol representing the
largest  number of points exactly fills the tessellation cell,  (c, bot-
tom) Square-bin bivariate density  plot. The square tessellation
cells of the scatterplot are not shown. Like in Figure Ib, the size
of the square symbol represents the number of points in a square
tessellation cell. The horizontal and vertical lines compete for at-
tention with the trend in the data.
 Cartography and Geographic Information Systems
                                                    229

-------
 possible. (One possible measure of a cell's compactness is
 the dimensionless second moment about the cell center, as
 defined by Con way and Sloane J1982]. This measure yields
 0.0833, 0.0802, and 0.0796 for squares, hexagons, and cir-
 cles, respectively.) Restricting the range of function values
 over the cell constrains the integrated mean square error.
 Thus,  hexagon partitions generally yield better approxi-
 mations than square partitions.
  The  function-approximation arguments in favor of hex-
 agons  are as not  as strong as the visual arguments. One
 can construct  functions for which hexagon-cell step-func-
 tion approximations are not better. When the hexagon-based
 approximations are better, Scott's (1985) result suggests that
 the improvement is usually slight. Nonetheless, consider-
 ations  of representational accuracy add further support to
 the use of hexagon  mosaic maps.
  One measure of success of a particular map variation is
 its acceptance for routine use by government  agencies. A
 triangular grid and accompanying hexagonal tessellation
 has been proposed  for use in the U.S Environmental Pro-
 tection Agency's (EPA) Environmental Monitoring and As-
 sessment  Program  (EMAP)  (Messer et  al. 1991), which
 emphasizes probability sampling using a regular geometric
 arrangement of samples to achieve  spatial coverage. The
 use of hexagonal  regions is  natural  both for spatial sam-
 pling and data display.
  EMAP's design objectives translate into a set of geometric
 and cartographic properties for  a sampling grid  (White et
 al. 1992):


  1. Equitable spatial coverage  of all environmental re-
     sources of interest
  2. Random positioning to yield a probability sample
  3. Equal-area sampling  units  to enhance  precision  of
     estimates
  4. Compact arrangement of sampling units
  5. Minimal correlation with any regularly spaced cultural
     features
  6. A hierarchical structure to facilitate increasing and de-
     creasing grid density
  7. A realization of the grid on a single planar surface for
     the entire domain of application


  The grid designed to satisfy these properties is based on
a triangular array  of points with a corresponding dual tes-
sellation of regular  hexagons established on the plane  of
the Lambert azimuthal equal-area map projection. For ap-
plication in EMAP, this sampling grid has been placed on
a hexagonal face of a truncated icosahedron n't to the globe
(Figure 10 in White et al. [1992]). The base density grid for
EMAP  consists of points placed about 27 km apart and
tessellation hexagons about 635  km2 in area. This density
represents a compromise between desired resolution and
cost of sampling.  The tessellation of these hexagons over
the conterminous  United States is shown in Figure 2.
  For sampling purposes, the tessellation hexagons may be
considered as  strata within which point or area samples
may be taken. The equal-area property assures that all points
or arbitrary areas  within the domain of the grid have an
equal probability of being selected for sampling. For analy-
 sis and display purposes, the tessellation hexagons provide
 a set of equal-area units that minimizes analytical and vis-
 ual bias inherent in the use of arbitrary spatial units bounded
 by political or other features.


   The Hexagon Mosaic Map and  Variations
 An important task for understanding sulfate deposition over
 the United States is to show broad regional patterns. In this
 case, the broad regional patterns must be constructed from
 irregularly sampled point data. The standard approach for
 this  task uses a spatial estimation algorithm to obtain es-
 timates  on a regular grid. The graphical algorithms trans-
 form the regular grid  estimates into maps. Our approach
 used kriging to produce estimates for a  hexagonal  grid.
 (Cressie [1991] discusses spatial  smoothing methods, in-
 cluding several variants of kriging.) We then transform the
 estimates  into a hexagon grid-cell choropleth map or hex-
 agon mosaic map, as shown in  Figure 3 (see page  271).
 This map distinguishes different sulfate deposition levels
 using color.  The colors are ordered in terms of value and
 are light enough to allow the state boundaries generally to
 be visible.
  The use of an equal-area regular grid on  a map has inter-
 pretational advantages. In Figure 3, each hexagon covers
 approximately 2,670 square km. (Figure 3  has been repro-
 duced from early studies of acidic deposition and is not an
 equal-area map. The area distortion in this Lambert con-
 formal conic projection is modest, so the map has not been
 revised.) The total area affected by  sulfate  deposition, say
 in the range of 20 to 30 kg per hectare, can be determined
 simply by counting hexagons. We use this fact and deter-
 mine the class intervals for the United States from the per-
 centage of hexagons involved.
  The lowest deposition class in the eastern United States
 covers 10% of the area and has an upper bound of 11.4 kg/
 ha. The next class covers 15% of the area and  its upper
bound is 14.8 kg/ha. Thus, 25% of the hexagons have val-
ues at or below 14.8 kg/ha. The cumulative percents  used
 to obtain boundaries are 10, 25, 50, 75, 90, and 95. The
highest deposition class covers the last 57o of the area and
is shaded  with the darkest color. For this application, we
prefer defining classes by the percent area  involved. If sul-
fate values determine directly the class intervals, showing
the percent of hexagons involved still provides an imme-
diate area-based interpretation.
  Figure 3 itself is a variation on a basic mosaic map,  since
the map is presented in two parts. In this application, the
big difference in values between  the western and eastern
regions strongly motivates the split. Notice that at least 90%
of the values in the western region fall in  the lowest cate-
gory of the eastern region. Thus, splitting the map into two
regions provides better presentation of the distribution of
deposition occurring in the United States.
  Alternative  methods can show the information rep-
resented in Figure 3, with the leading alternatives being
contour (isarithm)  maps and  perspective  views. Colored
contour maps  provide the main alternative. Colored con-
tour maps are similar to mosaic maps in several ways.  Both
normally involve an estimation step that produces values
on a regular grid. While contouring algorithms typically
230
       Cartography and Geographic Information Systems

-------
Figure 2. EMAP 635 km2 nonrandomized sampling grid tessellation hexagons. The grid provides a basis for probabilistic environmental
sampling.
assume a rectangular grid, it is quite possible to base both
contour maps and hexagon mosaic maps on the same tri-
angular grid of estimates. The two maps are fundamentally
the same at this level of construction. Both methods rep-
resent  areas and hide details concerning the amount and
placement of the underlying point data.  The  differences
between the two similar methods appear at the interpre-
tation stage.
  The hexagon mosaic map can have interpretation advan-
tages over a contour map, because the regular  tessellation
suggests the use of an estimation process and facilitates
thought about confidence intervals. The hexagon edges at
class boundaries imply the estimation lattice that has been
used. In contrast, smooth contour lines give little clue to
this underlying estimation step. The value for  each hexa-
gon typically is presumed to be an estimate that represents
the whole hexagon region. The value for the hexagon does
not have to match the value at any particular sampling site
within the  hexagon.  In  contrast, many  people interpret
contour lines  as precise, and knowledgeable local experts
argue about the placement of a contour line relative to ob-
served values at sampling sites.
  Interpretation differences also appear in terms of confi-
dence intervals. Analysts can easily  think of the estimated
value for a hexagon  as having upper and lower  bounds
corresponding to  the upper- and lower-confidence  sur-
faces. The units for confidence bounds are data units. Prag-
matically, a contour  line  is a sequence of line segments
whose endpoints are  two-dimensional  (bivariate) spatial
coordinates. The notion of confidence intervals for contour
lines is not well defined. Some understanding  about the
accuracy of a contour line can be obtained by drawing the
corresponding contour lines from the upper- and lower-
confidence surfaces. However, one-to-one correspondence
between the points on  the different lines is not guaranteed,
so probabilistic considerations for contour lines do not lead
to the one-dimensional simplicity of traditional confidence
intervals. Consequently, the hexagon mosaic map, al-
though less aesthetically pleasing because  of its jagged
boundaries,  has an interpretational advantage.
  Analysts often use perspective views to obtain a Gestalt
impression of a surface. For rough surfaces, several per-
spective views may be  required because part  of the surface
can be hidden. (See MacEachren and DiBiase  [1991] for
representations  covering variation in continuity and ab-
ruptness.) Perspective  views typically are poor for identi-
fying the geographic location of peaks and valleys.
Consequently, contour plots and perspective views  often
are both shown (Crotch 1983; Tufte 1991) and are best re-
garded as complementary, rather than competing, views.
  In general, hexagons help convey the spatial structure of
information. The idea  of neighboring regions is  clear. The
notion of averaging values from neighboring regions to ob-
tain a smoother surface estimate is straightforward and
provides  a  reasonable introduction to more complex
smoothing approaches such as kriging.
Cartography and Geographic Information Systems
                                                 231

-------
                 Comparison Plots
A recurrent graphical task is the comparison of two surfaces
on a map (Lloyd and Steinke 1976,1977; Steinke and Lloyd
1983; Monmonier 1990a). For  example, researchers might
want to compare  sulfur dioxide (SOj)  emission data  with
sulfate (SO,) deposition data in the attempt to better un-
derstand the deposition process. However, comparing sur-
faces is nontrivial. The complexity of  surfaces and short-
term visual memory limitations complicate the comparison
of juxtaposed surface representations. Superimposed plots
reduce the demand on visual memory, but restrict the form
of the plot. Fishnet (perspective view) plots and translucent
surfaces  can be superimposed.  However,  this  leads to a
further difficulty.
  Consider  the simpler case of comparing two superim-
posed curves.  Cleveland and McGill (1984) show that hu-
mans are very poor at assessing the difference between two
superimposed  curves. Our visual system tells us the dis-
tance between closest points on the two curves, regardless
of direction, rather  than computing the vertical distance
between corresponding points. There is no reason to think
we will do any better at comparing superimposed surface
representations. Consequently, an advantageous approach
computes and represents differences  (and  relative differ-
ences) directly. Such representations are straightforward
for hexagon mosaic maps, since values  are available for
each hexagon.
  A more limited way of making comparisons is to overlay
a pair of two-class distributions, one  from each surface,
creating a bivariate correlation  map. Olson (1981), Carsten-
sen (1982), Lavin and Archer (1984), and  Eyton  (1984) dis-
cuss this type of map. Figure 4a shows the two classes from
the sulfate-deposition plot. The class-interval break at 20
kg/ha was chosen because the Canadian  government is
concerned about values above 20 kg/ha.
  The next issue to consider is which class-interval break
to use from the smoothed emission data. (Smoothing fa-
cilitates area-based comparison because the emissions come
from point sources, such as coal-fired plants.) Several plau-
sible answers can be given. Figure 4b shows a class-interval
break that was selected to produce the same areas as the
classes in Figure 4a. Thus, attention can be focused on the
location of the class differences represented in the overlay
in Figure 4c. The visual impression is that the SO2 rises and
mixes with cleaner air as it moves to  the northeast and is
eventually.deposited as SO,. The movement  toward the
northeast is only a general visual impression and does not
necessarily match meteorological data. The visual impres-
sion of a much smoother  deposition surface is  very strong
and consistent with our understanding. Use of two-class
overlays provides much  less information than the  direct
plot of differences, but this procedure  can help  by focusing
attention on specific aspects of two surfaces.

   Showing Trends and Confidence Intervals
             Using  Ray-Glyph Maps
One purpose of the acidic deposition study was to leam
about trends. For the trend study, seasonally (three-month)
aggregated data for the period 1982 to 1987 are available
for a number of monitoring sites. The aggregated data could
be used to yield 24 seasonal maps. Comparing the sequence
of 24 maps to assess trends would be a difficult visual task,
complicated by the presence of seasonal variation. A direct
approach shows estimated trends at each monitoring site.
                                      min = 5
Figure 4a. Two-class map of sulfate deposition (kg/ha). The class
boundary of 20 kg/ha was chosen based on policy considerations.
                                      min = 0
Figure 4b. Two-class map of sulfur dioxide emissions (kgfha).
The emissions surface has been smoothed somewhat, but the class
boundary line is still rough. The specific class boundary defines
regions with the same areas as the classes in figure 4a.
232
       Cartography and Geographic Information Systems

-------
                                                                     Eastern North America
Figure 4c. Overlay of two two-class maps.  Two general impres-
sions are that the SO, boundary is much smoother than the SOt
boundary, and that the SO, area lies more toward the northeast.


  Monitoring studies  often yield data of varying quality,
so we  use nonparametric trend methods to mitigate prob-
lems presented by poor data. In particular, we used Sen's
nonparametric slope estimate and associated confidence in-
tervals (Gilbert 1987).  For each site, slope estimation pro-
ceeds by fixing a season and computing all possible pairs
of slopes between years, then listing the slopes from all
four seasons, and finally selecting the median from this list
as Sen's median slope estimate. Confidence intervals for
the estimate are based on order statistics; therefore, they
are typically asymmetrical about the estimate. Computing
the nonparametric slope estimates and confidence intervals
for individual sites is straightforward.
  The  next task is to  show  the estimates and confidence
intervals. If we want to portray only the estimates and their
broad  regional patterns,  we could  use  a hexagon mosaic
map. However, incorporating local confidence intervals on
the map  challenges us to use a different technique.  The
large number of sites  poses  a  problem  of graphic conges-
tion. A solution  is  to aggregate the sites into  hexagon re-
gions and show only a  summary slope and  confidence
interval for each region with data. While this introduces
statistical summarization issues that deserve attention, Fig-
ure 5 illustrates the representation  concept, and the octa-
gons in the plot locate the centers of hexagon regions (not
drawn) that contain site data.
  Figure 5 is  a ray-glyph map. The ray is composed of a
line segment  and a region-centered dot or polygon at the
base. Ray angle  encodes the slope estimate. In this case,
the scaling was chosen so that a horizontal ray represents
a zero-degree  slope (no  change).  A ray  straight up rep-
resents an increase of 6.6 kg/ha per year; a  ray straight
down represents a decrease of 6.6 kg/ha  per year. In Figure
                              / Sulfate Deposition Trends
                            f  Period = 1982 to 1987
                                Units = kg/ha Per Year

                            -         i   6.6
                             \        S  3.3
          ~-^ "" V            j    ',
                          »   \      •- o.o
                           \  ]__    %  -3.3
                                      t   -6.6

Figure 5. Ray-glyph and arc map showing sulfate deposition trends
and 90%  confidence intervals. The rays and arcs represent sum-
maries for local hexagon-shaped  regions. Rays  with a negative
slope represent a decrease in deposition. The shaded arcs covering
zero suggest weak evidence concerning nonzero trends.

5, the scale upper limit for the ray is determined  by the
90% confidence intervals shown as filled gray arcs; the scale
lower limit is determined  by symmetry.  This diminishes
the resolution for the estimate. (One observation  with a
large  confidence interval was omitted for  this reason.) On
the plot,  the majority of  rays point slightly down, sug-
gesting a small decrease in deposition over the six years.
However, the majority of confidence intervals include zero,
and a few might be expected  to exclude zero at random
since  this is a multiple  comparison situation. Thus, evi-
dence for change is weak. Note also that regions without
data and regions with highly variable estimates are quite
evident.  The  ray-glyph map is effective for representing
local area summaries.
  Of other representations that might be considered, framed-
rectangle symbols, as shown in Figure 6, provide a partic-
ularly informative alternative. Cleveland and McGill (1984)
developed the  framed-rectangle symbol  based on their
studies of perceptual accuracy of extracting the encoded
information. Their studies showed that judging positions
against common nonaligned scales was superior to judging
circle  areas, angles, colors,  and  other representations. Con-
sequently, they added a frame with center ticks to provide
a scale for judging bar height.  The framed-rectangle sym-
bols achieve the goal of perceptual accuracy, but other de-
sign issues also are relevant. These include addressing the
Cartography and Geographic Information Systems
                                                  233

-------
Figure 6. Two framed-rectangle plotting symbols. The frame and
tics increase the perceptual accuracy for judging bar heights. The
left bar height can be assessed by comparison with the center tics.
The right bar height can be assessed by the white between the bar
and the top of the frame.

recurrent problem of symbol congestion and deciding on
the relative emphasis between local symbols and the other
information on the map. The  horizonal and vertical lines
of the symbol frame draw visual attention and  may inter-
rupt the visual flow  from symbol to symbol.  Perceptual
accuracy of extraction provides just one design  criterion.
  The  ray glyph, with an open octagon as  the base, is a
line symbol and is better suited for overplotting than an
area symbol like a bar. The ray glyph can be made to blend
with or stand out from the rest of the map through control
of line thickness, octagon size, and ray length. The  octa-
gon, with lines connecting  opposite vertices,  provides  a
visual anchor that improves the perceptual accuracy of ex-
traction of the ray angle. When the glyph must  be small, a
polygon with few sides,  such  as a diamond, can be  used
as a visual anchor. In the context of hexagon tessellations,
neighboring octagons also provide local scale against which
angles  can be judged. Positive  and  negative ray slopes can
be assessed by looking at the  corresponding octagons on
the right. The ray glyph provides even greater accuracy on
a hexagon mosaic map when ray scale is nested within
classes that are distinguished by different colors. The ray
glyph is well suited for use with hexagon tessellations and
provides the flexibility to address several design  objectives.

           Bivariate Ray-Glyph Maps
            and  Graphical Interaction
The representation of bivariate information using maps is
a challenge. The task of interpreting side-by-side univariate
maps  has not proved easy, so various researchers  have
                                                       proposed different methods. Monmonier (1979) discussed
                                                       the cartographic cross-classification table, which is an in-
                                                       termediate step between juxtaposed univariate maps and
                                                       bivariate maps. Carstensen (1982) and Lavin and Archer
                                                       (1984) experimented with  and developed continuous bi-
                                                       variate crosshatching maps. Wainer and Francolini (1980)
                                                       and Olson (1981) have investigated the utility of bivariate
                                                       color maps. Eyton  (1984)  has proposed additional tech-
                                                       niques. Investigations suggest that bivariate color maps can
                                                       be helpful when they have few categories and the colors
                                                       are carefully selected. Bivariate ray-glyph maps (Carr 1991)
                                                       provide an alternative that represents the variables with
                                                       much greater resolution.
                                                         Figure 7a shows a bivariate ray-glyph map. Rays pointing
                                                       to the right represent sulfate deposition trends and  rays
                                                       pointing to the left represent nitrate deposition trends.
                                                       Dropping the confidence intervals and omitting scale con-
                                                       siderations of zero slope and symmetry increase the reso-
                                                       lution of sulfate deposition rays,  in comparison with
                                                       corresponding rays in Figure 5. The two rays of the bivar-
                                                       iate ray glyphs generally point down or up together, so the
                                                       attributes are positively correlated. With a few exceptions,
                                                       the spatial change in the two trends is reasonably smooth.
              Eastern North America
   ;     -^           "-,              i/^r    '  >
   '  *             -o             7^\    ^
 —'~*^f~\  "M         7>y{    ^-*"^  s**^-'-*^  -•'
 \      T"<^^rrr^^ ^
 V     fv      V- .1**-.  -*C-V^*-^-.'  X
      \      T^fy y^"  ^-5^
                              V£*
         •      ;    '.     r    •'  i*^*
   X     \   -r   -Vv^v"   r1 H ^7
    1   AA iX  ^>^     ^
                                    Bivariate Trends
Figure 7a. Highlighted, bivariate ray-glyph map of sulfate and
nitrate deposition trends.  Sulfate rays point to the right, and
nitrate rays point to the left. Rays pointing straight up correspond
to maxima, and rays pointing straight down correspond to min-
ima. The scaling is less than optimal for assessing zero trends,
but makes assessing correlations easier. The highlighted points are
determined from Figure 7b.
234
       Cartography and Geographic Information Systems

-------
The bivariate ray map seems to be an effective technique
for showing bivariate associations, provided the individual
areas to be represented are not too small.
  Having several attribute variables available for mapping
introduces new possibilities. In particular, subsets can be
selected from one view of the data and highlighted in an-
other. This has been done in Figures 7a and 7b. Figure 7b
shows a scatterplot of bivariate slopes. To  select a set of
relatively unusual points  for highlighting,  we computed
the bivariate density for each point and chose the four low-
est bivariate density points. This selection approach can
easily be automated. The points have also been highlighted
on the map (Figure 7a), using larger symbols with thicker
lines. Alternatively, points can be selected through direct
graphical interaction.
  The notion of dynamic simultaneous highlighting using
a mouse is often called "brushing" in the literature. Mc-
Donald (1982), Becker and Cleveland (1987), and Stuetzle
(1987) provided early papers on this technique, typically in
the context of attribute variable plots.  Can- et al. (1987),
Monmonier (1989, 1990b), Dunn (1989), and Haslett et al.
(1991) have all addressed dynamic subset selection and
highlighting in the geographic context. Brushing is a pow-
erful discovery technique.
  Highlighting  is important in ray-glyph plots, because
unusual rays may not be  immediately obvious. The ideal
highlight allows unusual rays to be found almost instantly.
Julesz and Bergen (1983) discuss the visual phenomenon
of immediate symbol location and make the distinction be-
tween rapid preattentive vision and slow attentive search.
They propose the theory of textons to characterize circum-
 CO
 Q)
 _
 
-------
 municate with  the public. The computing world and the
 rise of scientific visualization provides us with unprece-
 dented opportunities. The pressing needs of humankind
 on a finite planet provide us with unprecedented challenges.

 ACKNOWLEDGMENTS
 The authors would like to thank Jeanne Simpson, who played an
 active role in the  data preparation and summarization; Kevin Ad-
 ams, who helped develop software for the initial hexagon mosaic
 maps; and the reviewers for their helpful comments. The research
 described in this article has been funded by the National Science
 Foundation under grant no. DMS-9107188 and by EPA through
 contracts  68-C8-0006 to ManTech Environmental Technology Inc.
 and 68-CO-0021 to Technical Resources Inc. This paper went through
 the U.S. EPA's peer and administrative review and was approved
 for publication.

 REFERENCES
 Becker, R.A., and W.S. Cleveland. 1987. "Brushing Scatterplots."
  Tcchnometrics, vol. 29, pp. 12-142.
 Carr, D.B. 1991. "Looking at Large Data Sets Using Binned  Data
  Plots."  Computing Graphics in Statistics,  A. Buja and P. Tukey
  (eds.), pp. 7-39. New York: Springer-Verlag.
 Carr, D.B., R.J. Littlefield, W.L. Nicholson, and J.S. Littlefield.
  1987. "Scatterplot Matrix Techniques for Large N." Journal of the
  American Statistical Association, vol. 82, no. 398, pp. 424—436.
 Carstensen, L.W., Jr. 1982. "A Continuous Shading Scheme for
  Two-Variable Mapping." Cartographies, vol. 19, pp. 53-70.
 Castner, H.W., and A.H. Robinson. 1969. Dot Area Symbols in Car-
  tography: The Influence of Pattern on Their Perception, Technical
  Monograph CA-4. Bethesda, Maryland: ACSM.
 Chambers, J.M., W.S. Cleveland, B. Kleiner, and P.A. Tukey. 1983.
  Graphical Methods for Data Analysis. Pacific  Grove, California:
  Wadsworth & Brooks/Cole.
 Cleveland, W.S., and R. McGill. 1984. "Graphical Perception:  The-
  ory, Experimentation, and Application to the Development of
  Graphical Methods." Journal of the American Statistical Association,
  vol. 79,  no. 387, pp.  531-554.
 Conway, J.H., and N.J.A. Sloane. 1982. "Voronoi Regions of Lat-
  tices, Second Moments of Polytopes and Quantization." /£££
  Transactions of Information Theory, vol. 28, no. 2, pp. 211-226.
 Cressie, N.A.C. 1991.  Statistics for Spatial Data. New York: John
  Wiley is Sons Inc.
 Dunn,  R.  1989. "A  Dynamic Approach to Two-Variable Color
  Mapping." American  Statistician, vol. 43, no. 4, pp. 245-251.
 Eyton,  J.R. 1984.  "Complementary-Color,  Two-Variable Maps."
  Annals of the Association of American  Geographers, vol. 74, pp.  477-
  490.
 Gilbert,  R.O. 1987. Statistical  Methods for  Environmental Pollution
  Monitoring.  New York: Van Nostrand Reinhold Company.
 Crotch, S.L. 1983. "Three-Dimensional and Stereoscopic Graphics
  for Scientific Data Display and Analysis." IEEE Computer Graphics
  and Applications, vol.  3, no. 8, pp. 31-43.
 Haslett, J., R. Bradley, P. Craig, A.  Unwin, and  G.  Wills. 1991.
  "Dynamic Graphics for Exploring Spatial Data with Application
  of Locating Global  and Local Anomalies." American Statistician,
  vol. 45,  no. 3, pp. 234-242.
Julesz, B., and J.R. Bergen. 1983. "Textons, the Fundamental Ele-
  ments in Preattentive Vision and Perception  of Textures." The
  Bell System Technical Journal (Human Factors and Behavioral Science),
  vol. 62,  no. 6, pp. 1619-1645.
 Lavin, S.,  and J.C. Archer. 1984. "Computer-Produced Undassed
  Bivariate Choropleth Maps." The American Cartographer, vol. 11,
  no. 1, pp. 49-57.
 Lloyd, R.E., and T.R. Steinke. 1976. "The Decision Making Process
   for Judging the Similarity of Choropleth Maps." The American
   Cartographer, vol. 3, no. 2, pp. 174-184.
 	1977.  "Visual and Statistical  Comparison  of  Choropleth
   Maps." Annals of the Association of American Geographers, vol. 67,
   no. 3, pp. 429-136.
 MacEachren, A.M., and D. DiBiase. 1991. "Animated Maps of Ag-
   gregate Data: Conceptual and Practical Problems."  Cartography
   and Geographic Information Systems, vol. 18, no. 4, pp. 221-229.
 McDonald, J.A. 1982. "Interactive Graphics for Data Analysis,"
   Technical Report Orion 11. Stanford, California: Department of
   Statistics, Stanford University.
 Messer, J.J.,  R.A. Linthurst, and W.S. Overton. 1991. "An EPA
   Program for Monitoring Ecological Status and Trends." Environ-
   mental Monitoring and Assessment, vol. 17, pp. 67-78.
 Monmonier, M. 1979. "An Alternative Isomorphism for the Map-
   ping of Correlation."  International Yearbook of Cartography, vol.
   16, pp. 77-89.
 	1989. "Geographic Brushing: Enhancing Exploratory Analysis
   of the Scatterplot Matrix." Geographical Analysis, vol. 21, pp. 81-84.
 	1990a. "Strategies for the Visualization of Geographic Time-
   Series Data." Cartographica, vol. 27, pp. 30-45.
 	1990b.  "Strategies for the Interactive Exploration of Geo-
   graphic Correlation." Proceedings of the Fourth International Sym-
   posium on Spatial Data Handling, Zurich, pp. 512-521.
 Olson, J.M. 1981. "Spectrally Encoded Two-Variable Maps." Annals
   of the Association of American Geographers, vol. 71, pp. 259-276.
 Pfaltz, J.L., and A. Rosenfeld. 1967. "Computer Representation of
   Planar Regions by Their Skeletons." Communications of the ACM,
   vol. 10, no. 21, pp. 119-123.
 Pregibon, D. 1989. "Discussion of Regression Diagnostics with Dy-
   namic Graphics." Technometrics, vol. 31, no. 3, pp. 297-301.
 Scott, D.W. 1985. "A Note on Choice of Bivariate Histogram Bin
   Shape," Technical Report 85-822-3. Houston, Texas: Department
   of Mathematical Sciences, Rice University.
 	. 1992. Multivariate Density Estimation: Theory, Practice and Vi-
   sualization. New York: John Wiley & Sons Inc.
 Scott, D.W., A.M. Gorto, J.S. Cole, and G.A. Gorry. 1978. "Plasma
   Lipids as Collateral Risk Factors in Coronary  Artery Disease: A
   Study of 371 Males with Chest Pain." Journal of Chronic Diseases,
   vol. 31, pp. 337-345.
 Sena, J.  1982. Image Analysis and Mathematical  Morphology.  New
   York:  Academic Press.
 Simpson, J.C., and A.R. Olsen. 1990a. "1987 Wet Deposition Tem-
   poral and Spatial Patterns in North America," Technical Report
   PNL-7208. Richland, Washington: Pacific Northwest Laboratory.
 	1990b. "Uncertainty in North America Wet Deposition Iso-
   pleth Maps: Effect of Site Selection and Valid Sample Criteria,"
  Technical Report PNL-7291. Richland, Washington: Pacific
   Northwest Laboratory.
 Steinke, T.R., and R.E.  Lloyd.  19S3. "Judging the Similarity  of
   Choropleth Map Images." Cartographies, vol. 20, pp. 35-42.
 Sruetzle, W. 1987. "Plot Windows." Journal of the American Statis-
   tical Association, vol. 82, no. 398, pp. 446-475.
	1991. "Odds Plots: A Graphical Aid for Finding Associations
   Between Views of a Data Set." Computing and Graphics in. Statis-
  tics, A. Buja and P. Tukey (eds.), pp. 207-217. New York: Sprin-
  ger-Verlag.
Tufte, E. 1991.  Envisioning Information. Chesire, Connecticut:
  Graphics Press.
Wainer, H., and C.M. Francolini. 1980. "An  Empirical Inquiry
  Concerning Human Understanding of Two-Variable Color Maps."
  American Statistician, vol. 34, no. 2, pp. 81-93.
White, D., A.J. Kimerling, and W.S. Overton. 1992. "Cartographic
  and Geometric Components of a Global Sampling  Design for
   Environmental  Monitoring." Cartography  and Geographic  Infor-
  mation Systems,  vol. 19, no. 1,  pp. 5-22.
236
        Cartography and Geographic Information Systems

-------
!

                                      Annual  1985-1987 Sulfate  Deposition
            kg/ha
          Max = 17.8
       95% -M- 13.3
       90% -m~ 11.2
       75
6.9
                   4.0
           Min=0.8
                                                                                                      kg/ha
                                                                                                   Max = 42.7
                                                                                                95%  -H- 32.3
                                                                                                90%  -§1- 28.6
75%
23.8
                                                                                                           17.3
                                                                                                                              14.7
                                                                                                                              11.4
                                                                                                    Min=5.5
           Figure 3, Carr el al. Hexagon mosaic map of sulfate deposition (kg/ha). The map is split to provide greater resolution in both the east and west. Percent of area, found by counting
           hexagons, determines the class interval boundaries. The hexagon edges at class boundaries  indicate the hexagon cell size. The underlying estimation lattice consists of hexagon cell
           centers. The map is similar to a color-contour map, but suggests involvement of an estimation process.

-------
   Bivariate "Cross Map"
                                                          !
       I     I BELOUJ the mean for BOTH uariables
       B9 RBOUE meon ONLY for Female Officials
       I   >. I RBOUE mean ONLV for Females Working
       HH] RBOUE the mean for BOTH uariables
         Femole$ Working.
         i..rU'v ' 'M<-•.•.«<:r/itoiK4JlS8|;
Breaks at nationwide means
                                                                                                                      J
                                                                                                                      f
                                                                                                                      .y
                                                                                                                      -e
                                                                                                                      tx.
                                                                                                                      1
                                                          IX
                                                          t
Figure 4, Monmdnier. Juxtaposed cross map and scatlerplot used in the correlation script's second act. Conventional key does not appear below the map
until the end of the scene, after spiral variation of the category breaks (described in the text).

-------
                              TECHNICAL REPORT DATA
                        frtcau md liuruenonj on ttit rtvmt btfon
 1. REPORT NO.
  EPA/600/J-94/167
                          2.
                                                   3. RI
                    PB94-160538
 4. TITLl ANDSUiTlTLE
  Hexagon Mosaic Maps  for Display of Univariate
  and Bivairiate Geographical Data.
                                                   K. REPORT DATE
           •. PERFORMING ORGANIZATION CODE
 7. AUTHORISI.
 br'Carr1,  A.R.01senb,
 D.  White6
                                                   •. PERFORMING ORGANIZATION REPORT NO.
 I. PERFORMING ORGANIZATION NAME ANO ADDRESS

 •George Mason Univ.
 "USEPA, ERL-Corvallis, Corvallis,  OR
 CMETI, Corvallis,  OR
            10. PROGRAM ELEMENT NO.
            ITTCONTRACT/ORANT NO.
 12. SPONSORING AGENCY NAME ANO ADDRESS
  US Environmental Protection Agency
  Environmental Research Laboratory
  200 SW 35th Street
  Corvallis, OR 97333	
            13. TYPE Of REPORT ANO PERIOD COVERED
                Journal Article
            14. SPONSORING AOENCY CODE
             EPA/600/02
 IS. SUPPLEMENTARY NOTES
  1992.   Cartography and  Geographic Information Systems
         19(4) :228-236, 271,271
 1C. ABSTRACT
 Hexagon mosaic  maps and  hexagon-based  ray glyph maps are presented.
 The  phrase  "hexagon mosaic map"  refers  to maps that  use hexagons to
 tessellate major  areas  of a map  such as land masses.   Hexagon mosaic
 maps are  similar to  color-contour   (isarithm)  maps  and  show  broad
 regional patterns.  The ray glyph,  an oriented line segment with a dot
 at the  base,  provides a convenient  symbol for representing information
 within  a hexagon  cell.   Ray angle  encodes the  local  estimate  for the
 hexagon.  A simple extension adds upper- and lower-confidence  bounds as
 a shaded arc  bounded by two rays.  Another extension, the  bivariate ray
 glyph,   provides  a  continuous  representation  for  showing  the  local
 correlation of  two variables.   The  theme of integrating statistical
 analysis  and cartographic methods  appears  throughout   this  paper.
 Example maps show statistical  summaries  of acidic  deposition data for
 the  Eastern United States.  These maps provide  useful templates for a
 wide range  of   statistical   summarization  and  exploration  tasks.-
 Correspondingly,  the concepts  in this paper  address the  incorporation
 of statistical information, visual appeal, representational  accuracy,
 and  map interpretation.
 7.
                           KEY WORD* ANO DOCUMENT ANALYSIS
               DESCRIPTORS
b.lOENTIFIERS/OPEN ENDED TERMS
c. COSATi Field/Group
 hexagon mosaic maps, ray-gylph maps,
 comparison plots, bivariate maps,
 brushing.
  , DlSTRHUTlON STATEMENT


   Release to  Public
                                             "
          ea
31. NO. O^ PACES
     8
 0 SECURITY CLASS (Ttat ftfti
 Unclassified
                       aa. PRICE
•PA

-------