United States
Environmental
Protection Agency
Office ot
Pollution Prevention
and Toxics
EPA 747-R96-002
May 1996
Distribution of Soil Lead in
the Nation's Housing Stock
Lead? Lead? Lead? Lead? Lead? Lead? Lead? Lead? Lead? Lead? Lead? Lead?
1 ead? Lead? Lead? Lead? Lead? Lead? Lead? Lead? Lead? Lead? Lead9 'oaHO
-------
Distributions of Soil Lead in the
Nation's Housing Stock
This work was conducted under contract
number 68-D3-0011
Prepared for
Samuel Brown, Work Assignment Manager
Technical Programs Branch
Chemical Management Division
Office of Pollution Prevention and Toxics
U.S. Environmental Protection Agency
Washington, D.C. 20460
May, 1996
-------
The malarial in this document has been subject to Agency technical and policy review and approved for
publication as an EPA report The views expressed ty individual authors, however, arc their own and do
not necessarily reflect those of the US. Environmental Protection Agency. Mention of trade names,
products, or services, does not convey, and should not be interpreted as conveying official EPA approval,
endorsement, or recommendation.
11
-------
Table of Contents
1 . IntnxhictiofL.,..,..,..^,..,,,. ____ „ ______________________________________________________________________________ 1
1.1 Purpose of Report -------------------------------------- ....... _______ [[[ ------- 2
12 Overview of the National Survey ________________________________________ 1
2. Conclusions^.,—., ______ _________________________________________________________ 5
3. Descriptive Sofl Lead and Housing Unit Statistics ____________________________________ 9
3.1 Sofl Lead Data _______________ ................................................ ___________________ 9
3.2 Sofl Lead Prevalence _ ; _____________________________________________________ 12
33 Housing Unit Characteristics _________________________________________________________________ ............ ___________ 17
3.4 Pielmihiary Analysis of Sofl Lead Data and Housing Unit Characteristics — .. ---------------- 22
3 .5 Suitability of Sofl Lead Data ________________________________________________________________________ 25
3.6 Implications of Missing Sofl Lead Data ___ .. _______________ . _____________________________________ 28
4 Statistical Approach _______ _ _ _ .... „ 29
V . _ --------- _ ~T r ••'•"• ..... mmwnmfmfmwm**mm*mmmmwnmmmmm*nwmmw*mmmmmfm .,, ..,.-.*.,„„,...... - - «»„»„.„«.,..,
4.1 Private and Public Housing Model ________ ..... ________________________ ........................ ______________ 29
4.2 Modeling and Testing Procedures ~__ __________________________________________ _.„ __ 30
43 Confidence Intervals for Classification Percentages ---------------------------------- 31
5. Model Results ___________________________________________________ .................... _____________________ 33
5.1 Private Housing Results
-------
Tables
Table 1. Descriptive statistics (weighted) for the lead measurements in soil samples
at each soil location in private housing units 10
Table 2. Descriptive statistics (weighted) for the lead measurements in soil samples
at each soil location in public housing units 11
Table 3. Detailed distribution of private housing units missing soil lead concentration
measurements and the total number of homes by age, region, and urbanization 13
Table 4. Detailed distribution of public housing units missing soil lead concentration
measurements and the total number of homes by age and region 14
Table 5. Correlations between log-transformed soil lead measurements in private housing
from different locations around the same housing unit 15
Table 6. Correlations between log-transformed soil lead measurements hi public housing
from different locations around the same housing unit 16
Table 7. Estimated percent and number of U.S. homes built before 1980 exceeding various
soil lead concentrations 18
Table 8. Weighted geometric means for soil lead concentrations (ppm) by soil location and
housing unit characteristics for private housing units 19
Table 9. Weighted geometric means for soil lead concentrations (ppm) by soil location and
housing unit characteristics for public housing units 20
Table 10. Correlations between soil lead concentrations and housing unit characteristics
for private housing units ......23
Table 11. Correlations between soil lead concentrations and housing unit characteristics
for public housing units ..............24
Table 12. Chi-square results for building age and Census region variables for
private housing units ,...26
Table 13. Chi-square results for building age and Census region variables for
public housing units 27
Table 14.
Soil lead model regression statistics for private housing unit models 35
IV
-------
Tables (continued)
Table IS. Least-squares means and 95 percent confidence intervals for categorical
variables in die private housing unit models- 39
Table 16. Least-squares means and 95 percent confidence intervals for building age
by region interactions in the private housing unit models 41
Table 17. Soil lead model regression statistics for public housing unit models 43
Table 18. Least-squares means and 95 percent confidence intervals for categorical
variables in the public housing unit models 46
-------
Figures
Figure 1. Least squares means and 95 percent confidence intervals for soil lead
concentrations in private housing for building age by soil location 36
Figure 2. Least squares means and 95 percent confidence intervals for soil lead
concentrations in private housing for Census region by soil location 37
Figure 3. Least squares means and 95 percent confidence intervals for soil lead
concentrations in private housing for degree of urbanization by soil location. 38
Figure 4. Least squares means for soil lead concentrations hi private housing for the
degree of urbanization and building age interaction by soil location 40
Figure 5. Least squares means and 95 percent confidence intervals for soil lead
concentrations in public housing for building age by soil location 44
Figure 6. Least squares means and 95 percent confidence intervals for soil
lead concentrations hi public housing for Census region by soil location 45
VI
-------
Acknowledgments
The office of Pollution Prevention and Toxics of EPA would like to express their appreciation
for the many efforts and the contributions of Westat in the data analysis, interpretation, writing, and
preparation of this report. We would also like to thank Cindy Stroup, Samuel F. Brown, Brad Schultz,
Philip E. Robinson, and Sineta Wooten for their guidance and support throughout this research.
VII
-------
VIII
-------
Executive Summary
The primary objective of this study was to supplement the prior reports on the National Survey
of Lead-Based Paint in Housing through additional data analyses specifically focusing on the
relationship between lead in exterior soil (a potential source of lead hazard in homes) and housing unit
characteristics. The 1987 amendments to the Lead-Based Paint Poisoning Prevention Act required the
Secretary of Housing and Urban Development (HUD) to "estimate the amount, characteristics and
regional distribution of housing in the United States that contains lead-based paint hazards at differing
levels of contamination." In response to this act, HUD initiated and conducted the National Survey of
Lead-Based Paint in Housing, or the National Survey in 1990. The survey results were published in the
Environmental Protection Agency's (EPA) Report On The National Survey of Lead-Based Paint In
Housing document the National Survey and presented data on the extent and characteristics of lead
hazards in homes.
The National Survey inspected 381 housing units (284 privately-owned and 97 public) for lead in
paint on ulterior and exterior surfaces, lead in interior dust, and lead in exterior soil. The study
population was designed to be representative of nearly all housing in the United States constructed
before 1980. Newer houses were presumed to be lead-free because in 1978 the Consumer Product Safety
Commission banned the sale of lead-based paint to consumers and the use of such paint in residences.
The National Survey was conducted between December 1989 and March 1990 in 30 counties across the
48 contiguous states. These counties were selected to represent both the public and privately-owned
housing stock across the 48 contiguous states.
The purpose of this report is to supplement discussions on soil lead prevalence in the prior
reports on the National Survey by presenting findings on the prevalence and concentrations of lead in
soil around private and public housing units in the United States. These findings included estimates of
the number of housing units with different soil lead concentrations, nationally, by building age, by
Census region, and by degree of urbanization; and summaries of the statistical associations between soil
lead concentrations and soil location, building age, degree of urbanization, Census region, and the
presence and condition of interior and exterior lead-based paint
The quality of the private and public housing data was statistically evaluated to determine the
suitability of the soil lead data for the analyses needed in this study. The privately-owned homes
sampled in the National Survey were judged to be representative of the private housing stock nationally.
Therefore, the descriptive statistics presented in the private housing data tables and the results from the
analyses on the private housing data can be viewed as applicable to private housing nationally and useful
in policy analysis and decision making. In contrast, the sampled public housing units were not
considered representative of the public housing stock nationally, and the impact of the large amount of
missing soil data (70%) on the tables and analysis results was expected to be significant The public
housing data tables and results from the analyses on public housing should therefore be viewed only as
descriptive of those samples collected.
Under Section 403 of Title X, EPA has established health-based interim standards for soil lead
concentrations and action recommendations for each standard. The agency recommends that "interim
controls to change use patterns and establish barriers" should be implemented for areas that are expected
to be used by children where soil lead concentrations are between 400 and 5,000 parts per million (ppm).
Within this range, the degree of activity should be "commensurate with the expected risk posed by the
IX
-------
bare soil considering both the severity of [lead] exposure—and the likelihood of the children's exposure."
For areas where contact by children is less likely or less frequent, the "interim controls" should be
implemented when soil lead concentrations are between 2,000 and 5,000 ppm. Moreover, the agency
recommends the "abatement of soil" with lead concentrations above 5,000 ppm regardless of the
likelihood of children's exposure.
Using the data from the National Survey, it is estimated that 23 percent, or 18 million, of the
privately-owned homes in the United States built before 1980 have soil lead levels that exceed the 400
ppm "interim control" guideline. An estimated 8 percent, or 6 million, of the privately-owned homes in
the United States built before 1980 have soil lead levels that exceed the 2,000 ppm "interim control"
guideline. Finally, an estimated 3 percent, or 2.5 million, of the privately-owned homes in the United
States built before 1980 have soil lead levels that exceed the 5,000 ppm soil abatement guideline. The
prevalence and distribution of soil lead concentrations in public housing was not estimated due to the
considerable number of public housing units in the National Survey for which no soil was available for
sampling.
This study assessed the associations between the soil lead concentrations at different locations
and the presence and condition of interior and exterior lead-based paint to determine which
characteristics and factors specific to the housing unit are good predictors of soil lead. Additional
variables also considered to be related to soil lead included the average daily traffic flow hi the
neighborhood of the housing unit (for private housing only) and the number of family units in the
development (for public housing only), both of which were used to estimate the impact of the housing
unit's environment on soil lead.
Private Housing
The strongest statistical predictor of soil lead was found to be the building age. Building age
measures the length of time since the construction of the building and, in many cases, may be the last
major disturbance of soil. For private housing units, soil lead around homes built before 1940-.were
significantly greater than lead in soil around homes built between 1960 and 1979. Similarly, soil lead
around public housing units built before 1950 are significantly greater man lead in soil around homes
built between 1960 and 1979.
The Census region (Northeast, Midwest, South, West) in which the housing unit was located was
also an important predictor of soil lead levels. The data analysis showed that after adjusting for the age
of the housing unit, soil around private housing units in the Northeast region has, on average, higher lead
concentrations than in any other region, and soil in the Midwest region has on average, higher lead
concentrations than those in either the West or South regions. One possible explanation is that the
Northeast and Midwest are more industrialized, e.g., have the highest level of industrial productivity, of
the four regions of the United States.
Another finding was soil lead levels around homes in urban, suburban, and rural areas were
unexpectedly not significantly different, after adjusting for building age and other factors. Explanations
of this result include one or more of the following: the distribution of privately-owned homes where soil
lead measurements were not taken corresponds to sites which were expected to have high soil lead
concentrations (33 of the 93 sampled private housing units in large metropolitan areas have at least one
missing soil lead measurement), the correlations between the degree of urbanization and other factors,
such as traffic, might be reducing the effect of highly urbanized areas, and the random variation in the
data associated with the selection of the homes.
-------
After adjusting for building age, Census region, and other factors, the presence of lead-based
paint was an important predictor of soil lead at all three locations. The condition of lead-based paint,
however, was not an important predictor of soil lead at any of the three soil locations.
Public Housing
Soil lead samples were available for only 30 percent (29 of 97) of the sampled public housing
units, and the distribution of public housing units with soil lead samples was not consistent with national
distributions. These problems prevented any reliable national estimates of soil lead prevalence in public
housing from being calculated.
Although no estimates for the effects of the degree of urbanization could be made with respect to
public housing developments, the relationship between soil lead and housing unit characteristics in
public housing was analyzed with respect to building age and the presence and condition of lead-based
paint. The findings showed that these relationships were similar to those in private housing data. The
building age was the most important predictor of soil lead concentrations. The Census region in which
the development was located was an important predictor of soil lead after adjusting for the age of the
development Housing unit variables that were correlated with soil lead but were not significant
predictors of soil lead after adjusting for the age of the development and the Census region included the
number of family units in the public housing development (which was slightly correlated with the
development's building age) and the condition of lead-based paint in and around the housing unit
XI
-------
Xll
-------
1. Introduction
The 1987 amendments to the Lead-Based Paint Poisoning Prevention Act required the Secretary
of Housing and Urban Development (HUD) to "estimate the amount, characteristics and regional
distribution of housing in the United States that contains lead-based paint hazards at differing levels of
contamination." In response to this act, HUD initiated the National Survey of Lead-Based Paint in
Housing, or the National Survey which was completed in 1990. The National Survey produced a
detailed, statistically valid, national database on the extent of lead-based paint and lead in soil and dust
These data have been and continue to be analyzed to support the development of Federal policy and
programs with respect to the lead hazard hi homes.1
Issues currently before the U.S. Environmental Protection Agency involve the relationships
between housing unit characteristics and lead exposure levels. Soil lead is believed to be a significant
contributor to the lead hazard in homes since children often come in contact with lead through soil and
dust In addition, lead-based paint, primarily exterior lead-based paint, is believed to be a significant
contributor to soil lead contamination. Although the National Survey did not collect data on direct
measures of lead exposure, such as children's blood lead levels, an analysis of the relationship between
soil lead and housing unit characteristics may aid in understanding the relationship between housing unit
characteristics and potential lead exposure.
EPA is developing health-based standards for dust, paint, and soil lead concentrations under
Section 403 of the Residential Lead-Based Paint Hazard Reduction Act of 1992 (Title X). These
standards are published as EPA's Guidance on Identification of Lead-Based Paint Hazards2 and referred
to as the 403 Interim Final Rule.
1.1 Purpose of Report
The purpose of this report is to supplement the prior reports on the National Survey by
addressing the following objectives:
• Present findings from die National Survey on the prevalence and concentrations of lead in
soil around private and public housing units in the United States, including estimates of the
number of housing units with different soil lead concentrations, nationally, by building age,
Census region, and degree of urbanization;
• Summarize the statistical associations between soil lead concentrations and soil location,
building age, degree of urbanization, Census region, and the presence and condition of
interior and exterior lead-based paint;
1 A complete discussion of the National Survey, including the design, sample collection protocol, and results from
the data analyses, can be found in EPA's Report on the National Survey of Lead-Based Paint in Housing.
2 Guidance on Identification of Lead Based Paint Hazards, Federal Register, v-60 (175): September 11,1995.
-------
1.2 Overview of the National Survey
The National Survey was conducted by HUD. In that sample survey, 381 housing units, 284
private and 97 public, were inspected for lead in paint on interior and exterior surfaces, lead in interior
dust, and lead in exterior soil. The objective of the National Survey was to obtain data for estimating the
following:
• The number of housing units with lead-based paint;
• The surface area of lead-based paint in housing, used to develop an estimate of national
abatement costs;
• The condition of the paint;
• The prevalence of lead in house dust and in soil around the perimeter of residential
structures; and
• The characteristics associated with varying levels of potential lead hazards in housing in
order to examine possible priorities for abatement.
The study population consisted of nearly all housing in the United States constructed before
1980. Newer houses were presumed to be lead-free because hi 1978 the Consumer Product Safety
Commission banned the sale of lead-based paint to consumers and the use of such paint in residences.
The survey was conducted between December 1989 and March 1990 in 30 counties across the 48
contiguous states.
The 30 counties were randomly selected from the approximately 3,000 counties in the United
States to represent the nation's private and public housing stock built before 1980. The counties were
stratified by Census region (Northeast, South, Midwest, and West) and climate (mild or severe weather)
and selected with probability proportion to size. The private housing units were selected as follows.
Within each sampled county, five census blocks were randomly selected and a list of every housing unit
within each census block was developed. An initial sample of the listed units was randomly selected for
in-person screening visits to establish eligibility. An average of 20 housing units per census block were
screened and an average of 11 were found to be eligible. From the eligible housing units, two (plus
backups) were randomly selected.
The public housing units were selected as follows. Within each sampled county, lists of the
Public Housing Authority (PHA) housing developments, including the numbers and types of units in the
development, were created from lists supplied by HUD. The lists for each of the 30 counties were
merged, sorted by the age of the development, and a stratified random sample of 110 developments was
drawn. Within each of the selected developments, one unit was randomly selected.
Within each sampled private and public housing unit, two rooms were randomly selected for
inspection — one with plumbing, a "wet room," and the other without plumbing, a "dry room." In each
room, field technicians inventoried painted surfaces, measured the surface area, and assessed the
condition of the paint They also measured the lead loadings on randomly selected painted surfaces with
portable X-ray fluorescence (XRF) analyzers. Exterior painted surfaces of each dwelling unit were also
inventoried, and XRF measurements were made on one randomly selected side of the house to detect the
presence of lead in paint
-------
Exterior soil samples and interior dust samples were also collected. Generally, three soil core
samples were taken from each dwelling unit: one outside the main entrance to the building, a second
along the drip line (soil next to the housing unit), and a third at a remote location away from the building
but still on the property. The drip line and the remote samples were usually collected on the same,
randomly selected side of the house as the exterior XRF paint lead measurement. Dust samples were
collected on floors, window wells, and windows sills in the wet and dry rooms and from the floor
immediately inside the main entrance to the dwelling unit Dust samples were also collected from
common areas inside private multifamily and public housing units. Since the sample size for the
common area dust samples was small, they are not discussed in this report. Both dust and soil samples
were sent to laboratories for lead analysis.
Midwest Research Institute (MRJ) was the subcontracting laboratory responsible for the analysis
of bom soil and dust samples. MRI and its subcontractor, Core Laboratories, with a she in Casper,
Wyoming and another in Aurora, Colorado, analyzed the samples for lead. The Casper facility analyzed
both soil and dust samples, while the Aurora facility analyzed only dust samples. A total of 3,231
samples, 1,053 soil samples and 2,178 dust samples, were analyzed. The dust samples were analyzed by
graphite furnace atomic absorption (GFAA) spectroscopy. The soil samples were analyzed by
inductively coupled plasma-atomic emission spectrometry (ICP-AES). Internal checks, including
duplicate injections to measure instrument precision, and external checks, including the analysis of split
samples to measure the variability from sample handling prior to analysis, were used to track
performance. In addition, performance check samples were analyzed to measure the accuracy of the
analytical procedures. The results on the internal, external, and performance checks were satisfactory,
meeting most of the data quality objectives. MRI's Analysis of Soil and Dust Samples for Lead (Pb),
Final Report* details its methodology and data quality procedures.
3 Analysis of Soil and Dust Samples for Lead (Pb), Final Report, May 8,1991. Prepared under contract to the U.S.
Environmental Protection Agency. EPA Contract No. 68-02-4252.
-------
-------
2. Conclusions
This chapter presents the overall conclusions from the analyses of possible predictors of lead in
soil. The specific objectives and analytic requirements of many of these analyses were not foreseen
when the National Survey was designed and implemented. Therefore, the suitability of the data for
analysis, which includes a review of what the data actually represent were evaluated. The conclusions
about the suitability of the data for analysis and results from the analyses are presented followed by a
more detailed explanation of the conclusion.
1. The private housing data in the National Survey can be viewed as representative of the
nation's housing stock and suitable for the analysis.
For private housing units, the distribution of households in the National Survey was not
significantly different from the distribution of households in the American Housing Survey with respect
to building age. Differences with respect to the Census region, though, were only marginally significant
in that more dwelling units located in the South were sampled in the National Survey than expected
based on the American Housing Survey. Additionally, soil samples were taken at 94 percent of the
private housing units hi the National Survey. Because the distributions of households hi the National
Survey were not significantly different from those found hi American Housing Survey, only a small
percentage (six percent) of the sampled privately-owned homes had no soil lead measurements, and a
large amount of data was available (over 250 observations for each model), there are no apparent reasons
why inferences cannot be drawn from analyses for private homes.
2. The public housing data in the National Survey can not be viewed as representative of the
nation's public housing stock and results about public housing should be viewed with
caution.
For public housing units, differences between the distribution of sampled public housing units
and the distribution of all public housing units, provided by HUD, are significant based on both Census
region and building age. Moreover, problems with the lack of soil lead measurements make analyses of
the data difficult to interpret Soil samples were available at only 30 percent (29 of 97) of the sampled
public housing developments. Given both the distributional inequality and the relatively small number
of public housing units where soil samples were taken (n=29), all conclusions about public housing units
and results from analyses of the public housing data should be viewed with caution.
3. The strongest statistical predictor of soil lead in private and public housing for all sample
locations is the housing unit's date of construction.
The date of construction, or building age, measures the amount of time since the construction of
the building and, in many cases, is the last major disturbance of soil. Thus, the building age likely
measures the length of time lead — from the housing unit and/or neighboring activity sources — has been
accumulating on the soil. For private housing units, soil lead around homes built before 1940 were
significantly greater than lead in soil around homes built between 1960 and 1979. Similarly, soil lead
around public housing units built before 1950 are significantly greater than lead in soil around homes
built between 1960 and 1979.
-------
4. Additional significant predictors of soil lead in private housing include the Census region,
the interaction between the building age and the Census region, the presence of lead-based
paint, and the average daily traffic flow.
After adjusting for the housing unit's age, soil around privately-owned homes in the Northeast
region was estimated to have, on average, higher lead concentrations than in any other region. In
addition, soil around privately-owned homes in the Midwest region was estimated to have, on average,
higher lead concentrations than in either the West or South regions. Soil lead concentrations at the
remote location around privately-owned homes in the Midwest region built between 1940 and 1949,
however, were estimated to have lower soil lead concentrations than in any other region. One possible
explanation for the average higher soil lead concentrations in the Northeast and Midwest regions is that
these regions are the most industrialized, e.g., have the highest level of industrial productivity, of the four
regions of the Unites States.
The presence of lead-based paint was shown to have a significantly positive effect on soil lead
concentrations at all three locations, but to a larger extent at the drip line and entryway locations. In
addition, the traffic flow (a source of lead from automobile emissions) in the neighborhood around the
private housing unit was shown to have a significantly positive effect on soil lead concentrations at the
remote location. These results support the concerns in the 403 Interim Final Rule about lead in
residential soil from "lead-based paint and...as the result of point source emissions or leaded gasoline."
5. The degree of urbanization and condition of lead-based paint are not significant predictors
of lead in soil in private housing.
Soil lead levels around homes in urban, suburban, and rural areas were unexpectedly not
significantly different after adjusting for other factors such as building age, Census region, and the
presence of lead-based paint Explanations of this result include are likely to include one or more of the
following: the distribution of the missing soil lead measurements corresponds to sites which were
expected to have high soil lead concentrations (33 of the 93 sampled private housing units in large
metropolitan areas have at least one missing soil lead measurement); the correlations between the degree
of urbanization and other factors, such as building age or traffic, might be reducing the significance of
the effect of highly urbanized areas; and the random variation in the data associated with the selection of
the homes.
After adjusting for the housing unit's building age, Census region, and presence of lead-based
paint, the effect of the condition of lead-based paint on soil lead levels was also unexpectedly
insignificant This result is likely due to the fact that the condition of lead-based paint is correlated with
the building age, the Census region, and the presence of lead-based paint and does not explain any
significant variation in the soil lead levels after adjusting for the building age, Census region, and
presence of lead-based paint.
6. The only other significant predictors of soil lead in public housing is the Census region.
After adjusting for the building age, soil around public housing developments in the Midwest
and West regions was estimated to have, on average, higher lead concentrations than in the South region.
No estimates of soil lead prevalence around public housing developments could be made for the
Northeast region because only one sampled public housing development had soil samples. In addition,
the effect of the degree of urbanization could not be analyzed because no such data were collected. The
-------
condition of lead-based paint and the number of family units, both positively correlated with soil lead,
were not significant predictors of lead-based paint after adjusting for building age and Census region.
7. The results for the private housing data can be viewed as applicable to private housing
nationally and useful in policy analysis and decision making, but the results for the public
housing data would be viewed only as descriptive of those housing units sampled.
The quality of the private and public housing data was statistically evaluated to determine the
suitability of the soil lead data for the analyses needed in this study. The privately-owned homes
sampled in the National Survey were judged to be representative of the private housing stock nationally.
Therefore, the descriptive statistics presented in the tables for and the results from the analyses on the
private housing data can be viewed as applicable to private housing nationally and useful in policy
analysis and decision making. In contrast, the sampled public housing units were not considered
representative of the public housing stock nationally, and the impact of the large amount of missing soil
data (70%) on the tables and analysis results was expected to be significant The tables and analysis
results for public housing should therefore be viewed only as descriptive of those samples collected.
-------
-------
3. Descriptive Soil Lead and Housing Unit Statistics
This chapter discusses the soil lead data; housing unit characteristics, including the
representativeness of the sampled housing units; and soil lead prevalence levels in private and public
housing units. It also presents summaries of the soil lead and housing unit characteristic data in tabular
form. Sample weights were used in the estimates displayed in most of the tables. This was done so that
inferences could be drawn from these estimates about the populations of private and public housing. The
estimates presented in these tables are, under certain circumstances that are discussed and evaluated in
this chapter, representative of private and public housing nationally. The information presented here is
used as background information for the data analyses presented in Chapters 4 and 5.
3.1 Soil Lead Data
The sampling protocols required that soil be collected from three locations around each sampled
dwelling unit Soil samples were to be taken outside the main entrance to the building, at a selected
location along the drip line of an exterior wall, and at a remote location (away from the building, but still
on the property). The field and laboratory protocols for sampling and analysis are presented briefly in
Chapter 1, in Data Analysis of Lead in Soil and Dust,4 and in MRI's Analysis of Soil and Dust Samples
for Lead (Pb): Final Report.5
Basic weighted descriptive statistics for private and public housing units are presented in Tables
1 and 2. These statistics include the sample mean, standard deviation, coefficient of variation, selected
percentiles, geometric mean, and geometric standard deviation of the soil lead measurements for the
entrance, drip line, and remote soil lead measurements.6 The coefficient of variation is the ratio of the
standard deviation to the mean of the data and describes the spread of the measurements relative to the
average. It is useful for describing data such as soil lead concentration data mat are always greater then
equal to zero. The geometric mean and standard deviation are often used for right skewed data, because
they reduce the impact of extremely large measurements.
Private Housing Data
In some cases, such as around urban private housing units with all areas around the housing unit
paved or with no soil on the property, soil samples were not taken. Of the 284 private housing units in
the National Survey sample, 18 housing units had no soil samples taken and another 26 housing units
were missing data from one or two of the three soil locations. Thus, a total of 44 housing units were
missing one or more soil samples. Of the 18 housing units without soil data, 14 were located in large
metropolitan urban areas, 15 were in the Northeast Census region, and 12 were built before 1940. Of the
44 housing units with some missing soil data, 33 were located in large metropolitan urban areas, 21 were
4 Data Analysis of Lead in SoilandDust, September, 1993. EPA Report number 747-R-93-011.
5 Analysis cf Soil and Dust Samples for Lead (Pb), Final Report, May S, 1991. Prepared under contract to fee U.S.
Environmental Protection Agency. EPA Contract No. 68-02-4252.
6 Additional analyses of the soil lead data may be found in the following reports: HUD's Comprehensive and
Workable Plan for the Abatement of Lead-Based Paint in Privately Owned Housing: Report to Congress, and
EPA's Data Analysis of Lead in Soil and Dust and Report on the National Survey of Lead-Based Paint
-------
Table 1. Descriptive statistics (weighted) for the lead measurements in soil samples at each soil
location in private housing units
Set of data
Number of measurements
Arithmetic mean (ppm)
Percentiles (ppm)
maximum
upper 1%
upper 5%
upper decile
upper quartile
median
lower quartile
minimum
Geometric mean (ppm)
Geometric standard deviation (ppm)
Entrance
samples
260
327
6,829
6,829
1377
775
225
64.8
28.9
2.84
85
2.11
Drip line
samples
249
448
22,974
9,965
1,447
860
234
562
21.6
1.16
74
1.80
Remote
samples
253
205
6,951
2,974
603
278
120
46.7
18.5
1.45
46
1.81
10
-------
Table 2. Descriptive statistics (weighted) for the lead measurements in soil samples at each soil
location in public housing units
Set of data
Number of measurements
Arithmetic mean (ppm)
Percentiles (ppm)
maximum
upper 1%
upper 5%
upper decile
upper quartile
median
lower quartile
• •
BnnflynniMP
Geometric mean (ppm)
Geometric standard deviation (ppm)
Entrance
samples
26
127
527
527
483
438
186
44.0
23.1
8.10
55
127
Drip line
samples
28
117
871
871
871
265
140
312
22.0
10.6
55
128
Remote
samples
29
83
614
614
243
209
99.5
42.9
23.1
5.67
44
1.19
11
-------
in the Northeast Census region, and 20 were built before 1940. A more detailed distribution of the
missing data, including totals for private homes in the National Survey, can be found in Table 3.
Only 24 out of 762, or 3 percent, of the soil lead concentration measurements were reported
below the method detection limit, which ranged from 3 to 20 ppm.7 A common practice of replacing the
measurements below the detection limit with one-half of the detection limit was followed. The replaced
values were consistent with the distribution of all soil lead measurements. Accordingly, the handling of
the measurements below the detection limit is expected to have no significant effect on the statistical
analysis results.
Public Housing Data
As with private housing, soil samples were not collected around all of the sampled public
housing units. Unlike the private housing data, where soil samples were taken at all but 6 percent of the
homes, more than 70 percent of the public housing units had no soil samples taken. This considerably
larger percentage of missing data has the potential to significantly bias the results of any analysis. Of the
97 public housing units in the National Survey sample, 68 had no soil samples, and an additional four
housing units were missing data from one or two of the three soil locations. A more detailed distribution
of the missing data for public housing units in the National Survey can be found in Table 4. No soil lead
concentrations for public housing units that were sampled were below the instrument detection limit
3.2 Soil Lead Prevalence
The weighted sample geometric mean soil lead concentrations at the drip line, entryway, and
remote locations are 74,85, and 46 ppm, respectively, for private homes and 55,55, and 44, respectively,
for public housing units. Paired differences between the log-transformed measurements were used to
determine if the differences in weighted geometric means at different locations were statistically
significant. For private homes, the weighted geometric mean soil lead concentration at the remote
location was significantly lower than that at either the entrance or the drip line locations. The differences
between the entrance and drip line weighted geometric means are not statistically significant The
weighted geometric mean soil lead concentrations at the drip line, entryway, and remote locations in and
around public housing units were also not significantly different For both private housing and public
housing, soil lead concentrations at the three locations were all highly correlated, as shown in Tables 5
and 6 respectively.
Under Section 403 of Title X, EPA has established health-based interim standards for soil lead
concentrations and action recommendations for each standard. The agency recommends that "interim
controls to change use patterns and establish barriers" should be implemented for areas that are expected
to be used by children where soil lead concentrations are between 400 and 5,000 ppm. Within this range,
the degree of activity should be "commensurate with the expected risk posed by the bare soil considering
both the severity of [lead] exposure...and the likelihood of the children's exposure." For areas where
contact by children is less likely or less frequent, the "interim controls" should be implemented when soil
lead concentrations are between 2,000 and 5,000 ppm. Moreover, the agency recommends the
"abatement of soil" with lead concentrations above 5,000 ppm regardless of the likelihood of children's
exposure.
7 Of the 24 soil lead measurements below the instrument detection limit, 4 were entryway soil samples, 8 were drip
line samples, and 16 were remote location samples.
12
-------
Table 3. Detailed distribution of private housing units missing soil lead concentration
measurements by age, region, and urbanization
Total number of homes
Missing one,
two, or three
soil
measurements
44
Missing all
three soil lead
measurements
18
Missing no soil
lead
measurements
240
Total number of
homes hi
National
Survey
284
Building age
pre-1940
1940 to 1949
1950 to 1959
1960 to 1979
24
6
7
7
12
2
2
2
53
24
50
113
77
30
57
120
Census region
Northeast
Midwest
South
West
23
8
10
3
15
1
2
0
30
61
106
43
53
69
116
46
Degree of urbanization
Urban area in a large
metropolitan city
Suburban area in a large
metropolitan city
Urban area in a small
metropolitan city
Suburban area in a
small metropolitan area
Nonmetropolitan
33
7
3
1
0
14
4
0
0
0
60
59
41
23
57
93
66
44
24
57
13
-------
Table 4. Detailed distribution of public housing units missing soil lead concentration
measurements by age and region
Total number of homes
Missing one,
two, or three
soil
measurements
72
Missing all
three soil lead
measurements
68
Missing no soil
lead
measurements
25
Total number of
homes in
National
Survey
97
Building age
pre-1950
1950-1959
1960-1979
24
20
28
22
20
26
6
4
15
30
24
43
Census region
Northeast
Midwest
South
West
42
5
23
2
42
4
21
1
1
6
9
9
43
11
32
11
14
-------
Table 5 Correlations between log-transformed soil lead measurements in private housing from
different soil locations around the same dwelling unit
Soil lead measurements
Exterior entrance Drip line Remote location
Soil lead
entrance
Soil lead
drip line
Soil lead
remote
260
0.7148
0.0001
246
0.6090
0.0001
247
0.7148
0.0001
246
249
0.6780
0.0001
243
0.6090
0.0001
247
0.6780
0.0001
243
253
Note: For each cell in Table 5, the top number is the correlation coefficient, the middle is the probability
that a sample correlation this far from zero might occur by chance if there were actually no correlation in
the underlying population, and the bottom number is the number of paired measurements used to
calculate the correlation.
15
-------
Table 6 Correlations between log-transformed soil lead measurements in-public housing from
different soil locations around the same dwelling unit
Soil lead
entrance
Soil lead
drip line
Soil lead
remote
Soil
Exterior entrance
- -• v^-ies^_ . -<->'•*:*• .
. -.t-rb^mtsp": .-•*>•.«-. ' : v
'•" iTSs-j^j— --.'-';-" 'tSi •'.
••'-i3l.*s£-. -,>•"•: *-.;•-
26
0.7430 .,
0.0001
25
0.4313
0.0278
26
lead measurements
Drip line Remote
0.7430
0.0001
25
• . -j-,
.:•••> •;.*>«. - ^ -
28
0.7150 ;:,.• : ;|
0.0001 ^.7?. ^
28
location
0.4313
0.0278
26
0.7150
0.0001
28
&&-?•»
^feJt^fi-
29
Note: For each cell in Table 6, the top number is the correlation coefficient, the middle is the probability
that a sample correlation this far from zero might occur by chance if there were actually no correlation in
the underlying population, and the bottom number is the number of paired measurements used to
calculate the correlation.
16
-------
Using the data from the National Survey, an estimated 23 percent, or 18 million, of the private
homes in the United States built before 1980 exceed the 400 ppm "further evaluation" guideline; an
estimated 6 percent, or almost 5 million, of the private homes in the United States built before 1980
exceed the 2,000 ppm "interim control" guideline; and an estimated 3 percent, or approximately 2.5
million, of the private homes in the United States built before 1980 exceed the 5,000 ppm abatement
guideline. Table 7 tabulates the weighted number and percentages of private homes with one or more
soil lead concentrations above various levels that might be used as guidelines by EPA. Due to the
considerable amount of missing soil samples at public housing units, no national distribution of soil lead
prevalence levels is presented for public housing units.
Tables 8 and 9 show estimates of the weighted geometric mean soil lead concentrations for the
entryway, drip line, and remote location soil samples by building age, region, and degree of urbanization,
for private homes and public housing units. The estimates of the geometric means for public housing
presented in Table 9 are not precise due to the small sample sizes (n<10) hi most of the building age and
Census region categories. As a result, the apparent relationships displayed in Table 9 within building age
and Census region categories should be interpreted with caution.
33 Housing Unit Characteristics
The housing unit characteristics of interest in this study included the building age of the housing
unit, the Census region, and degree of urbanization. The construction date and state and county locations
of each housing unit were collected by the National Survey and used to classify housing units according
to these categories. Using the construction date from the National Survey, each housing unit was
classified as being built in one of four time periods for private housing units — between 1960 and 1979,
between 1950 and 1959, between 1940 and 1949, or before 1940 - and one of three time periods for
public housing units - between 1960 and 1979, between 1950 and 1959, and before 1950. The state in
which the housing unit was located was used to classify the housing unit into one of four Census regions:
the Northeast, Midwest, South, and West The regions and the states in each region are shown below:
Census Region States
Northeast Maine, New Hampshire, Vermont, Rhode
Island, Connecticut, New York, Pennsylvania,
New Jersey
Midwest Ohio, Indiana, Illinois, Michigan, Wisconsin,
Minnesota, Iowa, Missouri, Kansas, Nebraska,
North Dakota, South Dakota
South Delaware, Maryland, the District of Columbia,
Virginia, West Virginia, North Carolina, South
Carolina, Georgia, Florida, Mississippi,
Alabama, Tennessee, Kentucky, Arkansas,
Louisiana, Oklahoma, Texas
West Montana, Wyoming, Colorado, New Mexico,
Arizona, Utah, Idaho, Washington, Oregon,
Nevada, California, Hawaii, Alaska
17
-------
Table 7. Estimated percent and number of U.S. homes built before 1980 exceeding various soil
lead concentrations
Soil Lead Estimated percent of U.S. homes
Concentration built before 1980 exceeding the
(ppm) concentration (and 95 percent
confidence interval**)
Estimated number (000) of U.S.
homes built before 1980 exceeding
the concentration (and 95 percent
confidence interval**)
400
500
1,000
2,000
2,500
3,000
4,000
5,000
23.4% (14.7%, 34.4%)
20.3% (12.6%, 303%)
113% (6.9%, 17.4%)
7.7% (4.7%, 11.9%)
62% (3.9%, 9.6%)
3.4% (22%, 52%)
3.4% (22%, 5.2%)
3.1% (2.0%, 4.7%)
18,090(11363,26,582)
15,695(9,746,23399)
8,724(5329,13,435)
5,943(3,661,9,175)
4,802(2,984,7387)
2,652(1,706,3,991)
2,652(1,706,3,991)
2,424(1,569,3,632)
Total Homes
Note: Sample Size'
100%
= 266 homes with data
77,179
* The soil lead concentration is the maximum concentration among the drip line, entrance, and remote
location samples for each household with soil lead data.
** The methodology used to calculate the confidence intervals is presented in Section 43.
18
-------
Table 8. Weighted geometric means for soil lead concentrations (ppm) by soil location and
housing unit characteristic for private housing units
Soil Location
Drip Line
Entryway
Remote Location
Building age
pre-1940
1940 to 1949
1950 to 1959
1960 to 1979
480
151
70
27
393
135
74
38
183
67
44
23
Census region
Northeast
Midwest
South
West
198
109
51
35
161
110
63
58
102
48
38
33
Degree of urbanization
Urban area in a large
metropolitan chy
Suburban are in a large
metropolitan city
Urban area in a small
metropolitan city
Suburban area in a small
metropolitan area
Nonmetropolitan
Number of measurements
69
71
130
64
60
249
88
78
118
72
77
260
58
44
53
38
39
253
19
-------
Table 9. Weighted geometric means for soil lead concentrations (ppm) by soil location and
housing unit characteristic for public housing units
Drip line
Entryway
Remote
location
Number of
measurements
Building age
pre-1950
1950-1959
1960-1979
115
183
31
171
184
30
131
44
32
8
4
6
Census region
Northeast
Midwest
South
West
Number of measurements
45
41
41
97
28
230
34
52
75
26
9
49
30
80
29
1
7
10
10
* The number of measurements represents the average number across all soil locations of soil lead-level
readings used to estimate the geometric mean.
20
-------
The housing unit's county and related county statistics were used to designate the unit as
belonging to one of five urbanization categories: urban area in a large metropolitan city, suburban area
in a large metropolitan city, urban area in a small metropolitan city, suburban area in a small
metropolitan chy, or nonmetropolitan area. These categories were defined based on i) the size of the
Primary Metropolitan Statistical Area (PMSA) or Metropolitan Statistical Area (MSA) in which the
county was located and ii) whether or not the county is in the central city of the PMSA or MSA.8 No
such designations were made for public housing units.
Degree of Urbanization Definition
Urban area in a large Area located in a central city of a PMS A/MSA with a
metropolitan city population of over 1 million.
Suburban area in a large Area located in a PMSA/MSA with a population of over 1
metropolitan city million, but not located in a central chy.
Urban area in a small Area located in a central city of a PMSA/MSA with a
metropolitan city population of less than 1 million.
Suburban area in a small Area located in a PMSA/MSA with a population of less
metropolitan chy than 1 million.
Rural/nonmetropolhan area Area not located in a PMSA/MSA.
Other data derived from the National Survey and included in the analyses were the XRF and lead
paint hazard variables. The rationale for including these variables in the model was as follows: 1) to
examine the relationship between soil lead and the presence (defined using the XRF variable) and
condition (defined using the lead paint hazard variables) of interior and exterior lead-based paint and 2)
to control for these factors when assessing the effects of the housing unit characteristics.
The wet and dry room (interior) and exterior XRF variables are the natural logarithms of the
average of the XRF readings on all components weighted by the painted surface area of the components
in the sampled room. A household average XRF variable was calculated as the arithmetic mean of the
wet room, dry room, and exterior XRF variables. The wet and dry room (interior) and exterior lead paint
hazard variables are the natural logarithms of the average of the XRF readings on all components
weighted by the damaged paint surface area of the components in the sampled room. A household
average lead paint hazard variable was calculated as the arithmetic mean of the wet room, dry room, and
exterior lead paint hazard variables.
In an attempt to capture the effects of local traffic volume, the National Survey was
supplemented with data on traffic in the neighborhoods of the privately owned housing units in the
sample. The traffic volume, in vehicle miles per day, was calculated for each housing unit in the
following manner the length of each road within an eighth of a mile of the housing unit was multiplied
by the average number of motor vehicles that passed along that road in a 24-hour period, and these
products were summed across all roads in the eighth of a mile radius of the dwelling unit
8 The largest city in each PMSA or MSA is designated a "central city." There may be additional central cities if
specified requirements are met A more complete definition of "central chy" can be obtained from the U.S. Office
of Management and Budget
21
-------
The relationship between a household's traffic volume and its soil lead-levels is expected to be
nonlinear. Consequently, the traffic volume data were transformed by centering the natural logarithm of
the average daily traffic count at zero to reduce the correlation between the linear and quadratic traffic
terms in the soil lead models discussed in Chapter 4. A more complete description of the traffic volume
data can be found in Data Analysis of Lead in Soil and Dust? Again, no such data were collected for
public housing units.
3.4 Preliminary Analyses of Soil Lead Data and Housing Unit Characteristics
Simple correlations (defined by the product moment correlation coefficient r), which can be used
to identify potential relationships between housing unit characteristics and soil lead concentrations and
are useful tool in the modeling process, are presented in Tables 10 and 11 for private and public and
housing, respectively. The results from the correlation tables are descriptive of relationships in the data,
but these relationships may not apply to private or public housing in general.10 The variables are
separated divided three categories: soil lead concentrations, housing characteristics, and lead-based paint
hazards. The soil lead concentrations are the natural logarithms of the household soil lead levels
analyzed throughout the report, the housing characteristics include the number of family units in the
development (for public housing), the vehicle miles per day (for private housing), and the decade in
which the development was built (for both public and private housing).11 The lead-based paint hazards
include the average household lead hazard and average household XRF variables.12
Private Housing
The building characteristic having the strongest relationship with household soil levels is the age
of the building (r=0.60,0.60, and 0.55 for drip line, entryway, and remote locations, respectively). The
average daily traffic flow, average household lead hazard, and average household XRF reading (which
approximate the amount of lead due to traffic, the condition of lead-based paint in the building, and the
presence of lead-based paint in the building, respectively) were significantly correlated with the
household soil lead levels, although with a smaller correlation than with building age. Additional
correlations of interest were the age of the building and the average household lead hazard (r=0.28), the
age of the building and the average household XRF reading (r=0.19), and the average household lead
hazard and average household XRF reading (r=037).
Public Housing
Correlations in the public housing data display results similar to those from the private housing
correlation analyses. The building characteristic having the overall strongest relationship with household
soil lead levels is the age of the building (r=0.62, 0.53, and 0.28 for drip line, entryway, and remote
locations, respectively). The number of family units was significantly correlated with entryway soil lead
levels (r=0.53) and slightly correlated with drip line and remote location lead levels (r=0.37 and 0.29 for
drip line and remote locations, respectively). The average household paint lead hazard was significantly
9 Data Analysis of Lead in Soil and Dust. September, 1993. EPA Report number 747-R-93-011.
10 A discussion of die suitability of both the private and public housing data is presented in section 3.5.
11 The data are coded as follows: 2 for homes built between 1970 and 1979,3 for homes built between 1960 and
1969, 4 for homes built between 19SO and 1959, 5 for homes built between 1940 and 1949, 6 for homes built
between 1920 and 1939, and 7 for homes built before 1920.
12 A description of these two variables can be found in section 3.3.
22
-------
Table 10. Correlations between soil lead concentrations and housing unit characteristics for private
housing units
Soil Lead Concentrations
Drip line
Entryway
Remote
location
Building
Characteristics
Average
daily
traffic
flow
Age of
building
Lead-based paint
hazards
Average
household
lead
hazard
Average
household
XRF
reading
Drip line
Entryway
Remote
location
0.23754
0.0002
249
OJ0262"
0.0010
260
0.59942
0.0001
249
'0~59Sli
0.0001
260
030009 :
0.0001 j
245 i
OJ9937":"
0.0001 I
255 i
0.28047
0.0001
253
0.54941
0.0001
253
0.29756
0.0001
249
0.35073
0.0001
249
032922
0.0001
260
032499
0.0001
253
Average
daily traffic
flow
0.23754
0.0002
249
Age of
building
0.59942
0.0001
249
0.20262
0.0010
260
0-28047
0.0001
253
0.59511
0.0001
260
0.54941
0.0001
253
0.19335
0.0011
***
276
0.27500
0.0001
276
***
276
O.T9335
0.0001
284
Average
household
lead hazard
030009
0.0001
245
0.29937
0.0001
255
0.29756
0.0001
249
***
276
0.27500
0.0001
276
Average
household
XRF reading
035073
0.0001
249
032922
0.0001
260
032499
0.0001
253
***
276
0.19335
0.0001
284
037416
0.0001
276
037416
0.0001
276
Note: In each cell of Table 10 entries, the top number is the correlation coefficient, the middle is the
probability that a sample correlation this far from zero might occur by chance if there were
actually no correlation in the underlying population, and the bottom number is the number of
paired measurements used to calculate the correlation.
Cells in boldface are significant at the O.OS level.
*** — the correlation is between -0.10 and 0.10 and the p-value is greater man 0.1.
23
-------
Table 11. Correlations between soil lead concentrations and housing unit characteristics for public
housing units
Soil Lead Concentrations
Drip line
Entryway
Remote
location
Building
Characteristics
Family
units in
the
building
Age of the
building
Lead-based paint
hazards
Average
household
lead
hazard
Average
household
XRF
reading
Drip line
Entryway
iWtlSilfr^li
^ ir^3TOMichonsiB£S
' M -f^''7^SSf^^**' •™™1Irt™&'S
Remote
location
0.36990
0.0527
0.0053
26
0.62071
0.0143
28
"632885
0.0055
26
034533
0.0719
28
"6I49764
0.0097
26
029463
0.1208
29
027882
0.1430
29
026167
0.1703
29
***
28
"**"*'
26
***
29
Family units
in the
building
Age of the
building
0.36990
0.0527
28
0.62071
0.0143
28
0.53099
0.0053
26
0.52885
0.0055
26
029463
0.1208
29
027882
0.1430
29
^^-:r^F.
K^r&^?
.:?J-J
i*^;^
0.17447
0.0874
97
^fepysg^-.- -•••••'^•^^••rfv
0.17447 Sfeesr*J;-y;«
0.0874 s^r;:^^
97 ^tv^k
***
97
***
97
***
97
0.15895
0.1199
97
Average
household
lead hazard
034533
0.0719
28
0.49764
0.0097
26
0.26167
0.1703
29
97
Average
household
XRF reading
28
***
26
29
***
97
97
o'isWs"
0.1199
97
0.18390
0.0714
97
0.18390
0.0714
97
Note: In each cell of Table 11 entries, the top number is the correlation coefficient, the middle is the
probability that a sample correlation this far from zero might occur by chance if there were
actually no correlation in the underlying population, and the bottom number is the number of
paired measurements used to calculate the correlation.
Cells in boldface are significant at the 0.05 level.
*** — the correlation is between -0.10 and 0.10 and the p-value is greater than 02.
24
-------
correlated with entryway soil lead levels (r=0.50) and slightly correlated with drip line and remote
locations lead levels (1=035 and 0.25 for drip line and remote locations respectively). The estimated
correlations between average household XRF and soil lead readings, however, were not significantly
different from zero.
3.5 Suitability of Soil Lead Data
One important measure of the usefulness of the data is how the distributions of the housing
characteristics in the National Survey compare to national distributions. National distributions were
obtained from the American Housing Survey for 1987, performed by the Bureau of the Census and HUD
for private housing units, and from HUD for public housing units.13 The distributions of building age
and Census region from the National Survey were compared to their respective national distributions.
Chi-square tests were used to determine how the distributions in the National Survey compared to those
from the American Housing Survey for private homes and the data provided by HUD for public housing
units. Variance inflation factors of 1.45 for private housing and 1.13 for public housing units were used
to deflate the observed chi-square values to adjust for the survey design effect14 Results from the chi-
square tests are presented in Table 12 for private homes and Table hi 13 for public housing units.
Private Housing Data
For private housing units, the distribution of households in the National Survey was not
significantly different from the distribution of households in the American Housing Survey with respect
to building age. However, differences with respect to the Census region were marginally significant
(p=0.07) hi that more dwelling units located in the South were sampled in the National Survey man
expected based on the American Housing Survey. Because the distributions of households hi the
National Survey were not significantly different from those found in American Housing Survey and a
large amount of data was available (over 250 observations for each model), there are no apparent reasons
why inferences cannot be drawn from analyses for private homes.
Public Housing Data
For public housing units, differences between the distribution of sampled public housing units
and the distribution of all public housing units, provided by HUD, are significant (p=0.04) based on both
Census region and building age. Moreover, problems with the lack of soil lead measurements make
analyses of the data difficult to interpret As noted earlier, soil lead samples were taken at only 30
percent (29 of 97) of the sampled public housing units. Given both the distributional inequality and the
relatively small number of public housing units where soil samples were taken (n=29), all conclusions
about public housing units and results from analyses of the public housing data should be viewed with
caution.
13 The data used to represent the national distributions of building age and region can be found in the reports of the
National Survey, primarily Tables 3-6 and 3-7 of the EPA Report on the National Survey of Lead-Based Paint in
Housing—Appendix II: Analysis.
14 The variance inflation factors (VIFs) were estimated in the original analysis of the National Survey data.
25
-------
Table 12. Chi-square results for building age and Census region variables for private housing units
Building Age
Housing Units Observed from
National Survey
Estimated from American Housing
Survey (1987) (thousands)
Expected frequencies*
Individual chi-square values*
pre-1940
77
21,215
76.1
0.010
1940 to 1949
30
7,945
28.5
0.079
1950 to 1959
57
13,056
46.8
2.209
1960.to 1979
120
•
36,965
132.6
1.194
*The chi-square statistic was calculated assuming fixed total of 284 homes with data on building age (4
cells and 3 degrees of freedom).
Total chi-square statistic 2.41
P-value with 3 degrees of freedom 0.49
Census Region
Housing Units Observed from
National Survey
Observed from American Housing
Survey (1987) (thousands)
Expected frequencies**
Individual chi-square values**
Northeast
53
17,618
63.2
1.644
Midwest
69
20344
73.0
0216
South
116
25,589
91.8
6390
West
46
15,628
56.1
1.804
**The chi-square statistic was calculated assuming a fixed total of 283 homes with data on region (4 cells
and 3 degrees of freedom).
Total chi-square statistic 6.93
P-value with 3 degrees of freedom 0.07
Note: The chi-square statistics represent the sum of the individual chi-square statistics weighted by the
design effect
26
-------
Table 13. Chi-square results for building age and Census region variables for public housing units
Building Age
Housing Units Observed from
National Survey
From HDD's national database
(thousands)
Expected frequencies*
Individual chi-square values*
pre-1950
30
162
19.7
5.433
1950-1959
24
247
30.0
1213
1960-1979
43
388
47.3
0.391
*The chi-square statistic was calculated assuming fixed total of homes with data on building age (3 cells
and 2 degrees of freedom).
Total chi-square statistic 623
P-value with 2 degrees of freedom 0.04
Census Region
Housing Units Observed from
National Survey
From HUD's national database
(thousands)
Expected frequencies**
Individual chi-square values**
Northeast
43
272
30.1
5.483
.Midwest
11
152
16.8
2.009
South
32
361
40.0
1.618
West
11
90
10.0
0.101
**The chi-square statistic was calculated assuming a fixed total of homes with data on region (4 cells
and 3 degrees of freedom).
Total chi-square statistic 8.15
P-value with 3 degrees of freedom 0.04
Note: The chi-square statistics represent the sum of the individual chi-square statistics weighted by the
design effect
27
-------
3.6 Implications of Missing Soil Lead Data
The National Survey protocols specified sampling of soil on the selected property with a soil
coring device.15 Soil samples were not to be collected on neighboring properties if samples could not be
collected on the property selected. A percentage of both private housing and public housing buildings (6
and 70 percent respectively) were surrounded by pavement preventing any soil core samples. Two
questions arise as a result of the missing soil samples: i) are the soil samples taken representative of the
soil samples of interest and ii) how do the missing soil samples affect the results.
Different uses of the data may have required alternative sampling protocols. Some alternative
sampling protocols include:
1) Sampling soil in the neighborhood of the housing development, even if only on neighboring
properties,
2) Sampling soil as a form of exterior dust in which the dust might be collected using a
vacuum or scrape sample from dwelling units with no soil areas, and
3) Sampling the vegetation and/or other soil coverings, as well as the soil to examine the entire
lead hazard.1 ^
To the extent that the soil samples collected in the National Survey are similar to or
representative of the soil samples of interest, the results presented in later sections might be viewed as
applicable to public and private housing nationally. According to the 403 Interim Final Rule, soil
samples should be taken on bare soil in the area of concern. As a result, the soil samples collected in the
National Survey, core soil samples taken on the property, can be viewed as representative of samples
called for hi the 403 Interim Final Rule.
If soil lead concentrations are higher near older homes, homes in the Northeast region, and
homes in large metropolitan urban areas — the housing unit characteristics associated with the bulk of the
missing data — the estimated impacts of building age, Census region, and degree of urbanization on soil
lead concentrations and the estimated number of homes exceeding the various soil lead concentrations
would be lower than the true impacts and the true number of homes, respectively. Since only six
percent of the privately-owned homes had no soil areas for soil core sampling, the impact of the missing
soil lead data is not expected to be significant and the descriptive statistics in the tables and the results
from the analyses can be viewed as applicable to private housing nationally. The results from public
housing data, however, should only be viewed as descriptive of those samples collected because i) me
sample of public housing is not representative of public housing developments nationally and ii) the
impact on the prevalence and distributions of soil lead levels as a result of missing almost 70 percent of
the soil lead data is expected to be significant
15 It should be noted that at housing units where no soil samples were taken scrape sampling might have been
possible. Such sampling methods, however, would produce questions concerning the measurement comparability
between core and scrape samples. These questions, in turn, would make it difficult to compare the core and scrape
sample lead concentration measurements.
16 If soil has high concentrations of lead from external sources, such as lead in gasoline and lead in exterior or
interior paint, it is likely that the vegetation and/or other soil coverings would have high concentrations of lead as
well
28
-------
4. Statistical Approach
This chapter discusses the modeling and testing procedures used to show the relationship
between housing unit characteristics and soil lead concentrations and explains how the confidence
intervals for classification percentages were estimated. Many researchers believe that soil lead comes
mainly from paint lead and automobile emissions. A review of the evidence in support of this hypothesis
can be found in the Comprehensive and Workable Plan for the Abatement of Lead-Based Paint in
Privately-Owned Housing?1 Similarly, interior dust levels are believed to be related to soil lead levels.
4.1 Private and Public Housing Model
The purpose of the private and public housing model is to produce estimates of the relative
strengths of the associations between the natural logarithm of the soil lead concentrations (response
variables) and the housing unit characteristics, XRF measurements, and paint lead hazards (explanatory
variables) to determine which of the explanatory variables are good predictors of soil lead. It is to be
noted, though, that a strong statistical association between the explanatory (housing unit characteristics
and paint lead variables) and response (natural logarithm of the soil lead concentrations) variables does
not by itself establish a causal relationship among them. The two variables may have a strong statistical
relationship but not a causal relationship. These variables may be caused by a third, unidentified
variable, or the relationship may be a statistical artifact
Assume the following relationship between soil lead levels and housing unit characteristics and
other factors affecting soil lead:
(1) Y=a+p1X1+p2X2+-+PkXk-f«
In this model, Y represents the response variable, X}, X2, - . ., Xk represent the housing unit
characteristics and other factors affecting soil lead, a is the intercept, the parameters Plt Pj, • • -jpfc are the
coefficients of Xj, X2,..., Xk respectively, and s is the measurement error. Having knowledge of the
parameters allows the determination of which characteristics or factors play an important role in
determining or predicting soil lead concentrations. By combining the categorical characteristics and
factors into T and the assigning vie leftover n (n
-------
The coefficients a, c, and dl9 d2,.. „, d,, are the estimates of the model parameters a, y, and 5j, 82,
• • -, 5n and are calculated so that the weighted variance of the prediction errors, or residuals, e, is
minimized. The weights in the model were die sampling weights. As a result of the sample design, a
variance inflation factor is applied to the variance estimates to generate unbiased estimates.
The parameter estimates will be unbiased estimates of the true parameters if all three of the
following conditions hold: 1) the natural logarithm of the soil lead, Y, is the only variable that has
measurement error, 2) the measurement errors, s, are independent and the expected magnitude of the
measurement error is constant, and 3) the equation used in the model has the same independent variables
and mathematical form as the true relationship. Biased parameter estimates could lead to incorrect
conclusions about the relationships between soil lead concentrations and housing unit characteristics.
Although it is likely that some, if not all, of the continuous explanatory variables are measured
with error, the lack of knowledge about the true relationship between the explanatory and response
variables is the most important concern with respect to these models. Because of mis lack of knowledge,
it is important to keep all variables hi the analysis mat might affect the response variable. If key
explanatory variables are left out, vie estimates of the response variable based on the remaining
explanatory variables may be biased. If extra explanatory variables are included in the model, the model
estimates for the true explanatory variables will be unbiased, but only in the absence of measurement
error in the independent variables. The parameter estimates, though, will not be as precise as if the
extraneous variables were not in the model.
In the analysis of covariance model, parameter estimates are generated for all variables. These
estimates for continuous variables are unbiased (if all three of the above criteria are met) and have simple
interpretations. For these variables, the parameter estimates and 95 percent confidence intervals are
reported. The statistical significance of the categorical variables and the least squares means for each
level within a categorical variable are reported. The least squares means are estimates of the average
response (soil lead concentration) given the particular classification of the categorical variable of
interest, while holding all other variables at their averages.
Modeling and Testing Procedures
All variables that conceptually have a significant impact on household soil lead levels and were
available in the National Survey database were used in the initial analyses. These variables included the
building or development's age (measured as the date of construction), Census region, and degree of
urbanization (for private housing), two-way interactions between the age and Census region, the Census
region and degree of urbanization (for private housing), and the age and degree of urbanization (for
private housing), a three-way interaction between the age, Census region, and degree of urbanization, the
building's average daily traffic flow (for private housing), the number of units in the development (for
public housing), and ulterior and exterior XRF and lead hazard variables,18 which approximate the
presence and condition of lead-based paint, respectively.
Because parameter estimates in models with extraneous variables are imprecise due to inflated
variances estimates, the extraneous variables in the soil location models were removed. Methods for
removing extraneous variables range from keeping all possibly relevant terms, regardless of their
18 An aggregated household average XRF and lead hazard variable replaced the interior and exterior XRF and lead
hazard variables.
30
-------
statistical significance, to keeping all significant terms regardless of their relevance.- A method which
strikes a balance between these two bounds was used. The key variables of interest to the study-
building age, Census region, and degree of urbanization (for private housing only)~were always kept in
the model regardless of their statistical significance. Then, the most statistically insignificant variables
were removed one at a tune, unless they were one of the key variables of interest in the study. Variables
that were significant in other soil location models were also kept to create more comparable models. As
a result, reasonably relevant terms with some degree of statistical significance, and terms significant hi
any of the other soil location models, were kept in the final soil lead models.
A factor was considered a significant predictor of household soil lead if it was significant at the 5
percent level hi the model fit and the overall regression F statistic was significant at the 5 percent level.
In all cases, the overall regression F statistic was significant at the 5 percent level. Levels within factors
were considered significantly different if the factor was significant at the 5 percent level in the model fit
and the difference between levels was significant at the 5 percent level. No other multiple comparison
procedure was used to evaluate differences in the factor levels.
For significant factors, differences among levels were discussed without stating statistical
significance. Differences that were not significant were occasionally discussed, but only within the
context of understanding the results of the model fit
43 Confidence Intervals for Classification Percentages
The confidence intervals for the percentages reported in Table 7 were estimated using a series of
equations that accounted for measurement error, misclassification error due to measurement error (the
error associated with improper classifications of soil lead), and the expected asymmetry of the
confidence intervals. These calculations were performed for each of the concentration bounds presented
hi Table 7 as described below.
The first step was to compute the misclassification error, ae2, which was obtained in the
following manner
(4) cye2 = (Iip.|*(l-Pi))/n2 i=l,.~,n
The value p5 is the probability that the observed maximum soil lead level is greater than the
specified concentration limit assuming a normal distribution with the mean equal to the observed
maximum value and the variance equal to 0.84.19 Further, n is the number of homes with at least one
soil lead level observation. The variance of the proportion, ap2, can be estimated using the
misclassification error as:
(5) ap2 = (1.45*p*(l-p))/n + ce2,
where p is the observed proportion of homes hi the survey with soil lead levels greater than the
concentration limit and 1.45 is a variance inflation factor used to adjust variance of the proportion.
19 This value is the square of the estimated standard deviation of one soil lead measurement The calculations for
this estimate are found in Data Analysis of Lead in Soil and Dust.
31
-------
To generate an asymmetric confidence interval, the proportions, p; are transformed into
variables, y, which are approximately normally distributed. The transformation is
(6) y(p) = arcsin(Vp).
A 95 percent confidence interval for the transformed variables is calculated as
(7) y(p)±1.96*cy,
where ay2 is the variance of the transformed variables and is calculated as
(8) ay = ay/ap*ap = (Sarcsin(Vp)y9p*CTp.
Asymmetric lower and upper confidence limits for the proportion p are calculated from the lower
and upper confidence limits of y using equation (6).
32
-------
5. Modelling Results
Soil lead concentrations were regressed on housing unit characteristics, including the building's
age, Census region, and degree of urbanization, and the presence and condition of lead-based paint
Additional variables also considered to be related to soil lead and used in the analyses included the
average daily traffic flow in the neighborhood of die housing unit (for private housing only) and the
number of family units in the development (for public housing only). Soil lead concentrations from each
location, the drip line, entryway, and remote location, were analyzed separately. The natural logarithms
of the soil lead concentrations were used in all analyses as the response variables. In addition to
examining the relationship between soil lead levels and housing unit characteristics, this report also
examines the relationships between soil lead levels and interior and exterior paint lead levels.20
In each of the soil lead models, soil lead levels were regressed on the housing unit*s region,
building age, degree of urbanization, building age by region interaction, building age by degree of
urbanization interaction (for private housing), average paint lead hazard, average XRF, average daily
traffic counts (for private housing), and the number of family units in the building (for public housing).
In mis section, the results from the analysis of covariance models for the drip line, entryway, and remote
locations are presented in Tables 14 through 16 for private housing and Tables 17 and 18 for pubk'c
housing. These results include the significance and least-squares means of the categorical variables, the
parameter estimates of the continuous variables, and the model statistics.
There are two important concepts to remember in the discussions of the results. First, the
significance levels of the categorical variables show whether or not the levels of a categorical variable
have significantly different effects on the soil lead concentration. Second, the least-squares means show
how the levels of a categorical variable differ with respect to their effects on the soil lead concentration.
The categorical and continuous variables mat are statistically significant at the 5 percent level are
shown in boldface in Table 14 for the private housing results and Table 17 for the public housing results.
At the bottom of these tables, the model statistics, the number of observations used in the analysis and
the R-square, are presented for each soil lead location model. In these analyses, the R-square is viewed
as the percent of variation explained by the model, not as a measure of comparison between models. The
least-squares means and 95 percent confidence intervals for the categorical variables in each private
housing soil lead model are presented in Figures 1 through 4 and Tables 17 and 18. The least-squares
means and 95 percent confidence intervals for the categorical variables in each public housing soil lead
model are presented in Figures 5 and 6 and Table 17. Simple correlations between housing unit
characteristics, paint lead hazards, and soil lead concentrations in public housing units are presented in
Table 15.
The variance estimates from the analysis of covariance models, the mean-square error, and
variances of the parameter estimates were inflated as a result of using the sampling weights in the
analysis. The private housing variance estimates were inflated by a factor of 1.45 and the public housing
units were inflated by a factor of 1.13.
20 Additional discussions and conclusions on die relationship between soil lead levels and paint lead levels can be
found in Data Analysis of Lead in Soil and Dust.
33
-------
5.1 Private Housing Results
The strongest predictor of soil lead for all soil sample locations was the age of the dwelling unit
Dwelling unit age measures the length of time since the construction of the building and, in most cases,
the last major disturbance of soil. Thus, the dwelling unit age measures the length of time that lead
deposits — from dwelling unit and neighboring activity sources — have accumulated in the soil. In
addition, a two-way interaction involving the building age and Census region was significant in one of
the soil lead models. This two-way interaction, building age by region, provides a useful tool to quantify
the extent to which the factors of interest are not additive. The least squares means, the estimated
average of the soil lead measurements from a soil lead model, and 95 percent confidence intervals for Hie
building age, Census region, and degree of urbanization variables are presented in Figures 1, 2, and 3
respectively and in tabular form in Table IS. The least squares means for the interaction of building age
by region are presented graphically in Figure 4 and in tabular form in Table 16.
There were other significant predictors of soil lead in each of the soil location models. These
included the Census region, the presence of lead-based paint (as measured by the average XRF reading),
and the average daily traffic count Which predictors were significant depended on the location from
which the soil samples were obtained. Although the degree of urbanization was not a significant
predictor, it was left in the three soil lead location models because it was one of the key variables of the
study. The significant predictors in the drip line and entryway soil lead models were nearly identical, but
were different from the remote location soil lead model.
Drip Line and Entryway Models
For both the drip line and entryway soil lead models, the Census region factor was statistically
significant, although more significant in the drip line soil lead model than the entryway soil lead model.
The building age by Census region interaction was not significant in either the drip line or entryway soil
lead models. In both models, the housing units in the Northeast region were shown to have significantly
higher soil lead concentrations than soil lead concentrations in the South and West regions and have
higher soil lead concentrations man the Midwest region after adjusting for the housing unit's age.
Many studies have shown mat urban areas have higher soil lead concentrations than suburban
and rural areas.21 In mis analysis, it was expected that homes in urban areas would have higher soil lead
concentrations man homes in suburban and rural areas. Similarly, homes in large metropolitan areas
would have higher soil lead concentrations than homes in small metropolitan areas. In me drip line and
entryway soil lead models, the degree of urbanization factor was not significant As a result, soil lead
levels around homes in urban, suburban, and rural areas are not significantly different
There are a number of possible explanations for this unanticipated result One explanation might
be found in reviewing the distribution of the missing soil lead measurements. Generally, soil lead
concentrations are expected to be higher in large, highly urbanized areas. However many such sites have
very little, if any, soil. The larger and more urbanized a site, and the more likely the soil is to have high
lead concentrations, the more likely h is that the soil has been paved over. As a result, average soil lead
21 Examples of such studies include HW Mielke, et al, "Lead concentrations in the inner-city soils as a factor in the
child lead problem," American Journal of Public Health, 1983, and ID Shellshear, et al, "Environmental Lead
Exposure in Christchurch children: soil lead a potential hazard," New Zealand Medical Journal, 1975.
34
-------
Table 14
Soil lead model statistics for private housing models
Soil Location
Drip Line
Entryway
Remote Location
Significance of the Categorical Variables
Building age
Census region
Degree of urbanization
Building age by Census region
.0001
.01
**
**
.0001
.08
**
**
.0001
.06
**
.006
Parameter Estimates and 95 Percent Confidence
Intervals for the Continuous Variables
Average household XRF reading
Traffic
Traffic squared
0.036
(0.014,0.057)
-0.861
(-1371,-0.351)
0.091
(0.045,0.137)
0.042
(0.023,0.062)
-1.242
(-1.703.-0.780)
0.112
(0.071,0.154)
0.030
(0.009,0.051)
1.140
(0.660,1.619)
-0.081
(-0.506,0.344)
Model Statistics
R-Square
Number of observations
.611
249
.542
260
.513
253
: - not significant at the 0.10 level
35
-------
1,000 T
E
a.
o
CO
100--
10
•—Drip line
I- Entryway
—Remote location
—95% confidence intervals
pre-1940
1940 to 1949
1950 to 1959
1960 to 1979
Building Age
Figure 1. Least squares means and 95 percent confidence intervals for soil lead concentrations in
private housing for building age by soil location
36
-------
1,000 T
—Drip line
I" Enttyway
—•Remote location
• - -95% confidence intervals
¥
c.
a
1
o
CO
100-•
10
Northeast
Midwest
South
West
Census Region
Figure 2. Least squares means and 95 percent confidence intervals for soil lead concentrations in
private housing for Census region by soil location
37
-------
1,000 T
-Drip line
Entryway
-Remote location
-95% Confidence Interval
•a
5
o
CO
100-•
10
Urban
large metro
Suburban
large metro
Urban
small metro
Degree of Urbanization
Suburban
5pi^]p metro
Rural
Figure 3. Least squares means and 95 percent confidence intervals for soil lead concentrations in
private housing for degree of urbanization by soil location
38
-------
Table 15. Least-squares means and 95 percent confidence intervals (ppm) for categorical variables
in the private housing unit models
Soil Lead Model
Drip Line
Entryway
Remote Location
Building age
1960-1979
1950-1959
1940-1949
pre-1940
36.7(28.9,46.6)
75.1 (52.5, 107.6)
157.1(95.5,258.5)
329.5(234.0,463.8)
47.0(37.7,58.7)
852(61.5,1182)
152.7(97.6,239.1)
256.5(189.1,348.0)
25.6(203,322)
39.4(27.9,55.5)
90.1 (55.4, 146.7)
210.1(151.9,290.6)
Census region
Northeast
Midwest
South
West
Degree of urbanization
Urban area in a large
metropolitan area
Suburban area in a
large metropolitan area
Urban area in a small
metropolitan area
Suburban area in a small
metropolitan area
Nonmetropolitan
177.1(116.0,2703)
147.5 (102.3, 212.8)
84.0(60.9,115.9)
65.0(42.5,99.2)
1572 (107.4, 230.1)
125.1 (90.1, 173.6)
772(58.4,102.0)
103.5(70.1,153.0)
117.8(76.5,181.4)
50.6(36.0,71.1)
50.9 (38.0, 683)
62.9 (42.0, 943)
103.8(73.9,145.7)
95.9(703,130.7)
145.7(96.4,220.0)
1423(87.4,231.6)
75.6(503,113.6)
110.4(81.9,148.7)
104.0(78.5,137.7)
137.7(95.6,1983)
140.6(912,216.8)
792(54.6,114.9)
78.7(57.0,108.6)
55.0(40.8,742)
53.6(363,792)
66.8(42.4,1053)
81.5 (55.1, 120.5)
39
-------
1,000
100
o
ta
-1960 to 1979 - - D - -1950 to 1959
•1940to 1949 - «- pre 1940
Drip line
10
1,000
o
ta
100
Entryway
10
1,000 r
100
o
CO
10
Remote location
Northeast
Midwest South
Census Region
West
Figure 4. Least squares means for soil lead concentrations in private housing for the building age
and Census region interaction by soil location
40
-------
Table 16. Least-squares means and 95 percent confidence intervals for building-age by Census
region interactions in the private housing unit models
Housing Unit Characteristic
Building age
1960-1979
1950-1959
1940-1949
pre-1940
Census
region*
ME
MW
S
W
NE
MW
S
W
NE
MW
S
W
NE
MW
S
W
Soil Lead Concentrations (ppm)
Drip line soil lead
81.2 (43.4,152.0)
28.2(17.8,44.7)
31.5(22.9,433)
25.1 (15.7,40.1)
71.7(34.0,151.0)
132.1 (643,271.1)
85.5 (47.6,153.5)
39.4(183,84.7)
291.2(90.5,937.2)
188.5 (72.8,488.4)
1263 (533,2993)
87.9(313,246.7)
580.0(322.6,1042.8)
674.5 (373.4,1218.6)
146.7(782275.5)
2053 (892,472.7)
Entryway soil lead
58.6 (33.1,103.7)
41.8(27.6,63.5)
45.5 (342,60.5)
43.9(28.1,68.8)
72.1 (36.6,141.8)
96.7(51.0,183.6)
77.1 (45.6,130.5)
982(47.8201.7)
365.1 (1263,1054.9)
174.8 (73.6,4152)
73.1 (34.9,1532)
116.7(45.6298.5)
396.7(239.0,6583)
345.9(205.7,581.8)
1382 (793240.8)
2282 (107.0,486.9)
Remote soil lead
49.7 (273,90.5)
18.5(11.928.7)
282(20.838.1)
16.6(10.626.0)
32.1 (15.5,66.4)
493 (24.8,98.0)
37.9 (22.0,65.5)
40.1 (193,833)
543.1 (154.6,1908.0)
21.8 (93,50.9)
62.8 (28.9,136.6)
88.9 (33.1238.4)
222.0 (125.6392.6)
330.4 (188.8,5782)
100.0(55.8,1793)
2653(119.7,587.8)
* NE-Northeast
MW - Midwest
S-South
W-West
41
-------
concentrations in large metropolitan urban areas (which are missing- at least one soil lead
measurement at 33 of 93 sampled units) were found to be lower than those in small metropolitan areas
(which are missing at least one soil lead measurement at only 4 of 68 sampled units). A second
explanation might be that the correlation between the degree of urbanization and other factors, such as
traffic, is reducing the effect of highly urbanized areas. The unanticipated result might also be simply
due to the variation in the data associated with the random selection of the homes.
The parameter estimates of the remaining significant predictors of soil lead were relatively
consistent across the drip line and entryway models. The parameter estimate for the average XRF
reading variable was relatively consistent across both the drip line and entryway soil lead models. In
addition, the parameter estimates of both the linear and quadratic terms of the traffic variables were
significant and similar in magnitude across both models. Therefore, the relationship between the log
transformed traffic variable and log transformed soil lead response variable is nonlinear.
Remote Location Model
For the remote soil location model, as well as in the drip line and entryway models, the housing
units in the Northeast region have significantly larger soil lead concentrations at the sampled remote
location than do the other regions. In addition, the building age by region interaction was significant As
with the drip line and entryway models, the average household XRF reading variable was a significant
predictor of soil lead concentration. The effect of traffic was different, however, in mat it was linear in
the remote location model.
5.2 Public Housing Results
As discussed in Sections 3.5 and 3.6, problems with the public housing data limit inferences mat
can be drawn from an analysis of the public housing soil data. The results from the analyses of
covariance, presented in Table 17, are descriptive of relationships in the data, but these relationships may
not apply to public housing in general.
The results from the analysis of covariance are somewhat different than those from the
correlation analysis. The building age was significant hi both the analysis of covariance and correlation
analysis. However, the average household lead hazard variable and the number of family units were
significant in correlation analysis but not significant in the analysis of covariance. The average
household XRF reading variable was not significant in either the analysis of covariance or the correlation
analysis. These three variables—the number of family units, the average household lead hazard, and the
average household XRF reading—do not explain any additional variation in the soil lead concentrations
in the presence of the building age and Census region. The analysis of covariance results are presented
in Table 17 and least squares means for building age and Census region are presented in Figures 5 and 6,
respectively, and in tabular form in Table 18.
When viewing the correlations and the results from the analysis, the reader should be remember
that data from only 30 percent of the sampled units were used to estimate the correlations between
household soil lead concentrations and development characteristics and lead-based paint hazard
variables.
42
-------
Table 17. Soil lead model statistics for public housing models
Soil Location
Drip Line
Entryway
Remote Location
Significance of the Categorical Variables
Building age
Census region
.0003
.04
.002
**
.009
.04
Parameter Estimates and 95 Percent Confidence
Intervals for the Continuous Variables
Average household paint lead
hazard
Average household XRF reading
0232
(-0240,0.707)
-0.037
(-0.116,0.042)
0.417
(-0.124,0.958)
-0.051
(-0.137,0.036)
-0.026
(-0.473,0.421)
-0.042
(-0.117,0.032)
Model Statistics
R-Square
Number of observations
.601
27
.517
25
.461
28
** - not significant at the 0.10 level
43
-------
1000
100
o
V)
10
—Drip line
- Entryway
—Remote location
- -95% Confidence Intervals
pre-1950
1950 to 1959
Building Age
1960 to 1979
Figure 5. Least squares means and 95 percent confidence intervals for soil lead concentrations in
public housing for building age by soil location
44
-------
1000
E
Q.
100
10
—Drip line
I• Enttyway
•—Remote location
• - -95% Confidence Interval
Midwest
South
Census Region*
West
Figure 6. Least squares means and 95 percent confidence intervals for soil lead concentrations in
public housing for Census region* by soil location
* No least-squares means were generated form the Northeast region because the one sampled public
housing unit with soil lead data was removed from the analysis
45
-------
Table 18. Least-squares means and 95 percent confidence intervals for categorical variables in the
public housing unit models
Drip line soil lead
(ppm)
Entryway soil lead
(ppm)
Remote soil lead
(ppm)
Building age
1960-1979
1950-1959
pre-1950
302 (19.7,46.4)
305.5(1212,7702)
126.8 (59.6269.7)
30.1 (18.7,483)
186.4(663,524.0)
1732(66.6,4503)
32.1 (20.7,49.8)
79.8 (302210.9)
149.1 (67.4330.0)
Census region*
Midwest
South
West
1233 (55.12762)
56.0(323,97.1)
1693(1103259.9)
92.0 (36.1234.1)
78.7 (42.2,146.8)
134.1 (64.7278.0)
942(40.5219.1)
37.7(21.5,66.1)
107.7(56.0207.1)
* No least-squares means were generated form the Northeast region because the one sampled public
housing unit with soil lead data was removed from the analysis
46
-------
References
Brown, SJ., Schultz, B., Clickner, R.P., and Weitz, S., August 1992, Data Analysis of Lead in Soil.
Presented at the American Chemical Societies Annual Meeting.
H.W. Mielke, et al, "Lead concentrations in the inner-city soils as a factor in the child lead problem,"
American Journal of Public Health, 1983
I.D. Shellshear, et al, "Environmental lead exposure in Christchurch children: Soil lead a potential
hazard," New Zealand Medical Journal, 1975.
Midwest Research Institute, Analysis of Soil and Dust Samples for Lead (Pb), Final Report, May 8,
1991. Prepared under contract to the U.S. Environmental Protection Agency. EPA Contract No.
68-02-4252.
U. S. Department of Housing and Urban Development, Comprehensive and Workable Plan for the
Abatement of Lead-Based Paint in Privately Owned Housing: Report to Congress. December?,
1990. Washington DC.
U. S. Environmental Protection Agency, Data Analysis of Lead in Soil and Dust, September 1993. EPA
Report No. 747-R-93-011.
U. S. Environmental Protection Agency, Report on the National Survey of Lead-Based Paint, Base
Report, June 1995. EPA Report No. 747-R95-003.
U. S. Environmental Protection Agency, Report on the National Survey of Lead-Based Paint, Appendix I:
Design and Methodology, June 1995. EPA Report No. 747-R95-004.
U. S. Environmental Protection Agency, Report on the National Survey of Lead-Based Paint, Appendix
U: Analysis, June 1995. EPA Report No. 747-R95-005.
U. S. Environmental Protection Agency, Guidance on Identification of Lead-Based Paint Hazards,
Federal Register, v 60 (175): September 11,1995.
47
-------
50272-101
REPORT DOCUMENTATION
PAGE
1. REPORT NO.
EPA 747-R-96-003
4. Title and Subtitle
Distribution of Soil Lead in the Nation's Housing Stock
3. Recipient's Accession No.
5. Report Date
•May. 1996
6.
7. Author(s)
Westat,Inc.
8. Performing Organization Report No.
9. Performing Organization Name and Address
Westat,Inc.
1650 Research Boulevard
Rockville, MD 208SO
10. Project/Task/Work Unit No.
11. Contract (C) or Grant (G) No.
(C) 68-D3-0011
12. Sponsoring Organization Name and Address
U.S. Environmental Protection Agency
Office of Pollution Prevention and Toxics
Washington, DC 20460
13. Type of Report & Period Covered
Technical Report
14.
15. Supplementary Notes
16. Abstract (Limit: 200 words)
In the National Survey of Lead-Based Paint in Housing, conducted by EPA and HUD, lead measurements were
collected on exterior soil, interior house dust, and in interior and exterior, paint for each sampled dwelling unit In
addition, the dwelling unit's age, Census region, and degree of urbanization were obtained. This report presents findings
from the National Survey on the prevalence and concentrations of lead in soil in private and public housing units in the
United States. These findings include national estimates of the number of private housing units wife various soil lead
concentrations and average soil lead concentrations by building age, Census region, and degree of urbanization. The
report also summarizes the statistical associations between soil lead concentrations and building age, degree of
urbanization, Census region, and the presence and condition of lead-based paint An analysis of covariance model was
used to identify possible predictors of lead in soil. The age of the dwelling unit was the predominate predictor of soil lead.
Other statistically significant predictors of soil lead included the dwelling unit's Census region, the dwelling units' average
lead paint levels, and local automobile emissions.
17. Document Analysis
a. Descriptors
Environmental Contaminants
b. Identifiers/Open-Ended Terms
Soil lead, lead-related hazards, National Survey of Lead-Based Paint, Title X, Section 403.
c. COSATI Field/Group
18. Availability Statement
Available to the public from NTIS, Springfield, VA
19. Security Class (This Report)
Unclassified
20. Security Class (This Page)
Unclassified
21. No. of Pages
62
22. Price
(SeeANSI-239.18)
See Instructions on Reverse
OPTIONAL FORM 272 (4-77)
(Formerly NTIS-35)
Department of Commerce
------- |