National Human Exposure Assessment Survey (NHEXAS) Arizona Study Quality Systems and Implementation Plan for Human Exposure Assessment Standard Operating Procedure SOP-IIT-A-9.0 Sampling Weight Calculation


£ rPA United States	m 0fficeof
\/Crr\ Environmental Protection Agency	Research and Development
National Human Exposure Assessment Survey
(NHEXAS)
Arizona Study
Quality Systems and Implementation Plan
for Human Exposure Assessment
The University of Arizona
Tucson, Arizona 85721
Cooperative Agreement CR 821560
Standard Operating Procedure	SOP-IIT-A-9.0
Title: Sampling Weight Calculation
Source: The University of Arizona
U.S. Environmental Protection Agency
Office of Research and Development
Human Exposure & Atmospheric Sciences Division
Human Exposure Research Branch
Notice: The U.S. Environmental Protection Agency (EPA), through its Office of Research and Development (ORD), partially funded
and collaborated in the research described here. This protocol is part of the Quality Systems Implementation Plan (QSIP)
that was reviewed by the EPA and approved for use in this demonstration/scoping study. Mention of trade names or
commercial products does not constitute endorsement or recommendation by EPA for use.

-------
SOP#nT-A-9.0
Revision #0
February 27, 2001
Page: 1 of 8
STANDARD OPERATING PROCEDURE
FOR
SAMPLING WEIGHT CALCULATION
This Standard Operating Procedure (SOP) uses data that have been properly coded and
certified with appropriate QA/QC procedures by the University of Arizona NHEXAS
team.
Objective
Calculate sampling weights. The sampling weights are needed to obtain weighted statistics
of the NHEXAS data.
Introduction
Three types of sampling weights are defined here:
1.	Single-stage Weight: It is the inverse of the probability of selection of each sampled
unit at a particular stage.
2.	Total Weight: Total weight of a sampled unit at a particular stage is calculate by 2
different ways depending on the following conditions:
•	If no weight adjustment has been done in the previous stages. The total weight is
the product of the single-stage weights of that sampled unit at that stage and all
other previous stages.
•	If sampling weights in the previous stages have been adjusted. The total weight is
the product of the single-stage weight of that sampled unit at that stage and the
adjusted total weight of that unit at the previous stage.
3.	Adjusted Total Weight: It is the total weight which has been adjusted to best
represent the population of interest. This type of weight is the one to be used in the
calculation of weighted statistics of data at each stage.
A diagram which explains development of sampling weights in each survey stage of
NHEXAS is presented in Table 6-1.

-------
SOP# IIT-A-9.0
Revision #0
February 27, 2001
Page: 2 of 8
Table 6-1: Sampling Weight Diagram-Survey Stages
STAGE
ACTIVITY
SINGLE-STAGE
WEIGHT
TOTAL
WEIGHT




Survey
1
Select SO Tracts
from the State of
Arizona
WU

Survey
2
Select 250 Blocks
from the 50 Tracts
Wy
WT,j = (WUXWU)
Survey
3
Select 1225 HHs
from the 250
Blocks
WU
WTU = (WTU)(WU)




Survey
4
Select 1 primary
respondent per HH
Wi.4
WPDES = (WTDES'XWi,4)
Calculation of Sinele-staee Weights
In random sampling, each sample unit has equal probability of selection. When n units are
selected from a stratum which has N units, the probability of selection of each unit is:
n
(6-1)
When the "Probability Proportional to Size" or PPS design is applied, sample units which
are different in size have unequal probability of selection. In NHEXAS, the number of
occupied housing units (OHU) are the measure of size. When unit i is selected from a
stratum which has N units, the probability of selection of that unit, 7C», is:
(size of the unit)
(total size of all N units)
(6-2)
When n units are selected, the probability of selection of each unit is multiplied by n and
becomes the frequency of selection, 7ti*:
1 WTDES is the adjusted total weight of unit i at the survey stage 3, WTy. Sampling weight adjustment
are explained in SOP#10.

-------
SOP# IIT-A-9.0
Revision #0
February 27, 2001
Page: 3 of 8
, (n)(size of the i"1 unit)
1 (total size of all N units)
(6-3)
The sampling weight of unit i, W; , is the reciprocal of the probability of selection or
frequency of selection of that unit:
Wi = 1/Tti or l/7Ci*	(6-4)
The equations for weight calculations of each survey stage in NHEXAS are as follow:
Stage 1 : Selection of tracts from each county (PPS design)
W., =	(0HU/a"y)-		(6-5)
(tract selected/county) j (OHU / tract 2) •
Stage 2 : Selection of blocks from each tract (PPS design)
w. , 		(0HU/tratt2)'		(6-6)
x'2 (block selected/tract) j (OHU / block 2) j
Stage 3 : Selection of households from each block (random sampling)
(OHU/block 2):	_
W =	1—	(6-7)
' (OHU selected/block)j
Stage 4 : Selection of one primary respondent from each household (random sampling)
(member/OHU), =(mmber	(6g)
If*	1
Calculation of Total Wei£hts
In a multi-stage sampling, weighted statistics of the samples at any particular stage can be
calculated by using the total weight of each unit at that stage. That total weight is equal to
a product of the single-stage weights resulted from that stage and all the proceeding
stages. Mathematically:

-------
SOP#nT-A-9.0
Revision #0
February 27, 2001
Page: 4 of 8
m
wt,„=nwi,	(s-9)
s=l
where WT^ = total weight of unit i at stage m, and = single-stage weight of unit i at
stage s.
Equivalently, a total weight of unit i at stage m can be considered as a product of the
single-stage weights resulted from that stage and the total weight of that unit at stage m-1
Mathematically:
WT„=(WTim_,XWin,)	(6-10)
If the total weight at stage m-1 is adjusted for nonresponse or adjusted with other
methods, the adjusted total weight will be used to calculate the total weight at the next
stage. Therefore, the equation becomes:
WTim =(WTi>m_1>Myustod)(Wijm)	(6-11)
Variable List
Variable
Description
COUNTY
County I.D., according to the Census.
COUNTY 2
County I D., after some changes are made (see details in Procedure).
TRACT
Tract I.D., according to the Census.
TRACT 2
Tract I.D., after some changes are made (see details in Procedure).
BLOCK
Block I.D., according to the Census.
BLOCK 2
Block I.D., after some changes are made (see details in Procedure).
HHID
Household I D., according to the Census.
RESPONSE
Response status (enrolled or refused to answer the Descriptive
Questionnaire).
OHU/CNTY
Number of occupied housing unit (OHU) per county.
OHU/TRACT
Number of OHU per tract.
ohu/tract 2
Number of OHU per tract, according to TRACT 2.
HHENU/BLK
Number of household enumerated by "field truthing" per block
VAC REP/BLK
Number of vacant household per block
ohu/block
Number of OHU available for contact per block. This is equal to HH
ENU/BLK minus VAC REP/BLK.
ohu/block 2
Number of household available for contact per block, according to
BLOCK 2.

-------
SOP#nT-A-9.0
Revision #0
February 27, 2001
Page: 5 of 8
Variable
Description
TRACT
SELECTED/CNTY
Number of tracts selected per county, according to TRACT 2.
BLOCK
SELECTED/TRACT
Number of blocks selected per tract, according to TRACT 2 and BLOCK 2.
OHU
SELECTED/BLOCK
Number of OHU selected per block, according to BLOCK 2.
UEUBER/OHU
Number of member per OHU.
Wn
Single-stage weight of OHU i. resulted from sampling in stage 1.
m
Single-stage weight of OHU i, resulted from sampling in stage 2.
wl3
Single-stage weight of OHU i, resulted from sampling in stage 3.

Single-stage weight of the primary respondent in OHU i, resulted from
sampling in stage 4.

Total weight of OHU i at stage 3.

-------
SOP# nT-A-9.0
Revision #0
February 27, 2001
Page: 6 of 8
Procedure
1. In SPSS, open ORIGINAL DQX, delete all variables except the following: TRACT, BLOCK,
HHID, and RESPONSE. The data will then be saved as a new file called weight structure
MAIN.
2. In Excel, open WEIGHT STRUCTURE MAIN, the following variables will be added into the file
and their values will be entered: COUNTY, OHU/CNTY, and OHUtlRACT. Data for the 3
variables will be obtained from the 1990 Census data file C_TRACTJ)AT sent to IIT by
the UA research team. Next, the following variables will be added into the file and
their values will be entered: HHENU/BLK and VAC REP/BLK. Data for the two variables will
be obtained from a document called "NHEXAS RECRUITMENT LOG SUMMARY"
which was sent to IIT by the UA research team. Also, a variable called OHU/BLOCK will
be created. Its value, for each household, is equal to the value in HH ENU/BLK minus the
value in VAC REP/BLK.
3. Since there are combination of tracts and blocks, variables for new tract and block i d.
will be created and called TRACT 2 and BLOCK 2, respectively. The combination data is
obtained from the document "NHEXAS RECRUITMENT LOG SUMMARY". For
TRACT 2, all values will be the same as the original tract i d. except in Santa Cruz
county where its two tracts (9962 and 9964) will be combined. They are considered as
one tract with a given tract i.d. "99629964". For BLOCK 2, all single blocks will have
the same block i.d. as in the original block i.d., while all combined blocks will have a
new set of i.d. There is a total of 36 groups of combined blocks. Each of them will be
given a block combination number. The new block i.d. will then be equal to the block
combination number plus 9900. For example, Pima county has 6 selected blocks: 101,
307, 113, 128, 119, and 201. The last two blocks are indicated as combined blocks.
This combination was given a combination number equal to 1. Therefore, these two
blocks will be considered as one block with an i.d. 9901. Two important criteria for
weight calculation purposes are:
a) A block cannot be considered as a single block and a component of combined
blocks at the same time. For example, tract 4 in Yavapai county which has 4
single blocks: 107, 229, 220, and 231; and a group of combined blocks:
231+311+312 is considered as having 3 single blocks: 107, 229, and 220; and
a group of combined blocks: 231+311+312.
b) A block cannot be in different groups of combined blocks at the same time. For
example, tract 9611 in Navajo county has 2 single blocks 135 and 316; and 3
groups of combined blocks: 147+218, 211+218, and 239+218. Since block
218 appears in all 3 groups, the groups must be combined. As a result, this
tract is considered as having 2 single blocks: 135 and 316; and a group of
combined blocks 147+211+239+218.
The new tract and block i.d., which are identified by the variables TRACT 2 and BLOCK 2
will be used for the rest of the weight calculation procedure.
4. When using the SUDAAN program, it is required that the number of Secondary
Sampling Unit (SSU) selected from each Primary Sampling Unit (PSU) is more than

-------
SOP# IIT-A-9.0
Revision #0
February 27, 2001
Page: 7 of 8
one. In NHEXAS, the PSU is county and the SSU is tract. Therefore, any county
which has only one tract selected must be combined with another county. Based on
geographical attributes of the counties, combinations of counties were made and the
new county i.d. are:
Combined Counties
New I D. in COUNTY 2
1 and 17
1017
9 and 11
9011
15 and 25
15025
19 and 23
19023
Similarly, the new county i.d. which is identified by COUNTY 2 will be used for the rest
of the weight calculation procedure.
5. In weight structure main, a new variable OHU/TRACT 2 will be created. Its values will be
the same as OHU/TRACT except in tract 99629963 where the value is the summation of
OHU in tract 9962 and 9963. In other words, this variable contains the numbers of
OHU per tract which correspond to TRACT 2. Next, a new variable OHU/BLOCK 2 will be
created. For single blocks, the variable will have the same values as OHU/BLOCK. For
groups of combined blocks, each group will have a value which is the summation of
OHU in each combined blocks. In other words, this variable contains the numbers of
OHU per block which correspond to BLOCK 2. The summation process will be done
using available functions in Excel.
6. In weight structure kain, using functions available in Excel, the following variables will
be created: TRACTSELECTED/CNTY, BLOCK SELECTED/TRACT, and OHU SELECTED/BLOCK.
7. Finally, in weight structure main, calculate the following sampling weights:
• Wi,i, calculated by using Eq. 6-5.
• calculated by using Eq. 6-6.
• Ufa, calculated by using Eq. 6-7.
• calculated by using Eq. 6-8.
• WT& calculated by using Eq. 6-9.
Spreadsheet Format
In weight structure main:
Column
Variable
1
COUNTY 2
2
TRACT 2
3
BLOCK 2
4
HHID
5
RESPONSE

-------
SOP# nT-A-9.0
Revision #0
February 27, 2001
Page: 8 of 8
Column
Variable
6
OHU/CNTY2
7
ohu/tract 2
8
HHENU/BLK
9
VAC REP/BLK
10
OHU/BLOCK, calculated from (HHENU/BLK) - (VACREP/BLK)
11
OHU/BLOCK 2, which is the values from OHU/BLOCK adjusted according to BLOCK 2
12
TRACT SELECTED/CNTY
13
BLOCK SELECTED/TRACT
14
OHU SELECTED/BLOCK
15
UEUBER/OHU
16
Wii, calculated from (OHU/CNTY) / [(OHU/TRACT 2) x (TRACT SELECTED/CNTY)]
17
W„. calculated from (OHU/TRACT 2) / ((OHU/BLOCK 2) x (BLOCK SELECTED/TRACT)]
18
WiS, calculated from (OHU/BLOCK2) /(OHUSELECTED/BLOCK)
19
Wu4, which is equal to (UEUBER/OHU)
20
OTtj, calculated from x x

-------