Assessment of the Contribution to
   Personal Exposures of Air Toxics from
   Mobile Sources
United States
Environmental Protection
Agency

-------
                    Assessment of the Contribution  to
                 Personal Exposures of Air  Toxics from
                                  Mobile  Sources
                                  Assessment and Standards Division
                                 Office of Transportation and Air Quality
                                 U.S. Environmental Protection Agency
                                        Prepared for EPA by

                                       Clifford P. Weisel, PhD
                           Environmental & Occupational Health Sciences Institute
                                 Robert Wood Johnson Medical School
                            University of Medicine and Dentistry of New Jersey

                                    EPA Contract No. 68-C-03-149
                   NOTICE

                   This technical report does not necessarily represent final EPA decisions or
                   positions. It is intended to present technical analysis of issues using data
                   that are currently available. The purpose in the release of such reports is to
                   facilitate the exchange of technical information and to inform the public of
                   technical developments which may form the basis for a final EPA decision,
                   position, or regulatory action.
SER&
United States
Environmental Protection
Agency
EPA420-R-05-025
December 2005

-------
Executive Summary:




   To evaluate the role of proximity to mobile source emissions on ambient  air




surrounding  residences,  statistical  analyses  using linear regression   models were




conducted for selected volatile  organic compounds, carbonyls, PM2.5 mass, elemental




carbon and organic carbon with mobile emission sources. The log transformed ambient




air  concentration  of individual  air  toxics  measured  in  Elizabeth, NJ  during  the




Relationship  of Indoor,  Outdoor and Personal  Air (RIOPA)  study was used as the




dependent variable and inverse distance to roadways, gas stations, and point sources and




meteorological  parameters as the independent variables in the regression models.  The




home, roadway, point and area sources in and around Elizabeth, NJ were geocoded using




Geographic Information System (GIS) techniques to determine the distance between the




homes and potential  ambient sources. Meteorological data (wind speed, wind direction,




temperature, and atmospheric pressure) were obtained from the NOAA, Weather-Bureau-




Army-Navy (WBAN)  station in the Newark Liberty  International Airport,  which is




immediately to  the north of Elizabeth, and mixing height data from Brookhaven, NY (the




closest station to Elizabeth containing that type of data).  The meteorological data were




averaged over the 48 hour sampling period to provide  a single  value for each sample.




The roads were stratified into six roadway types based on  categories used in the EPA




Mobile 6 model. Quality assurance steps were  taken to confirm the location  and  each




home and  location, including  direct visits  to Elizabeth  to verify the  address  and




coordinates. Various regression models (and selection criteria) were used  to confirm that




repeatable set of associations were obtained.

-------
   All target aromatic compounds (benzene, toluene,  ethyl benzene, m,p xylene,  o




xylene),  methyl tert butyl ether, PM2.5, and organic carbon were statistically associated




with the inverse distance to urban major arterials (FC14) or the interstate highway




(FC11);  methyl tert butyl ether  (MTBE), benzene, m,p xylene,  and o xylene  were




statistically  associated with the  inverse  distance to gasoline stations;  the carbonyl




compounds  (acetaldehyde, acrolein, and formaldehyde) were not associated with the




inverse distance to  roadways;  PM2.5  and elemental  carbon were associated with area




sources  of diesel emissions based on  truck or bus depot and idling activity, two  PAH




compounds  (coreonene-gasoline emissions and  benzo[ghi]perylene-mobile emissions)




were statistically associated with the inverse distance FC11 and PM area sources.  Two




volatile compounds  and two PM constitutes without mobile sources (carbon tetrachloride,




tetrachloroethylene,  sulfur and selenium), were examined as controls  to  check for




spurious  associations, were not associated with distance to roadways.




   The regression model had overall r2 of between 0.16 and 0.67, indicating that between




approximate 20% and 70% of the variability in the air concentrations was explained by




the model.   However,  the partial r2 of the distance  terms were  less than 10%, as




meteorology  was  a  more  important  factor on controlling  the variations  in the




concentrations than  the distance between the home and a mobile emission source.   The




effect of mobile sources emissions appears to be confined to residences very close to the




sources within 200  meters, though within that distance that can cause in several |ig/m3,




dependent upon the sources  strength for that compound.  Thus, for most homes in




Elizabeth, NJ the influence of mobile sources is to raise the general background levels of




the compounds emitted with the increase dependent upon the meteorological condition,

-------
especially the atmospheric stability, and there are appears to only be small increases in




concentrations as the distance decreases.  For homes within very close proximity  (no




more than several hundred meters) of gas  stations and highly trafficked roadways  the




regression model predict changes  in the median  ambient air concentration around  the




homes from the typical background levels by 2 to 10|ig/m3.

-------
BACKGROUND




       The Relationship between Indoor, Outdoor, and Personal Air (RIOPA) study was




undertaken to determine the influence  of outdoor sources on indoor and personal  air




concentrations of a set  of volatile organic compounds  (VOCs), aldehydes,  and PM2.5




mass. Indoor/outdoor polyaromatic hydrocarbons (PAHs), and elemental carbon/organic




carbon (EC/OC) concentrations were also measured (Weisel et al  2004a,b).  The study




collected data on indoor, outdoor,  and personal air concentrations for approximately 300




non-smoking homes  in Los Angeles (CA), Elizabeth (NT), and Houston (TX), visited




twice from the summer  of 1999 to the spring of 2001.  Either one or two homes was




visited  on a  single day, though some days had  samples  collected from three  or four




homes.  Samples were collected throughout the year.  This report focuses on the analysis




of VOCs, carbonyls and PM2.5 associated with mobile source emissions and air samples




collected outside residence in Elizabeth, NJ.  One dominant source in Elizabeth NJ  for




aromatic hydrocarbons and PM2.5 is mobile sources. Prior to  examining the ambient




source  contributions to indoor/personal VOCs, the association between source emission




and ambient VOC concentrations near residences should be established. One approach to




this  is  to evaluate  the role  of proximity to  the  potential  emission  sources and




meteorological  conditions on  ambient concentrations.  Precise  proximity information




between the residences where the samples were collected and potential emission  sources




are needed along with locally collected meteorological information to evaluate the effect




of proximity and  meteorological  conditions.  All roadway  classes bisect the  city  of




Elizabeth, NJ, so wide  distributions of distances to  each roadway type and gasoline




stations exist. The home selection  criteria included over-sampling homes close to heavily








Final Report                             1                                11/22/2004

-------
trafficked roadways and being near gasoline stations. The association of air concentration




and proximity to mobile sources was examined by deriving linear regression equations




using air concentration as the dependent variable  and proximity to roadways, gasoline




stations, and point sources and meteorological parameters as the independent variables.




Attempts to see such statistical  association have met with only minimal  success unless




the locations of the homes were very close to the roadways.  The RIOPA dataset was




designed to contain a  substantial number of homes  within 0.5km of mobile  sources




thereby allowing for examination of the effect of proximity to mobile source emissions in




a northeast urban environment, Elizabeth, NJ.









METHODOLOGY




Construction of the RIOPA Database




       The major components in the RIOPA database were sample information, analysis




results, and questionnaire responses. The database was implemented in Microsoft Access




97® and upgraded in Microsoft Access 2000®.  A decomposition process was  used to




remove internal duplication in a series of steps without loss of data. Every  tabular  record




was indexed with a unique data-independent primary key.  The unique, data-independent




primary key enables the linking, indexing, filtering and sorting of records in multiple




tables and their components.  A normalization process was used to re-organized the data




into a streamlined effective tabular structure. For decomposition and normalization, the




Access commands 'selection query' and 'make table query'  were most frequently used.




To find the repetition of the  identical record in a table, the Access commands 'find the




duplicate query' was used, while 'find  unmatched query' was used to determine when








Final Report                               2                                11/22/2004

-------
there was the missing data.  Establishing relationships between one table and another




table by assigning a unique primary key  such as identification field was mandatory for




the database performance.




       Each  sampling home  and sample was assigned a unique identification number




(ID) prior to collecting the sample. Each unique sample number was linked to the home




ID so the samples  associated with each  home  could be  identified. The home ID  was




coded  to  identify the state  the  sample was  collected in using  the two letter state




abbreviations (CA, TX, and NJ), followed by a three-digit number unique for each home




in that state. Among the three digit numbers, the first digit represented whether the visit




was the first or a repeat visit (1, 2 respectively), and the second and third  digits  the




chronological order the house was selected in (00-99).  A unique  five digit sequential




number was assigned to each sample as the sample identifier. The first digit of sample ID




was reserved to identify sample types while the remaining numbers randomly assign so




that the analyst could not determine where the  sample came from  nor the sample type




(indoor, outdoor, personal, blank, duplicate) prior to analysis. The descriptions of the  data




fields contained in the RIOPA database are listed in Tables 1 to 3.




Quality Assurance of Database




       The following quality assurance protocols  were followed at each data entry and




modification  step to find data entry errors and repeated or missing data. All the sampling




information, analysis  results and  questionnaire data were transferred into the database.




Quality assurance at the data entry level was performed by having an individual who did




not enter the data compare the original written  sampling records to the  electronic  data




files. Validation equations  were used in  Access Query to identify potential data entry








Final Report                              3                                11/22/2004

-------
errors, especially for the fields containing calculated values. Access commands were used




to find duplicate data entries ("find the duplicate query"), which were deleted from the




database and missing data across different tables, ("find the unmatched query" ).




All  detailed  information  concerning  the sample  collection,  sample  analysis  and




questionnaires were consolidated and compiled  into the main database in an organized




manner as illustrated in Figure 1 (adopted from Weisel et al. 2004b, RIOPA final report).




The  final database was  reviewed by research associates,  experienced in analyzing each




specific type of sample. This review included cross checking keyed data entries against




the original printed hard copy  of the analytical data. The research associate double-




checked all the calculations  used  to  transform the analytical data into the  reported




ambient air concentrations. Finalized data were confirmed by reapplying all of the




calculations to the original analytical data. After the research associate completed his or




her verification, the initial database was then classified as the preliminary database.




         The field  teams validated  the preliminary database  by reviewing  the  field




sampling information and confirming the calculations that incorporated the  information




from the field sampling  sheets. The field teams then made any necessary corrections and




noted the change, which was then reported back to the originator for further confirmation




of the needed  correction. After the field teams made their comments and corrections, the




principal investigators randomly checked the data by cross-referencing the electronic data




for a subset of samples with the respective original data from the analytical results or




sampling information sheets.
Final Report                               4                                 11/22/2004

-------
Table 1.  Components and Data Fields of the  Sampling Information in the RIOPA
Database
Data Fields
Description
  Home ID

  Source ID,
  location

  CAT ID
  Sample ID


  Sampling date,
  time

  Sample duration

  Flow rate

  Sample volume

  Pump elapsed
  time

  Pump recorded
  volume

  Sample type

  Equipment ID

  Leak test
Unique identification number with state abbreviation

PFT source ID number (alpha-numeric) and location (floor-room)

Capillary absorbent tube ID (numeric)

Unique  5 digit  number  linked to  the  house ID,  identifying
contaminant category measured (VOC, DNPH, DNSH, Teflon and
Quartz filter for PM2.5, PUF)

Date (mm/dd/yy) and time (hh:mm)  sampling started and ended

Calculated duration of sampling in minutes

Initial, final and average flow rate of pump (cc/min, or L/min)

Calculated volume of sample (L, or m3)

Pump elapsed time recorded on the pump counter in minutes


Pump recorded volume of air sampled (m3)

Sample  type  (indoor, outdoor,  personal  adult,  child,  duplicate,
blank, control)

Pump, head and battery IDs

Leak test check done before and after sampling (yes/no)
Final Report
                                                  11/22/2004

-------
Table 2. Components and Data Fields of the Information of the Analysis Results in the RIOPA Database
Data Fields
VOCs
Carbonyls
PM2.5
PAHs
Description
Concentration (ppb, |ig/m3) of 1,3-butadiene, methylene chloride,
chloroprene,   methyl   tert  butyl  ether,  carbon  tetrachloride,
chloroform,  benzene,  m,/?-xylene,  toluene,  trichloroethylene,
tetrachloroethylene, ethylbenzene, o-xylene, styrene,  |i-pinene, |i-
pinene, J-limonene, 1,4-dichlorobenzene
Concentration  (ppb,   |ig/m )  of  formaldehyde,   acetaldehyde,
acetone, acrolein, propionaldehyde, crotonaldehyde, benzaldehyde,
hexaldehyde, glyoxal, methylglyoxal

PM2.5 mass, Concentration (ppb,  |ig/m3) of organic carbon (OC)
and elemental carbon (EC), elements; Ag, Al, As, Ba, Be, Bi, Br,
Ca, Cd, Cl, Co, Cr, Cs,  Cu, Fe, Ga, Ge, Hg, In, K, La, Mn, Mo, Ni,
P, Pb, Pd, Rb, S, Sb, Se, Si, Sn, Sr, Ti, Tl, U, V, Y, Zn, Zr

Concentration (ppb, |ig/m3) of gas/ particle phase poly cyclic
aromatic hydrocarbons; Dibenzothiophene, Phenanthrene,
Anthracene, 2-Methylanthracene, 1-Methylanthracene, 1-
Methylphenanthrene, 9-Methylanthracene, 4,5-
Methylenephenanthrene, 3,6-Dimethylphenanthrene, 9,10-
Dimethylanthracene, Fluoranthene, Pyrene, Benzo[a]fluorene,
Retene, Benzo[b]fluorene, Cyclopenta[c,d]pyrene,
Benzo[a]anthracene, Chrysene+Tripheny 1 ene,
Benzo[b]naphtho[2, l-d]thiophene, Benzo[b+k]fluoranthene,
Benzo[e]pyrene, Benzo[a]pyrene, Perylene, Indeno[ 1,2,3-
c,d]pyrene, Dibenzo[a,c+a,h]anthracene, Benzo[g,h,i]perylene,
Coronene
House Information   Air exchange rate (1/hr) and the volume of house (m3)
Meteorological
Information
Temperature and relative humidity measured inside and outside of
house
Final Report
                                                     11/22/2004

-------
Table 3. Components and Data Fields of the Questionnaire Data in the RIOPA Database
Data Fields         Description
Technician
Walkthrough
Baseline Survey
Activity
Questionnaire

Time Diary
Evaluation of the house and its usage and a description of the
neighborhood regarding possible sources.

Household  and  participant  characteristics;  demographics  and
socioeconomic  status; housing characteristics, facilities and usage;
personal  exposure  activities before  the   study  period;   and
respiratory health status of participant

A detailed series of questions related to activities, duration and use
of consumer products

48-hour   activity   log  listing   the   time   spent  in   each
microenvironment
Final Report
                                                       11/22/2004

-------
                            FIELD SAMPLING/DATA COLLECTION
  Questionnaires
    •Technician Walkthrough
    •Baseline
    •Time Activity Diary
  f  Original questionnaires
  /     stored in locked file
  I  cabinet. Access restricted
  \    to Pis and designated
  \.     field technician.
                                  _L
                                    _L
Field Sampling Information
Sheets
  •Subject ID
  •Sampling Date/Time
  •Sampler Type
  •Sample Location
1U. J

Sample


•Aldehydes
•VOC
•PM
•PAH
•AER




Provided Directly to Pis


          Entered into
          Database by
        Designated Field
          Technician
      QUESTIONNAIRE
         DATABASE
     (restricted access!!)
     Pis retain sampling
   information sheets until
    completion of sample
         analysis.
          Final
      Verification by
           Pis
                                Verification
                                    and
                               Comments by
                                Field Teams
                           SUB DATABASE:
                                CA
                                NJ
                                TX
                 Corrections When
                    Necessary
                                 Data Keyed-ln/
                                 Consolidated
                                 By Laboratory
                                  Technician
                            PRELIMINARY DATABASE
  Samples transported
  from the field to the
  laboratory on ice in
       coolers.
                                                                         Laboratory
                                                                         Technician
                                                                         for Analysis
                                                                 CA field samples
                                                                  shipped on blue
                                                                    ice to NJ
                                                                 analytical lab via
                                                                 overnight carrier.


                                                                      NJ&
                                                                   TX Samples
INITIAL DATABASE
      Initial
  Verification by
    Research
    Associates
       Analysis:
    Summary Tables
         Plots
   Statistical Analysis
                 Figure 1. The Flow Diagram of the Transference of Information from the Field Sampling to Database Construction and
                 the Quality Assurance Processes (adopted from Final Report of the RIOPA study)
Final Report
                                                        11/22/2004

-------
Data Integration in the RIOPA Database




       To expand the utility of the RIOPA database and to facilitate data analysis with




meteorological and geographical datasets, different databases in the public domain were




either imported into or linked to the RIOPA database. The details of the integration of the




databases are illustrated in Figure 2. The databases included were the National Emission




Inventory of 1999 (version 3.0 final for HAPs and criteria pollutants, US EPA), National




Climatological  data  obtained   from the  National  Oceanographic  and Atmospheric




Administration (NOAA),  2000 US Census data, 2000 TIGER/Line  data,  and Roadway




Information & Transportation data obtained from NJ DOT (Table 4)









National Emission Inventory of 1999




       The emission  data of the states of New Jersey and New York from mobile, area,




and point sources were obtained from the 1999 National Emission Inventory (NEI, the




final version 3.0  for the hazardous air pollutants, released on Dec 2003; the final version




3.0 for the criteria pollutants, released on Feb 2004). The datasets were divided into four




categories (On-road,  Non-road, Point, Non-Point) and  available from the  Technology




Transfer Network, Clearinghouse  for Inventories and Emission Factors (TTN CHIEF,




http://www.epa.gov/ttn/chief/net/1999  inventory.html).   The  emission   sources  of




compounds collected in the RIOPA study were selected from the inventory datasets of




the counties containing or adjacent to the RIOPA study  area.  The counties were Union,




Essex, and Hudson Counties, New Jersey, and Richmond County, New York.




       The emissions from on  road mobile sources were calculated to evaluate which




road types to consider in the regression models.  Actual  emission rates were not used as








Final Report                               9                                11/22/2004

-------
inputs in the models since only statistical associations were examined in this analysis and




not a  comparison of predicted  to measured  concentrations.   The  emissions  were




calculated by multiplying emission factors (g/mile) estimated by US EPA using MOBILE




6.2 model and vehicle miles traveled (VMT, 106 miles). The VMT were estimated from




the sampled traffic  counts  of road segments  by Federal Highway  Administration




(FHWA)'s Highway Statistics 1999 (US EPA, Documents forNEI, 2003). The emission




estimates for each county were stratified by road types (6 urban categories of public roads




were present in Union County) and by twelve vehicle types. The emission rate per unit




length  of public road by functional classification was estimated from the total roadway




mileages of Union County and the annual total emission from  on-road sources in Union




County. The emission rates of selected VOCs by roadway class are listed in Table 5. The




emission rates calculated from the major  roadways (FC11, urban interstate highways;




FC12,  urban other freeways and expressways; FC14, urban major arterials) were more




than 6 to 90 fold higher than the emission rates from the minor classes of roadways




(FC16, urban minor arterial; FC17, urban collector; and FC19, urban local). To apportion




the annual total amount of emissions from the  on-road mobile  sources countywide to




Elizabeth, the ratio of the roadway mileage in Elizabeth to the roadway mileage in Union




County was  calculated for each category of functional classification. The public roadway




mileages in Elizabeth were 11.5, 3.7, and 22.3 in kilometers for FC11, FC12, and FC14,




respectively. The percentage of the major public roadway miles inElizabeth classified as




FC11, FC12 and FC14 were 3.5%, 11% and 6.7%, respectively. The proportion of major




roadway miles  in Elizabeth were larger than that in Union County as a whole (FC11,




1.4%; FC14, 3.6%),  in New York Northeast New Jersey (FC11, 1.3%; FC14, 5.4%) and








Final Report                              10                               11/22/2004

-------
in the composite of New Jersey urban areas (FC11, 1.2%; FC14, 5.4%). As a result, the




proportion of urban local roads (FC19) was lower (64%) than the proportion of local




roads of  other  metropolitan areas mentioned (over 70%)  (Table  6).  The largest




contributions to  on-road source emissions in Elizabeth were from roadways  of FC14




(about 33%), followed by contributions from roadways of FC11 (about 30%). More than




75% of aromatic compounds and MTBE were emitted from  major roadways (FC14,




FC11, FC12) according to the emission inventory data and public roadway information of




New Jersey.




      Emissions from a specific area source were estimated from the annual  emission




estimate for  Elizabeth divided by the total number of area sources in Union County. The




population ratio of Elizabeth to Union county was used to apportion the annual  emission




for  specific area sources in Elizabeth. The national emission inventory of point sources




provided the annual  generation  and the coordinates. The  daily emission from a  point




source was estimated by dividing the annual total by 365 days, which assumes that the




facility operated everyday. The emission from the non-road mobile sources was ignored




because the  total number non-road  sources (lawn and garden equipment, snowmobiles,




snow blowers,  construction equipment etc) in the  study area an  urban  center,  was




considerably lower than the on-road emissions or the off road emissions for the  more




suburban regions of Union County.




      A number of non-point sources for diesel emissions were identified in  and near




Elizabeth, NJ.  These included: a truck depot and bus depot in  north-east Elizabeth, the




Port Authority-Marine Terminal in East Elizabeth and the Newark Liberty International




Airport located north  - north east of Elizabeth.  All of these locations were north to north








Final Report                              11                               11/22/2004

-------
east of the majority of sampling locations, though the truck and bus depots were close to




a subset of homes. No residencies exist intermingled with either the seaport or airport.









The Meteorological Data for New Jersey




Surface Observation Data




       Meteorological  data  for  Elizabeth,  New  Jersey,  were  obtained  from




NCDC/NOAA  (National  Climatic  Data Center, National  Oceanic and Atmospheric




Administration). The data are part of the quality assured national climatological database.




The datasets contain hourly  observation tables, along with daily  and monthly summary




tables covering the entire  period of the RIOPA Study. The hourly observation  datasets




were  used because those could  be matched  to the  exact  48-hour sampling  time  of




individual samples. The ASCII data files were linked to the RIOPA weather database for




data extraction. First, the meteorological data were selected from the observation station




that was closest to the study area, the Weather-Bureau-Army-Navy (WBAN) station in




the Newark Liberty International Airport (EWR, 14734, Latitude; 40.72°, Longitude; -




74.17°). Next, a series of the selection queries in Access were used to retrieve the hourly




observation dataset corresponding to each individual sample according to the date/time




the sampling was started and ended.




       Among  the meteorological data extracted, the variables considered as possibly




influencing the  ambient air concentrations were:  the dry bulb temperature (°F), relative




humidity (%), precipitation (inches), station atmospheric pressure (inHg),  resultant wind




speed (knots), resultant wind direction  (tens of degrees from true north). The  English




units were converted to the  SI units. Meteorological values averaged for  individual 48-







Final Report                               12                               11/22/2004

-------
hour sampling periods, were wind speed (U, m/s), temperature (K, Kelvin), atmospheric




pressure (mmHg), and relative humidity (RH, %). The precipitation was totaled for the




48-hour sampling period.









Mixing Height Data




       The mixing height data were obtained from NCDC/NOAA. The mixing height




data were computed from source code made available by the US  EPA. The dataset was




computed using the upper air data of Brookhaven, NY and the surface data of Newark,




NJ. Brookhaven, NY, was the  closest monitoring station  to the RIOPA  study  site




recording  the upper level air data. Mixing heights were reported as AM and PM mixing




heights. The values were averaged for individual homes according to the corresponding




sampling duration of 48-hour.









Atmospheric Pasquill Stability




       The Atmospheric Pasquill Stability classes with a time resolution of 3 hours were




retrieved from NOAA AIR Resources  laboratory's READY (Real-time Environmental




Applications and Display system) web site (http://www.arl.  noaa.gov/ready.html). The




archived datasets were EDAS (Eta Data Assimilation  System) meteorological data




(80km, 3  hourly, US). The representative coordinates  of Elizabeth (Latitude; 40.65°,




Longitude; -74.20°) were used as the location. The text results were tabulated and the




stability time-series plots were saved for individual sample dates when available. The 48-




hour average  stability was calculated  from the stability time-series  classes for each




sample. The classification of the atmospheric stability is described in Table 7.








Final Report                             13                               11/22/2004

-------
Table 4. Description of Integrated Data from Databases in the Public Domain

  Databases       Description

National Climatological Data
  Hourly
  observations
  Hourly
  precipitation
  Daily table
  Mixing height

  Atmospheric
  stability
ASOS;  WBAN number,  date, time  in  local  standard  time,  sky
conditions,   visibility,   significant   weather  types,   dry   bulb
temperature,  dew point temperature, wet bulb temperature, relative
humidity, wind speed, wind  direction, wind characteristic gusts,
value for wind character, station pressure, pressure tendency, sea
level pressure, report type, precipitation totals in inches

ASOS; WBAN number, date, time, hourly precipitation

ASOS;  WBAN number,  date, temperature  (maximum, minimum,
average, departure from  normal,  average dew point, average  wet
bulb),  degree days  (heating,  cooling), significant weather types,
snow/ice depth and water equivalent, precipitation snowfall, pressure
(average  station  and  average sea level),  resultant  wind  speed,
resultant wind direction,  average speed, maximum  5 second, 2
minute speed and direction

Morning  and afternoon  mixing  height (meters) produced from
surface air and upper air data by NCDC/NO AA

Atmospheric Pasquill  stability class  from  NOAA AIR resources
laboratory
National Emission Inventory Data
                   County  level estimates are stratified  by type  of roadways  and
  On-road sources   vehicles; NEI for criteria pollutants and HAPs for year 1999 (version
                   3 final)
  Non-road
  sources

  Point sources

  Non-point
  sources
NEI for criteria pollutants and HAPs for year 1999 (version 3 final)

County level estimates from registered point sources; NEI for criteria
pollutants and HAPs for year 1999 (version 3 final)

County level  estimates  of  non-point  sources; NEI  for  criteria
pollutants and HAPs for year 1999 (version 3 final)
Final Report
                     14
11/22/2004

-------
Geographic Information and the Spatial Data
  Transportation   Public roadway mileages, functional class of roadways, vehicle miles
  data             traveled by stratified vehicle types; NJ DOT

  Census    2000  Line features (roadways, railroads, hydrography etc.), municipality
  TIGER data      from US Census Bureau
Final Report                              15                               11/22/2004

-------
Table 5. Estimated Emission Rates (jig/sec-m) of Selected VOCs for Public Roadways of
Union County by its Functional Classes. (Estimation based on 1999 NEI v3 Final)
VOCs
Xylene
Toluene
MTBE
Benzene
Ethylbenzene
Formaldehyde
Acetaldehyde
Acrolein
Table 6. Percent
of Elizabeth, NJ
VOCs
Xylene
Toluene
MTBE
Benzene
Ethylbenzene
Formaldehyde
Acetaldehyde
Acrolein
FC11
50.9
88.2
44.4
32.2
13.3
17.6
5.12
0.70
FC12
69.8
121.0
60.9
44.1
18.3
24.2
7.02
0.95
FC14
29.2
50.4
25.6
18.0
7.7
11.1
3.22
0.49
Contribution of On-road Source
(Estimation based on 1999 NEI
FC11
29.4
29.5
29.2
29.8
29.2
17.4
17.4
16.0
FC12
13.1
13.1
13.0
13.3
13.0
14.3
14.4
13.2
FC14
32.6
32.6
32.5
32.4
32.7
27.4
27.4
28.4
FC16 FC17 FC19
5.0 3.4 0.9
8.7 5.9 1.6
4.4 3.0 0.8
3.1 2.1 0.5
1.3 0.9 0.2
1.92 1.31 0.3
0.56 0.38 0.1
0.09 0.06 0.01
Emission by Roadway Types in the City
v3 Final)
FC16 FC17 FC19 Total
9.9 5.2 9.8 100
9.9 5.2 9.7 100
9.9 5.2 10.2 100
9.8 5.2 9.4 100
9.9 5.2 9.9 100
16.5 7.2 17.1 100
16.5 7.2 17.0 100
17.1 7.5 17.8 100
Final Report
16
11/22/2004

-------
Table 7. The Description of the Classification of the Atmospheric Pasquill Stability



 Pasquill Stability Class     Description                               Coded




 A                         Extremely unstable conditions             1



 B                         Moderately unstable conditions            2



 C                         Slightly unstable conditions               3



 D                         Neutral conditions                        4



 E                         Slightly stable conditions                  5



 F                         Moderately stable conditions              6



 G                         Extremely stable                          7
Final Report                                17                                 11/22/2004

-------
                                               DATA COLLECTION and INTEGRATION PROCESSES
                National Emission Inventory
                HAPs and Criteria Pollutants, 1999
                 • On-Road (Mobile) Sources
                 • Non-Road Sources
                 • Point (Industrial) Sources
                 • Non-Point (Area) Sources
Data Extraction bv Selection Query
By State and County; Union, Essex,
Hudson Co, NJ, and Richmond Co, NY
By Pollutants; VOCs, Carbonyls, PM25
i
r
I
Estimation of Emission Rates
Mobile Sources by FC of Roads
Area Sources by Population ratio


              | Location Information of Point Sources
       National Climatoloqical Data
       NOAA ASOS Data (1999 ~ 2001)
        • Hourly Observations Table
        • Hourly Precipitation Table
        • Daily Observations Table
        • Monthly Observation Table
        • Mixing Height & Atmospheric
          Stability Data from NOAA
                                                       Dataset Extraction by Selection Query
                                                       By WBAN Station and by Coordinate
                                                       Purchased from NOAA by Combination of
                                                       Surface and Upper Air Data
                                                       Data Extraction for Sampled Date/Time
                                                       and Calculate 48-hour Averages
                    DATA ANALYSIS
                     Statistical Analysis
            RIOPA DATABASE
              Sampling Information
              Sample Analysis Data
             Questionnaire Database
       Geographical Information
        . 2000 U.S. Census Data
        . 2000 TIGER/Line Data
        • Transportation Data from NJ DOT
        • List of Small Businesses from
          HAZMAT Team of Union Co, NJ
        • Digital Ortho Quad Quadrangles
          (Aerial Photo)
                                              Download Layers for the Study Area
                                              By Municipality; Elizabeth City, Union,
                                              Essex, Hudson Co, NJ, Richmond Co, NY
                                              By DOQQ ID and the Location of Features
                                                     Geographical Layer Overlays
                                                     Visual Verification of Locations
                                                Confirmation of the Sampler Location
                                                and Area Sources by Re-visiting with
                                                GPS
  <=>
          CIS
Layers & Geo-Database
 Proximity Calculation
    Soatial Analysis
Figure 2.  Data Integration  Processes of the Public Databases  into the RIOPA  Database  for Data  Analysis of New Jersey  Site
Final Report
18
11/22/2004

-------
Geographical Information Systems




       Arc View GIS (version 3.1, ESRI, Inc.) was used to build the geographical inputs




for statistical analysis. The spatial analyst extension used was for geo-processes such as




dissolve, merge, clip, union, spatial join, and select themes. The scripts downloaded were




used to  measure the distances  between geographical locations.  For the geographic




coordinates  of  projection, NAD83 (North American Datum 1983), New Jersey State




Plane 1983 was used with units of decimal degrees and feet using ArcScript, Addxycoo




(ESRI). GIS application itself provided a powerful database tool for  integration  of




datasets by joining and linking databases.









Census 2000 TIGER/Line® Datasets




The  Census 2000  TIGER®  (Topologically  Integrated  Geographic  Encoding  and




Referencing  system) datasets  were downloaded  from the Geography  Network  (US




Census Bureau, Geography Division,  http://www.census.gov/geo/www/tiger).  The line




features included were roads,  railroads, and hydrography. The polygon features were




municipal boundaries such as county, township, and city borderlines. Not only were the




spatial data of  Union County, NJ included in the resulting map, but also the  spatial




features of adjacent counties (Essex, Hudson Counties, NJ and  Richmond County, NY)




since the proximity  information and  source  emissions were also  reviewed  for these




counties (figure 3).
Final Report                              19                               11/22/2004

-------
Digital Images




       Digital  orthoquarter quadrangles  (DOQQs)  are the  combined  image  of  a




photograph with geometric qualities of a map. The primary digital orthophotoquad has a




1-meter ground resolution, quarter-quadrangle (3.75-minutes of latitude by 3.75-minutes




of longitude) image cast on the Universal Transverse Mercator Projection (UTM) on the




North American Datum of 1983 (NAD83). For the RIOPA study area in New Jersey, the




corresponding 1997 DOQQs were downloaded from the New Jersey Image Warehouse




site  of the  NJ  DEP,  Bureau of GIS (http://njgin.nj.gov/OITJW/index.jsp).  The




downloaded DOQQs are listed in Table 8. Figure 4 illustrates the digital image of the




City of Elizabeth with municipal borderlines.









New Jersey Road Network




       The functional classes of roadways (Table 9) in Elizabeth were obtained from the




functional classification  map of Union County from the Bureau of Transportation Data




and Development in Department of Transportation of New Jersey (http://www.state.nj.us/




transportation/refdata). The functional  class information was assigned to the appropriate




road segments using the roadway line feature layer of Arc View GIS® project file. The




Straight Line Diagrams provided a graphical representation of state, toll, and county




roads and showed intersecting  streets,  administrative and geometric characteristics. The




Straight Line Diagrams  provided the width of the roadways for estimating the general




offset distance from the centerline of roadways. The offset distance used was one half the




roadway width and was required to specify the location of each home relative to that the




roadway centerline.  This allowed  the home to be placed on the correct side of the








Final Report                              20                               11/22/2004

-------
roadway rather than on the center line and to calculate the distance from the home to the




center line of the roadway.  Offset distances of 20 to 30 meters were used based on the




functional classes of the roadways. The customized map of the  public roadways  in the




study area is illustrated in Figure 5.









Location of Area and Point Sources




       Lists of  the  street  locations  of service  stations were  obtained  from  visual




observation  and written records made during the sampling, from  web sites  that list




gasoline     stations      by     zip      code      for     price      comparison




(http://www.gaspricewatch.com/USGas_index.asp),  and  from   the   yellow   pages




(http://www.yellowbook.com)  for  Elizabeth,  New  Jersey.  After  combining and




comparing the information  contained in these lists, it was determined a more reliable




compilation   was   still   needed.   This   was   obtained   from   the  Emergency




Response/HAZMAT  of Union  County,  Division  of  Environmental   Health and




Emergency Management. The list of the actually operating dry cleaning facilities in the




City of Elizabeth, NJ, was also obtained from HAZMAT Team of Union County. Figure




6 and Figure 7  are the maps of gas station and dry  cleaning facilities identified and




located in the study area. The latitude and longitude of the point sources identified in the




study area from the emission inventory database were  provided  with the list used to




generate customized maps by making  event themes. (Table 10), (Figure 8).
Final Report                              21                               11/22/2004

-------
Quality Assurance of Geographical Data




       To evaluate the effect of proximity and meteorological conditions simultaneously,




the relative locations of sources and sampling  sites  should be defined  precisely.  All




downloaded geographical layers were overlaid on the New Jersey State Plane of NAD 83.




TIGER maps placed road centerlines substantial distances (15 ~ >50 meters) from actual




location based on aerial photos (DOQQs). Therefore, to obtain the needed accuracy of the




proximity  data  acquisition,  TIGER  data were evaluated  before  geo-coding   and




calculating the distance between road centerline and receptor location. The errors of 2000




TIGER/Line® data were corrected by following the centerlines of the roadways observed




on the overlaid  DOQQs as reference themes. The point  themes were  finalized after




correcting the locations based on the street information collected during confirmation




trips done by driving to each address listed  in the RIOPA dataset, digital orthophoto, the




Elizabeth City engineer's map, pictures taken from the sampling,  and the GPS readings




from the confirmation trip. The  GPS unit used to read the coordinates was  a GeoStats




wearable GeoLogger™  The GPS reading was used solely as an aid to locate the houses




during the quality assurance visit to Elizabeth.  The values retrieved from the GPS were




not used in the data analysis,  rather the longitude and latitude obtained from the GIS




mapping was used. The corrected point themes  were the locations of the  outdoor sampler,




point sources, gas stations,  and the dry cleaning facilities (figures 5 - 8).   Approximate




receptor  (outdoor sampler) locations  are  given for each residence  in  figure 9  for




illustration  purposed to maintain  confidentiality  of the subjects,  actual locations




coordinates were used to determine proximity to sources.
Final Report                              22                               11/22/2004

-------
Measurement and Calculation of Geographical Data




       The location of the residences, point sources, and area sources were determined




by the address-matching technique within Arc View on the corrected and quality assured




line files from Census 2000 TIGER/Line® as the reference theme using US streets with




zones. The spatial coordinates of the point themes, such as residences, point sources, gas




stations, and dry cleaning facilities were determined by "Addxycoo", a commonly used




ArcScript. The distances  from point theme to point theme and the distances, from point




theme to  line theme were  measured  by "the nearest features",  an extension patch




available in ESRI's site for ArcScripts (http://arcscripts.esri.com).
Final Report                               23                                11/22/2004

-------
Table 8. The List of Digital Orthoquarter Quadrangles Used in this Study for Quality
          Assurance (Source: NJ DEP)
 QQ Number             QQ Name
 514                    SEROSELLNJ
 521                    NW ELIZABETH NJ-NY
 522                    NE ELIZABETH NJ-NY
 523                    SW ELIZABETH NJ-NY
 524                    SE ELIZABETH NJ-NY
Table 9. The Functional Classification of Public Roadways in Urban Area (Source: NJ
          DOT)
 Functional Class         Description
 FC 11                  Urban Interstate Highways
 FC 12                  Urban Other Highways/Freeways
 FC 14                  Urban Major Arterial
 FC 16                  Urban Minor Arterial
 FC17                  Urban Collector
 FC 19                  Urban Local
Final Report                           24                             11/22/2004

-------
Table 10. The Point Sources of Selected VOCs Used for Data Analysis (Source:  1999
NEI for HAPs version 3 Final)
PS ID
Xyl_PSl
Xyl_PS2
Xyl_PS3
Xyl_PS4
Tol_PSl
Tol_PS2
Tol_PS3
Tol_PS4
Tol_PS5
Bzn_PSl
Bzn_PS2
Bzn_PS3
Bzn_PS4
Ebz_PSl
Ebz_PS2
MTBE_PS1
PCE_PS1
Emissions
1.95
0.94
0.91
0.24
4.05
3.03
2.14
0.50
0.38
4.55
1.73
0.20
0.10
0.60
0.27
43.50
1.03
Facility /Process
Refinery
Tanker Terminal
Industry
Aviation Service
Refinery
Tanker Terminal
Industry
Industry
Aviation Service
Refinery
Tanker Terminal
Joint Meeting of Essex and Union
Aviation Service
Refinery
Industry
Refinery
Refinery
X
-74.22
-74.25
-74.19
-74.17
-74.22
-74.25
-74.19
-74.22
-74.17
-74.22
-74.25
-74.20
-74.17
-74.22
-74.19
-74.22
-74.22
Y
40.64
40.63
40.67
40.70
40.64
40.63
40.69
40.63
40.70
40.64
40.63
40.64
40.70
40.64
40.67
40.64
40.64
PS  ID:  Point  Source ID, Emissions are annual total generations  in metric tons,  X:
Longitude, Y: Latitude.
Final Report                              25                               11/22/2004

-------
                                           STATEN IS,
                                           RICHMOND, N
Figure 3. The Location of Union County and City of Elizabeth in New Jersey
Final Report
26
11/22/2004

-------
                                                                                      ?
                                                                                      6 Kilometers
Figure  4.  Digital Image  of  Study  Area,  the  City  of  Elizabeth,  New  Jersey  (Source  of  DOQQs:  NJ  DEP,  jpeg97)




Final Report                               27                                11/22/2004

-------
                 Hillside Township
              Newark, Essex County

                                      N
             Elizabeth Cit
             River
      Major Roadway
             FC11
             FC12
             FC14
         Newark Libert
   °>m r international Airport
                                                                             izabe
                                                                          Port Authorit
                                                                         Marine Terminal
                                                             Staten Island
                                                       3                        6 Kilometers
  Figure 5. Major Public Roadways in Study Area, the City of Elizabeth, New Jersey
Final Report
28
11/22/2004

-------
                                                         Newark, Essex County

                                                                                 N
                                                     Newark Libert
                                                    ternational Airport
                                                                           izabe
                                                                        Port Authorit
                                                                       Marine Terminal
                                                           Staten Island

                                                     3                       6 Kilometers
  Figure  6. Identified Gas Stations in Study Area,  the City of Elizabeth, New Jersey
               (Source: HAZMAT List of Union County)
Final Report
29
11/22/2004

-------
                                                         Newark, Essex County

                                                                                 N
                                                     Newark Libert
                                                    ternational Airport
                                                                           izabe
                                                                        Port Authorit
                                                                       Marine Terminal
                                                           Staten Island

                                                     3                       6 Kilometers
  Figure 7. Identified Dry Cleaning Facilities in Study Area, the City of Elizabeth, New
               Jersey (Source: HAZMAT List of Union County)
Final Report
30
11/22/2004

-------
                Hillside Township
LEGEND
Point Sources
#
#
#
<1 Ton
1 - 4.55 Ton
> 4.55 Ton
             Newark, Essex County
                                                    Newark Libert
                                                   ternational Airport
   Union Township
                                                                         izabe
                                                                       Port Authorit
                                                                      Marine Terminal
       ?r   Linden

         #	Xvlene
               Staten Island

         3                       6 Kilometers
  Figure 8. Identified and Selected Point Sources  of VOCs  Studied in Study Area, the
               City of Elizabeth,  New Jersey (Source:  1999 NEI for HAPs, Version 3
               Final)
Final Report
31
11/22/2004

-------
                                                          Newark, Essex County
                                                                                  N
Hillside Township
                                                       Newark Libert
                                                     International Airpor
    Union Township
                                                                           Elizabeth
                                                                         -Port Authorit
                                                                         Marine Term in a
                                                                 Newark/ Bay
                                                            Staten Island
                                                    \3                       6 Kilometers
             Figure  9.  Approximate  Locations  of Outdoor  Samplers  in  the City  of
             Elizabeth,  New Jersey  (Locations  randomly shifted  by small amount  for
             illustration  purpose  to  preserve  subjects'  confidentiality,  Source: RIOPA
             Questionnaire Database, 2003)
Final Report
                           32
11/22/2004

-------
Statistical Analysis




Statistical Treatment of Data




       The SAS system for Windows (version 8.02) and SPSS for Windows (version




12.0) were used for all statistical analyses in. The blank subtracted, temperature adjusted,




and uncensored ambient air concentrations (|ig/m3) of the selected air toxics and PM2 5




were evaluated.




       The distributions of the residential ambient air concentrations were examined by




the one-sample Kolmogorov-Smirnov (K-S) test to evaluate their normality. Natural log-




transformation of the concentrations was performed because it provided distributions that




were closer to a normal distribution with more constant variance than the un-transformed




concentrations. Any zero values in the uncensored dataset were replaced with one half




the minimum diction limit prior to the statistical analysis.




       The sample means,  standard deviations, median,  percentiles, the minimum and




maximum values  for the variables  were  computed. The scatter plots  of residential




ambient air concentration and  each independent variable were examined for obvious




associations.




       Bivariate Pearson correlation coefficients and the significance of the statistics




were computed to examine the  correlations between the response variables and the




predictor variables for the purpose of preliminary  selection of the  more influential




explanatory  predictors  among the groups  of candidate variables. Correlations of un-




transformed, In-transformed, inversed, squared, and inverse squared of the concentration




and predictor variables were examined.




       Two samples were collected from most homes several months apart and on most








Final Report                               33                               11/22/2004

-------
days one or two homes were visited, though occasional three or four homes were sampled




on a  single day.  The Mixed Model Proc in SAS was run with home identification




number and with date as the repeated measure to evaluate if whether multiple samples at




the same location or date affected the results. No affect was observed.









Multiple Linear Regression Analyses




       Multiple regression analysis  was used to examine the association between the




ambient air concentrations and the proximity and meteorological  variables. A multiple




linear regression equation that expresses the response variable as a linear combination of




(p - 1) predictor variables, has the form:
              where:




              Yz is the response in the i th trial




              fig, /?, ••• J3p_-, are the parameters (regression coefficients)




              Xa, Xz2 • • • X;, p _j are the values of predictor variables




              sz is the error term





       This  equation  assumes  that  the  relationship  of independent  variables with




response variable is linear, and that the distribution of error terms is normal with equal




variance. Two of the explanatory variable groups considered  important for predicting




residential  ambient air concentrations of the  selected VOCs were the  proximity of a




residence to the emission sources and the  corresponding meteorological  conditions.




Distances from residences to mobile, area, and point emission sources identified from the




emission inventories, wind speed, atmospheric  stability,  mixing height,  temperature,






Final Report                               34                                11/22/2004

-------
relative humidity, precipitation, and atmospheric pressure were used as the independent




variables.




       Selection of the predictors associated with elevated ambient air  concentrations




around  residences were examined using several multiple linear regression analyses




methods:   forward selection, backward elimination, stepwise selection,  r squares, and




maximum  r2 improvement  methods, to verify that  consistent  results were obtained




independent of the type of regression model used. Final model were determined using




stepwise selection.  The default criteria of each  method in the SAS  program (version




8.02) were used for  selecting variables to be included  in  the  resulting model. The




parameter  selection  criteria  used for forward selection,  backward  elimination, and




stepwise selection were p<0.50, p<0.10, and p<0.15,  respectively. Due to the different




levels in selection criteria, the number of predictors included in resulting models differed.




The models selected by the different selection methods were compared and evaluated by




the p values of parameter  estimates of  predictor variables and the composition  of




variables in the model. When the best-fitting model was selected for a VOC compound,




the model  and  the corresponding statistics were also evaluated.  The equality of error




variances of the best-fitting model was visually examined on the appropriate diagnostic




plots and statistics computed.  See Appendix A for discussion of multicollinearity which




was identified among several meteorological variables.









Identification and Tests of Outlying Observations




       Details on how outliers were determined are given in Appendix A.  Standardized




residuals were  examined with a criteria of tinv (0.95,  n-p-\) of ±  1.654 based on  a








Final Report                               35                               11/22/2004

-------
minimum degree of freedom of 170 to determine if a value was a statistical  outlier.




Presence of outliers suggest other processes, not accounted for  by the independent




variables  selected,  was  contributing  to  the concentration  or there was analytical




uncertainty in the measurement.  The final model chosen excluded those values (which




were <10% of the measurements) to determine the strength of the model for the data that




could be predicted, as the focus of this  analysis was to establish how proximity affected




concentration.  A separate analysis could be informative to indicate why outliers to the




regression analysis exist. The actual degree of freedoms for  each  compounds were as




follows; m,p-xylene (171);  o-xylene (174); toluene (174); benzene (175); ethylbenzene




(171); MTBE (169); and PCE (161).




       To test if the outliers removed from  the multiple regression  model, biases the




model outcome ANOVA tests were used to compare the means of independent variables




between  groups of outliers  and  non-outliers. To verify that removing the outlying




observations did not eliminated specific conditions or situations, the analysis of variance




(ANOVA) tests were performed on the means  of the predictor variables between group of




outliers and group of non-outliers.  The regression model was  run excluding the  outliers




to obtain the final, best fit equation for each compound.









Diagnostics of Unequal Error Variances and Multicollinearity




       To test  the  assumption of equal error variance, the  heteroscedasticity  of the




parameter estimates were tested To determine whether the error variance was constant




over all cases (Neter et al.,  1996).  The  null hypothesis for this test  is that the errors are




homoscedastic,  independent of the predictors. Therefore, the  equal error variance was








Final Report                               36                               11/22/2004

-------
assumed in  the best-fitting model when the  probability  (p) of the chi-square test was




greater than  0.05.




       The multicollinearity, which results from linear interactions between the predictor




variables, was tested because codependency might be detrimental when interpreting the




resulting regression model.  First, the  bivariate  Pearson correlations between pairs of




predictors included in the final models were  examined to identify  the highly correlated




pairs of the predictors.  Second, the magnitude  of variance inflation factor (VIF) was




examined to determine  if it was greater than 10. Third, the  condition index  and




eigenvalue were examined from the collinearity diagnostics. A condition index  greater




than  100   and  an  eigenvalue   smaller  than  0.01   was  considered  evidence  of




multicollinearity in the  model since  those values indicate the presence  of highly




correlated variables when the proportion of variation is greater than 0.5.









Use of Dummy Variables (for Seasonally)




         The three indicator  (dummy) variables were introduced to the finalized best-




fitting models of selected VOCs.  To avoid the not-fully  ranked model problem, dummy




variables for spring, summer, and  fall were generated  by assigning 1 for the season of the




sampled date, and  by assigning 0 for the other seasons.  Therefore, the winter would be




defined by all three indicator variables to be zero.
Final Report                              37                               11/22/2004

-------
RESULTS




Dataset Extraction for Data Analysis




        The  RIOPA  database  was  integrated with source  emission  inventory  and




meteorological information to provide datasets for statistical data analyses that contained




accurate proximity information of emission sources of each sample with corresponding




meteorological  conditions for  each 48-hour  sampling  period. The  blank  subtracted,




temperature adjusted,  uncensored  residential ambient  air concentrations  of  selected




VOCs:  w,/?-xylene, o-xylene,  toluene,  benzene, ethylbenzene,  and MTBE,  carbon




tetrachloride and PCE (as control compounds); Aldehydes: formaldehyde, acelydehyde,




and acrolein; and Particulate Matter: PM2.5, elemental carbon, organic carbon  and two




PAHs were examined.  The distances from residences to identified mobile, area, and point




sources were determine as was the averages of meteorological variables for each time




period a sample was collected.









Descriptive Statistics  :




       The sample means, standard deviations, median, percentiles, and the maximum




values for  the concentrations (|ig/m3) of selected the target  compounds  measured in




residential ambient air are listed in Table 11.  The sample means, standard  deviations,




median, percentiles, the minimum and maximum values of the closest distances from the




location of the RIOPA sampler to the public roadways by its functional class, and by the




roadway name are listed in Table  12 and 13, respectively. The sample means,  standard




deviations,  median, percentiles,  the minimum and maximum values  of  distances from




sampler to the closest area and point sources are listed in Table 13.  The sample means,








Final Report                              38                              11/22/2004

-------
Table 11. Concentrations of Selected VOCs in Residential Ambient Air (|ig/m3, N=183)
Compounds
m,p-Xy\ene
o-Xylene
Toluene
Benzene
Ethylbenzene
MTBE
Tetrachl oroethy 1 ene
Mean
3.25
1.71
6.82
1.50
1.34
5.75
1.10
Carbon TetrachlorideO.84
Formaldehyde
Acetaldehyde
Acrolein
PM2.5 Mass
Elemental Carbon
Organic Carbon
6.35
8.88
0.89
20.4
1.36
3.33
Standai
Deviati
4
6
5
1
2
5
3
2
2
6
1
.29
.51
.83
.54
.74
.34
.09
.28
.81
.50
.29
10.7
0
1
.64
.73
-d Percentiles
nn
25
1
0
2
0
0
2
0
0
2
O
J
0
.51
.59
.59
.69
.46
.23
.50
.48
.71
.05
.13
13.8
0
2
.92
.07
50
2.37
0.94
4.83
1.22
0.99
4.35
0.74
0.69
7.09
7.86
0.39
18.2
1.29
3.00
75
51.21
80.98
32.88
18.06
36.24
27.17
41.82
39.1
10.7
38.7
6.21
71.7
3.51
9.46
90
3.97
1.38
9.36
1.90
1.74
7.51
1.11
0.81
8.29
10.2
0.78
25.5
1.72
4.00
Comparison
Maximum NJ Urban
Concentration

6.44
2.16
14.67
2.68
2.51
12.13
1.50
0.94
9.33
14.6
1.69
30.9
1.96
5.61

2
1
5
0
0
6
0
0
2
1
-

.6
.2
.7
.62
.92
.83
.40
.09
.3
.1

15.8
-
-


A NJDEP mean concentrations reported in Elizabeth, NJ, 2001
(www. state.nj .us/dep/airmon/toxicsO 1 .pdf)
Final Report
39
11/22/2004

-------
Table 12. The Closest Distances  from Sampler Location to the  Public Roadways by
Functional Classes (km, N=183)
Roads
FC11
FC12
FC14
FC16
FC17
FC19
13. The
N=183)
Roads
195 a
Rtlb
Rt27b
Rt28b
Rt439 b
, , Standard
Mean _ . .
Deviation
1.53
2.53
0.50
0.19
0.29
0.03
Closest
1.05
1.16
0.54
0.17
0.22
0.02
Distances from
, , Standard
Mean _ . .
Deviation
1.89
1.10
1.23
1.54
0.93
1.23
0.83
0.83
0.87
0.81
Minimum
0.04
0.02
0.01
0.01
0.02
0.00
Percentiles
25
0.68
1.47
0.11
0.07
0.11
0.02
Sampler Location to
Minimum
0.05
0.03
0.04
0.10
0.01
50
1.33
2.87
0.33
0.13
0.25
0.03
Individual
75
2.28
3.44
0.65
0.32
0.39
0.04
Public
Percentiles
25
0.86
0.42
0.50
0.88
0.24
50
1.73
0.93
1.02
1.51
0.61
75
2.88
1.72
1.85
2.08
1.61


3.70
5.58
2.49
0.78
0.97
0.13
Roadways (km,


5.33
3.62
3.40
3.59
2.86
a: Interstate (FC11), b: Major Arterial (FC14)
Final Report
40
11/22/2004

-------
Table 14. The  Closest Distances from Sampler Location to  Area Sources and Point
Sources Likely Impact Elizabeth, NJ (km, N=183)
Emission Sources
Gas Station
A/r Standard Mimmu
Mean _ . .
Deviation m

0.36
Dry Cleaning Facilities 0.55
Refinery a
Tanker Terminal b
Industry c
Aviation Service d
Industry e
j?
Industry
Joint Meeting of
Essex and Union g







2.98
4.78
2.51
5.92
4.19
3.27
2.77
0
.21
0.39
1
.12
1.14
1
1
.00
.15
1.11
1
.14
1.20
0.03
0.06
0.84
3.23
0.62
2.80
0.81
0.99
0.40
Percentiles
25
0.22
0.25
2.06
3.78
1.75
5.03
3.50
2.46
1.91
50
0.36
0.43
3.07
4.58
2.60
6.18
4.43
3.24
2.36
75
0.49
0.77
3.77
5.72
3.26
6.91
5.08
3.98
3.80
Maximu
m
1.01
1.
.69

5.76
7.
.69

5.63
8.63
6.
.64

6.04
5.
.81

a: Refinery = Xyl PS1, Tol PS1, Bzn PS1, Ebz PS1, MTBE PS1, PCE PS1; b: Tanker
Terminal = Xyl PS2, Tol PS2, Bzn PS2; c: Industry = Xyl PS3, Ebz PS2; d: Aviation
Service = Xyl_PS4, Tol_PS5, Bzn_PS4; e: Industry = Tol_PS3; f: Industry = Tol_PS4; g:
Joint Meeting of Essex and Union = Bzn PS3
Table 15. The Meteorological
Variable, Unit
Temperature, K
Wind Speed, m/s
Relative Humidity,
Atmospheric
Pressure, mmHg
Precipitation, mm
Mixing Heights, km
Pasquill Stability
Class
Mean
284.2
4
.3
66.3
762.3
0
1
5
.0207
.027
.028
Variables (N=l 83)
Standard
Deviation
8.0
1.1
12.
4.5


6

0.0249
0.362
0.444
Minimum
265.5
1.9
42.7
750.3
0.000
0.414
3.867
Percentiles
25
279.4
3.6
58.1
759.6
0.000
0.767
4.706
50
284.6
4.4
66.4
761.6
0.010
0.948
5.000
75
289.9
5.1
75.9
765.5
0.040
1.214
5.300
Maximu
m


303.3
8.
0

91.8
773.1
0.
2.
6.
130
099
063



Final Report                              41                               11/22/2004

-------
standard deviations, median, percentiles, the minimum and maximum  values  of the




meteorological condition variables are listed in Table 15.
DISCUSSION




Prior to establishing the best-fit linear regression equations for each compound Bivariate




Pearson Correlations were conducted to guide the inclusion of different variables and




examine associations among the variables.  The In transformed concentration data were




used since the concentration  distribution was  consistent with a log normal distribution




and linear regression analyses assumes a normal distribution for the independent variable.




The  inverse distance was used since concentration declines inversely from line sources,




such as roadways, or as the square of the inverse for point sources based on an idealized




Guassian  Dispersion.   The  square of inverse  distance was also examine,  but no




differences in results were observed, so only the inverse distance was retained in the final




mathematical models.  As described in the method section andAppendix A outliers were




identified  for the regression  model  calculated from the entire data set and a second




regression model was determined after eliminating the outliers.  The variables selected in




the model were examined for multicollinearity.  Details for the  model evaluated for each




compound are given in  Appendix A.









Test of Outlying Observations




       To verify that the outlying observations were not eliminated based on  specific




conditions, the ANOVA tests were performed on the means of the predictor variables








Final Report                              42                               11/22/2004

-------
between group of outliers and group of non-outliers. Duncan's multiple range test results




indicated  that means of predictor variables were not significantly different between




groups of outliers and non-outliers for selected VOCs. The frequency of outliers removed




is listed by season in Table  16.









Model Summaries




       The relative  contribution  to  residential  ambient  air  concentrations  due  to




proximity to ambient sources on the selected air toxics and PM2.5  with  corresponding




meteorological conditions were determined by multiple linear regression analyses (Table




16).  The  F statistics were significant for overall models  except  carbon tetrachloride




(p<0.0001). Probabilities for parameter estimates were more significant and the r2 larger




for the meteorological variables than the proximity variables. This implies that a greater




percentage of the  explanatory power of the  regression equations for these compounds




were due  to changes in the meteorological conditions than the distance to a source (see




below). There were some  interactions between the predictor variables in the best-fitting




model,  especially  between the meteorological  variables.  The  model coefficients  of




determination for the compounds that included proximity predictors varied between 0.16




and 0.47 (Table  17).  The samples and meteorological data were averaged  over 48 hours,




reducing the possibility of  accounting for shorter term variability that could alter the air




concentrations.




       Among the variables associated with  proximity to mobile source  emissions, the




inverse distance to major  urban arterial  roadways (FC14) was  selected  as significant




predictor  in best-fitting models of residential ambient  air concentrations of all of the








Final Report                              43                                11/22/2004

-------
aromatic compounds and the inverse distance to the NJ Turnpike (FC11) for  PM2.5,




organic carbon and the two individual PAHs examined (coronene and Benzo[ghi]pyrene).




The inverse distance to the closest gas station was included as a predictor in the models




of residential ambient air concentrations of m,p-xy\ene, o-xylene, benzene,  and MTBE.




The inverse distance to areas in Elizabeth that had high truck traffic that included loading




and unloading and therefore idling  trucks  was included in  the models for PM2.5 and




elemental carbon while the inverse distance to the refinery in Linden, NJ was included in




the regression equation for elemental carbon.  The inverse distance  to the closest dry




cleaning facility was selected  as a  significant  predictor variable  in the model  of




residential  ambient air concentration of PCE in Elizabeth, NJ.  No variables associated




with the inverse distance to sources were identified for the three aldehyde  compounds.




Nor were any of the proximity factors included in the control variable that did  not have




mobile source emissions, carbon tetrachloride, tetrachloroethylene, particulate sulfur and




particular selenium.




       Among the  meteorological  condition variables  atmospheric stability, mixing




height, temperature, wind speed, and relative humidity were significantly associated with




one or more of the residential ambient air concentrations. The atmospheric stability and




temperature were consistently included as statistically significant predictors in  the best-




fitting models of  the  aromatic compounds, MTBE and  the particulate species, while




mixing  height was selected for acrolein.  Atmospheric stability is calculated based on




mixing height and temperature. Wind speed was included with a negative coefficient in




most models.
Final Report                              44                                11/22/2004

-------
       A consistency in the parameter estimates of the proximity variables are observed




among the aromatic compounds.  The order of mobile emission strength in Elizabeth, NJ




(Table 2) is toluene, xylenes (m,p xylenes is greater than o xylene), benzene and ethyl




benzene, the same order as the magnitude of the coefficients in the regression equation,




though the sum of the coefficients of o and m/p xylene exceeds that of toluene.  The order




of the coefficients for  proximity to gasoline stations (GS"1)  is  MTBE, m,p  xylene,




benzene and o xylene  with  GS"1 not included in the regression equation for toluene.




MTBE is the compound with the highest concentration in gasoline and has the highest




vapor pressure (0.309atm) of the VOCs studied. The next most prevalent compound of




those that included GS"1 is m,p  xylene.  Lastly, while o xylene might be at a higher




concentration than benzene in gasoline, benzene has a high vapor pressure (0.125atm)




than the xylenes or ethyl benzene (0.0109-0.0125atm).  Thus, the parameter coefficient




order is consistent with the abundance of these compounds in gasoline as modified by the




vapor pressure. It is unclear why proximity to gas  stations was not included in toluene's




regression equation since its concentration in gasoline is second only to MTBE and  its




vapor pressure is between that of benzene and m,p xylene, The lack of inclusion GS"1 in




the regression equation  for ethyl benzene  might reflects  its  lower concentration  in




gasoline and the lower air concentration with more values being below detection..  The




regression equation for MTBE did not include distance from arterial roadways, while the




aromatic compounds did, but did include distance from the interstate highway.  These




differences may  reflect the  more efficient  combustion and removal in the catalytic




converter of MTBE compared to the aromatic compounds and lower tailpipe  emissions




along with a (Poulopoulos and Philippopoulos 2003).








Final Report                              45                               11/22/2004

-------
         Two polyaromatic hydrocarbons (PAHs) measured in the PM2.5, coreonene and




benzo[ghi]perylene, were evaluated for effects of proximity to mobile sources. Coronene




has been used as an index PAH compound to differentiate between gasoline and diesel




vehicles because coroene is found in emissions from gasoline powered vehicles, but has




not been detected in diesel emissions (Rogge et al, 1993). Benzo[ghi]perylene is present




in both diesel vehicle and gasoline vehicle exhaust, so should be an individual compound




representative of mobile source emissions (Harrison et al. 1996).  It should be noted that




these compounds are also emitted from other combustion sources, so may not be solely




from mobile sources. Both compounds included the inverse distance to FC11 as well as




atmospheric stability, temperature and wind speed in the regression equation (Table 17).




The elemental carbon was associated with FC14. FC14 has three (Rt 1/9, Rt 27, Rt 439)




roadways that are major truck thoroughfare. A number of the homes in the study on very




close to  FC14  and FC14 was the mobile source area associated with the  aromatic




hydrocarbons. No clear difference in what sources contributed to PM2.5 mass and the two




individual PAHs that may be markers of mobile  sources,  was identified.  The weakest




associations were observe for organic carbon which is  expected to have more sources




besides  combustion and diesel emissions than the  other components of PM.  All PM




components were influenced by a variety of meteorological factors,  proximity to the NJ




Turnpike, major arterial roads and/or truck loading/unloading areas.




      As a check on the possibility that there was an inherent bias in the sampling or




analyses  that caused the associations between the  proximity to mobile sources in the




regression equations to volatile compounds without  mobile sources,  carbon tetrachloride




and  tetrachloroethene were evaluated.  Carbon tetrachloride  has little  industrial or








Final Report                              46                               11/22/2004

-------
commercial uses and therefore minimal sources in Elizabeth, NJ, while tetrachloroethene




is  the primary solvent  used in the  dry  cleaning industry.   No  parameters,  neither




proximity nor meteorological variables, were include in the regression  equations  for




carbon tetrachloride at a p<0.5  criteria indicating that no local sources nor distance to




roadways influenced the  variability in its measured concentration. This is consistent with




the lack of local  sources.   Meteorological  variables were included in the regression




equation for tetrachloroethene, but not proximity to roadways or gasoline stations. Since




tetrachloroethene is used in dry cleaning, the distance between the sampling locations and




dry cleaning facilities were determine and evaluated in the regression equation.  The final




regression equation for tetrachloroethene included the inverse distance to dry cleaning




facilities, atmospheric stability,  temperature, wind speed and relative humidity with




similar  partial r2 and coefficients for the meteorological identified for the regression




equations for the compounds derived from mobile sources (Table 17).




       To evaluate whether all  particulate components  might show associations with




mobile  sources, selenium  an element measured in PM2.5 that is  not expected to be




associated with  mobile  sources  was  also examined.   Its  regression equation did  not




include proximity or meteorological to mobile sources.









Effects of Source Proximity




       The common  interpretation of a regression coefficient is that  it  estimates  the




change in the response variable per unit increase in the predictor variable. This estimation




has limitations when  the predictor variables are  seriously intercorrelated. When highly




correlated  predictor variables vary  together, the  magnitude  of the outcome  variable








Final Report                               47                               11/22/2004

-------
change  with a single predictor variable is altered.  Since the multicollinearity in the




models was not serious for the immediate remedial measures (Appenix A), based on the




previous diagnostics, the effect of individual parameter estimates on the concentrations




were  evaluated by holding all the variables constant except for the variable  being




evaluated. This approach allows for the model to be evaluated for the effect of a  single




variable across its range of values when considering all other variable to be constant. It is




a type of sensitivity analysis. The assigned constant values used were the median value




for the meteorological variables and the maximum value for the distance.  The maximum




value of the distance was used since the smallest changes in concentrations with distance




would be expected at the furthest distance from  the source. A plot of the predicted  air




concentration  with distance for  each  aromatic compound,  MTBE  and the  PM2.5




components derived from the best fit regression equation are given in figures 11-19.




       The  shape  of the  decline with distance  follows an exponential form  since the




regression equations included distance  as  an inverse  term and the concentration was




expressed as a log normal concentration. For the roadways (both FC11 and FC14) and




gasoline stations, the  decline in predicted concentration is rapid during the first  200m




with little change  due to roadways after that distance. The magnitude of this change




between 20 meters, the distance of the closest samples, to 200 meters  was a factor of




approximately two for the PM2.5 constituents to four  for the  aromatic  compounds and




MTBE.  The scatter plots of concentration with distance for the actual data (figure 20 to




45) are  consistent  with the rapid falloff in concentration with  distance predicted by the




regression equations, though the falloff may be at a slightly further distance, though the




changes  in concentrations appear to be  small.  The predicted effect for the PM area








Final Report                               48                                11/22/2004

-------
sources of truck loading and unloading for PM2.5 and elemental carbon is over a longer




distance than  the  roadways.   This is probably a statistical artifact the multiple area




sources associated with truck loading  and unloading that are all to the east/north east of




Elizabeth and the difficulty in assigning the appropriate distance to the site since it covers




a large area (figures 46 and 47).  The roadways FC19 which are small local roads show a




maximum effect on PM2.5 mass at 10 to 20 meters.  This is suspect as a true mobile




source of PM2 5 from streets in this category is minimal since the traffic is very light with




little if any truck traffic.  The scatter plot suggests that only a few data points are




responsible for the observation so it may  be a statistical associate with other causes.  All




homes will be close to roadway FC19, even if they are near major roadways as well, as is




evident by the maximum distance to a roadway classified as FC19 for any home was less




than 100 meters.









       One meteorological parameter that  we could not adequately incorporate in the




models was wind direction. Several different approaches to examine wind direction were




examined including categorizing wind direction based on the amount of variability in




wind direction during the sampling period as  well as  evaluation  of the dominant




direction.  However,  the micrometeorology around the sample  sites  could not be




definitively represented by the meteorological station at Newark Airport since directional




changes are expected around buildings and  roadways.  Thus, the effect of wind direction




could not be adequately represented  in  the model  and therefore final models did not




include that term.
Final Report                              49                               11/22/2004

-------
       The meteorological variables contributed more to the explanatory power of the




regression equations than the proximity variables.  One possible reason this is that there




were more homes  sampled at distances greater than  200  meters, than closer than 200




meters, the distance with the maximum predicted effect of the roadway.  If only homes




within 200 m of a major roadway were included in the study it is possible that the effect




of proximity would be  stronger.  The regression equations  suggest that the effect of




distance due to mobile sources is minimal for homes further than 200 meters from major




roadways  and that concentration  changes nears homes more than 200 meters from




roadways or gasoline stations would be dependent upon meteorology which controls the




urban background levels for constant emission  sources  within an urban center and




transport of pollutants from outside Elizabeth, NJ. Exploratory analyses of only homes




within 200 meters and homes within 500 meters of FC11, the NJ Turnpike, suggest that




inverse distance to the NJ Turnpike was a potential predictor variable, but the term FC11




did not reach statistical significance at p<.15 for the aromatic compounds probably due to




the small n in that sub-sample of homes.









       None  of the regression models for the three aldehyde  compounds  studied,




formaldehyde, acetaldehyde and acrolein,  included the inverse  distance to any of the




mobile source proximity terms, even though they are exhaust emission products.  The




positive association with the distance to FC11  roadways for formaldehyde implies that




roadways  are not a source of formaldehyde.  It is more likely that it is a result of an




association among FC11 distance, formaldehyde concentration  and third variable not




evaluated. It is possible though that photochemical production of formaldehyde increases








Final Report                              50                               11/22/2004

-------
with distance from roadway in a manner similar to ozone, which is higher away from




roadways  than directly adjacent  from roadways as there  is time component to  its




maximum concentration. To attempt  to  evaluate whether the affect of proximity to




roadways could be observed in the absence of photochemistry, regression equations were




also determined for data when the mean temperature during sampling was <10°C, days




when photochemistry is expected to be minimal.  Again, only meteorological variables




were only included in the regression equation for formaldehyde and acetaldehyde with a




p<0.15 (Table 17). These analyses had a smaller so had less statistical power to identify




an association.
Final Report                              51                               11/22/2004

-------
Summary




       Mobile sources (cars, trucks and gasoline stations) are a main source for aromatic




hydrocarbons, methyl tert butyl ether, and PM2.5, elemental carbon and selected PAHs in




Elizabeth, NJ. Meteorological factors, in particular atmospheric stability, wind speed and




temperature were statistical predictors of the overall concentration of these pollutants in




the ambient air surrounding homes in the area.  The air concentrations at homes that were




very close to roadways and gasoline stations within 200 to  500 meters, were inversely




related to the distance to those sources.  Increases in the concentrations for the closest




residences are predicted to be factors of two to four above what might be considered the




background levels for the area. Area sources that were associated with truck activity or




possibly  other mobile source (airport or shipping terminal) also appears to increase the




PM levels associated with diesel  emissions.  These increases in ambient air for homes




near ambient sources  could  potentially  result in  corresponding increases in personal




exposure  for  individuals living  in  homes  without smokers   since the  ambient air




surrounding homes penetrates into the home and  a strong association has been found




between  ambient air concentrations outside a portion of the homes studied during the




RIOPA study with both indoor  and personal  air for these compounds  (examples in




Figures 48 to 50 - from Weisel et al. 2004b).
Final Report                              52                               11/22/2004

-------
 Table 16. Frequency of Non-detects, Outliers, and Non-outliers by Season
m,p-Xy\ene
Season
Fall
Spring
Summer
Winter
0-Xylene
Season
Fall
Spring
Summer
Winter

Season
Fall
Spring
Summer
Winter

Season
Fall
Spring
Summer
Winter

Season
Fall
Spring
Summer
Winter

Season
Fall
Spring
Summer
Winter

Season
Fall
Spring
Summer
Winter
Non
N
1
1
2
Non
N
1
Toluene Non
N
3
Benzene Non
N
1
Ethylbenzene Non
N
3
3
1
MTBE Non
N
6
1
1
PCE Non
N
5
1
9
Detects
%
25
25
50
Detects
%
100
Detects
%
100
Detects
%
100
Detects
%
42.86
42.86
14.29
Detects
%
75
12.5
12.5
Detects
%
33.33
6.67
60
Outliers
N
4
1
6
2
%

30.77
7.69
46.15
15.38
Outliers
N
4
1
4
6
%

26.67
6.67
26.67
40
Outliers
N
1
7
4
3
%

6.67
46.67
26.67
20
Outliers
N
5
1
5
7
%

27.78
5.56
27.78
38.89
Outliers
N
5
1
7
4
%

29.41
5.88
41.18
23.53
Outliers
N
6
7
10
5
%
21
25
35
17

.43
.71
.86
Outliers
N
O
5
3
2
%
23
38
23
15

.08
.46
.08
.38
Non-Outliers
N
51
36
46
33
%
30.72
21.69
27.71
19.88
Non-Outliers
N
52
36
48
31
%
31.14
21.56
28.74
18.56
Non-Outliers
N
55
30
46
34
%
33.33
18.18
27.88
20.61
Non-Outliers
N
51
36
48
29
%
31.1
21.95
29.27
17.68
Non-Outliers
N
48
36
43
32
%
30.19
22.64
27.04
20.13
Non-Outliers
N
50
24
42
31
%
34.01
16.33
28.57
21.09
Non-Outliers
N
48
31
50
26
%
30.97
20
32.26
16.77
Final Report
53
11/22/2004

-------
Table 17.  Summary of Finalized Best-fitting Models of Selected VOCs (-p<0.15 used as
criteria for inclusion)
Pollutant
Total r2
m,p-Xy\ene
0.33

o-Xylene
0.42

Toluene
0.31

Benzene
0.41

Ethylbenzene
0.16

MTBE
0.25

PERC
0.31

Formaldehyde
0.15

Acetaldehyde
0.13

Acrolein
0.046

Row
Heading
X,
Pi(SE)
p_r2
x,
Pi(SE)
P-r2
x,
Pi(SE)
P-r2
x,
P,(SE)
P-r2
X,
P,(SE)
P-r2
X,
Pi(SE)
P-r2
X,
P i(SE)
P-r2
X,
Pi(SE)
P-r2
X,
P i(SE)
P-r2
X,
P i(SE)
P-r2
Intercept
Po
4.9(1. 7)b

Po
4.5(1.4)b

Po
3.1(1. 8)d

Po
10.(1.3)a

Po
6.0(2.4)c

Po
-2.7(2.3)**

Po
2.5(1.3)c

Po
9.2(1.1)

Po
2.2(0.3)

Po
-.68(.33)

Mobile/Area/Point
Source
FC141
7.9(4.4)d
0.01
FC141
7.4(4.5)d
0.01
FC141
14.7(4.4)b
0.04
FC141
10.1(1.5)*

FC142
9.7(5.6)c
0.02
FC142
22.3(14.3)c
0.01












GS1
17.4(6.3)t
0.04
GS1
9.5(5.5)c
0.02



GS1
5.5(3.3)d
0.01



GS1
33.6(8.3)a
0.09
DCF1
32.7(12.4)b
0.04









Meteorological Variables
Stab
0.54(0.11)*
0.18
Stab
0.52(0.09)a
0.27
Stab
0.71(0.12)a
0.22
Stab
16.1(5.6)b
0.03
Stab
0.44(0.16)b
0.08
Stab
0.24(0.15)c
0.01
Stab
0.14(0.09)*
0.01






MH
0.61(.32)
.046
K
-0.02(0.005)*
0.09
K
-0.02(0.004)a
0.09
K
-0.02(0.006)b
0.03
K
0.30(0.10)*
0.10
K
-0.03(0.007)b
0.06
K
0.01(0.007)c
0.01
K
-0.01(0.004)b
0.05
K
-0.11(.03)
.094
K
.023(.007)
.093






U
-0.12(0.04)b
0.04



U
-0.04(0.004)*
0.25
U
-0.11(0.07)c
0.015
U
-0.19(0.06)*
0.12
U
-0.14(0.04)"
0.18
U
-0.31(.22)
.014
U
-.13(.06)
.036









RH
0.01(0.005)c
0.02









RH
0.01(0.003
)b
0.03









Final Report
54
11/22/2004

-------
Pollutant
Total r2
PM2.5
0.47

EC
0.40

OC
0.33

Coronene
0.67

Benzofghi]-
pyrene
0.66

Sulfur
0.52

Selenium
0.41

Row
Heading
X,
P i(SE)
P-r2
X,
P i(SE)
P-r2
X,
P i(SE)
P-r2
X,
P i(SE)
P-r2
X,
P i(SE)
P-r2
Xi
P i(SE)
P-r2
Xi
P i(SE)
P-r2
Intercept
Po
1.0(0.6)

Po
-2.5(0.6)

Po
-2.2(.7)

Po
24(4)

Po
22(4)

Po
4.3(2.2)

Po
1.2(3.5)

Mobile/Area/Point Source
FC111
20(11)
.016
REF1
630(26)
.033
FC111
39(21)
.043
FC111
133(42)
0.091
FC111
123(38)
.094
03
29(5)
.090
03
27(9)
.059
FC191
4.2(1.7)
.052

















TRUCK1
51(30)
.016
TRUCK1
78(36)
.050














Meteorological Variables
Stab
0.43(.09)
.32
Stab
0.32(0.13)
.078
Stab
0.66(0.14)
.25
Stab
0.81(0.28)
.06
Stab
0.71(.26)
.060
Stab
0.44(.ll)
.14
Stab
0.93(.19)
.29
U
-0.13(.04)
.066
RH
.011(.004)
.24
Precip
.013(.006)
.035
K
-O.lO(.Ol)
.28
K
-0.087(.01)
.26
K
-0.023(.008)
.23
K
-0.15(.08)
.029









U
-0.41(.10)
.24
U
-0.38(.10)
.25
U
-O.ll(.OS)
.039
U
-0.019(.007)
.034
Analysis of aldehyde data for days when the temperature was <10°C, to evaluate role of photochemistry
Formaldehyde
0.13

Acetaldehyde
0.13

Acrolein
0.046

X,
Pi(SE)
P-r2
X,
P i(SE)
P-r2
X,
Pi(SE)
P-r2
Po
1.5(0.5)

Po
2.2(0.3)































K
-0.03(.02)
.063
K
-0.03(0.02)
0.068



U
-0.13(.07)
.007






MH
0.43(.23)
.03






Final Report
55
11/22/2004

-------
X;, /' th predictor variable; Po, intercept of model; ft, parameter estimate of /' th predictor;
SE, standard error of parameter estimates; r2, coefficient of determination; P-r2, Partial r
square of the variable.
 ~l, indicates inverse values; "2, indicates inverse square values.
p<0.15 used as selection criteria for inclusion of a variable in the model
FC14"1 is the inverse distance (m) to the nearest major arterial roadways
GS"1 is the inverse distance (m) to the nearest gasoline station
PS"1 is the inverse distance (m) to a point source (Linden Refinery)
DCF"1 is the inverse distance (m) to the nearest dry cleaning facility
TRUCK"1 is the inverse distance (m) to the major truck loading areas
Airport"1 is the inverse distance (m) to Newark International Airport
Stab is atmospheric stability
K is temperature (°K)
U is the wind speed (m
MH is the mixing height (km)
Precip is precipitation (total mm)
Final Report                               56                                11/22/2004

-------
Regression Model Predictions Figures 10-20

There figures show the change in concentration as predicted by the regression models
while varying the variable indicated from the minimum to maximum value observed
during the study and holding all other variables in the model constant (median value for
the meteorological variables or maximum value for the distance variables).  The side bar
is a box and whisker plot of the measured concentrations during the study (mean, median,
5th, 25th, 75th, 95th percentiles) for comparison.

Scatter Plots of Distance to Concentration Figures 21-45
These figures are the scatter plots of the concentration measured with the determined
nearest distance  between each home and the nearest roadway in each class or gasoline
station for all values in the study.  The figures provide a visualization of the association
concentration with distance to mobile sources without consideration for meteorology, a
major factor that influences concentration.
Final Report                              57                                11/22/2004

-------
                                    Effect of Distance to Sources on Residential Ambient Air
                                                   Concentration of m,p-Xylene
                                4.5 T	1	1	^4.5
                                4.0
                                3.5
     3.0
                           CD
X   2.5
Q.
E

     2.0
                                1.5
                                1.0
                                                  •FC14inv

                                                   GSIinv
                                              50
                              100
150
200
                                                        Distance, meters
                                                                                             Mean
                               -4.0
                                                                        -3.5
                                                                                                   -3.0
                                                                                                   -2.5
                                                                                                   -2.0
                                                                        -1.5
                                                                          1.0
250
 Figure 10: Effect of the Distance to the Emission Sources on the Residential Ambient Air Concentration of ^-Xylene Estimated by the Best-fitting
                          Model (Box Plot Shows Mean and Quartiles of Distribution of ^-Xylene Concentrations)
Final Report
                                          58
                                                                              11/19/2004

-------
                                   Effect of Distance to Sources on Residential Ambient Air

                                                   Concentration of Benzene
                               2.1
                               1.8
£    1.5
                          o>
                          c
                          CD
                          N

                          CD
                          GO
     1.2
                               0.9
                               0.6
                                                   -FC14inv


                                                   GSIinv
                                                                                              Mean
                                                                           -2.1
                                                                                                      1.5
          -1.2
                                                                          -0.9
                                                                           -0.6
                                             50         100        150


                                                       Distance, meters
                                                    200
250
 Figure 11: Effect of the Distance to the Emission Sources on the Residential Ambient Air Concentration of Benzene Estimated by the Best-fitting Model (Box

                                  Plot Shows Mean and Quartiles of Distribution of Benzene Concentrations).
Final Report
                                          59
                                                                              11/19/2004

-------
                       LJJ
                       DQ
                                Effect of Distance to Sources on Residential Ambient Air
                                                 Concentration of MTBE
                            20
                             15
                             10
                                                                          GSIinv

                                                                         •FC11inv
                                                                                         Mean
                                          50         100       150

                                                   Distance, meters
200
250
                      15
                      20
                      10
Figure 12: Effect of the Distance to the Emission Sources on the Residential Ambient Air Concentration of MTBE Estimated by the Best-fitting Model
                             (Box Plot Shows Mean and Quartiles of Distribution of MTBE Concentrations).
Final Report
                                        60
                                                                          11/19/2004

-------
                                   Effect of Distance to Sources on Residential Ambient Air
                                                   Concentration of o-Xylene
                             2.00 -i	1    i    ^2.00
                             1.75
                             1.50
                         x;
                         6
                             1.00
                             0.75
                             0.50
                                            50
100
150
                                                     Distance, meters
                     •FC14inv

                      GSIinv
                               -1.75
                                          -1.50
                                                                                                -1.25
                                          -1.00
                                            0.75
                                            0.50
200
250
  Figure 13: Effect of the Distance to the Emission Sources on the Residential Ambient Air Concentration of 0-Xylene Estimated by the Best-fitting
                          Model (Box Plot Shows Mean and Quartiles of Distribution of 0-Xylene Concentrations).
Final Report
                                         61
                                                                             11/19/2004

-------
E
O)

0"
0
_3
o
                                     Effect of Distance to FC14 on Residential Ambient Air
                                                   Concentration of Toluene
                               14
                               12
                               10
                                6
                                                                         -12
                                                                         -10
                                                                                             Mean
                                                                           14
                                                                         -6
                                                                                                   -4
                                             50
                              100
150
200
250
                                                       Distance, meters
   Figure 14: Effect of the Distance to the Emission Sources on the Residential Ambient Air Concentration of Toluene Estimated by the Best-fitting
                          Model (Box Plot Shows Mean and Quartiles of Distribution of Toluene Concentrations)
Final Report
                                         62
                                                                             11/19/2004

-------
                              2.0
                                   Effect of Distance to FC14 on Residential Ambient Air
                                               Concentration of Ethylbenzene
                               1.6
CD
C
CD
N

CD
.Q
>,
£

LLJ
                              0.8
                              0.4
                                             50         100        150

                                                      Distance, meters
                                                     200
                                                                                             Mean
                                                                          -1.6
                                                                            2.0
                                                                                                   -1.2
                                                                            0.8
                                                                            0.4
250
Figure 15: Effect of the Distance to the Mobile Source Emission on the Residential Ambient Air Concentration of Ethylbenzene Estimated by the Best-

                      fitting Model (Box Plot Shows Mean and Quartiles of Distribution of Ethylbenzene Concentrations)
Final Report
                                          63
                                                                             11/19/2004

-------
O)

LO"
c\i

0.
                                  Effect of Distance to Sources on Residential Ambient Air
                                                  Concentration of PM2.5
                               20
                               15
                               10
                                                                  	FC19inv

                                                                  — - - Truck Inv

                                                                  -  - - - FC11inv
                                                                                           Mean
                                                                                                 -25
                                                                       -22
                                                                       -20
                                                                                                -18
                                                                       -15
                                                                                                -12
                                                                        •10
                                            100
                             200
300
400
500
                                                      Distance, meters
 Figure 16: Model Prediction Of PM25 Concentration With Distance To Fl 1, F19 And Truck Loading And Unloading Region Estimated By The Best-Fitting
                          Model (Box Plot Shows Mean And Quartiles Of Distribution Of PM2.5 Concentrations)
Final Report
                                        64
                                                                          11/19/2004

-------
                                  Effect of Distance to FC14 on Residential Ambient Air
                                                    Concentration of EC
                       o
                       LJJ
                             1.5
                             1.4
                             1.3
                             1.2
                             1.1
                             1.0
                             0.9
                                                                                        Mean
  1.5
-1.2
 •0.9
                                 0        1000      2000      3000      4000      5000

                                                   Distance, meters
  Figure 17: Model prediction of elemental carbon concentration with distance to Truck loading/dock area Estimated By The Best-Fitting Model (Box Plot
                            Shows Mean And Quartiles Of Distribution Of Elemental Carbon Concentrations)
Final Report
                                        65
                                                                           11/19/2004

-------
~O)
 i

o"
                                Effect of Distance to Sources on Residential Ambient Air

                                                  Concentration of OC
                          10



                           9



                           8



                           7



                           6



                           5



                           4



                           3



                           2
                           1  -
                             0
                 100       200       300


                         Distance, meters
400
                                                            Mean
                                                                    10
                                                                  -8
                                                                  -6
                                                                  -4
                                                                  -2
500
  Figure 18: Model Prediction Of Organic Carbon Concentration With Distance To Fl 1 Roadways Estimated By The Best-Fitting Model (Box Plot Shows

                               Mean And Quartiles Of Distribution Of Organic Carbon Concentrations)
Final Report
                                       66
                                                                         11/19/2004

-------
                                  Effect of Distance to FC11 on Residential Ambient Air

                                               Concentration of Coronene
                            3.0
                       O)


                       CD"
                       c
                       CD
                       C

                       2
                       o
                      O
                            2.5
                            2.0
1.5
                            1.0
                            0.5
                            0.0
                                                                                       Mean
                                                                                             -3.0
                                                                                            -2.5
                                                                                            -2.0
                                                                -1.5
                                                                -1.0
                                                                -0.5
                                                                 -0.0
                                          50        100       150        200       250

                                                   Distance, meters
    Figure 19: Model Prediction Of Coronene Concentration With Distance To Fl 1. Estimated By The Best-Fitting Model (Box Plot Shows Mean And

                                      Quartiles Of Distribution Of Coronene Concentrations)
Final Report
                                        67
                                                                         11/19/2004

-------
                      CD"
                      o
                      N
                      C
                      CD
                      GO
                           10
                                Effect of Distance to FC11 on Residential Ambient Air

                                         Concentration of Benzo[ghi]pyrene
                            8
6
                            0
                                        50       100      150

                                               Distance, meters
                                       200
                                                                                FC11I
                                                     *

                                                     §

                                                     §
250
                                                           10
                                                          IV
                                                          -8
         -6
                                                                                      -4
                                                                                      -2
Figure 20: Model Prediction Of Benzo[ghi]pyrene Concentration With Distance ToFll. Estimated By The Best-Fitting Model (Box Plot Shows Mean And

                                  Quartiles Of Distribution Of Benzo[ghi]pyrene Concentrations)
Final Report
                                       68
                                                                        11/19/2004

-------
            60.00




            50.00




         O 40.00
         "+j

         5

         "" 30.00
         0)
         O
         O
            20.00
             10.00
              0.00
                                       mp Xylene  FC11

                                                                      !»       t*
                                                                 *      * »V»?«»A»A
                 0.00     0.50      1.00      1.50      2.00      2.50     3.00     3.50     4.00

                                               Distance FC11


                     Figure 21. Scatter plot ofm/p xylene with distance from FC11 Roadways, major urban arterial.
Final Report
                                 69
                                                              11/19/2004

-------
              8.00
              7.00
              6.00
           c


          S  5.00

           5
          *J  , ««
 <1>
 o
 c
 o
o
               .00
               .00
              1.00
              0.00
                                               Benzene
                                 » •
                                                                              •   *
                                        	j_
                  0.00     0.50     1.00      1.50     2.00     2.50


                                                Distance FC11
                                                               3.00
3.50
4.00
                       Figure 22. Scatter plot of benzene with distance from FC11 Roadways, major urban arterial.
Final Report
                                   70
                                                                11/19/2004

-------
PM2.5withFC11
on
OU
yn
/ U
C fin
c
3 en
OU
O
CAC\
o
° 30
10 dU
^c on
^^H
O_
1 n
I U
n



.
* * * «» * * •

** ^*%**j! \* »*»< /» * * * **!
* **/***

\J i i i i i i i
0 0.5 1 1.5 2 2.5 3 3.5 4
Distance to FC11 (km)
                                Figure 23. Scatter plot of PM25 with distance from FC11 Roadways, major urban arterial.
Final Report
                                                71
                                                                                        11/19/2004

-------
                                             ECFC11



c
o
+J
£
<1>
o
c
o
o
o
LJJ



H-
3.5

rs
o
2.5
2

1.5

1

0.5
n



A

A

A A
A
A A A A A
A A A A
A A A A A A
A A A *
A A A

A A A* A
A A A A A A
A AA A A A A A
A

                 0       500     1000     1500    2000    2500     3000     3500     4000
                                              FC11 Distance

                   Figure 24. Scatter plot of elemental carbon with distance from FC11 Roadways, major urban arterial.
Final Report
                                  72
                                                              11/19/2004

-------
           
-------
                                      mp Xylene FC14
Concentration
/1U.UU
IR nn
I O.UU
IK nn
I O.UU
14 nn
I *f .UU
1 9 nn
I Z.UU
m nn
I U.UU
8nn
.uu
R nn
O.UU
4 nn
t.uu
2.00
n nn

»


•
: • . .•
.*..-.
>«*» *rv» »» , • »
.«*••$ *..- ».-*.* t : t '
^MSfe t \ . : •:
U.UU n i i i i
0.00 0.50 1.00 1.50 2.00 2.50 3.(
                                             Distance FC14
                       Figure 25. Scatter plot ofm/p xylene with distance from FC14 Roadways, interstate.
Final Report
                                74
                                                           11/19/2004

-------
                                        oXyleneFC14
iJ.UU

C*^ f^n
_O
+J ^ nn
flj O.UU
+^ « i-«
CO f^fi
z.ou
0
O o /-»/-»
^ Z.UU
O
01 ^n
1 nn
i .\j\j
Ocn
n nn


*
» * »
»» »
^ »
•
• • * * • + *
;^*
-------
                                         Ben-FC14
Concentration
o.uu
5nn
.uu
A nn
*f .UU
^ nn
o.uu
o nn
z.uu
1 nn
I . UU
n nn
•
• »
»;
•
* » »
•• * * •
• »• • * •
*• 4
4 * • * *
;,<;V«; • " •• • •
*:•;•> . «. .
I»v* ** • *
i'l^ri. . s • .. •
***** »
0.00 0.50 1.00 1.50 2.00 2.50 3.(
                                            Distance FC14
                        Figure 27. Scatter plot of benzene with distance from FC14 Roadways, interstate.
Final Report
                               76
                                                          11/19/2004

-------
                                        Toluene FC14
Concentration
OiJ.UU
on nn

on nn
^u.uu
1C nn
I vJ.UU
m nn
I U.UU
c nn
vJ.UU
n nn
*

^
•n * *••
• ^
*• *Y
A. • • A
;;;-;*... ; ..;
^?C^;\:* I ;<; i : ' * * ;• .
U.UU n i i i i
0.00 0.50 1.00 1.50 2.00 2.50 3.(
                                             Distance FC14
                         Figure 29 Scatter plot of toluene with distance from FC14 Roadways, interstate.
Final Report
                                77
                                                            11/19/2004

-------
                                  Ethyl Benzene FC14
iJ.VJVJ
4^0
.OU
4OO
.UU
,- r> cr»
C o.ou
o
+J O OO

0
is 9 oo
^ £..uu
O
O 1 50
1 OO
I .UU
O^o
n nn
•

"


*#*• * ** •
******* * *
«*vi • • • •
****** * *
»^***«** '****! *
>V*tf%i* :**»* »
If* ** * * * ;
0.00 0.50 1.00 1.50 2.00 2.50 3.(
                                           Distance FC14
                      Figure 30. Scatter plot of ethyl benzene with distance from FC14 Roadways, interstate.
Final Report
                               78
                                                          11/19/2004

-------
                                           MTBE FC14
c
_o
!_
+J
0
o
c
o
o
ou.uu
9^ nn
zo.uu
9n nn
zu.uu
1 ^ nn
I O.UU
m nn
I U.UU
c nn
\j.\j\j
Onn
• • »
* *»
A *
•
»»
»
•
•
+ •
** * »
• » »
j|» * »*«£* »» * * *
«^«* »t**»***^ »* *
1^ ^ »*«* » » • *
A A 4t A^ A. A. £
.UU •.»-»•!» | »| |
0.00 0.50 1.00 1.50 2.00 2.50 3.(
                                               Distance FC14
                  Figure 31. Scatter plot of methyl tert butyl ether (MTBE) with distance from FC14 Roadways, interstate.
Final Report
                                 79
                                                              11/19/2004

-------
an
ou
*- yn
£ /O
3^n
c
O 50
2
*^ An
c 4U
0
2 ?n
C ^U H
o
o
iq 20
s 1 n
K I U
Q_
i
C
PM2.5 and FC14




^* ***** * *

^*»^»4»^*>* * * ,* ^
»*»»*»**»
i i i i
) 0.5 1 1.5 2 2
Distance to FC14 (km)









5
                                  Figure 32. Scatter plot of MP25 mass with distance fromFC14 Roadways, interstate.
Final Report
                                               80
                                                                                        11/19/2004

-------
                                             EC  FC14
H-
C ^
o 3
"+J
(0 o c
I_ ^-3
+J
0 o
O ^
C
0 1 5
O T-O
fl5 1
LU
n ^
U.vJ
n

•
•
• •
•
"• • •• •
*'m. " " •
* • • •
* • 1 ' "
f \_ m m m
m
,
                 0
500
1000           1500
 FC14 Distance
2000
2500
                       Figure 33. Scatter plot of elemental carbon with distance from FC14 Roadways, interstate.
Final Report
                                  81
                                                               11/19/2004

-------
                                              OC FC14
              10
               8
           TO
          £   6
           O   4
          O
          O   3
               0
-+—*   ^
                 0
      500
1000           1500

 FC14 Distance
2000
2500
                        Figure 34. Scatter plot of organic carbon with distance from FC14 Roadways, interstate.
Final Report
                                   82
                                                                11/19/2004

-------
                               Tetrachloroethene FC14
Concentration
H-iJ.UU
40 00
HU.UU
r>c (-»(-»
OvJ.UU
^O OO
ou.uu
9C. OO
zo.uu
90 00
£.\j.\j\j
1C nn
I vJ.UU
1O OO
I U.UU
c (->(->
vJ.UU
0.00
O.I
•







^ » *
•4Mm^K*ltNf»» •! /s^t* t • t » *»l ft
DO 0.50 1.00 1.50 2.00 2.50 3.(
                                            Distance FC14
                    Figure 35. Scatter plot of tetrachloroethylene with distance fromFC14 Roadways, interstate.
Final Report
                               83
                                                          11/19/2004

-------
                                         PMwith  F19
             80

          £T 70
          £
          "B) 60


          I50
          2  40
g  30
          o
             20
             10
              0
                0
                                      A
      • »

                                         *
                                         •

0.02
                                                             0.08
                      0.04          0.06

                   Distance to F19 (km)

Figure 36. Scatter plot of PM25 Mass with distance from FC19 Roadways, small local roads.
0.1
Final Report
                                84
                                                           11/19/2004

-------
           I   :


           2 2.1
           +J
           c
           
-------
          .0
          "+J
           15
          +j
           c
           0)
           o
           c
           o
          o
          o
          O
10
 9
 8
 7
 6
 5
 4
 3
 2
 1
 0
                 0
                                             OCFC19
•  •
                                     -*-4-
                 0.02
       0.04           0.06
        FC19 Distance
0.08
0.1
                     Figure 38. Scatter plot of organic carbon with distance from FC19 Roadways, small local roads
Final Report
                                  86
                                                               11/19/2004

-------
          0)
          o
          c
          o
          o
             20.00


             18.00
              16
                                           m/p Xylene
                               0.2
  0.4         0.6         0.8         1


Distance Gas Station (km)
1.2
                         Figure 39. Scatter plot of m/p xylene with distance from closest gasoline station.
Final Report
                                 87
                                                             11/19/2004

-------
                                               o Xylene
              6.00
              5.00
           O  4.00
           0)
           o
              3.00
                  »    /
           P  2.00
          o
            ^   «-
                       *   •
              1.00
              0.00
0
                               0.2
  0.4          0.6         0.8


Distance Gas Station (km)
                            Figure 40. Scatter plot of o xylene with distance from closest gasoline station.
1.2
Final Report
                                                                11/19/2004

-------
                                             Benzene
o.uu
5nn
.uu
c
OA nn
+j
15
^- ^ nn
^ o.uu
0)
O
c
o 9 nn
o ^-uu
1 nn
I .UU
n nn

-
*
• * » * *
» » • » * *
: ** * * *
t * » » »
* •»* »*» ** •%*/
* 2 ^ • ^ ^
^^TJfrfJ^7^ -:<
                   o
0.2
  0.4         0.6         0.8
Distance Gas Station (km)
1.2
                           Figure 41. Scatter plot of benzene with distance from closest gasoline station.
Final Report
                                  89
                                                              11/19/2004

-------
                                                 MTBE

c
I_
+J
0
O
c
o
o

ou.uu
9^ nn
zo.uu
9n nn
zu.uu
1 ^ nn
I O.UU
m nn
I U.UU
5 00
n nn
* » •
% »
+.
•
• •
•
»
*. *
• » * •
': < • •• .*• s. . • .
• ,. i.»; '*» <».»t f
* ** ^* ^* * * *
                    0           0.2         0.4          0.6          0.8          1
                                          Distance Gas Station (km)
                       Figure 42. Scatter plot of methy tert butyl ether with distance from closest gasoline station.
1.2
Final Report
                                   90
                                                                11/19/2004

-------
                                     Tetrachloroethene
iJ.UU

C*^ f^n
0
+J ^ nn
ro J-uu
+^ « i-«
C9 *^n
z.ou
0
^ 9 nn
^ z.uu
O
01 ^n
1 nn
i .\j\j
Ocn
n nn


•
*
*
»
.
* • • *
4 \ • <•. ••< • • •
*»»'*• \^* I ^% /» | % • •
• •% *• *^ * *«• * ^.* »*«» »
* **v t ** **»^ *• *•* *****
* * , ,*•/ . \
                  o
0.2
  0.4         0.6          0.8
Distance Gas Station  (km)
1.2
                      Figure 43. Scatter plot of tetrachloroethylene with distance from closest gasoline station.
Final Report
                                91
                                                            11/19/2004

-------
                                    EC PM02 Distance


c
0
15
c
0)
o

o
0
o
LJJ



H- ~
3.5
3

2.5
2

1 5

1

0.5
n J



X



X X
X
x x* x x
XX * x
X X « X

x * * x * x
X X
x x * x
X x X x X
* * * x *
* »* X
™ u, *^ >K
X

                0.0      1.0     2.0
3.0     4.0      5.0     6.0
     PM02  Distance
7.0      8.0      9.0
                        Figure 44. Scatter plot of PM2 5 Mass with distance from PM02, truck loading area.
Final Report
                                92
                                                           11/19/2004

-------
                                     PM PM Source 03
             80


             70

          c

          .2  60

          s
          -£  50

          0)
          o
          o
          o
          in
             40
30
             20
             10
              0
                                    •
                                    ».
               0.0
              1.0
2.0         3.0          4.0


     Distance  PM03
5.0
6.0
                     Figure 44. Scatter plot of PM2 5 Mass with distance from PM 03 truck loading area and dock.
Final Report
                                93
                                                            11/19/2004

-------
             10
              9
              8
         .2   7
         "+j
         £   6
          
-------
                                        PM Source  1-3 Plots
VJ.U
0.5
£U.4-
0
co 0.3
Q.
W0.2
Q
0.1
n

9 ' f
|
\ •— •
w M- m
'&:

• »PM2
• PM1
PM1

•






vsPMS
vsPM2
vsPMS


                 0
0.1
0.2         0.3         0.4
   Dist PM2 or PM1
0.5         0.6
 Figure 46. Scatter plot of distance from homes to PM 1, 2 and 3 source regions. Patterns same indicating a high correlation for homes close to these sources (a
                                                few hundred yards).
Final Report
                                    95
                                                                   11/19/2004

-------
          o
          co  3
          0_
          ffi  2
          Q
             1
             0
                                   PM Source 1-3 Plots
Jt
,*i
               0              0.5               1              1.5              2
                                      Dist PM2 or PM1
                    Figure 47. Scatter plot of distance from homes to PM 1, 2 and 3 source regions to 2 km distance.
Final Report
                                96
                                                           11/19/2004

-------
Methyl tert butyl ether
(n=505)
Indoor concentration ( jig/m3)
8 g
1:1 line ^-"
' . '' *'.
..:'!.i:.£&* ,':/.•• '. •
0 10 20 30 40 5
Outdoor concentration (|j,g/m3)
igure 48. Scatter plots of MTBE for indoor/outdoor, (
Methyl tert butyl ether
Methyl tert butyl ether
(n"b04) (n=502)
^n en
Personal concentration
(ug/m3)
v K) CO .&>. (
O 0 O O O C
. • 1:1 line'
* • •*••'•
Personal concentration ( ^g/m3)
cs 8 8 5 i
1:1 line^''
^ "- ' :
*" *•-'''
• • s • x *
,* " * " . * ""'*
'-.' :v-%<:. "•
•*:#&&'
jar::-.:-.
0 10 20 30 40 50 0 10 20 30 40 5
.j / i j- Outdoor concentration Lig/m3) , , ,, , Indoor concentration (ng/m3)
jutdoor/personal and indoor/personal snowing rfiai there are homes around the 1 : 1 line so that pollutants
                                                    arising from outdoor will affect personal exposure.
Final Report
                                                  97
                                                                                            11/19/2004

-------
                         (n=505>
                                                                              m,p-Xylene

                                                                               (n=504)
i
8

8
?
                                                                                              1:1 line
              5Q       CQ       150

               Outdoor concentration (
                                                                     50       100       150       200


                                                                      Outdoor concentration (u,g/m3)
                                                                                                                     250
                                                                                                               O     200 -

                                                                                                               1

                                                                                                               S S-  150

                                                                                                               g  £
                                                                                                                  8o>
                                                                                                                  a.
o
S2
£
                                                                                                                     100 -
                                                                                                                      50 -
                           m,p-Xylene

                             (n=502)
                                                                                                                                                             1:1 line.
                   50        100        150       200

                    Indoor concentration (ixg/m3)
                                                         250
    Figure 49. Scatter plots of MTBE for indoor/outdoor, outdoor/personal and indoor/personal showing that a subset of homes around the 1:1 line so that pollutants

                                                         arising from outdoor will affect personal exposure.
    Final Report
                                                       98
                                                                                                  11/19/2004

-------
    I
    1
    5; pT"*1
    8  E
    81
PM25
(n=292) 1:1 Nne
-20
-DO -
80 -
60 -
40 -
20 -

n
,''
/'
*• ^ X*
• " x •
•3&&i '. ' '
jjiBffji' " "
x'"*Jr '
             0    20    40    60   80    -DO   120
              Outdoor concentration (ug/m3)
                                          0






CO
E
1





PM25
(n=256) 1:1 Nne
180
160 -
140 -
-20 -
-DO -
80 -
60 -

40 -
20 -
n
B x ^
'
_
• • ". ^''
*•" " "
•"?" ^ •• ^ ^
«••--> ^ • •
'(•*' '
5".
^ -


§
1
§** ^
c £
§1
c
§
£

PM25
(n=246) 1:1Nne
180
160 -
140 -
120 -
-DO -
80 -
60 -
40 -
20 -
n
^^
,''
. * s
f • ^S
• • » . " ,- X
*,-.x^ •• ;
:«^^:.. /

ipi7-' *
ffT~
                                                    0  20  40  60  80 -DO  120 140 160  180
                                                     Outdoor concentration (ug/m3)
0  20  40  60  80  -DO  120 140  160 180
  Indoor concentration (ug/m3)
  Figure
50. Scatter plots of PM2 5 Mass for indoor/outdoor, outdoor/personal and indoor/personal showing that some homes are parallel to the 1:1 line so that
                                   pollutants arising from outdoor will affect personal exposure.
Final Report
                                              99
                                                                                      11/19/2004

-------
References:

Harrison, RM Smith, DJR and Luhana , L Source apportionment of atmospheric
polycyclic aromatic hydrocarbons collected from an urban location in
Birmingham, UK EST 30, 825-832, 1996.

Netter, J. Kutner, MH, Nachsheim, CJ  and Wasserman, W Applied Statistical Models.
4th edition, R.D. Irwin, Inc, Homewood, IL 1996.

Poulopoulos,  SG  and  Philippopoulos,  CJ,  "The  Effect  of Adding  Oxygenated
Compounds to Gasoline on Automotive Exhaust Emissions" Transactions of the ASME,
125, 344-350, 2003

Rogge, WF Hildemann, LM Mazurek, MA Cass, GR and Simoneit, BRT,
Sources of fine organic aerosol. 5. Natural gas home appliances. Environmental
Science and Technology, 27, 636-651, 1993.

Weisel, CP,  Zhang, JJ, Turpin, BJ, Morandi, MT Colome, S, Stock, TH Spektor, DM,
Korn, L, Winer, A, Alimokhtari , S, Kwon, J, Mohan, K  Harrington, R, Giovanetti, R
Cui, W, Afshar, M,  Maberti, S, Shendell, D "The Relationships of Indoor, Outdoor and
Personal Air  (RIOPA) Study: Study Design, Methods  and Quality Assurance/Control
Results, Journal of Exposure Analysis and Environmental Epidemiology, In Press 2004.

Weisel, Zhang, Turpin, Morandi, Colome, Stock  and Spektor "Relationships of Indoor,
Outdoor and Personal Air (RIOPA)", HEI Final Report, In Press 2004.

-------
                                                                                  A-l






                                     Appendix A










/H,/>Xylene




Bivariate Pearson Correlation




       The correlation coefficient between the  In-transformed /w^-xylene concentrations




and the distance to urban interstate (FC11) roadways was -0.20 (p=0.007). The correlation




coefficients between the In-transformed m,p-xylene concentrations and the inverse distance




to urban major arterial (FC14) roadways and the inverse distance to urban collector (FC17)




roadways were 0.19 (p=0.01) and 0.22 (p=0.0034), respectively. The correlation between the




ambient air concentration of r%,p-xy\ene and distances to individual roadways were examined




for the major roadway classes (1-95 for FC11; Rt.l, Rt.27, Rt.28, Rt.439 for FC14). The




distance  to  the  1-95  was  statistically  significantly  correlated to  the  In-transformed




concentration of r%,p-xy\ene in the residential ambient air (-0.194, p=0.0093). The distance to




the US Highway Route 1 also was statistically significantly correlated to the In-trans formed




concentration of m,p-xylene in the residential ambient air (-0.272, p=0.0002).




       The correlation coefficient between In-transformed ambient air concentration of m,p-




xylene and the inverse distance to the closest gas station was 0.28  (p=0.0002). For m,p-




xylene, only the point sources that were closer than 3 km from any of the sampled homes




and emissions larger than 0.9  tons  of annual total generation, were considered  in the data




analysis. Two point sources met the above criteria; one refinery in Linden, and an industrial




emission in Elizabeth. Only the distance  between the refinery and  the  residences had a




statistically significant correlation with  the In-transformed /w^-xylene concentrations  (-0.17,




p=0.022).

-------
                                                                                  A-2






       The Pearson  correlation coefficients  of the meteorological  variables  and the In-




transformed  m,p-xylene  concentrations  that  were  statistically significantly correlated at




D =0.05, were atmospheric stability,  0.348  (p<0.0001);  mixing height, -0.254 (p=0.0009);




wind speed,  -0.235  (p=0.0014);  and  temperature,  -0.19  (p=0.0101).  The  correlation




coefficients of precipitation and relative humidity were 0.125 (p=0.091) and 0.129 (p=0.082),




respectively.  Atmospheric  pressure  was  not  correlated  to  the  ambient  m,p-xylene




concentration (p=0.47).










Preliminary Selection of Predictors




       The preliminary regression analysis was performed on the In-transformed /w^-xylene




concentration to determine  the relative importance of variables within the same types




(proximity and meteorological)  of independent variables. The  distances to the  roadways,




either original or transformed,  were  grouped by  its FC to examine the importance of




proximity of the mobile sources to the m,p-xylene air concentration. When the distances to




the functional classes were analyzed,  the distances to the urban interstates (FC11) and the




urban principal  arterials (FC14) were included in  the resulting linear  regression model




(p<0.15) with r2 = 0.0819. For the gas stations  and the point sources, the inverse form of the




closest distance was always selected as the largest explanatory predictor variable in the model




regardless of the  selection  methods. The proximity to the refinery was also selected as the




larger  explanatory  variable  along  with an   industry  site  in  the   model.  Among  the




meteorological variables, the 48-hour averaged mixing heights, temperature, and wind speed




were selected from  the   preliminary regression analysis. When the  48-hour  averaged




atmospheric  stability was introduced to  the initial group of other meteorological variables,

-------
                                                                                  A-3






the model was  improved  (increased r2),  but the mixing height was  eliminated from the




resulting model.










Selection of the Best-fitting Model




       The variables selected by the different regressions methods were relatively consistent.




Atmospheric stability, temperature and wind speed were included as predictors in the model




as were the inverse distances to the major roadways (FC11) and to gasoline stations (Table




A-l). The association of the distance to the refinery was not significant. The parameters and




analysis of variance of the regression equations for the ?%,p-xy\ene ambient air concentration




for the best-fitting model with 6 variables  selected are given in Table A-l.  The C(p), which




is Mallows' Cp statistic, associated with this particular subset of variables was determined to




be 7.0. The resulting model was appropriate in number of parameters, because the number




of parameters (p) including the intercept in the  best-fitting model exactly matched to the




same value of the C(p). The diagnostic plots, the residual plot against the predicted values,




the normal probability-probability (PP) plot and  normal quantile-quantile (QQ) plot of the




residuals were generated and  visually examined (Figures  A-l,2 and Appendix B). The




residuals were randomly distributed without showing any obvious trend or any particular




pattern  (Figure A-l.)  indicating close to a normal distribution and the constant variances.




The PP plot was nearly linear so it could be considered the error term of the model follows a




normal distribution. Based on the visual diagnosis, there was no significant evidence of lack




of fit or of significant unequal error variance for the best 7-parameter regression model.





       Possible Outliers were found (Figure A-2) using the test statistics (+ tinv,  .95, n-p-l =





175) of + 1.6545. The regression equation was recalculated after removal of the seventeen





Outliers. The parameters for the best fit equation are given in Table A-2. An increase in the

-------
                                                                                  A-4






r2 was obtained after removal of the outliers.  The residuals plotted against the predicted




(Figure A-3.) seemed more randomly distributed compared to those in Figure A-l.










Diagnostics of Equal Variances and Multicollinearity Diagnostics




       To test the assumption  of  equal variance,  the heteroscedasticity  of the parameter




estimates were tested as well as multicollinearity (Appendix C). The chi-square was 18 with a




probability of 0.19, a value greater than  0.05. Therefore,  the  variances  of the parameter




estimates  could be concluded as not  being  significantly different. As a  consequence, the




equal error variances  in parameter  estimates were assumed in  the  best-fitting  6-parameter




regression model.  The  multicollinearity of predictor variables in the best-fitting model was




tested. The bivariate Pearson correlations between pairs of predictors included in the model




were examined for any significant correlation between the predictors.




       The variance inflations for  all  predictors  were close to 1, a value smaller than 10,




suggesting that there  was no  significant collinearity between the predictors in the model.




However, the  collinearity diagnostics suggest that there  were  possible co-dependences,




which might overspecify  the model outcome. In particular the meteorological conditions




were  somewhat  correlated and commercial enterprises,  such  as gasoline stations, are




preferentially located on or near major roadways so correlations  in the proximity variables




could exist.




       In order to attempt to reduce the multicollinearity diagnosed, the temperature, which




had larger proportion  of variation than 0.5, was removed from the best-fitting model and the




multicollinearity of the resulting model diagnosed. When the temperature was removed from




the model, the coefficient  of determination (r2) of the resulting 5-parameter model decreased




from 0.33 to 0.24, and the  condition index decreased from  116 to 41. The interaction

-------
                                                                                  A-5






between the predictors in 5-parameter model appeared to be decreased after removal of the




temperature from the model, but the  eigenvalue was still smaller than 0.01  (0.0023) and the




proportion of variation of the stability (0.97) were still greater than 0.5.




       The multicollinearity diagnostics described above exhibited divergent results. The




largest condition index and proportion of variation indicated potential collinearity may exist




in the predictors. However,  the variance inflation factors were much smaller than 10 for all




five parameter estimates indicating that the multicollinearity may not be a problem. Neter et




al suggested (1996)  that even though  there is serious multicollinearity, the fitted model may




be useful for estimating mean responses or making predictions, if the inferences of the fitted




regression model are restricted to the same multicollinearity pattern as the data on which the




regression  model is based.  Consequently, it was concluded that retaining all predictors




included in the best-fitting model with its qualitative characteristics is more beneficial for




explanatory observational  purposes  of  this  research  than  dropping  the  potentially




intercorrelated predictors  from the model.  Similar considerations were used for the other




compounds as well.

-------
                                                                               A-6
Table A-l. Results of the Best-fitting 7-Parameter Model for m,p-Xjlene




Analysis of Variance

Source DF
Model 5
Error 177
Corrected Total 182
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
35.9789
99.43418
135.4131
0.74952
0.81562
91.89477
Mean
Square
7.19578
0.56178

R- Square


F Value
12.81



Adjusted R-Square




0.2657
0.2450


Pr>F
<.0001





Parameter Estimates

Variable Label
Intercept Intercept
F14_lmlnv (Distance to FC14)-1

DF
1
1
GSlmlnv (Distance to Gas Station)-1 1
Stab4 Atmospheric Stability
K5 Temperature
U4 Wind speed
1
1
1
Parameter
Estimate
5.56153
14.56178
22.46222
0.52630
-0.02472
-0.12254
Standard
Error
2.24241
5.17879
8.56119
0.14753
0.00671
0.05923

t Value
2.48
2.81
2.62
3.57
-3.69
-2.07

Pr> 1 1 1
0.0141
0.0055
0.0095
0.0005
0.0003
0.0400
Summary of Stepwise Selection

Step Variable Entered
1 Atmospheric Stability
2 (Distance to Gas Station)-1
3 Temperature
4 (Distance to FC14)-1
5 Wind Speed
Partial
R- Square
0.1317
0.0460
0.0362
0.0340
0.0178
Model
R-Square
0.1317
0.1778
0.2140
0.2479
0.2657

Cp
27.9107
18.9388
12.3164
6.2173
3.9860

F Value
27.46
10.08
8.24
8.04
4.28

Pr>F
<.0001
0.0018
0.0046
0.0051
0.0400

-------
                                                                                   A-7
         Residual Plot  of the  Best  Fit  Model  of  mp— Xylene
         LnnpXD = 5.5615 +14. 562 f 14_1ninv +22. 462 GS1 ni nv +0.5263 3ab4 -0.0247 K5 -0. 1225 IX
           2"

          -2-
          -3-
          -4i
                                                                          N
                                                                          183
                                                                          Ffeq
                                                                          0.2657
                                                                          AdjRsq
                                                                          0.2450
                                                                          0.7495
                    I      I     I      I      I      I     I      I      I      I
             -0.25   0.00  0.25   0.50   0.75   1.00   1.25   1.50   1.75  2.00   2.25

                                     Redicted Val ue

Figure A-l. Residual vs. Predicted Plot of the Best-fitting 7-Parameter Model of m,p-Xylene
                     Outliers of  Model  for  mp—Xylene
         LnnpXO = 5. 5615 +14. 562 f 14_1nt nv +22. 452 GS1nt nv +0. 5263 3 ab4 -0. 0247 V5 -0.1225 Ut
3-

1 2'
—

-------
                                                                                  A-8
Table.A-2. Results of the Best-fitting 5-Parameter Model for m,p-Xylene after Removing the
Twenty Outliers

Analysis of Variance

Source DF
Model 5
Error 162
Corrected Total 167
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
22.98982
46.58357
69.57338
0.53624
0.86365
62.08976
Mean
Square
4.59796
0.28755

R-Square

F

Value
15.99



Adjusted R-Square




0.3304
0.3098


Pr>F
<.0001





Parameter Estimates

Variable Label
Intercept Intercept
F14_lmlnv (Distance to FC14)-1

DF
1
1
GSlmlnv (Distance to Gas Station)-1 1
Stab4 Atmospheric Stability
K4 Temperature
U4 Wind Speed
1
1
1
Parameter
Estimate
4.94236
7.94739
17.43615
0.53744
-0.0232
-0.0653
Standard
Error
1.70161
4.43103
6.29951
0.11065
0.00507
0.04438

t Value
2.9
1.79
2.77
4.86
-4.58
-1.47

Pr> 1 1 1
0.0042
0.0747
0.0063
<.0001
<.0001
0.1431
Summary of Stepwise Selection

Step Variable Entered
1 Atmospheric Stability
2 Temperature
3 (Distance to Gas Station) -1
4 (Distance to FC14)-1
5 Wind Speed
Partial
R-Square
0.1813
0.0871
0.0385
0.0147
0.0090
Model
R-Square
0.1813
0.2684
0.3068
0.3215
0.3304

Cp
32.3859
13.4982
6.2732
4.7579
4.6109

F Value
36.76
19.64
9.10
3.52
2.17

Pr>F
<.0001
<.0001
0.0030
0.0624
0.1431

-------
                                                                                      A-9
         Residual Plot  of  the Best  Fit  Model of mp— Xylene
         LnnpXDOrt = 4.9424 +7. 9474 f 14_1ntnv +17.436 GSIntnv +0.5374 a ab4 -0.0232 K5 -0. 0653 IX
            1.5
            1.0"
            0.5
           -0.5
           -to-
           -1 5"


                           N
                           168
                           Ffeq
                           0.3304
                           AdjRsq
                           0.3098
                           R\
-------
                                                                                 A-10






o-Xylene




Bivariate Pearson Correlation




       The correlation coefficients between In-transformed o-xylene concentrations and the




distance to urban interstate (FC11) roadways and the distance to urban major arterial (FC14)




roadways were -0.147 (p=0.048) and -0.148 (p=0.046), respectively. The distance to the US




Highway  Route  1  also  had a  statistically significantly correlation  coefficient  of -0.266,




p=0.0003. The correlation coefficient between In-trans formed ambient air concentration of




o-xylene and the inverse distance to the closest gas station was 0.24 (p=0.0011). The refinery




was  the only  point source whose  distance to the residences had  a statistically  significant




correlation with the In-trans formed o-xylene concentrations (0.174, p=0.019).




       The meteorological variables  that were statistically significantly correlated with  o-




xylene concentrations  were  wind  speed  (-0.30,  p<0.0001),  atmospheric stability (0.427,




p<0.0001),  mixing  heights  (-0.28, p=0.0002),  relative  humidity  (0.16,  p=0.027),  and




temperature (-0.11, p<0.15). Precipitation and  atmospheric pressure were not correlated with




the residential ambient air concentration of o-xylene.










Preliminary Selection of Predictors




       A series  of preliminary regression  analyses  for each group  of variables were




performed using the In-transformed o-xylene concentrations to determine which variables to




include in the model. The distances to the closest gas station,  the refinery,  and the urban




major arterial  roadways (FC14) were  selected as important predictors among the variables




that  describe  the  distance  between  sources and  residences.  From the meteorological




variables,  wind speed, temperature,  and stability  were  selected  as predictor variables

-------
                                                                                  A-ll
Selection of the Best-fitting Model




       The predictor variables selected by the different regression model selection methods




for the residential ambient  air concentration of o-xylene  were relatively consistent.  The




meteorological variables, which were consistently included in the series of regression model,




were the atmospheric stability, temperature, and wind speed, in order of selection. The  C(p)




was 7, the same as the number of parameters included in model. The parameter estimates




were  significant  (p<0.05),  except for the  intercept (p=0.26). The model  statistics  are




summarized in  Table A-3.  As illustrated in  Figure  A-5, the  residuals were  distributed




relatively random. The PP plot was nearly linear implying that the  error term of the model




followed a normal distribution. Twelve data points were identified as possible Outliers by





using a test statistic of ±1.645 (0.95, df=175,  Figure A-6). The analysis of the variance,





parameter  estimates, and the summary of model statistics  for the best-fitting 7-parameter




model for  o-xylene after removal of outliers are listed in Table A-4. The  selected model was




statistically significant  (p<0.0001).  The residuals for the  model  with the outliers removed




were  randomly  distributed without showing any obvious  trend or any particular pattern




(Figure A-7). The standardized residuals  of the best-fitting model were close to a normal




distribution and had the  constant  variances. Based on a visual  diagnosis on residual plot,




probability plot, and quantile plot, there was no evidence of a  lack of fit or unequal error




variance for the best-fitting 7-parameter  regression model for the ambient residential o-




xylene.  The Mallows'  C statistic  associated with this particular  subset of variables  was




determined at 7.0, indicating  that the  resulting model  had  the  appropriate  number of




parameters.

-------
                                                                               A-12






Diagnostics of Equal Variances and Multicollinearity Diagnostics




       To  test the  assumption of equal variance,  the heteroscedasticity of the parameter




estimates were tested. The chi-square was 28 with a probability of 0.104, a value greater than




0.05 (Appendix C). Therefore, the variances of the  parameter estimates could be concluded




as not being significantly different. As a consequence, the equal error variances in parameter




estimates were assumed in the best-fitting 7-parameter model.  The same considerations




about parameter correlations expressed for tn/p xylene also apply to o xylene.

-------
                                                                              A-13
Table A-3. Results of the Best-fitting 7-Parameter Model for o-Xylene
Analysis of Variance

Source DF
Model 5
Error 177
Corrected Total 182
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
39.22208
80.5563
119.7784
0.67463
-0.09397
-717.895
Mean
Square
7.84442
0.45512

R-Square


F Value
17.24



Adjusted R-Square




0.3275
0.3085


Pr>F
<.0001





Parameter Estimates

Variable Label
Intercept Intercept
F14_lmlnv (Distance to FC14)-1
GSlmlnv (Distance to Gas Station)
Stab4 Atmospheric Stability
K5 Temperature
U4 Wind speed

DF
1
1
-1 1
1
1
1
Parameter
Estimate
2.28427
20.31620
13.94798
0.63575
-0.01848
-0.11480
Standard
Error
2.01835
4.66133
7.70577
0.13279
0.00604
0.05331

t Value
1.13
4.36
1.81
4.79
-3.06
-2.15

Pr>|t
0.2593
<.0001
0.0720
<.0001
0.0025
0.0326
Summary of Stepwise Selection

Step Variable Entered
1 Atmospheric Stability
2 (Distance to FC14)-1
3 Temperature
4 Wind Speed
5 (Distance to Gas Station) -1
Partial
R-Square
0.1907
0.0802
0.0285
0.0156
0.0124
Model
R-Square
0.1907
0.2709
0.2994
0.3150
0.3275

Cp
31.3278
12.4773
7.0815
5.0189
3.7836

F Value
42.65
19.81
7.27
4.06
3.28

Pr>F
<.0001
<.0001
0.0077
0.0454
0.0720

-------
                                                                            A-14
   Residual Plot  of the  Best  Fit Model  of o—Xylene
LnoXD = 2.2843 +20. 316 f 14 1ninv +13. 948 GSIninv +0.6357 aaM -0. 0185K5 -0.1148 IX
          ++     $


      +
  -1-

  -2"

  -3-
                                                                    N
                                                                    183
                                                                    Ffeq
                                                                    0.3275
                                                                    AdjF^q
                                                                    0.3085
                                                                    0.6746
      \     i      i     i     i      i     i      i     i     i      i     r
    -1.25 -1.00  -0.75 -0.50  -0.25  0.00   0.25  0.50   0.75  1.00   1.25   1.50

                              Redicted Val ue

           Figure A-5. Residual Plot of the Model of o-Xylene
               Outliers of  Model for o—Xylene
 LnoXO = 2.2843 +20.316f14 1ntnv +13. 948 GBIntnv +0. 6357 3 ab4 -0.0185I« -0. 1148 Ut
   6'
   4'
   2'
•   o-
  -4
           *


                                          + +
                                                          ++
             25      50
                            75      100     125

                             Cbservat i on Nintoer
                Figure A-6. Outliers of Model of o-Xylene
N
183
Ffeq
0. 3275
AljFfeq
0. 3085
                                                                    0.6746
                                                  150     175     200

-------
                                                                               A-15
Table A-4 Results of the Best-fitting 7-Parameter Regression Model for o-Xylene after
Removing of the Outliers

Analysis of Variance

Source DF
Model 5
Error 162
Corrected Total 167
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
23.35913
31.95114
55.31027
0.44410
-0.09736
-456.16000
Mean
Square
4.67183
0.19723

R-Square


F Value
23.69



Adjusted R-Square




0.4223
0.4045


Pr>F
<.0001





Parameter Estimates

Variable Label
Intercept Intercept
F14_lmlnv (Distance to FC14)-1
GSlmlnv (Distance to Gas Station)
Stab4 Atmospheric Stability
K5 Temperature
U4 Wind speed

DF
1
1
-1 1
1
1
1
Parameter
Estimate
4.45813
7.44373
9.54244
0.52092
-0.02352
-0.12197
Standard
Error
1.40740
4.48291
5.47996
0.09234
0.00419
0.03697

t Value
3.17
1.66
1.74
5.64
-5.62
-3.30

Pr>|t
0.0018
0.0988
0.0835
<.0001
<.0001
0.0012
Summary of Stepwise Selection

Step Variable Entered
1 Atmospheric Stability
2 Temperature
3 Wind Speed
4 (Distance to Gas Station) -1
5 (Distance to FC14)-1
Partial
R-Square
0.2717
0.0877
0.0353
0.0178
0.0098
Model
R-Square
0.2717
0.3594
0.3947
0.4125
0.4223

Cp
38.3416
15.9706
8.1589
5.2175
4.4861

F Value
61.92
22.59
9.57
4.93
2.76

Pr>F
<.0001
<.0001
0.0023
0.0277
0.0988

-------
                                                                                    A-16
            Residual Plot of the Best Fit  Model of o—Xylene
         LnoXDOrt  =4.4581 +7. 4437 f 14 1ni nv +9. 5424 GS1ni nv +0. 5209 9ab4 -0. 0235 K5 -0.122 IX
1.25-

LOO"
0.75'
0.50-

0.25'
0.00"
0.25-
0.50-
0.75"
LOO'
1.25'


+ +
+ + I ++++ ++ + +
++ + + + +
+ ~§-A^ + + + J.
+ t + 5-^ ^ +$+ + +^+
I H" ^ff ~r I "I" ~h
~H .I "*" _j_ """ ~H
+ +++ -H-+ +++++ + +
+++ +* ++ ++ +
+ + ++ + ++ +
+ + + ++
+ + + +
N
168
Rsq
0.4223
AdjRsq
0.4045
RvEE
0.4441






                 \       \
                                      1       I       I        \
                                                                   \       \
               -1.00   -0.75   -0.50
                                     -0.25    0.00    0.25

                                        Redicted Val ue
                                                          0.50    0.75    1.00
Figure A-7- Residual vs. Predicted Plot of the Best-fitting 7-Parameter Model of o-Xylene
after Removing the Outliers
                         Cp Plot with  Reference Lines

          LnoXOQit = 4. 4581 +7. 4437 f 14_1 nt nv +9. 5424 GS1 nt nv +0. 5209 3 ab4 - 0. 0235 V5 - 0.122 \M
           50"
           40"
           30"
           20-
           10-
                                                                  N
                                                                  168
                                                                  Ffeq
                                                                  0. 4223
                                                                  AdjFfeq
                                                                  0.4045
                                                                  RvEE
                                                                  0.4441
              n       i       i        i       i        i       i        i
              2.0     2.5     3.0     3.5     4.0      4.5     5.0      5.5     6.0
Rot   + + + CPP
     --------- CP = 2P - (P f or f ul I  model )  + 1
                                                  CP= P

-------
                                                                                                           A-17
   Table 4.A- Results of the Test of Multicollinearity of Predictor Variables Included in the
                                      Best-fitting Model for o-Xylene
                    Parameter Estimates
          Parameter  Standard
 Variable  DF  Estimate  Error  t Value   Pr >  |
                                              Variance
                                              | Tolerance Inflation
 Intercept   1   5.19346  1.43598  3.62
 f!4_lmlnv  1  8.75248  4.30991  2.03
 GSlmlnv   1  9.80788  5.25048  1.87
 Stab4     1  0.53003   0.08902 5.95
 K4       1  -0.02635  0.00443 -5.94
 U4       1  -0.13256  0.03548 -3.74
                                     0.0004   .     0
                                       0.0439  0.92783  1.07778
                                       0.0636  0.91647  1.09114
                                     <.0001  0.72367  1.38185
                                     <.0001  0.93148  1.07355
                                     0.0003  0.69736   1.43397
                     Collineanty Diagnostics
                         Collineanty Diagnostics
           Condition ----------------------------- Proportion of Variation
Number Eigenvalue  Index   Intercept fl4_lmlnv  GSlmlnv   Stab4
                                                                        U4
  1  4.83871   1.00000   0.00002176  0.01278  0.01262   0.00022105  0.00002821  0.00172
  2  0.63347   2.76376   0.00004494  0.37172  0.34247   0.00041480  0.00005795  0.00400
  3  0.46833   3.21432   3.48031E-9  0.60663  0.62097   0.00000123  4.828461E-8 0.00002712
  4  0.05572   9.31914   0.00038414  0.000028140.00518  0.01731    0.00064391  0.57135
  5  0.00347   37.36627  0.02094    0.00109  0.01645   0.88898   0.05375    0.27248
  6  0.00030792 125.35642  0.97861    0.00776  0.00231   0.09308   0.94552    0.15042
Table 4.4.4. Results of the Test of Heteroscedasticity of Parameter Estimates Determined in
the Best-fitting Model for o-Xylene
                      Consistent Covanance of Estimates
  Variable   Intercept    f!4_lmlnv     GSlmlnv
                                                  Stab4
                                                              K4
                                                                        U4
  Intercept  1.6675971894   0.6946045492  -1.338990966  -0.041187547  -0.004861092  -0.017907343
  fl4_lmlnv  0.6946045492  12.512236841   -2.924001768  0.0253987847  -0.003166084  0.0004076095
  GSlmlnv  -1.338990966  -2.924001768   15.630366869  -0.007104998  0.0047403892   -0.010061741
  Stab4    -0.041187547  0.0253987847  -0.007104998   0.0068550523  3.7178761E-7  0.0013929763
  K4      -0.004861092 -0.003166084  0.0047403892  3.7178761E-7  0.0000168767  0.000016424
  U4      -0.017907343  0.0004076095  -0.010061741  0.0013929763  0.000016424  0.0015020694
                          Test of First and Second
                           Moment Specification

                          DF  Chi-Square  Pr > ChiSq
                          20     28.23    0.1040

-------
                                                                                 A-18






4.5. Toluene




4.5.1. Bivariate Pearson Correlation




       The correlation coefficients between In-transformed toluene concentration and the




distance  to  FC14 roadways and for inverse distance to FC14 were —0.177 (p=0.018) and




0.172 (p=0.02), respectively. The distance to the  US Highway Route 1 also was statistically




significantly correlated  to  the  In-transformed  concentration of toluene in the residential




ambient  air (-0.162, p=0.03). The  correlation  coefficient between In-transformed toluene




concentration and the distance to the closest gas station was —0.18 (p=0.015). The refinery




was the only point source whose distance from the residence to the facility had a statistically




significant correlation for the In-transformed toluene  concentration in ambient air (—0.15,




p=0.04).




       The meteorological  variables  that  correlated with  toluene  concentration were




atmospheric stability  (0.31, p<0.0001), wind speed (-0.198, p<0.01), mixing heights (-0.19,




p<0.01), relative  humidity (0.17, p<0.05),  temperature (-0.135, p<0.1), and precipitation




(0.115, p<0.15). Atmospheric pressure was not correlated with the residential ambient air




concentration of toluene.










Preliminary Selection of Predictors




       The distances  to the refinery, and  the distance to  major  urban arterial roadways




(FC14) were selected  as important predictors among the variables that describe the distance




between   sources  to   residences.  From the meteorological variables, wind speed and




atmospheric stability were selected as predictor variables (p<0.15).










Selection of the Best-fitting Model

-------
                                                                                 A-19






       The predictor variables  selected by the different selection methods for regression




model for the residential ambient air concentration of toluene from the proximity variables




and the meteorological variables were consistent.  The meteorological variables included in




the regression model were the atmospheric stability, and temperature. The inverse distance




to the closest major urban arterial roadway (FC14) was included in the model as a predictor




among the proximity to the major roadway variables. The  inverse distance  to the refinery




was included as significant predictor variables  in the model  among the distance variables to




point source. The distance to the gas station was not selected as a predictor for the model of




toluene. The model was statistically significant (p<0.0001) with an r2 of 0.199 (adjusted r2 of




0.181).




       The parameter estimates for the meteorological predictors were significant at p<0.01




and the parameter estimates  for the proximity variables in the model were significant at




p<0.15. The model statistics are summarized in Table A-5. The residuals were distributed




relatively random (Figure A-9). The error term of the model followed a normal distribution,




based on the  linearity observed in PP plot (Appendix  B). The  possible outliers were





determined by  using the test statistics of + 1.645 (0.95, df. = 165) to improve by removing





the less  contributing  Outliers  (Figure  A-10).  The  analysis  of the variance,  parameter




estimates, and  the summary of model  statistics for the best-fitting  5-parameter model for




toluene are listed in Table A-5.   The removal of the outlier improved the r2, from 0.20 to




0.33.  The residuals were randomly distributed without showing any obvious trend or any




particular pattern based on a visual examination (Figure A-ll). The standardized residuals of




the best-fitting  model appear to be close to a normal distribution and had constant variances.




The residual, PP, QQ plots, there appear to show no visual evidence of lack of fit or unequal




error  variance.  The Mallows'  Cp statistic associated with this particular subset of variables

-------
                                                                                 A-20






was  determined  to be  5,  indicating the  resulting  model  had  appropriate  number  of




parameters.




Diagnostics of Equal Variances and Multicollinearity Diagnostics




       To test the assumption  of  equal variance, the heteroscedasticity of the parameter




estimates were tested as well as multicollinearity (Appendix  C). The chi-square was  28.15




with a probability  of 0.014, a  value  smaller than 0.05.  Therefore, the  variances of  the




parameter  estimates were concluded as significantly different. As a consequence, the error




variances  in  parameter estimates could not be assumed as  equal  for the best-fitting 5-




parameter model of toluene. The bivariate Pearson correlations between pairs of predictors




included in the model showed  some statistically significant correlations were identified




between  'the inverse distance to the refinery' and 'the closest distance to the urban major




arterial roadways (FC14)' (-0.284, p<0.0001).




       The variance inflations for predictor variables were close to 1 (1.01 ~  1.11) which is




not greater than 10. Based  on  the  variation  inflation  factors, there was no significant




collinearity between the predictors  in the model.  However,  as  a result of the  collinearity




diagnostics, the condition index was 107, which was greater than 100,  and the eigenvalue was




close  to  zero  (0.00038),  which was   smaller than 0.01. The  proportion  of variation  of




intercept  (0.98) and  temperature (0.97) were  greater  than  0.5,  indicating  that  the two




parameters interacted. Therefore, there were possible co-dependences in the model, which




might overspecify the model outcome.

-------
                                                                               A-21
Table A-5. Results of the 5-Parameter Multiple linear regression Model for Toluene




Analysis of Variance
Source DF
Model 4
Error 178
Corrected Total 182
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
39.48395
146.1108
185.5947
0.90601
1.52498
59.41107
Mean
Square
9.87099
0.82085

F Value
12.03


R-Square
Adjusted R-Square


0.2127
0.1951
Pr>F
<.0001



Parameter Estimates
Variable Label
Intercept Intercept
f!4_lmlnv (Distance to FC14)-1
Stab4 Atmospheric Stability
K4 Temperature
RH5 Relative Humidity
DF
1
1
1
1
1
Parameter
Estimate
6.29278
16.66451
0.65082
-0.03208
0.01554
Standard
Error
2.41687
6.14926
0.16197
0.00830
0.00613
t Value
2.60
2.71
4.02
-3.87
2.54
Pr>|t
0.0100
0.0074
<.0001
0.0002
0.0120
Summary of Stepwise Selection
Step Variable Entered
1 Atmospheric Stability
2 Temperature
3 (Distance to FC14)-1
4 Relative Humidity
Partial
R-Square
0.1162
0.0384
0.0296
0.0285
Model
R-Square
0.1162
0.1546
0.1843
0.2127
Cp
23.6494
16.8377
12.0408
7.5148
F Value
23.8
8.18
6.50
6.44
Pr>F
<.0001
0.0047
0.0116
0.0120

-------
                                                                             A-22
      Residual Plot  of the  Best Fit Model  of Toluene
 LnTolO = 6.2928 +16.665f14 1ninv +0.6508aab4 -0.0321 K5 -K3.0155R-5
   -1
   -2"
   -3-
   -41
+       +  +   + ,. V
                                +      4-
                                        ++
                                                                      N
                                                                      183
                                                                      Rsq
                                                                      0. 2127
                                                                      AdjF^q
                                                                      0.1951
                                                                      RISE
                                                                      0.906
       \     I      I     I      I     I      I     I      I     I      I     \
     0.25   0.50  0.75   1.00  1.25   1.50  1.75   2.00  2.25   2.50  2.75   3.00

                               Redicted Val ue

Figure A-9. Residual vs. Predicted Plot of the 5-Parameter Model of Toluene
                  Outliers of  Model for Toluene
  LnTol 0 = 6. 2928 +16. 665 f 14 1nt nv +0. 6508 3 ab4 -0. 0321 ¥5 +0. 0155 R-5
    3-
    2'
     r
 ~ 0-"T-
    -r
    -2"
 -S -3-

                                                    i +     +
                                                   .   + +
                                                            - ---------
                                                        N
                                                        183
                                                        Ffeq
                                                        0.2127
                                                        MjFfeq
                                                        0.1951
                                                        RVCE
                                                        0.906
              25      50
                             75      100      125

                              Cbservat i on Nintoer
                                                   150      175     200
                 Figure A-10. Outliers of Model of Toluene

-------
                                                                              A-23
Table A-6. Results of the Best-fitting 5-Parameter Model for Toluene after Removing the
Outliers
Analysis of Variance
Source DF
Model 4
Error 162
Corrected Total 166
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
30.71102
67.51842
98.22944
0.64559
1.60929
40.11608
Mean
Square
7.67776
0.41678

F
Value
18.42


R-Square
Adjusted R-Square


0.3126
0.2957
Pr>F
<.0001



Parameter Estimates
Variable Label
Intercept Intercept
f!4_lmlnv (Distance to FC14)-1
Stab4 Atmospheric Stability
K4 Temperature
RH5 Relative Humidity
Parameter
DF Estimate
1
1
1
1
1
3.11017
14.72149
0.70584
-0.02067
0.01116
Standard
Error
1.84905
4.43634
0.12046
0.00635
0.00480
t Value
1.68
3.32
5.86
-3.25
2.33
Pr> t
0.0945
0.0011
<.0001
0.0014
0.0212
Summary of Stepwise Selection
Step Variable Entered
1 Atmospheric Stability
2 (Distance to FC14)-1
3 Temperature
4 Relative Humidity
Partial
R-Square
0.2245
0.0376
0.0276
0.0230
Model
R-Square
0.2245
0.2620
0.2897
0.3126
Cp
18.6321
11.8339
7.3616
3.9805
F Value
47.76
8.35
6.34
5.42
Pr>F
<.0001
0.0044
0.0128
0.0212

-------
                                                                                     A-24
              Residual Plot of the  Best  Fit Model  of Toluene
         LnTolOOit =3.1102 +14.721 f 14 1ntnv +0. 7058 StaM -0. 0207K5 +0. 0112R-5
            1.5-
            1.0'
            0.5'
           -2.01
                             **"     **

           -0.5-
           -1.0
           -1.5-
    **   *
N
167
Rsq
0.3126
AdjRsq
0.2957
R\
-------
                                                                                 A-25






4.6. Benzene




4.6.1. Bivariate Pearson Correlation




       A negative correlation coefficient of  -0.196( p=0.008)  was  determined  for  the




distance  to the US Highway Route  1  and natural In-transformed benzene concentrations.




The  correlation coefficient between  In-transformed benzene concentration and the inverse




distance to the closest gas station was significant (0.259, p=0.0004). Among the distances to




the four  identified point sources, no point source was  significantly correlated with benzene




concentration.




       The meteorological variables  that correlated  with benzene concentrations were




temperature (-0.392, p<0.0001), atmospheric stability  (0.263, p=0.0003), mixing  heights (-




0.224, p=0.0024), wind speed (-0.167, p=0.025), and atmospheric pressure (0.165,  p=0.026).




Precipitation and  relative  humidity were not significantly correlated with  the benzene




concentration.










Preliminary Selection of Predictors




       The distances to the closest  gas station, the refinery, and the US highway Route 1




were  selected as important predictors of ambient  benzene concentration (p<0.15).  Wind




speed, temperature, and atmospheric stability were  selected as predictor variables from the




meteorological variables (p<0.15).










Selection of the Best-fitting Model




       The predictor variables  selected by the various  selection methods in models for the




residential  ambient  air  concentration  of benzene,  were  relatively  consistent.  The




meteorological variables which were consistently included were:  temperature, atmospheric

-------
                                                                                 A-26






stability, and wind speed in order of selection. The C(p)  suggested the 6-parameter model




was appropriate. The model statistics and parameter estimates are summarized in Table A-7.




The  residuals were distributed relatively randomly  (Figure A-13) suggesting that there was




equal variance in residuals. The probability plot (Appendix B) was linear indicating that the




error term of the model followed a normal distribution.





       Possible outliers were identified based on a  test statistic of + 1.645 (0.95, df. = 175)





(Figure A-14).  After the removal of sixteen possible Outliers, the model became the 5-




parameter model because  the  wind speed was not included  in  the best-fitting regression




model.  The  analysis of the variance,  parameter estimates, and the summary  of model




statistics for the best-fitting 5-parameter model for benzene are listed in Table A-8. A visual




examination of the residuals indicated that they were randomly distributed without showing




any obvious trend or any particular pattern  (Figure A-15). The standardized residual of the




best-fitting model was  close to a normal distribution with constant variances. There was no




visual evidence for the  lack of a fit or of significant unequal error variance for the best-fitting




5-parameter regression model  for the residential ambient air benzene concentration. The




Mallows'  Cp statistic  associated with this particular subset of variables was determined at 5,




suggesting the model result had appropriate number of parameters included.










Diagnostics of Equal Variances and Multicollinearity Diagnostics




       To test the assumption of equal variance,  the heteroscedasticity of the parameter




estimates were tested as well as multicollinearity. The chi-square was 18.85 with a probability




of 0.17, a value greater than 0.05. Therefore, the variances of the parameter estimates could




be concluded as not significantly different.  As a consequence, the equal  error variances in




parameter estimates were  assumed in the  best-fitting 5-parameter  model.  The  bivariate

-------
                                                                                  A-27






Pearson correlations between pairs of predictors included in the model showed statistically




significant correlations  between  'the inverse distance  to  the closest gas station' and  'the




inverse distance to the closest urban major arterial roadways (FC14)' (0.184, p=0.013).




       The variance inflations for predictor variables were close to 1 (1.01 ~ 1.06) which is




not greater than  10. Based on  the variation  inflation  factors, there was  no  significant




collinearity between the predictors in the model.  However, as a  result of the collinearity




diagnostics, the condition index was 103, which was greater than 100, and the eigenvalue was




close to zero (0.00036), which was smaller than 0.01.

-------
                                                                             A-28
Table A-7. Results of the Best-fitting 6-Parameter Model for Benzene




Analysis of Variance
Source DF
Model 3
Error 179
Corrected Total 182
Root MSB
Dependent Mean
Coefficient of Variation
Sum of
Squares
26.86251
77.73533
104.5978
0.6590
0.1278
515.6321
Mean
Square
8.95417
0.43428

F Value Pr > F
20.62 <.0001


R-Square
Adjusted R-Square


0.2568
0.2444



Parameter Estimates
Variable Label
Intercept Intercept
GSlmlnv (Distance to Gas Station)
K4 Temperature
U4 Wind speed
DF
1
-1 1
1
1
Parameter
Estimate
11.46110
26.16590
-0.03743
-0.17239
Standard
Error
1.74199
7.30908
0.00585
0.04456
t Value Pr> t

6.58 <.0001
3.58 0.0004
-6.4 <.0001
-3.87 0.0002
Summary of Stepwise Selection
Step Variable Entered
1 Temperature
2 Wind Speed
3 (Distance to Gas Station)-1
Partial
R-Square
0.1407
0.0629
0.0532
Model
R-Square
0.1407
0.2036
0.2568
Cp
27.2654
14.1672
3.3947
F Value Pr>F

29.64 <.0001
14.22 0.0002
12.82 0.0004

-------
                                                                         A-29
     Residual Plot of the  Best  Fit  Model  of Benzene
  LnfenO = 11. 461 +26.166 GSIni nv -0. 0374 K5 -0.1724 Ltt
    2"
    0	+
    -2-
    -3-
    -4i
                       +   J
                 +x   n     -
                  +++    4   +
N
183

0.2568
AdjRsq
0.2444
R\
-------
                                                                              A-30
Table A-8. Results of the Best-fitting 5-Parameter Model for Benzene after Removing the
Outliers
Analysis of Variance

Source DF
Model 5
Error 163
Corrected Total 168
Root MSB
Dependent Mean
Coefficient of Variation
Sum of
Squares
26.0494
37.2387
63.2881
0.4780
0.1438
332.5000
Mean
Square
5.20987
0.22846

R-Square


F Value
22



Adjusted R-Square


.8


0.4116
0.3936


Pr>F
<.0001





Parameter Estimates

Variable Label
Intercept Intercept
F14_lmlnv (Distance to FC14)-1

DF
1
1
GSlmlnv (Distance to Gas Station)-1 1
Stab4 Atmospheric Stability
K4 Temperature
U4 Wind Speed
1
1
1
Parameter
Estimate
10.07440
5.49770
16.14780
0.30356
-0.03914
-0.08488
Standard
Error
1.49805
3.32981
5.57504
0.09971
0.00447
0.03966

t Value
6.73
1.65
2.90
3.04
-8.76
-2.14

Pr> 1 1 1
<.0001
0.1007
0.0043
0.0027
<.0001
0.0338
Summary of Stepwise Selection

Step Variable Entered
1 Temperature
2 Atmospheric Stability
3 (Distance to Gas Station)-1
4 Wind Speed
5 (Distance to FC14)-1
Partial
R-Square
0.2510
0.0996
0.0340
0.0172
0.0098
Model
R-Square
0.2510
0.3506
0.3846
0.4018
0.4116

Cp
41.1680
15.7541
8.3926
5.6629
4.9545

F Value
55.95
25.46
9.12
4.71
2.73

Pr>F
<.0001
<.0001
0.0029
0.0314
0.1007

-------
                                                                                     A-31
             Residual Plot  of the  Best  Fit Model  of Benzene
         LnfenOOit = 10.074 +5. 4977 f 14 1ntnv +16. 148 GSIntnv +0.3036 aab4 -0.0391 K5 -0. 0849 IX
            1.5-
            LO-
            CI 5'
           -0.5
           -1.0-
           -1 5'
 N
 169

 0.4116
 AdjRsq
 0.3936
                                                                             0.478
                \       I      I       I      I       I      I      I       I      T
               -0.75   -0.50   -0.25   0.00    0.25   0.50    0.75   1.00    1.25   1.50

                                        Predicted Val ue

Figure A-15. Residual vs. Predicted Plot of the Best-fitting 5-Parameter Model of Benzene

after Removing the Outliers
                         Cp  Plot with Reference  Lines
          LnfenOQit = 10.074 +5. 4977 f 14 1ntnv +16. 148 GBInknv +0.30363ab4 -0.0391 V5 -0.0849LM
            so-
            40'
            30'
            20'
            10'
            0'
N
169
Ffeq
0.4116
MjFfeq
0.3936
RVCE
0.478
              2.0     2.5      3.0     3.5      4.0     4.5      5.0     5.5      6.0

                                            P

          R ot  + + + CPP                       	 CP = P
               	 CP = 2P - (P for f ul I model) + 1
Figure A-16. Cp Plot for the Best-fitting 5-Parameter Model for Benzene after Removing the
Outliers

-------
                                                                                 A-32






4.7. Ethylbenzene




4.7.1. Bivariate Pearson Correlation




       Ethylbenzene concentration was not  correlated significantly with the any  of the




proximity variables of roadway classification directly or following any transformations. The




inverse distance from the sampler location to US highway Route 1 (one of the  individual




roadways  of  FC14)  showed  significant correlation for  un-transformed  ambient air




concentration  of ethylbenzene  (0.167, p=0.024). The inverse distance to the closest gas




station  had  significant  correlation  for  un-transformed  ambient air  concentration of




ethylbenzene (0.206, p=0.005). There were two identified point sources of ethylbenzene in




the study area,  but only the distance  to  the refinery was  correlated with  ethylbenzene




concentration (p<0.1).




       The  meteorological variables  that  correlated  with  In-transformed  ethylbenzene




concentrations were atmospheric stability (0.27, p=0.0003), wind speed (-0.17, p=0.024), and




temperature (-0.14, p=0.06). Mixing height, relative humidity,  precipitation, and atmospheric




pressure   were  not  significantly  correlated with  the   ambient  air  concentration of




ethylbenzene.










4.7.2. Preliminary Selection of Predictors




       A series  of  preliminary regression  analyses for each group  of  variable were




performed using  the In-transformed ethylbenzene  concentrations  to  determine  which




variables to include in the model. The distances to the closest gas station was selected as an




important predictor among the variables that describe the distance between sources and




residences. From  the meteorological variables,  atmospheric  stability  was  selected  as  a




predictor variable (p<0.15).

-------
                                                                                 A-33
4.7.3. Selection of the Best-fitting Model




       The predictor variables selected in the models by the different selection methods for




the  residential  ambient  air  concentration  of  ethylbenzene were relatively  consistent.




Atmospheric stability and temperature were selected among the meteorological variables as




predictor variables. The inverse square distance to the urban major arterial roadways  (FC14)




was  selected as a significant predictor in the model of ethylbenzene among  the  source




proximity variables. The C(p) was 4.3, close to the number of parameters (4) included in




model. The parameter estimates were significant at p<0.05 for the meteorological variables




(atmospheric stability and temperature). The model statistics are summarized in Table A-9.




The  residuals were distributed relatively  randomly (Figure A-17) suggesting that there was




equal variance in residuals. The probability plot (Appendix C) was linear indicating that the




error term of the model followed a normal distribution. Possible outliers were identified by





using a test statistics of + 1.645 (0.95, df. = 175) (Figure A-18). The analysis of the variance,





parameter  estimates, and  the summary of model statistics for the  best-fitting 4-parameter




model for  ethylbenzene after removal of outliers are listed in Table  A-10.  The  removal of




the seventeen  outliers did not improve the r2 as much as  observed  from the models of the




other VOCs in this research. However, the probability of parameter estimates of the  best-




fitting model was improved for all variables (p<0.05).




       The residuals  for  the model with  the outliers removed  were  more randomly




distributed (Figure A-ll)  compared to the distribution before removal (Figure A-9). The




standardized residuals of the best-fitting model were close to a normal distribution and had




constant variances. The probability plot  showed the linearity of the error term followed a




normal distribution. Based on a visual diagnosis, there was no evidence of a lack of fit or

-------
                                                                                 A-34






unequal error variance for  the best-fitting 4-parameter regression model for the ambient




residential ethylbenzene. The Mallows' C  statistic associated with this particular subset of




variables was  determined at 4 (Figure A-12),  indicating that the resulting model had the




appropriate number of parameters.










Diagnostics of Equal Variances and Multicollinearity Diagnostics




To test the assumption of equal variance, the heteroscedasticity of the parameter estimates




were tested as well as the mulitcollinearity. The chi-square was 16.88 with a  probability of




0.0506, a value slightly greater than 0.05. Therefore, the variances of the parameter estimates




could be concluded as not significantly different. As a consequence, the equal error variances




in parameter estimates were assumed  in the best-fitting 4-parameter model.  The bivariate




Pearson correlations between pairs  of predictors did not identify any statistically significant




correlations between predictors in the model at D =0.05.




       The variance inflations for predictor variables were close to 1 (1.007 ~  1.023) a value




not greater  than  10.  Based  on  the variation inflation  factors,  there was no  significant




collinearity between the predictors in the model. As a result of the collinearity diagnostics,




the condition index was 91,  which was smaller than  100. However, the eigenvalue was close




to zero (0.00037), which was smaller than 0.01. The proportion of variation of intercept




(0.98) and temperature (0.97) were greater than 0.5, indicating that the two parameters were




interacted. As described for m,p xylene  some codependency among the variable exist..

-------
                                                                             A-35
Table A-9. Results of the Best-fitting 4-Parameter Model for Ethylbenzene




Analysis of Variance
Source DF
Model 4
Error 178
Corrected Total 182
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
32.8909
223.166
256.057
1.1197
-0.2723
-411.2200
Mean
Square
8.22271
1.25374

FV
6.


R- Square
Adjusted R-Square
alue
56


0.1285
0.1089
Pr>F
<.0001



Parameter Estimates
Variable Label
Intercept Intercept
F14_lmlnv (Distance to FC14)-1
Stab4 Atmospheric Stability
K5 Temperature
U4 Wind Speed
DF
1
1
1
1
1
Parameter
Estimate
5.32543
13.31980
0.52795
-0.02653
-0.17165
Standard
Error
3.34905
7.59097
0.21804
0.00999
0.08826
t Value
1.59
1.75
2.42
-2.66
-1.94
Pr> t
0.1136
0.0810
0.0165
0.0086
0.0534
Summary of Stepwise Selection
Step Variable Entered
1 Atmospheric Stability
2 Temperature
3 Wind Speed
4 (Distance to FC14)-1
Partial
R-Square
0.0741
0.0204
0.0189
0.0151
Model
R-Square
0.0741
0.0945
0.1134
0.1285
Cp
7.2130
5.1088
3.3142
2.2823
F Value
14.49
4.06
3.81
3.08
Pr>F
0.0002
0.0455
0.0525
0.0810

-------
                                                                              A-36
     Residual Plot of the Best Fit Model  of Ethylbenzene
    LnEbzO = 5.3254 +13.32 f14 1ninv +0.5279 3ab4 -0.0265K5 -0.1716 LX
       3"
      -2'


      -3


      -4"
                                                                       N
                                                                       183
                                                                       Rsq
                                                                       0.1285
                                                                       AdjRsq
                                                                       0.1089
                                                                       1.1197
          I      I      I     I      I      I      I      I     I      I      I
        -1.50  -1.25  -1.00   -0.75   -0.50  -0.25  0.00   0.25   0.50   0.75   1.00

                                 Redicted Val ue

Figure A-17. Residual vs. Predicted Plot of the  4-Parameter Model of Ethylbenzene
                 Outliers of Model for  Ethylbenzene
     LnBzO = 5.3254 +13.32 f 14 1nknv +0. 5279 3 ab4 -0.0265 l« -0. 1716 LM
       3"


    I 2'

     as
     ^ 1'



    J °

     X
    - -r
    -D

    1 -2'
    -a

    £ -3-
     as
    -a

    si -4"


       -5'

N
183
Ffeq
0.1285
MjFfeq
0.1089
RVCE
1.1197
                 25      50      75     100     125      150     175     200

                                 Cbservat i on Nintoer
           Figure A-18. Outliers of 4-Parameter Model of Ethylbenzene

-------
                                                                               A-37
Table A-10.  Results of the Best-fitting 4-Parameter Multiple linear regression Model for
Ethylbenzene After Removing the Outliers

Analysis of Variance
Source DF
Model 4
Error 164
Corrected Total 168
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
20.4924
105.564
126.056
0.8023
-0.1066
-752.6100
Mean
Square
5.12309
0.64368

F



R- Square
Adjusted R-Square
Value
7.96


0.1626
0.1421
Pr>F
<.0001



Parameter Estimates
Variable Label
Intercept Intercept
F14_lmlnv (Distance to FC14)-1
Stab4 Atmospheric Stability
K4 Temperature
U4 Wind Speed
Parameter
DF Estimate
1
1
1
1
1
5.98125
9.68110
0.43775
-0.02747
-0.11372
Standard
Error
2.43806
5.62338
0.16287
0.00732
0.06587
t Value
2.45
1.72
2.69
-3.75
-1.73
Pr>|t
0.0152
0.0870
0.0079
0.0002
0.0861
Summary of Stepwise Selection
Step Variable Entered
1 Atmospheric Stability
2 Temperature
3 Wind Speed
4 (Distance to FC14)-1
Partial
R-Square
0.0774
0.0552
0.0149
0.0151
Model
R-Square
0.0774
0.1326
0.1474
0.1626
Cp
13.4719
4.8002
3.9236
2.9959
F Value
14.01
10.56
2.88
2.96
Pr>F
0.0002
0.0014
0.0917
0.0870

-------
                                                                                   A-38
         Residual  Plot of the  Best  Fit Model of Ethylbenzene
         LnEbzOOit = 5.9813 +9.6811 f 14 1ntnv +0.4377 SLaM -0.0275 K5 -0.1137 IX
           2"
           -2-
           -3-
                            +    +
                         +  v--

              \       I      I       I      I       I       I      \
                                                                   \\
                                                                           N
                                                                           169

                                                                           0.1626
                                                                           AdjRsq
                                                                           0.1421
                                                                           0.8023
             -1.0   -0.8    -0.6    -0.4    -0.2    0.0    0.2    0.4    0.6    0.8

                                      Redicted Val ue

Figure A-19. Residual vs. Predicted Plot of the Best-fitting 4-Parameter Model of
Ethylbenzene after Removing the Outliers
                         Cp Plot with Reference Lines
         LnBzOQit  = 5.9813 +9.6811 f14_1ntnv +0.4377 3ab4 -0.0275 Y5 -0.1137 Ut
           50'
           40'
           20'
           10"
             2.0
                       2.5
                                 3.0
                                           3.5

                                           P
          R ot  + + + CPP
               	 CP = 2P - (P for f ul I model) + 1
                                                    4.0
' CP= P
                                                              4.5
                          N
                          169
                          Ffeq
                          0.1626
                          MjFfeq
                          0.1421
                          RVCE
                          0. 8023
                                                                        5.0
Figure A-20. Cp Plot  for the Best-fitting  4-Parameter Model  for Ethylbenzene  after
Removing the Outliers

-------
                                                                                 A-39






Methyl tert Butyl Ether (MTBE)







Bivariate Pearson Correlation




       The correlation coefficients between the untransformed ambient air concentration of




MTBE and the distance and the inverse distance to the nearest FC 11 (interstate highways in




urban) was 0.22, p=0.0027.  The only identified point source of MTBE within 3 kilometers




of Elizabeth, NJ, which generated more  than 0.9 tons in  1999, was the refinery in Linden.




However, the distance  to the  refinery was  not  significantly correlated  to  the MTBE air




concentration.




       The  meteorological  variables that  were  significantly  correlated  with  the  In-




transformed MTBE  air concentrations were: atmospheric stability (0.296, p<0.0001), wind




speed (-0.265,  p=0.0004), relative humidity (0.196, p=0.0094), and temperature  (0.173,




p=0.022). The atmospheric  pressure and precipitation were not correlated with the MTBE




air concentrations.










Preliminary Selection of Predictors




       The distances from the air  sampler to the closest  gas station  was selected as  a




predictor  of ambient air  concentration  of MTBE at p<0.15. The  distance to the  major




roadways  and refinery were  not selected  as significant predictors. Atmospheric  stability was




selected as predictors from the meteorological variables.










Selection of the Best-fitting Model




       The predictor variables selected by the different regression model selection methods




for the residential ambient air concentration of MTBE were relatively consistent. The




meteorological variables, which were  consistently included in the series of regression model,

-------
                                                                                 A-40






were the atmospheric stability, temperature, and wind speed, in order of selection. The




distance to the closest interstate roadways (FC11) and the distance to the closet major urban




arterial roadways (FC14) were not selected as significant predictor variables in the model of




MTBE. The distance to the closest gas station was included as a significant predictor variable




in the model of MTBE. The model statistics are summarized in Table A-ll. The residuals




were relatively randomly distributed but had some irregular pattern in error variances (Figure




A-21). The PP plot was nearly linear implying that the error term  of the model followed a




normal  distribution (Appendix B). Fourteen data points were identified as possible Outliers





were identified using test statistics of +  1.655 at 0.95, df=165 (Figure A-22). The analysis of





the variance, parameter estimates, and the summary of model statistics  for the best-fitting 5-




parameter model for MTBE after removing outliers are listed in Table A-12. After removing




the Outliers, the parameter estimates became more significant for all variables. A visual




examination of the residuals  indicated that they were randomly distributed without showing




any obvious trend or any particular pattern (Figure A-23). The standardized residuals of the




best-fitting model were close to a normal distribution with constant variances. The PP plot




was nearly linear implying that the error term of the model followed a normal distribution




(Appendix C). There was no visual evidence for the lack of fit or unequal error variance for




the best-fitting 5-parameter regression model for the residential ambient air MTBE




concentrations. The Mallows'  C statistic associated with this particular subset of variables




was 5.0. Since the number of parameters (p) including the intercept in the best-fitting model




was 5, the resulting model had the appropriate  number of parameters.

-------
                                                                                 A-41






Diagnostics of Equal Variances and Multicollinearity Diagnostics




       To test the assumption  of  equal variance, the heteroscedasticity of the parameter




estimates were tested as well as multicollinearity. The chi-square was 15.55 with a probability




of 0.34, a value greater than 0.05. Therefore, the variances of the parameter estimates could




be concluded as not significantly different.  As a  consequence, the equal  error variances in




parameter estimates were  assumed in the  best-fitting 5-parameter model. The  bivariate




Pearson correlations between pairs of predictors included in the model identified statistically




significant correlations between the wind speed and atmospheric stability  (-0.51,  p<0.0001),




and between the wind speed and temperature (-0.25, p=0.0007).




       The variance inflations for  seven predictor variables were close to  1 (1.03 ~ 1.46)




which is not greater than 10. Based on the variation inflation factors, there was no significant




collinearity between the predictors  in the model. However,  as a result of the  collinearity




diagnostics, the condition index was 115, which was greater than 100, and the eigenvalue was




close to zero (0.00033), which was smaller  than 0.01.  The proportion  of variation  of




intercept  (0.98)  and  temperature (0.94) were greater  than  0.5,  indicating that  the two




parameters were interacted. Therefore, between  some predictors, there were possible  co-




dependences, which might overspecify the model outcome.

-------
                                                                            A-42
Table A-ll. Results of the Best-fitting 5-Parameter Model for MTBE




Analysis of Variance
Source DF
Model 2
Error 180
Corrected Total 182
Root MSB
Dependent Mean
Coefficient of Variation
Sum of Mean
Squares Square
F Value
23.4697 11.7349 8.23
256.619 1.42566
280.089
1.1940 R-Square 0.0838
1.2495 Adjusted R-Square 0.0736
95.5620
Pr>F
0.0004
Parameter Estimates
Variable Label
Intercept Intercept
GSlmlnv (Distance to Gas Station)-1
U4 Wind speed
Parameter
DF Estimate
1 1.97438
1 39.49380
1 -0.21592
Standard
Error t Value
0.35393 5.58
13.21040 2.99
0.07758 -2.78
Pr>|t|
<.0001
0.0032
0.0060
Summary of Stepwise Selection
Step Variable Entered
1 (Distance to Gas Station)-1
2 Wind Speed
Partial Model
R-Square R-Square
0.0444 0.0444
0.0394 0.0838
Cp F Value
5.9898 8.40
0.3584 7.75
Pr>F
0.0042
0.0060

-------
                                                                                A-43
              Residual  Plot of the Best  Fit  Model  of MTBE
        LnWIEBD = 1. 9744 +39. 494 GS1ni nv -0. 2159 Ltt
          -r


          -2-


          -3


          -4"

                                                                         N
                                                                         183

                                                                         0.0838
                                                                         AdjRsq
                                                                         0.0736
                                                                         1.194
             n     i      i      i     i      i      i     i      i      i     r
            0.25   0.50   0.75   1.00   1.25   1.50   1.75   2.00   2.25   2.50   2.75

                                     Redicted Val ue

Figure A-21. Residual vs. Predicted Plot of the Best-fitting 5-Parameter Model of MTBE
                         Outliers of Model  for MTBE
         LnWIBBD = 1.9744 +39.494 CBMnv -0.2159 Uf
+
+ + + + +
v / +\ \ * ++ $. +/+
/*- + ^-bf ++ f+++ ++ ++ + -t
* t+
i
j
,
-r '

I I I

+
*+++++ +++ V
f -!£+++ ++++4
+ i"
+

+
-|-

|
+
-|-
+ + + ^
jt-K^jfefa.--^. 	 .
++v +





i
N
Ffeq
0.0838
MjFfeq
0.0736
RVCE







                    25     50     75     100     125     150     175     200

                                    Cbservat i on Nintoer

                  Figure A-22. Outliers of 5-Parameter Model of MTBE

-------
                                                                             A-44
Table A-12. Results of the Best-fitting 5-Parameter Model for MTBE after Removing the
Outliers
Analysis of Variance

Source DF
Model 5
Error 165
Corrected Total 170
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
29.1548
87.6716
116.826
0.7289
1.4525
50.1840
Mean
Square
5.83095
0.53134

R- Square


F Value
10.97



Adjusted R-Square




0.2496
0.2268


Pr>F
<.0001





Parameter Estimates

Variable Label
Intercept Intercept
Fll_lmlnv (Distance to FC11)-1
GSlmlnv (Distance to Gas Station)
Stab4 Atmospheric Stability
K5 Temperature
U4 Wind speed

DF
1
1
-1 1
1
1
1
Parameter
Estimate
-2.74300
22.25470
33.56270
0.24348
0.01239
-0.18702
Standard
Error
2.27071
14.30720
8.28221
0.14963
0.00669
0.06022

t Value
-1.21
1.56
4.05
1.63
1.85
-3.11

Pr>|t
0.2288
0.1217
<.0001
0.1056
0.0659
0.0022
Summary of Stepwise Selection

Step Variable Entered
1 Wind Speed
2 (Distance to Gas Station)-1
3 Temperature
4 Atmospheric Stability
5 (Distance to FC11)-1
Partial
R-Square
0.1226
0.0897
0.0136
0.0126
0.0110
Model
R-Square
0.1226
0.2123
0.2259
0.2386
0.2496

Cp
23.1543
5.7156
4.7657
4.0309
3.6458

F Value
23.62
19.13
2.94
2.75
2.42

Pr>F
<.0001
<.0001
0.0885
0.0991
0.1217

-------
                                                                                 A-45
              Residual Plot of the  Best Fit  Model  of MTBE
         LrMEBDOit = -2. 743+22. 255 f 11 1ni nv +33. 563 GSIntnv +0. 2435 StaM +0. 0124 K5 -0.187 IX
2.0'

1.5'

1.0"
0.5'


0. 0'

0.5'

1.0"


1.5'
2.0'
2.5'


+ +
+ ~*~ 4- "*"
++ + + +
+ +l"+ |V ++ +"+ + ^
"*" + + + JL. "4;"l"iu.+ -t"h-+ "*" + + ++
+ +±t+ + +4+l++-^" ^ + +
+ •+ ^" ~Hi.H=i- _!__(- +
-(- + L±T^ j_ -H~+ + + _(.
+ i J~ 1 J 	 L -(-
+ ++ + +*+ +
-+ f+
+ 1 "*"
"*" +
+ + + +
+
+
N
171
F%q
0.2496
AdjRsq
0.2268
0.7289












                                                       \     i      i     r
              0.25   0.50  0.75   1.00   1.25   1.50   1.75   2.00   2.25  2.50   2.75
                                      Predicted Val ue

Figure A-23. Residual vs. Predicted Plot of the Best-fitting 5-Parameter Model of MTBE
after Removing the Outliers
                        Cp Plot with Reference Lines
         LnWIBBDQit = -2. 743 +22. 255 f 11 1nt nv +33. 563 (£1nt nv +0. 2435 3 ab4 +0. 0124 \<5 -0. 187 Ut
50'



40'


30'


20'

10'

0'
+

+ +



+ +
+ ,

+ + +
+ *
+ * +
+ +

	 L^^— ^ 	 — •-
	
N
171
Ffeq
0.2496
MjFfeq
0.2268
RvEE
0.7289







2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
P
~A -1 	 1 — 1- /-rvti-i 	 /~i-i _ n
                   CP = 2P - (P for f ul I  model) + 1
Figure A-24. Cp Plot for the Best-fitting 5-Parameter Model for MTBE after Removing the
Outliers

-------
                                                                                 A-46






4.9. Tetrachloroethylene (PCE)




4.9.1. Bivariate Pearson Correlation




       The distance to the closest dry cleaning facility (DCF1) was statistically significantly




correlated with the In-transformed ambient air concentration of PCE at D =0.01, regardless




of form of transformation of the distance variable. The refinery was the only identified point




source of the PCE with the combined annual generation of 1.0 ton. The distance to the




refinery was not correlated  significantly with PCE in any  of  the  analysis. A statistically




significant  correlation was found at p<0.01 were the distance to the US  Highway Route 1




and the distance to the closest gas station. Other proximity  variables to roadway were not




significantly correlated.




       The  meteorological   variables  that   were  significantly  correlated  with   PCE




concentrations were wind  speed (-0.373, p<0.0001), relative humidity (0.313, p<0.0001),




atmospheric stability (0.282, p=0.0001), mixing heights (-0.254, p=0.0005), and precipitation




(0.235, p=0.0013). Temperature and  atmospheric pressure was not  correlated with the




residential ambient air concentration of PCE.










4.9.2. Preliminary Selection of Predictors




       A series of preliminary regression analyses were performed on the In-transformed




PCE concentration to  determine which variables to include in the model. The distances to




the closest  dry cleaning facility was selected as  an important predictor of ambient  PCE




concentration from the variables that describe the distance between sources and residences.




The wind  speed, precipitation, and  stability were selected as predictor variables (p<0.15)




from the meteorological variables.

-------
                                                                                 A-47






Selection of the Best-fitting Model




       The predictor variables selected by the different regression model selection methods




for the  residential ambient air concentration of PCE were relatively  consistent.  The




meteorological variables, which were consistently included in the series of regression model,




were the wind speed, temperature, atmospheric stability, and  relative humidity in order of




selection. The inverse distance to the closest dry cleaning facility was selected as a significant




predictor variable in the best-fitting model of PCE, while the distance to major roadways or




gas station were not selected as  expected. The  model statistics are summarized in Table A-




13. The residuals were relatively  randomly distributed but had few outliers in error variances




(Figure A-25). The PP plot was nearly  linear  implying that the error term of the model




followed a normal distribution.  Eight data points were identified as  possible Outliers were





identified using test statistics of  +  1.655 at 0.95, df=165 (Figure A-26).  The analysis of the





variance, parameter estimates, and  the summary of the model statistics for the best-fitting 6-




parameter model for PCE after removing the outliers are listed in Table A-14. The selected




model was statistically significant (p<0.0001).




       A visual examination of  the residuals indicated that  they were randomly distributed




without showing any obvious trend or any particular pattern  (Figure A-26). The standardized




residuals of the  best-fitting model were close to a normal  distribution with  constant




variances. The PP plot was nearly linear implying that the error term of the model followed a




normal distribution. There was no  visual evidence of lack of fit or unequal error variance for




the best 6-parameter regression  model for the residential ambient air PCE concentration.




The Mallows'  Cp statistic associated with this particular subset of variables was determined at




6.0 (Figure A-27). Since the number of parameters (p) including the intercept in the best-




fitting model was 6, the resulting model was appropriate in number of parameters.

-------
                                                                                 A-48
Diagnostics of Equal Variances and Multicollinearity Diagnostics




       To test the assumption of equal variance,  the heteroscedasticity of the  parameter




estimates were tested as well as multicollinearity. The chi-square was 18.1 with a probability




of 0.58, a value greater than 0.05. Therefore, the variances of the parameter estimates could




be concluded as not  significantly different.  As a consequence, the equal error variances in




parameter  estimates were  assumed  in the  best-fitting 6-parameter model.  The bivariate




Pearson correlations showed statistically significant  correlations were identified between the




wind speed and atmospheric stability (-0.51, p<0.0001), and between the wind  speed and




temperature  (-0.25,  p=0.0007). Relative  humidity was  also  significantly  correlated with




atmospheric stability (0.36, p<0.0001), and with wind speed (-0.43, p<0.0001).




       The variance  inflations for seven predictor variables were close to  1 (1.04 ~ 1.44)




which is not greater than 10. Based on the variation  inflation factors, there was no  significant




collinearity between the predictors in  the  model. However, as a result of the collinearity




diagnostics, the condition index was 130, which was greater than 100, and the eigenvalue was




close to zero (0.00033), which was smaller than 0.01

-------
                                                                             A-49
Table A-13. Results of the Best-fitting 6-Parameter Model for PCE




Analysis of Variance
Source DF
Model 4
Error 178
Corrected Total 182
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
25.689
96.5014
122.190
0.7363
-0.3394
-216.9400
Mean
Square
6.42224
0.54214

F Value
11.85


R- Square
Adjusted R-Square


0.2102
0.1925
Pr>F
<.0001



Parameter Estimates
Variable Label
Intercept Intercept
DCFlmlnv (Distance to DCF)-1
Stab4 Atmospheric Stability
U4 Wind speed
PrecipS Precipitation
DF
1
1
1
1
1
Parameter
Estimate
-0.65455
54.38580
0.21071
-0.22587
0.01026
Standard
Error
0.86751
21.18230
0.14312
0.05600
0.00388
t Value
-0.75
2.57
1.47
-4.03
2.65
Pr>|t
0.4515
0.0111
0.1427
<.0001
0.0088
Summary of Stepwise Selection
Step Variable Entered
1 Wind Speed
2 (Distance to DCF)-1
3 Precipitation
4 Atmospheric Stability
Partial
R-Square
0.1389
0.0314
0.0303
0.0096
Model
R-Square
0.1389
0.1704
0.2006
0.2102
Cp
13.2138
8.1961
3.4399
3.2933
F Value
29.20
6.82
6.78
2.17
Pr>F
<.0001
0.0098
0.0100
0.1427

-------
                                                                                  A-50
            Residual Plot of the Best  Fit Model of PCE
      LnFCBD = -0.6546 +54.386 OFIninv +0.2107 aab4 -0. 2259 IX +0. 0103 Reel p5
        0"
        -1
        -2-
        -31
                                                                          N
                                                                          183
                                                                          Rsq
                                                                          0.2102
                                                                          AdjRsq
                                                                          0.1925
                                                                          R\
-------
                                                                               A-51
Table A-14. Results  of the Best-fitting 6-Parameter Model for PCE after Removing the
Outliers
Analysis of Variance

Source DF
Model 5
Error 158
Corrected Total 163
Root MSE
Dependent Mean
Coefficient of Variation
Sum of
Squares
12.5816
27.6904
40.2721
0.4186
-0.2064
-202.8800
Mean
Square
2.51633
0.17526

R- Square

FV

alue
14.36



Adjusted R-Square




0.3124
0.2907


Pr>F
<.0001





Parameter Estimates

Variable Label
Intercept Intercept
DCFlmlnv (Distance to DCF)-1
Stab4 Atmospheric Stability
K5 Temperature
U4 Wind speed
RH4 Relative Humidity

DF
1
1
1
1
1
1
Parameter
Estimate
2.49450
32.67340
0.14442
-0.01229
-0.14410
0.00913
Standard
Error
1.34715
12.39640
0.08831
0.00416
0.03588
0.00301

t Value
1.85
2.64
1.64
-2.96
-4.02
3.04

Pr>|t
0.0659
0.0092
0.1040
0.0036
<.0001
0.0028
Summary of Stepwise Selection

Step Variable Entered
1 Wind Speed
2 (Distance to DCF)-1
3 Relative Humidity
4 Temperature
5 Atmospheric Stability
Partial
R-Square
0.1847
0.0381
0.0317
0.0463
0.0116
Model
R-Square
0.1847
0.2228
0.2545
0.3008
0.3124

Cp
25.6505
18.9710
13.7554
5.2151
4.5652

F Value
36.70
7.90
6.80
10.53
2.67

Pr>F
<.0001
0.0056
0.0100
0.0014
0.1040

-------
                                                                                    A-52
                Residual Plot of the Best Fit Model  of PCE
         LnPCBDOit = 2.4945 +32.673 DCFInt nv +0.1444 StaM -0.0123 K5 -0.1441 Ltt +0.0091 R-5
1.25"
1.00-
0.75-
0.50-
0.25-
^ 0.00'
O3
-0.25'
-0.50-
-0.75-
-1.00-
-1.25-

+ +
+ + + + +
\ + ++-V ^ +^++ + + ? +
-t1" + ++ -fc
^ 1 ~ _L . il_.
+ + ++ JL*+ + ++ + +
"^ £ *
+ + t++++++++ +*+^+ + ^
+ + +
+
N
164
Rsq
0. 3124
AdjRsq
0.2907
RvEE
0.4186






                 \       i       r
               -0.8    -0.6     -0.4     -0.2    0.0     0.2    0.4     0.6     0.8

                                        Predicted Val ue

Figure A-27. Residual vs. Predicted Plot of the Best-fitting 6-Parameter Model of PCE after
Removing the Outliers
                         Cp Plot with  Reference Lines
          LnFCEOOrt = 2. 4945 +32. 673 DCFInt nv +0.1444 3 ab4 -0. 0123 ¥5 -0. 1441 LM +0. 0091 R-5
           50'
           40'
           30'
           20'
           10'
N
164
Rsq
0.3124
MjFfeq
0. 2907
RvCE
0.4186
              2.0     2.5     3.0      3.5     4.0      4.5     5.0     5.5      6.0

                                            P

          R ot  + + + CPP                      	 CP = P
               	 CP = 2P - (P for f ul I  model)  + 1


Figure A-28. Cp Plot for the Best-fitting 6-Parameter Model for PCE after Removing the

Outliers

-------
                                                                                A-53






PM25 Mass




Bivariate Pearson Correlation




       The correlation coefficients of the In-trans formed PM25 Mass and the distance to the




inverse of urban interstate (FC11) roadways, minor arterial roads (F16) and local roads (F19)




were 0.21 (p=0.03), 0.22 (p=0.03) and 0.23 (0.02), respectively.




       The Pearson correlation  coefficients  of the meteorological variables and the In-




transformed PM25 Mass that were  statistically significantly correlated were: atmospheric




stability,  0.56 (p<0.0001); mixing height, -0.26 (p=0.01); wind speed, -0.50 (p<.0001); and




relative humidity, 0.39 (p<0.0001).










Preliminary Selection of Predictors




       The preliminary regression analysis was performed on the In-transformed PM25 Mass




to determine the relative importance of variables within the same types  (proximity and




meteorological) of independent variables. The distances to the urban interstates (FC11), and




local roadways  (FC19)  were  included  in  the linear  regression model  (p<0.15). Inverse




distance  to a   truck  loading/unloading  area  (PM03) was  also  selected.  Among  the




meteorological variables, atmospheric stability, temperature, atmospheric pressure and wind




speed were selected.










Selection of the Best-fitting Model





       The variables  selected  by  the different regressions methods were  relatively




consistent. Atmospheric stability was the  most important factor in the  regression model




with partial r2 of 0.318. The wind speed, temperature and atmospheric stability were also




included as predictors in the model. The model also included the inverse distances to the

-------
                                                                             A-54






major roadways (FC11),  local roadways (F19) and  truck loading area (PM03). The




parameters and  analysis of variance of the regression  equations for the PM25 Mass




ambient air concentration for the best-fitting model with 6 variables selected are given in




Table A-15.   The C(p), which is Mallows' Cp statistic,  associated with this particular




subset of variables was determined to be 8.0. The resulting  model was appropriate in




number of parameters, because the number of parameters (p) including the intercept in




the best-fitting model match to the C(p) value.  The  diagnostic plots,  the residual plot




against  the  predicted values,  and the  normal probability-probability (PP) plot were




generated and visually examined (Figures  A-30 and Appendix B). The residuals were




randomly distributed without showing any obvious trend or any particular pattern (Figure




A-30.) and close to a normal distribution and the  constant variances. The PP plot was




nearly linear so  it could be considered the error term of the model follows a normal




distribution. Based on the visual diagnosis, there was no significant evidence of lack of fit




or of significant unequal error variance for the best 8-parameter regression model.




       No evidence of outliers was found.









Diagnostics of Equal Variances and Multicollinearity Diagnostics




       The  standardized residuals of the  "best-fitting"  model  are close to a normal




distribution and have constant variances. A Shapiro-Wilk W test for normality was also




performed, and the p-value was very large (0.61), indicating that we cannot reject that the




residuals are normally distributed. Based on these results, there was no evidence of a lack




of fit or unequal error variance for this 8-parameter regression model.




       The multicollinearity of the eight predictor variables was tested by checking their




variance inflation factors (vif), which varied between 0 and 1.35 and were never higher

-------
                                                                             A-55






than 10  (reference value).  This indicates that there was  no  significant collinearity




between the predictors in the model. However, the collinearity diagnostic tests showed a




condition index of 651 (much higher than 30, the reference value) and an eigenvalue of




0.00002 (much smaller than 0.01, the reference value). The proportion of variation for the




intercept (0.99)  and for the pressure  (0.98) were  greater than 0.5 (reference  value),




indicating that the two parameters were probably interacting, and that the 8-parameters




model could be overly specified. However, to a certain extent, this might be unavoidable




because it is extremely  unlikely for all parameters  to be completely independent (non




correlated) to each other.




       In  order  to lessen the degree  of multicollinearity  diagnosed, the  atmospheric




pressure, which showed some interaction with the intercept,  was  removed from the




predictor variables and  a new multiple regression analysis was  run. The coefficient of




determination (r2) of the resulting "best-fitting"-7-parameter model was  decreased from




0.50 to 0.49, and the condition index was decreased from 651 to 127.3.  The interaction




between the predictors in this new 7-parameter model appeared to be decreased after the




removal of the pressure from the model, but the eigenvalue was still smaller than 0.01




(0.0003), and the  proportion of variation of the intercept (0.98) and the temperature




(0.94) were still greater than 0.5. Therefore, the temperature was removed from the 7-




parameter model and a  new multiple regression analysis was run again. This time, the




coefficient of determination (r2) for the resulting 6-parameter model was decreased from




0.49 to 0.47, the condition index decreased from 127.3 to 43.2 (much closer to 30, the




reference  value), and the  eigenvalue  increased to 0.002 (much closer  to 0.01,  the




reference value). However, the proportion of variations of the intercept (0.949) and that

-------
                                                                                    A-56


of the atmospheric stability (0.839) were still greater than 0.5. Eliminating the stability

parameter  from  the  model  resulted  in  a drastic  decrease  of  the  coefficient  of

determination (r2); thus, we concluded that probably the 6-parameters regression equation

better describes the In-transformed outdoor concentration of PM2.5 (Table 16).



                           Residual vs, Predicted Plot
     LnPM = 13.914 +25.773 f11_llnw *3.6275 f19_1inv +58.636 PM03DIS_inv -0.0093K4 -0.163U4
           -0.0131 mmHG4 +0.4138Stab4
                                                 t  I
                                              4- *
                                                                          N
                                                                          102
                                                                          Rsq
                                                                          0.5010
                                                                          fldjRsq
                                                                          0.4639
                                                                          RMSE
                                                                          0.3G58
            1.75     2.00    2.25     2.50    2.75    3.00     3.25    3.50    3.75

                                     Predicted Value
Figure A-30. Residual vs. Predicted Plot of the Best-fitting 6-Parameter Model of PM25 Mass

-------
Table.A-15. Results of the Best-fitting 7-Parameter Model for LnPM2.5
                                                                             A-57
Analysis of Variance

Source DF
Model 7
Error 94
Corrected Total 101
Parameter Estimates
Variable Label
Intercept Intercept
Fll_llnv Distance to FC11
F19_llnv Distance to FC19
PM03DIS_Inv
K4 Temperature
U4
mmHG4
Sum of
Squares
12.62973
12.57688
25.20661

DF
1
1
1
1
1
1
1
Stab4 Atmospheric Stability 1
Mean

Square F Value Pr > F
1.80425 13
0.13380


Parameter Standard
Estimate Error
13.91429 6.84008
25.77270 11.93020
3.62748 1.65616
58.63622 29.32827
-0.00927 0.00476
-0.16301 0.03855
-0.01311 0.00827
0.41383 0.09362
48 <.0001



vi Pr> 'I
Value '
4.14 0.0447
4.67 0.0333
4.80 0.0310
4.00 0.0485
3.79 0.0545
17.88 <.0001
2.52 0.1160
19.54 <.0001
Summary of Stepwise Selection
Step Variable Entered
1 Stab4
2 U4
3 F19_llnv
4 Fll_llnv
5 PM03DIS_Inv
6 K4
7 mmGH4
Partial
R-Square
0.3184
0.0655
0.0524
0.0210
0.0163
0.0141
0.0134
Model
R-Square P F Value Pr>F
0.3184 30.4114
0.3839 20.0725
0.4363 12.1924
0.4573 10.2394
0.4736 9.1699
0.4877 8.5164
0.5010 8.0000
46.71 <.0001
10.52 0.0016
9.12 0.0032
3.75 0.0557
2.97 0.0880
2.61 0.1094
2.52 0.1160

-------
                                                                            A-58
Table.A-16 Results of the Best-fitting 5-Parameter Model for PM2.5 after Removing
Atmospheric Pressure and Temperature
Analysis of Variance

Source DF
Model 5
Error 96
Corrected Total 101
Parameter Estimates
Variable Label
Intercept Intercept
Fll_llnv Distance to FC11
F19_llnv Distance to FC19
PM03DIS_Inv
U4
Stab4
Sum of
Squares
11.93801
13.26860
25.20661

DF
1
1
1
1
1
1
Mean
Square
2.38760
0.13821


Parameter
Estimate
1.06000
25.27425
4.19851
50.94389
-0.13037
0.42820

FV

alue
17.27



Standard
Error
0.57297
12.03839
1.65726
29.55414
0.03636
0.09491



t
Value
3.42
4.41
6.42
2.97
12.86
20.35

Pr > F
<.0001



Pr> t|
0.0674
0.0384
0.0129
0.0880
0.0005
<.0001
Summary of Stepwise Selection
Step Variable Entered
1 Stab4
2 U4
3 F19_llnv
4 Fll_llnv
5 PM03DIS_Inv
Partial
R-Square
0.3184
0.0655
0.0524
0.0210
0.0163
Model
R-Square
0.3184
0.3839
0.4363
0.4573
0.4736
Cp
26.3067
16.3623
8.7980
6.9713
6.0000
F Value
46.71
10.52
9.12
3.75
2.97
Pr>F
<.0001
0.0016
0.0032
0.0557
0.0880

-------
                                                                                A-59






Elemental Carbon




Bivariate Pearson Correlation




       The  correlation  coefficients of the In-trans formed elemental carbon concentration




and the distance to the inverse of urban interstate (FC11) roadways and minor arterial roads




(F16) were 0.28 (p=0.04) and 0.29 (p=0.03), respectively. Distance to hamburger restaurants




where broiling of meats  occur had a correlation coefficient of 0.35(p=0.01).




       The  Pearson correlation coefficients of the meteorological variables  and the In-




transformed elemental carbon concentration that were statistically  significantly correlated




were: atmospheric stability, 0.43 (p<0.0001); mixing height, -0.34 (p=0.01); wind speed, -0.33




(p=.02), relative humidity,  0.49 (p<0.0001) and precipitation 0.29(p=0.03).










Preliminary Selection of Predictors




       The  preliminary regression analysis was performed on the In-transformed elemental




carbon concentration to determine the relative importance of variables within the same types




(proximity and meteorological) of independent variables. The distances to the urban major




arterial roadways (FC14) was  included in the resulting linear regression  model (p<0.15).




Inverse distance to  a truck loading/unloading area  seaport area (PM02) was also selected.




Among the meteorological variables, atmospheric stability, relative humidity were selected.










Selection of the Best-fitting Model





       The  variables  selected  by the  different  regressions methods  were  relatively




consistent. Atmospheric stability and relative humidity were included as predictors in the




model. The  model also included the inverse distances to the major roadways (FC14) and




truck loading/sea  port area (PM02). The parameters and  analysis  of variance  of the

-------
                                                                             A-60






regression equations for elemental carbon ambient air concentration for the best-fitting




model with 6 variables selected are given in Table A-17. The C(p), which is Mallows' Cp




statistic, associated with this particular subset of variables was determined to be 5. The




resulting model was appropriate  in  number of parameters, because the  number of




parameters (p) including the intercept in the best-fitting model match to the  C(p) value.




The  diagnostic plots,  the residual plot against the predicted values, and  the  normal




probability-probability (PP) plot were generated and visually examined (Figures A- and




Appendix B). The residuals were randomly distributed without showing any obvious




trend or any particular pattern (Figure A-31.) and close to a normal distribution and the




constant variances. The PP plot was nearly linear so it  could be considered  the error term




of the model follows a normal distribution. Based on  the visual diagnosis, there was no




significant evidence of lack of fit or of significant unequal error variance for the best 8-




parameter regression model.




       No evidence of outliers was found.









Diagnostics of Equal Variances and Multicollinearity Diagnostics




       The standardized residuals  of the  "best-fitting" model  are  close to  a  normal




distribution and have constant variances. A Shapiro-Wilk W test  for normality showed a




very large p-value  (0.65), indicating that the residuals are normally distributed.




       From  the figure above  (PP-plot) we see that the distribution of the residuals




doesn't seem heteroscedastic and, therefore, we accept the hypothesis of homogeneity of




variance  of the residuals  in the 5-parameter model. Also in this case we checked the




multicollinearity of the five predictor  variables,  which varied  between 0 and  1.29,




indicating that there was no  significant collinearity between  the predictors in the model.

-------
                                                                                 A-61


The collinearity diagnostic tests showed a condition index of 37 (a little higher than 30)

and an eigenvalue  of 0.003 (smaller than 0.01). The proportion of variation for the

intercept (0.92) and for the atmospheric stability (0.95) were greater than 0.5. However,

because of the very low vif values we concluded that there was no significant collinearity

between the predictors in  the  model  and the 5-parameters regression equation  shown

before is basically adequate to describe the variation in outdoor concentration of LnEC.


                                 Residual vs, Predicted Plot
              LnEC = -2.7687 +8.4105 f14_linv +941.59 PM02D PS_]nv *0.0137RH4 *0.3S47Stab4
                                                                           N
                                                                           52
                                                                           Rsq
                                                                           0.3765
                                                                           AdjRsq
                                                                           0.3234
                                                                           HMSE
                                                                           0.3789
                                           0.0      0.2

                                          Predicted Value
 Figure A-31. Residual vs. Predicted Plot of the Best-fitting 6-Parameter Model of Elemental
                                        Carbon

-------
                                                                             A-62
Table.A-17. Results of the Best-fitting 4-Parameter Model for Elemental Carbon
Analysis of Variance
Source
DF
Model 4
Error 47
Corrected Total 51
Parameter Estimates
Variable
Intercept
F14_llnv
PM02DIS_Inv
RH4
Stab4
Label
Intercept
Distance to FC14


Sum of
Squares
4.07456
6.74801
10.82257
DF
1
1
1
1
Atmospheric Stability 1
Mean
Square
F Value Pr > F
1.01864 7.
0.14357
Parameter
Estimate
-2.76874
8.41047
944.59427
0.01371
0.35474
Standard
Error
0.68589
3.84635
494.54016
0.00498
0.14301
09 0.0001
t
Value
16.30
4.78
3.65
7.60
6.15
Pr> t|
0.0002
0.0338
0.0622
0.0083
0.0168
Summary of Stepwise Selection
Step Variable Entered
1 RH4
2 Stab4
3 F14_llnv
4 PM02DIS.



_Inv
Partial
R-Square
0.2357
0.0562
0.0362
0.0484
Model
R-Square
0.2357
0.2919
0.3281
0.3765
Cp
9.6140
7.3748
6.6483
5.0000
F Value
15.42
3.89
2.58
3.65
Pr>F
0.0003
0.0542
0.1145
0.0622

-------
                                                                               A-63






Organic Carbon




Bivariate Pearson Correlation




       The correlation coefficients of the In-trans formed organic carbon concentration and




the distance to the inverse of minor urban arterials  (FC16) roadways was 0.29 (p=0.04). The




Pearson  correlation coefficients  of the meteorological  variables  and the In-transformed




elemental carbon concentration that were statistically significantly correlated at a=0.05, were




atmospheric stability, 0.51 (p<0.0001) and relative humidity, 0.49 (p<0.0001).









Preliminary Selection of Predictors




       The preliminary regression analysis was performed on the In-transformed organic




carbon concentration to determine the relative importance of variables within the same types




(proximity and meteorological)  of independent variables.  The distances  to the interstate




roadways  (FC11)  was included  in  the resulting  linear  regression  model  (p<0.15).




Atmospheric stability was included from the meteorological variables.









Selection of the Best-fitting Model




       The  variables  selected  by the different regressions methods  were  relatively




consistent. Atmospheric stability was  included as a predictor in the model. The model




included the inverse distances to  the interstate (FC11). The parameters and analysis  of




variance of the regression  equations for elemental carbon ambient air concentration for




the best-fitting model with  6 variables selected are given in Table A-18.   The C(p), which




is Mallows' Cp statistic, associated with this particular subset of variables was determined




to be 3. The  resulting model  was appropriate  in number  of parameters, because the




number of parameters (p)  including the intercept in the best-fitting model match to the

-------
                                                                              A-64






C(p) value.  The diagnostic plots, the residual plot against the predicted values, and the




normal probability-probability (PP) plot were generated and visually examined (Figures




A-32 and Appendix B). The residuals were randomly distributed without showing any




obvious trend or any particular pattern (Figure A-32.) and close to a normal distribution




and the constant variances. The PP plot was nearly linear so it could be considered the




error term of the model follows  a normal distribution. Based on the visual diagnosis,




there was no significant evidence of lack of fit or of significant unequal error variance for




the best 8-parameter regression model.




       No evidence of outliers was found.









Diagnostics of Equal Variances and Multicollinearity Diagnostics




       The  standardized residuals  of the "best-fitting"  model  are  close  to  a  normal




distribution  and have constant variances. A Shapiro-Wilk W test for normality showed a




very large p-value (0.79), indicating that the residuals are normally distributed. From the




figure above (PP plot)  we see that also in this case we can accept the hypothesis  of




homogeneity of variance of the residuals.




       The vif values of the three predictor variables varied between 0 and  1.006 (never




higher than  10,  the reference value).  The collinearity diagnostic tests  showed a condition




index of 25.86  (lower than 30, the reference value) but an eigenvalue of 0.004 (a little




smaller than 0.01, the reference value). Even though the proportion  of variation for the




intercept and for the atmospheric stability were greater than 0.5 (they both were 0.998)




we still can assume that there was no significant collinearity between the predictors in the




model because of the extremely low vif values and, therefore, the 3-parameters regression

-------
                                                                                     A-65


equation  shown before  is basically  adequate to  describe  the variation  in outdoor

concentration of LnOC.
                                   Residual  vs,  Predicted Plot
               LnOC = -1.6752 +26.646 f11_1Inv +0.5557 Stabl
                                                                               N
                                                                               52
                                                                               Rsq
                                                                               0.2889
                                                                               AdjRsq
                                                                               0.2599
                                                                               RMGE
                                                                               0.4041
                     0.5   0.6  0.7   0.8  0.9   1.0   1.1   1.2   1.3   1.4  1.5   1.6   1.7

                                            Predicted Ualue
  Figure A-32. Residual vs. Predicted Plot of the Best-fitting 6-Parameter Model of Organic
                                         Carbon

-------
Table.A-18. Results of the Best-fitting 3-Parameter Model for Organic Carbon
                                                                            A-66
Analysis of Variance

Source DF
Model 2
Error 49
Corrected Total 51
Parameter Estimates

Variable Label
Intercept Intercept
Fll_llnv Distance to FC11
Stab4 Atmospheric Stability
Sum of
Squares
3.25109
8.00212
11.25321


DF
1
1
1
Mean
Square


F Value Pr > F
1.62555 9.95 0.0002
0.16331


Parameter
Estimate
-1.67517
26.64584
0.55571


Standard
Error t Value Pr>



t
0.66211 6.40 0.0147
19.36632 1.89 0.1751
0.13474 17.01 0.0001
Summary of Stepwise Selection
Step Variable Entered
1 Stab4
4 Fll_llnv
Partial
R-Square
0.2614
0.0275
Model
R-Square
0.2614
0.2889
Cp F Value Pr>F
2.8931 17.70 0.0001
3.000 1.89 0.1751

-------
                                                                               A-67






Coronene and Benzo-ghi-Pyrene




Bivariate Pearson Correlation





       The correlation coefficients of the In-transformed Coronene and Benzo-ghi-Pyrene




concentrations and the distance to the inverse of urban collectors (FC17) roadways was 0.44




(p=0.04) for B-ghi-p and 0.42(P<.0001) for COR. The Pearson correlation coefficients of





the meteorological  variables  and  the  In-transformed  Coronene and  Benzo-ghi-Pyrene





concentrations that were statistically significantly correlated at a=0.05, were atmospheric





stability,  0.40 (B-ghi-P) and 0.44 (COR) (p<0.0001), temperature -0.42 (both)  (P<0.0001)




and mixing height -0.29 (B-ghi-P) and -0.31 (COR) (p<0.04).










Preliminary Selection of Predictors




       The preliminary regression analysis was performed on the In-transformed organic




carbon concentration to determine the relative importance of variables within the same types




(proximity and meteorological) of independent variables. The distances to the interstate




roadways  (FC11) and  to Newark Airport  (PM01)  were included in  the resulting  linear




regression model  (p<0.15). Atmospheric  stability,  temperature  and  precipitation were




included from the meteorological variables.










Selection of the Best-fitting Model





       The  variables  selected  by  the  different  regressions  methods  were  relatively




consistent. Atmospheric stability was  included as  a predictor in the model. The model




included the  inverse distances to the interstate (FC11) and  PM01. The parameters and




analysis  of variance  of the regression equations  for Coronene and  Benzo-ghi-Pyrene




ambient air concentration for the best-fitting model with 5 variables selected are given in

-------
                                                                             A-68






Table A-19 and 20.   The C(p),  which is Mallows' Cp statistic, associated with this




particular subset of  variables was determined  to  be  6.  The  resulting  model was




appropriate in number of parameters, because the number of parameters (p) including the




intercept in the best-fitting model  match to the C(p) value.  The diagnostic plots, the




residual plot against the predicted values, and the normal probability-probability (PP) plot




were generated and visually examined (Figures A-33 and Appendix B). The residuals




were randomly distributed without showing any obvious trend or any particular pattern




(Figure A-33.) and close to a normal distribution and the constant variances. The PP plot




was nearly linear so it could be considered the error term of the model follows a normal




distribution. Based on the visual diagnosis, there was no significant evidence of lack of fit




or of significant unequal error variance for the best 8-parameter regression model.




       No evidence of outliers was found.









Diagnostics of Equal Variances and Multicollinearity Diagnostics




       The  standardized residuals  of the "best-fitting"  model are close to a normal




distribution and have constant variances. A Shapiro-Wilk W test for normality indicated a




very large p-values (0.72 and 0.75 for LnB-ghi-P and COR, respectively), suggesting that




the residuals are normally distributed.




      From the residual  versus predicted values plots shown above we see  that the




distributions of the residuals doesn't seem overly heteroscedastic and, therefore, we can




accept the hypothesis of homogeneity of variance of the residuals in both  6-parameter




models.




       The vif values varied between 0 and 1.27 for both PAHs, which indicates that




there was no significant collinearity between the predictors in the two models.  Once

-------
                                                                                    A-69
again, the  collinearity diagnostic tests  showed condition  indexes and eigenvalues that

were, respectively, slightly higher and lower than the reference values (30 and 0.01), and

proportion  of variations for two of the predictor variables that were higher than 0.5 (the

reference value). However, because of the very low vif values obtained we concluded

that there was significant collinearity between the predictors in neither of the two models

and that the two 6-parameters regression equations shown before are basically  adequate

to describe the variations in outdoor concentrations of LnB-ghi-P and LnCOR.
          Residual vs. Predicted Plot
• - IS.00? *I5Z.3I fll_llnv *B5C,8 ftlOIDIS_lnv -14.471 PrDclf>4 -O.Ci«9K4 *
         Residua! vs. Predicted Rot
                                               - -1
                                               inprmp
                                                £.0
                                                             Residua vs. Predicted Plot
 Residua vs. Predicted Plot

i_1lnu t&H3.3E PHfl1DIS_i*w -fl.OAJHIM -13.647 F

                                                  -3.0   -2.5   -2.0  -1.5   -l.i)
                                                                                 0.5   1.0   1.5
 Figure A-33. Residual vs. Predicted Plot of the Best-fitting 6-Parameter Model of COR and
                                         B-hgi-P

-------
Table.A-19. Results of the Best-fitting 5-Parameter Model for Corenen
                                                                            A-70
Parameter Estimates
Variable Label
Intercept Intercept
Fll_llnv Distance to FC11
PM01DIS_Inv
Precip4
K4
Stab4
DF
1
1
1
1
1
1
Parameter
Estimate
14.23612
125.00731
563.35265
-13.04686
-0.08337
1.63208
Standard
Error
4.72659
53.04655
371.56318
5.65791
0.01603
0.35474
t
Value
9.07
5.55
2.30
5.32
27.04
21.17
Pr> t|
0.0045
0.0236
0.1375
0.0265
<.0001
<.0001
Summary of Stepwise Selection
Step Variable Entered
1 K4
2 Stab4
3 Fll_llnv
4 Precip4
5 PM01DIS_Inv
Partial
R-Square
0.2140
0.1966
0.0588
0.0437
0.0271
Model
R-Square
0.2140
0.4106
0.4694
0.5131
0.5402
CP F
25.6691
10.9925
8.0059
6.2988
6.0000
Value
11.71
14.01
4.54
3.59
2.30
Pr>F
0.0014
0.0005
0.0391
0.0654
0.1375

-------
Table.A-20. Results of the Best-fitting 5-Parameter Model for Benzo-ghi-Pyrene
                                                                            A-71
Parameter Estimates
Variable
Intercept
Fll_llnv
PM01DIS_Inv
Predip4
K4
Stab4
Label
Intercept
Distance to FC11


Temperature





Atmospheric Stability
DF
1
1
1
1
1
1
Parameter
Estimate
13.
125
629.
-12.
-0.
1.
55955
65797
84523
18382
07630
37241
Standard
Error
4
48
336
5
0
0
.27716
.00264
.23325
.11993
.01451
.32101
t
Value
10.05
6.85
3.51
5.66
27.65
18.28
Pr>
0
0
0
0
t
0030
0125
0685
0223
<.0001
0
0001
Summary of Stepwise Selection
Step Variable Entered
1 K4
2 Stab4
3 Fll_llnv
4 Precip4
5 PM01DIS.




Jnv
Partial
R-Square
0
0
0
0
0
2160
1613
0719
0424
0420
Model
R-Square
0
.2160
.03773
0
0
0
.4493
.4917
.5337
CP F
24.
13
9.
7.
6.
5663
0761
0590
5090
0000
Value
11.85
10.88
5.36
3.34
3.51
Pr>F
0
0
0
0
0
0013
0020
0257
0751
0685

-------
                                                      APPENDIX B. PP (Probability), OO (Ouantile) Plots
                        PP Plot of the Best Fit Model of  mp-Xylene                           PP Plot  of the Best  Fit Model of  mp-Xylene
                    LnnpXD = 5.5615 +14. 562 f 14_1rrl nv +22. 462 GBlnl nv +0.5263 3 ab4 -0.0247 IS -0.1225U4                            LnnpXOCUt = 4. 9424 +7. 9474 f14_1rrl nv +17. 436 GSM nv +0. 5374 3 ab4 -0. 0232 K5 -0. 0653 Ut
                                                    N
                                                    183
                                                    Rsq
                                                    0.2657
                                                    AdjRsq
                                                    0.2450
                                                    WEE
                                                    0.7495
              (A)
0.0   0.1   0.2   0.3   0.4    0.5    0.6   0.7   0.8   0.9   1.0
               CUnul at i ve Dstribution of RBS| dual
                                                                                                                                                           Rsq
                                                                                                                                                           0.3304
                                                                                                                                                           MjFfeq
                                                                                                                                                           0.3098
                                                                                                        0.0    0.1   0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9   1.0
                                                                                                                      CUnul at i ve D st ri but i on of fesi dual
                      QQ Plot of  the  Best  Fit  Model of mp-Xylene
                    LnnpXD = 5.5615 +14. 562 f 14_1rrl nv +22. 462 GBIrrl nv 40.52633*4 -0.0247 IS -0. 12251)4
                                                                              QQ Plot of  the  Best  Fit  Model of mp-Xylene
                                                                            LnnpXDCUt = 4. 9424 +7. 9474 f14_1rrl nv +17. 436 GSM nv +0. 5374 3 ab4 - 0. 0232 IS - 0. 0653 U4
                                                   N
                                                   183
                                                   Rsq
                                                   0.2657
                                                   AdjRsq
                                                   0.2450
                                                   WEE
                                                   0.7495
                                                                                                                                                            0.3304
                                                                                                                                                            MjFfeq
                                                                                                                                                            0.3098
                                                                                                                                                            FME
                                                                                                                                                            0.5362
                       -3        -2
              (B)
                                                 o
                                                 Oiantile
                                                                                                         -3       -2
                                                                     (D)
                                                                                                 -1       o
                                                                                                     NT ml Qiartile
                                  Figure B-l. Plots  for Model of  m,p-Xylene before  (A,B)  and  after  (C,D)  removal of  outliers
Final Report
                                                     B-l

-------
                        PP Plot of the  Best Fit Model of o-Xylene
                  Lno)C = 2.2843 +20. 316 f 14_1nl nv +13. 948 GBIrrl nv +0.6357 3 ab4 -0.0185 IS -0.1148m
                                                                              Rsq
                                                                              0. 3275
                                                                              AdjRsq
                                                                              0.3085
                                                                              FUSE
                                                                              0.5746
            (A)
     0.0   0.1    0.2   0.3   0.4    0.5   0.6   0.7    0.8   0.9   1.0
                     Oinul at i ve D st ri but i on of fesi dual

     QQ Plot of  the Best Fit Model of o-Xylene
Lno)C = 2.2843 +20. 316 f 14 Inlnv +13. 948 GBIrri nv +0.6357 3 ab4 -0.0185 IS -0.1148 Ut
                                                                              N
                                                                              183
                                                                              Rsq
                                                                              0. 3275
                                                                              AdjRsq
                                                                              0.3085
            (B)
                                                 0
                                               l  diantile

                                                                                        (C)
                                                                                                      PP Plot of the Best Fit Model of o-Xylene
                                                                                                Lno)COlt =4.4581 +7. 4437 f14_1rtl nv +9. 5424 GBIrtl nv +0. 5209 3ab4 -0. 0235 IS -0.122 Ut
                                                                                                                                                             Rsq
                                                                                                                                                             0.4223
                                                                                                                                                             AJjRsq
                                                                                                                                                             0.4045
                                                                                                                                                             FME
                                                                                                                                                             0. 4441
    0.0    0.1    0.2   0.3    0.4   0.5   0.6    0.7   0.8    0.9    1.0
                    Oinul at i ve D st ri but i on of fesi dual

     QQ Plot of  the Best Fit  Model  of o-Xylene
Lno)COlt = 4.4581 +7. 4437 f 14 1rtl nv +9. 5424 GB1 nl nv +0. 5209 3 ab4 -0.0235 IS -0. 122U4
                                                                                                  1.25-

                                                                                                  1.00'

                                                                                                  0.75"

                                                                                                  0.50-

                                                                                               _ 0.25"
                                                                                                D
                                                                                               ~ 0.00-

                                                                                               °° -0.25-

                                                                                                  -0.50-

                                                                                                  -0.75-
                                                             N
                                                             168
                                                             Rsq
                                                             0.4223
                                                             AdjRsq
                                                             0.4045
                                                                                                                                0        1
                                                                                                                            ^brnBl diantile
                                       Figure  B-2.  Plots  for  Model of o-Xylene  before  (A,B)  and  after  (C,D)  removal of  outliers
Final Report
                                                                   B-2
                                                                                        (D)

-------
                         PP Plot of the Best Fit  Model of Toluene
                 LnTd 0 = 6. 2928 +16. 665 f 14_1rrl nv +0. 6508 3 ab4 - 0. 0321 IS +0. 0155 R-B
                                                                              AdjRsq
                                                                              0.1951
                                                                              OOE
                                                                              0.906
            (A)
    0.0    0.1    0.2   0.3    0.4   0.5   0.6    0.7   0.8   0.9    1.0
                    CUnul at i ve D st r i but i on of fesi dual

      QQ Plot of  the Best  Fit Model of Toluene
LnTd 0 = 6. 2928 +16. 665 f 14_1rrl nv +0. 6508 3 ab4 - 0. 0321 IS +0. 0155 R-E	
                                                                              N
                                                                              183
                                                                              AdjRsq
                                                                              0.1951
                                                                              OOE
                                                                              0.906
            (B)
                                                 0
                                               l  diantile
                                                                       (C)
                                                                                                       PP Plot of the Best Fit Model  of Toluene
                                                                                                LnTdOCUt = 3.1102 +14.721 f14_1rtlnv +0.7058 SsM -0.0207 IS +0.0112 R-B
                                          X
                                                                                                                                    X*
                                                                                                                                                            AdjRsq
                                                                                                                                                            0.2957
                                                                                                                                                            OJBE
                                                                                                                                                            0.6456
    0.0    0.1    0.2   0.3    0.4   0.5   0.6    0.7   0.8   0.9    1.0
                    CUnul at i ve D st r i but i on of Ffesi dual

      QQ Plot of  the  Best Fit Model of Toluene
LnTdOCUt = 3.1102 +14.721 f14_1rrlnv +0.7058 SsM -0.0207 IS +0. 0112 R5	
  1.5-1
                                                            AdjRsq
                                                            0.2957
                                                            OJBE
                                                            0.6456
                                                                                                     -3       -2       -1        0
                                                                                                                           Nb-nBl  Oiantile
                                       Figure B-3. Plots  for Model  of  Toluene  before  (A,B)  and  after  (C,D)  removal  of  outliers
Final Report
                                                                   B-3
                                                                                        (D)

-------
               PP  Plot of the Best Fit Model of Benzene
         Ln&nO = 11.461 +26.166 GBInl nv -0. 0374 K5 -0.1724 Ut
                                                                    Rsq
                                                                    0.2568
                                                                    AJjRsq
                                                                    0.2444
                                                                    FME
                                                                    0.659
   (A)
     0.0   0.1   0.2    0.3   0.4   0.5   0.6   0.7    0.8   0.9   1.0
                     Cunul at i ve D st ri but i on of fesi dual


      QQ Plot of  the Best Fit  Model of Benzene

Ln&nO = 11.461 +26.166 GBInl nv -0. 0374 K5 -0.1724 Ut
   (B)
                                                                                                    PP Plot of the Best  Fit Model  of Benzene
                                                                                             Ln&nOCUt = 10.074 +5. 4977 f 14_1n1 nv +16.148 GBIrtl nv +0.3036 3 ab4 -0.0391 K5-0. 0849U4
                                                                                                                                                        0.4116
                                                                                                                                                        AJjRsq
                                                                                                                                                        FME
                                                                                                                                                        0.478
    0.0   0.1   0.2    0.3   0.4   0.5   0.6   0.7    0.8   0.9   1.0
                    CUrml at i ve Dstribution of fesi dual


      QQ  Plot of  the Best Fit Model of Benzene

Ln&nOCUt = 10.074 +5. 4977 f 14 1rtl nv +16.148 GBIrtl nv +0.3036 3 ab4 -0.0391 IS -0. 0849U4
                                                                                                                                                                Adjfeq
                                                                                                                                                                0.3936
                                                                                                                                                                FWE
                                                                                                                                                                0.478
                                                                                                          -3       -2
                                                                                        (D)
                                                                                                                                     0        1

                                                                                                                                Nb-nBl  Oiantile
                                      Figure  B-4.  Plots  for Model  of  Benzene before  (A,B)  and  after  (C,D)  removal  of outliers
Final Report
                                                                          B-4

-------
           PP Plot of the Best  Fit Model of Ethylbenzene
       LnBzO = 5.3254 +13. 32 f 14_1n1 nv 40. 5279 3 ab4 - 0. 0265 IS -0.1716U4
                                                                   Rsq
                                                                   0.1285
                                                                   AdjRsq
                                                                   0.1089
                                                                   OOE
                                                                   1.1197
   (A
  0.0    0.1    0.2   0.3   0.4    0.5   0.6   0.7    0.8   0.9   1.0
                  CUnul at i ve D st r i but i on of fesi dual


   QQ Plot  of  the  Best  Fit  Model  of Ethylbenzene

LnBizO = 5. 3254 +13. 32 f14_1rrl nv -Kl. 5279 3 ab4 -0. 0265 K5 -0. 1716 Ut
                                                                    Rsq
                                                                    0.1285
                                                                    MjRsq
                                                                    0.1089
   (B)
                     -2-10         1

                                   Nbrnal  Qiantile

                                                                                                  PP Plot of the  Best Fit  Model of Ethylbenzene
                                                                                              LnBzOCUt = 5.9813 +9.6811 f14_1rrlnv +0. 4377 3 ab4 - 0. 0275 IS -0.1137 Ut
                                                                                                                                                          AdjRsq
                                                                                                                                                          0.1421
                                                                                                                                                          WEE
                                                                                                                                                          0.8023
    0.0   0.1    0.2   0.3   0.4    0.5    0.6   0.7   0.8    0.9    1.0
                    CUnulative Dstribution of  fesi dual


        Plot  of  the  Best  Rt Model of Ethylbenzene

LnBlzOCUt = 5.9813 49.6811 f14_1rrlnv +0.4377 3 ab4 -0.0275 IS -0.1137Ut
                                                                                                                                                          AdjRsq
                                                                                                                                                          0.1421
                                                                                         (D)
                                                                                                           -2-101

                                                                                                                         NbrnBl Oiantile
                                    Figure B-5.  Plots for Model  of Ethylbenzene  before  (A,B)  and after  (C,D)  removal of outliers
Final Report
                                                                          B-5

-------
               PP  Plot of the Best  Fit Model  of MTBE
       LnlUIBH) = 1. 9744 +39. 494 GBInl nv -0. 2159 Ut
         1.0

                                                                   Rsq
                                                                   0.0838
                                                                   Adjfeq
                                                                   0.0736
                                                                   OOE
                                                                   1.194
   (A
  0.0    0.1    0.2   0.3    0.4   0.5   0.6    0.7   0.8   0.9    1.0

                  Cunul at i ve D st r i but i on of fesi dual


       QQ  Plot of  the Best Fit Model of MTBE

LnMBHl = 1. 9744 +39. 494 GBInl nv -0. 2159 l»	
  3"
                                                                     Rsq
                                                                     0.0838
                                                                     MjRsq
                                                                     0.0736
            -3       -2
    (B)
                                        0        1

                                   Nbrnal Qiantile
                                                                                                       PP Plot of the Best  Fit Model of MTBE
                                                                                               LnWTEBXXt = -2.743 +22. 255 f 11_1rrl nv +33. 563 GB1 nl nv 40.24353*4 +0.0124 IS -0. 187U4
                                                                                                                                                           AJjRsq
                                                                                                                                                           0.2268
    0.0   0.1    0.2   0.3   0.4    0.5   0.6   0.7    0.8   0.9    1.0

                    CUnul at i ve Dstribution of fesi dual


       QQ  Plot of  the Best  Fit Model of MTBE

LnMBHlQlt = -2.743 +22.255 f11_1rrlnv +33. 563 GSM nv +0.2435 3 ab4 +0. 0124 K5-0. 187 U4
  2.(H
                                                                                                  1.5'


                                                                                                  1.01


                                                                                                  0.5


                                                                                                ;  0.0


                                                                                                i -0.5-


                                                                                                 -1.0-


                                                                                                 -1.51


                                                                                                 -2.0


                                                                                                 -2.5
                                                            171
                                                            Rsq
                                                            0.2496
                                                            AdjRsq
                                                            0.2268
                                                                                          (D)
                                                                                                                               0        1

                                                                                                                           Mr ml Oiantile
                                         Figure  B-6.  Plots  for Model  of MTBE  before  (A,B)  and after  (C,D)  removal  of  outliers
Final Report
                                                                           B-f

-------
                 PP  Plot of the  Best Fit Model of PCE
       LnFCHl = -0.6546 454.386 DCFIrrl nv +0.21073ab4 -0.22591H +0.0103 Red p5	
                                                              4*
Rsq
0.2102
AdjRsq
0.1925
OOE
0.7363
    (A
   0.0   0.1    0.2   0.3    0.4   0.5    0.6   0.7   0.8    0.9   1.0

                   CUnul at i ve D st r i but i on of Ffesi dual


         QQ Plot of the Best Fit Model of PCE

LnFCHl = -0.6546 +54.386 DCFIrrl nv +0. 2107 3 ab4 - 0. 2259 Ut +0. 0103 FT eel p5	
                                                                        Rsq
                                                                        0. 2102
                                                                        MjRsq
                                                                        0.1925
                                                                        WEE
                                                                        0.7363
             -3        -2
    (B)
                                          0         1

                                     NbrnBl  Qiantile
                                                PP Plot of the  Best Fit Model of PCE
                                      LnKHJOlt = 2.4945 +32.673 DCFIrtl nv +0.1444 3ab4 -0. 0123K5 -0.1441 Ut +0.0091 R-E
                                                                                                                                                                           AJjRsq
                                                                                                                                                                           0.2907
                                                                                                                                                                           FME
                                                                                                                                                                           0.4186
                                           0.0   0. 1    0.2   0.3    0.4    0.5   0.6    0.7   0.8    0.9   1.0

                                                           CUnul at i ve D st r i but i on of Ffesi dual


                                               QQ  Plot of  the  Best Fit Model of  PCE

                                      LnFCHlCUt = 2.4945 +32.673 DtFlrrl nv +0.1444 3 ab4 - 0. 0123 K5 - 0.1441 U4 +0.0091 R-6
                                         1.25'
                                         1.00-


                                         0.75


                                         0.50


                                      _  0.25'
                                      a

                                      ~  0.00'


                                        -0.25'


                                        -0.50


                                        -0.75


                                        -1.00


                                        -1.25
                                                                                                                                                                  164
                                                                                                                                                                  Rsq
                                                                                                                                                                  0.3124
                                                                                                                                                                  AdjRsq
                                                                                                                                                                  0.2907
                                (D)
                                                              -1        0

                                                                   NbrnBl Qiantile
                                           Figure  B-7.  Plots for Model  of  PCE  before  (A,B)  and after  (C,D)  removal  of  outliers
Final Report
                 B-7

-------
                                                                             PP P ot
                                                                                        _lnv ->.**!!3 K4 -*.1«U4
                                                     : 1.1
                                                     ~
                                                               Figure B-7.  Plots  for  Model  of PM2.5 Mass


                                                                             PP Rot
                                                        - -E.7EI7 "I.4IOS flH.llrw *H4.Sf m»ZDIS_lnv *C.OI37»H *O.DH?StaW
                                                     5 «•«
                                                           Figure B-7. Plots  for Model of  Elemental Carbon
Final Report

-------
                                                                            PP Plot
                                                                          e Distribution of I
                                                         Figure B-7.  Plots for Model of  Organic  Carbon
                                                  PP Plot                                                      PP Plot
                                          Cunulatlve Di r tr i but-on of lleslduat
                                                         figure B-*7.  plots'^for Model df Or^anic? CafBon '
Final Report
B-9

-------
                              APPENDIX C. Diagnostic Results of Equal Variance and Multicollinearity
M,p-Xylene




Consistent Covariance of Estimates
Variable
Intercept
f!4_lmlnv
GSlmlnv
Stab4
K5
U4



Parameter Estimates
Variable
Intercept
F14_lmlnv
GSlmlnv
Stab4
K5
U4
Intercept fl4_lmlnv
2.4504 0.30152
0.30152 10.3825
-0.86165 -2.90185
-0.08012 0.04758
-0.00668 -0.00222
-0.03337 0.00483
Test of First
DF
20

GSlmlnv Stab4
-0.86165 -0.08012
-2.90185 0.04758
35.1448 -0.05396
-0.05396 0.00971
0.00323 7.5E-05
0.01211 0.00231
and Second Moment Specification
Chi-Square Pr > ChiSq
23.57 0.2616

_„ Parameter Standard TT . _ .
DF „ . „ t Value Pr > t
Estimate Error '
1 4.94236
1 7.94739
1 17.4362
1 0.53744
1 -0.0232
1 -0.0653
K5 U4
-0.00668 -0.03337
-0.00222 0.00483
0.00323 0.01211
7.5E-05 0.00231
2.1E-05 4.8E-05
4.8E-05 0.00197




„ . Variance
lolerance , n .
Inflation
1.70161 2.9 0.0042 . 0
4.43103 1.79 0.0747 0.95826 1.04356
6.29951 2.77 0.0063 0.93148 1.07356
0.11065 4.86 <.0001 0.74522 1.34188
0.00507 -4.58 <.0001 0.91088 1.09784
0.04438 -1.47 0.1431 0.70158 1.42535
Final Report
B-10

-------
Collinearity Diagnostics
Number Eigenvalue
1 4.77579
2 0.64666
3 0.51787
4 0.05567
5 0.00365
6 0.00035
Condition
Index
1
2.7176
3.03677
9.26198
36.1517
116.442
Proportion of Variation
Intercept
2.5E-05
4.1E-05
7.3E-06
0.00043
0.0199
0.9796
fl4_lmlnv
0.01274
0.66273
0.31247
0.00407
0.00799
1.3E-06
GSlmlnv
0.0131
0.14372
0.81381
0.00016
0.02775
0.00147
Stab4
0.00023
0.00036
4.6E-05
0.01681
0.86081
0.12174
K5
3.4E-05
5.6E-05
1.1E-05
0.00081
0.06304
0.93606
U4
0.00179
0.0034
0.0003
0.58238
0.22528
0.18686
Final Report
B-ll

-------
o-Xylene
Consistent Covariance of Estimates
Variable Intercept
Intercept 1.67737
fl4_lmlnv 0.71942
GSlmlnv -1.41352
Stab4 -0.05168
K5 -0.00462
U4 -0.01991
f!4_lmlnv GSlmlnv Stab4
0.71942 -1.41352 -0.05168
13.7652 -3.15844 0.00684
-3.15844 15.3048 0.00274
0.00684 0.00274 0.00738
-0.00276 0.00464 2.7E-05
-0.00947 -0.0015 0.00141
K5 U4
-0.00462 -0.01991
-0.00276 -0.00947
0.00464 -0.0015
2.7E-05 0.00141
1.5E-05 2.1E-05
2.1E-05 0.0016
Test of First and Second Moment Specification
DF Chi-Square Pr > ChiSq

Results of Multicollinearity Test on the
Parameter Estimates
Variable DF
Intercept 1
fl4_lmlnv 1
GSlmlnv 1
Stab4 1
K5 1
U4 1
20 26.89 0.1384
Final Model of o-Xylene

Parameter Standard TT . _
„ . „ t Value Pr >
Estimate Error
4.45813 1.4074 3.17
7.44373 4.48291 1.66
9.54244 5.47996 1.74



. „ , Variance
t lolerance , n .
Inflation
0.0018 . 0
0.0988 0.93057 1.07461
0.0835 0.91381 1.09431
0.52092 0.09234 5.64 <.0001 0.72058 1.38777
-0.02352 0.00419 -5.62 <.0001 0.9089 1.10023
-0.12197 0.03697 -3.3
0.0012 0.68247 1.46527
Final Report
B-12

-------
Collinearity Diagnostics
Number Eigenvalue
1 4.83844
2 0.63267
3 0.46846
4 0.05647
5 0.0036
6 0.00035
Condition
Index
1
2.76544
3.21379
9.25612
36.6669
116.918

Intercept
2.4E-05
5E-05
1.97E-10
0.00042
0.02017
0.97934

fl4_lmlnv
0.01282
0.36289
0.61709
2.5E-05
0.00148
0.00569
Proportion of "V
GSlmlnv
0.01258
0.35101
0.60919
0.00486
0.01935
0.00301
rariation
Stab4
0.00022
0.00041
1.9E-06
0.01685
0.85845
0.12406

K5
3.4E-05
6.9E-05
3.02E-08
0.00079
0.06507
0.93403

U4
0.0017
0.00403
1.3E-05
0.5583
0.24109
0.19487
Final Report
B-13

-------
Final Report                                              B-14

-------
Toluene
Consistent Covariance of Estimates
Variable
Intercept
fl4_lmlnv
Stab4
K5
RH5

Intercept f!4_lmlnv Stab4
2.53139 0.99584 -0.03324
0.99584 15.3306 -0.0417
-0.03324 -0.0417 0.00931
-0.00899 -0.00354 -1.1E-05
0.00266 0.00176 -0.00015
Test of First and Second Moment Specification
K5 RH5
-0.00899 0.00266
-0.00354 0.00176
-1.1E-05 -0.00015
3.5E-05 -1.2E-05
-1.2E-05 2E-05

DF Chi-Square Pr > ChiSq

Results of Multicollinearity
Parameter Estimates
14 26.13 0.0249
Test on the Final Model of Toluene

T7 ... Parameter Standard .
Vanable DF „ . „ t Value Pr >
Estimate Error
Intercept 1
f!4_lmlnv 1
Stab4 1
K5 1
RH5 1



. „ , Variance
t lolerance , n .
Inflation
3.11017 1.84905 1.68 0.0945 . 0
14.7215 4.43634 3.32 0.0011 0.98854 1.01159
0.70584 0.12046 5.86 <.0001 0.84859 1.17842
-0.02067 0.00635 -3.25 0.0014 0.84105 1.18898
0.01116 0.0048 2.33 0.0212 0.73064 1.36867
Final Report
B-15

-------
Collinearity Diagnostics
Number Eigenvalue
1 4.31674
2 0.65659
3 0.02114
4 0.00516
5 0.00038
Condition
Index
1
2.56408
14.291
28.9175
106.883
Proportion of Variation
Intercept
3.8E-05
2.5E-05
0.00462
0.01539
0.97993
fl4_lmlnv
0.01532
0.97236
0.00271
0.0002
0.00941
Stab4
0.00035
0.00024
0.01229
0.91675
0.07038
K5
4E-05
2.5E-05
0.00315
0.02567
0.97112
RH5
0.00122
0.00092
0.81896
0.03874
0.14016
Final Report
B-16

-------
Benzene
Consistent Covariance of Estimates
Variable
Intercept
f!4_lmlnv
GSlmlnv
Stab4
K5
U4
Intercept
1.92094
0.77079
0.1021
-0.08161
-0.00501
-0.02127
fl4_lmlnv
0.77079
7.55125
-2.74611
-0.02564
-0.00231
-0.0074
GSlmlnv Stab4
0.1021 -0.08161
-2.74611 -0.02564
28.7272 -0.07682
-0.07682 0.0103
0.00096 7.3E-05
-0.03097 0.00224
K5 U4
-0.00501 -0.02127
-0.00231 -0.0074
0.00096 -0.03097
7.3E-05 0.00224
1.6E-05 1.3E-05
1.3E-05 0.00161
Test of First and Second Moment Specification




Results of Multicollinearity Test on
Parameter Estimates
Variable
Intercept
fl4_lmlnv
GSlmlnv
Stab4
K5
U4

DF
1
1
1
1
1
1
DF
20
the Final Model of

Parameter
Estimate
10.0744
5.4977
16.1478
0.30356
-0.03914
-0.08488
Chi-Square Pr > ChiSq
25.52 0.1824
Benzene

Standard TT . _ .
„ t Value Pr > t
Error '




„ . Variance
lolerance , n .
Inflation
1.49805 6.73 <.0001 . 0
3.32981 1.65 0.1007 0.95886 1.0429
5.57504 2.9 0.0043 0.94282 1.06065
0.09971 3.04 0.0027 0.7092 1.41003
0.00447 -8.76 <.0001 0.9031 1.10729
0.03966 -2.14 0.0338 0.66435 1.50522
Final Report
B-17

-------
Collinearity Diagnostics
Number Eigenvalue
1 4.74689
2 0.66641
3 0.52523
4 0.0576
5 0.00351
6 0.00036
Condition
Index
1
2.66891
3.00627
9.0781
36.755
114.882
Proportion of Variation
Intercept
2.6E-05
3.5E-05
1.3E-05
0.0004
0.02129
0.97823
fl4_lmlnv
0.01248
0.7678
0.20451
0.00464
0.00542
0.00515
GSlmlnv
0.01339
0.07717
0.88566
8.8E-06
0.02252
0.00125
Stab4
0.00023
0.00029
8.8E-05
0.01655
0.86748
0.11536
K5
3.5E-05
4.6E-05
1.9E-05
0.00077
0.0673
0.93183
U4
0.00175
0.00289
0.00068
0.54143
0.25551
0.19774
Final Report
B-18

-------
Ethylbenzene
Consistent Covariance of Estimates
Variable Intercept
Intercept 6.25969
fl4_lmlnv 0.16325
Stab4 -0.23772
K5 -0.01664
U4 -0.07101



Table E.10. Results of Multicollinearity
Parameter Estimates
Variable DF
Intercept 1
fl4_lmlnv 1
Stab4 1
K5 1
U4 1
fl4_lmlnv Stab4





Test
DF
14
Test on the

Parameter
Estimate
5.98125
9.6811
0.43775
-0.02747
-0.11372
0.16325 -0.23772
31.8971 0.0485
0.0485 0.02746
-0.00188 0.00027
-0.02487 0.00441
of First and Second Moment Specification
Chi-Square Pr > ChiSq
22.44 0.0700
Final Model of Ethylbenzene

Standard . .
„ t Value Pr > t
Error '
K5 U4
-0.01664 -0.07101
-0.00188 -0.02487
0.00027 0.00441
5.2E-05 8.2E-05
8.2E-05 0.00625





„ , Variance
lolerance T n .
Inflation
2.43806 2.45 0.0152 . 0
5.62338 1.72 0.087 0.99172 1.00835
0.16287 2.69 0.0079 0.75403 1.32621
0.00732 -3.75 0.0002 0.92634 1.07952
0.06587 -1.73 0.0861 0.71056 1.40735
Final Report
B-19

-------
Collinearity Diagnostics
Number
1
2
3
4
5
Eigenvalue C
4.29372
0.64669
0.05557
0.00365
0.00038


1
2.57673
8.79047
34.3084
106.199
Proportion of Variation
Intercept
3.4E-05
2.4E-05
0.00046
0.0224
0.97709
fl4_lmlnv
0.01575
0.9744
0.00241
0.00158
0.00585
Stab4
0.0003
0.0002
0.01763
0.87486
0.10702
K5
4.6E-05
3E-05
0.00084
0.06731
0.93178
U4
0.00227
0.0019
0.58435
0.23986
0.17163
Final Report
B-20

-------
Methyl tert Butyl Ether (MTBE)





Consistent Covariance of Estimates

Variable Intercept
Intercept 4.96409
Fll_lmlnv 0.29828
GSlmlnv -1.29803
Stab4 -0.13099
K5 -0.01409
U4 -0.06632
fll_lmlnv GSlmlnv Stab4
0.29828 -1.29803 -0.13099
210.102 -18.3631 0.08589
-18.3631 61.7254 -0.34534
0.08589 -0.34534 0.0222
-0.00205 0.00883 3.8E-05
-0.09811 0.03525 0.0028
K5 U4
-0.01409 -0.06632
-0.00205 -0.09811
0.00883 0.03525
3.8E-05 0.0028
4.6E-05 0.00014
0.00014 0.00261
Test of First and Second Moment Specification
DF Chi-Square Pr > ChiSq

Results of Multicollinearity Test on
Parameter Estimates
Variable DF
Intercept 1
fll_lmlnv 1
GSlmlnv 1
Stab4 1
K5 1
U4 1
20 18.20 0.5745
the Final Model of MTBE

Parameter Standard TT . _
„ . „ t Value Pr >
Estimate Error



. „ , Variance
t lolerance T n .
Inflation
-2.743 2.27071 -1.21 0.2288 . 0
22.2547 14.3072 1.56 0.1217 0.97446 1.02621
33.5627 8.28221 4.05 <.0001 0.95972 1.04197
0.24348 0.14963 1.63 0.1056 0.70196 1.42459
0.01239 0.00669 1.85 0.0659 0.89917 1.11213
-0.18702 0.06022 -3.11 0.0022 0.6496 1.53942
Final Report
B-21

-------
Collinearity Diagnostics
Number Eigenvalue
1 4.66055
2 0.73182
3 0.54484
4 0.0588
5 0.00364
6 0.00036
Condition Index —
1
2.52358
2.92472
8.90275
35.8027
113.396
Proportion of Variation
Intercept
2.7E-05
2E-05
2.6E-05
0.00038
0.01965
0.9799
fll_lmlnv
0.01139
0.94942
0.0308
0.00666
0.00015
0.00157
GSlmlnv
0.01397
0.00242
0.95844
0.00263
0.02173
0.00081
Stab4
0.00024
0.00018
0.00018
0.01602
0.85101
0.13237
K5
3.8E-05
2.8E-05
3.7E-05
0.00077
0.06764
0.93148
U4
0.00183
0.00096
0.00209
0.53461
0.24151
0.21899
Final Report
B-22

-------
Tetrachloroethylene (PCE)
Consistent Covariance of Estimates
Variable Intercept DCFlmlnv Stab4 K5
U4 RH5
Intercept 1.73111 -0.03794 -0.0554 -0.00463 -0.02507 -0.00017
DCFlmlnv -0.03794 200.107 -0.25514 0.0035 -0.01538 -0.00311
Stab4 -0.0554 -0.25514 0.00855 2.7E-05 0.00145 -8.6E-06
K5 -0.00463 0.0035 2.7E-05 1.5E-05 3.5E-05 -1.5E-06
U4 -0.02507 -0.01538 0.00145 3.5E-05 0.00135 3.1E-05
RH5 -0.00017 -0.00311 -8.6E-06 -1.49E-06 3.1E-05 7.86E-06
Test of First and Second Moment Specification
DF Chi-Square Pr > ChiSq
20 11.97 0.9170
Results of Multicollinearity Test on the Final Model of PCE
Parameter Estimates
TT ... „„ Parameter Standard TT , „ ^
Variable DF . t Value Pr >
Estimate Error


. „ , Variance
t Tolerance T n .
Inflation
Intercept 1 2.4945 1.34715 1.85 0.0659 . 0
DCFlmlnv 1 32.6734 12.3964 2.64 0.0092 0.97021 1.0307
Stab4 1 0.14442 0.08831 1.64
0.104 0.74611 1.34028
K5 1 -0.01229 0.00416 -2.96 0.0036 0.85224 1.17338
U4 1 -0.1441 0.03588 -4.02 <.0001 0.71002 1.40842
RH5 1 0.00913 0.00301 3.04 0.0028 0.75523 1.3241
Final Report
B-23

-------
 Collinearity Diagnostics
Number Eigenvalue
1 5.53937
2 0.371
3 0.06942
4 0.01614
5 0.00374
6 0.00034
Condition
Index
1
3.86407
8.933
18.5246
38.4987
127.603
Proportion of Variation
Intercept
1.9E-05
3.4E-05
1.7E-05
0.00354
0.02173
0.97466
DCFlmlnv
0.0088
0.94315
0.01382
0.02428
0.00461
0.00533
Stab4
0.00017
0.00023
0.00338
0.06688
0.79268
0.13666
K5
2.4E-05
4.5E-05
6.8E-05
0.00359
0.05781
0.93846
U4
0.00132
0.00492
0.42245
0.23735
0.22567
0.10829
RH5
0.0008
0.0016
0.0973
0.84095
0.00917
0.05018
Final Report
B-24

-------
4.10. PM2.5 Mass
   Summary of Stepwise Selection
        Variable
   Step Entered
Variable
Removed
        Stab4

       Ufl9_linv
        fll_llnv
        PM03DlS_inv

       KmmHG4
            Number  Partial    Model
Label        Vars  In R-Square  R-Square   C(p)
F Value Pr > F
Stab4
U4
f!9 linv
fll_llnv
PM03DIS inv
K4
mmHG4
1
2
3
4
5
6
7
0
0
0
0
0
0
0
.3184
.0655
.0524
.0210
.0163
.0141
.0134
0,
0,
0,
0,
0,
0,
0,
.3184
.3839
.4363
.4573
.4736
.4877
.5010
30
20
12
10
9
8
8
.4114
.0725
.1924
.2394
.1699
.5164
.0000
46,
10,
9,
3,
2,
2,
2,
.71
.52
.12
.75
.97
.61
.52
<,
0,
0,
0,
0,
0,
0,
.0001
.0016
.0032
.0557
.0880
.1094
.1160
                                     Analysis of Variance
Sum of Mean
Source DF Squares Square F Value
Model T 12.62973 1.80425 13.48
Error 94 12.57688 0.13380
Corrected Total 101 25.20661
Parameter Standard
Variable Estimate Error Type II SS F Value Pr
intercept 13.91429 6.84008 0.55366 4.14 0.
fll linv 25.77270 11.93020 0.62441 4.67 0.
f!9 linv 3.62748 1.65616 0.64188 4.80 0.
PM03DIS inv 58.63622 29.32827 0.53482 4.00 0.
-0.00927 0.00476 0.50718 3.79 0.
K4
U4mmHG4
Stab4


Variable
Step Entered
Stab4
1
2 Ufl9_linv
-0.16301 0.03855 2.39254 17.88 <.
-0.01311 0.00827 0.33669 2.52 0.
0.41383 0.09362 2.61421 19.54 <.
Bounds on condition number: 1.5607, 58.934
Summary of Stepwise Selection
Variable Number Partial Model
Removed Label Vars In R-Square R-Square C(p)
Stab4 1 0.3184 0.3184 26.3067
U4 2 0.0655 0.3839 16.3623
f!9 linv 3 0.0524 0.4363 8.7980
Pr > F
<.0001
> F
0447
0333
0310
0485
0545
0001
1160
0001



F Value
46.71
10.52
9.12






Pr > F
<.0001
0.0016
0.0032
Final Report
                                            B-25

-------
        fll_llnv
        PM03DlS_inv
   fll_llnv
   PM03DlS_inv
                       0.0210
                       0.0163
                   0.4573
                   0.4736
              6.9713
              6.0000
          3.75 0.0557
          2.97 0.0880
         Source

         Model
         Error
         Corrected Total
         Analysis  of Variance

                Sum of           Mean
     DF        Squares         Square

      5       11.93801        2.38760
     96       13.26860        0.13821
    101       25.20661
                                        F Value     Pr  >  F

                                          17.27     <.0001
              Variable

              Intercept
              fll_llnv
              f!9_linv
              PM03DlS_inv

            U4Stab4
                               Find The Best Fitted Model  for PM
                                    Stepwise Selection:  Step 5
Parameter
 Estimate
  1.
 25.
  4.
  .06000
  .27425
  .19851
50.94389
-0.13037
 0.42820
             Standard
                Error
 0.57297
12.03839
  .65726
  .55414
 0.03636
 0.09491
               1.
              29.
           Type II SS  F Value  Pr > F
0.47304
0.60922
0.88708
0.41068
1.77726
2.81332
42
41
                                        6.42
                                        2
                                       12
97
86
                                       20.35
0.0674
0.0384
0.0129
0.0880
0.0005
<.0001
                           Bounds on condition number:  1.3453,  28.976
Final Report
                                   B-26

-------
C -11 Elemental Carbon
        Variable
   Step Entered
Variable
Removed
        RH4
        Stab4
        f14_li nv
        PM02DlS_inv
                                 Summary of Stepwise Selection
Label

RH4
Stab4
f14_li nv
PM02DlS_inv
Number  Partial   Model
Vars In R-Square R-Square
                                 0.2357
                                 0.0562
                                 0.0362
                                 0.0484
                  0.2357
                  0.2919
                  0.3281
                  0.3765
   C(p)    F Value Pr > F

   9.6140   15.42 0.0003
   7.3748    3.89 0.0542
   6.6483    2.58 0.1145
   5.0000    3.65 0.0622
           Analysis of Variance
         Source                   DF

         Model
         Error                    47
         Corrected Total          51
                         Sum of
                        Squares

                        4.07456
                        6.74801
                       10.82257
                              Mean
                            Square

                           1.01864
                           0.14357
                          F Value    Pr > F

                             7.09    0.0001
              Variable

              Intercept
              f14_li nv
              PM02DlS_inv
              RH4
              Stab4
         Parameter
          Estimate

          -2.76874
           8.41047
         944.59427
           0.01371
           0.35474
           Standard
              Error

            0.68589
            3.84635
          494.54016
            0.00498
            0.14301
          Type II SS  F Value  Pr > F
             2.33955
             0.68647
             0.52380
             1.09096
                                                         0.88334
16.30
 4.78
 3.65
                           60
                                                                     6.15
                           Bounds on condition number:  1.2892,  19.405
0.0002
0.0338
0.0622
0.0083
0.0168
Final Report
                                             B-27

-------
C-12 Organic Carbon
   step
                                   Summary  of  Forward  Selection
Variable
Entered
Stab4
fll_llnv
Label
Stab4
fll_llnv
Number
Vars In
1
Partial
R-Square
0.2614
0.0275
Model
R-Square
0.2614
0.2889
C(p)
2.8931
3.0000
F Value
17.70
1.89
Pr > F
0.0001
0.1751
         Source

         Model
         Error
         Corrected Total
                                       Analysis  of Variance
DF
49
51
Sum of
Squares
3.25109
8.00212
11.25321
Mean
Square
1.62555
0.16331
F Value
9.95
Pr > F
0.0002
                                 Find The Best Fitted Model  for PM
                                      Forward  Selection:  Step  2
Variable
Intercept
fll_llnv
Stab4
Parameter
Estimate
-1.
26.
0.
,67517
,64584
,55571
Standard
Error
0,
19,
0,
.66211
.36632
.13474
Type
1.
0.
2.
II SS
,04536
,30915
,77773
F Value
6
1
17
.40
.89
.01
Pr > F
0
0
0
.0147
.1751
.0001
                           Bounds  on  condition  number:  1.0061,  4.0244
Final Report
B-28

-------
C- 13 PAHs
        Variable
   Step Entered
       Kstab4
        fll_llnv
              ;_inv
             Summary of Stepwise Selection

Variable                Number  Partial   Model
Removed     Label       Vars In R-Square R-Square

            K4              1
            Stab4           2
            fll_llnv        3
            Precip4         4
            PMOlDlS_inv     5

                   The REG Procedure
                     Model:  MODELl
         Dependent Variable: LnBghiPP LnBghiPP

               Stepwise Selection: Step 5
                             C(p)
                        F Value Pr > F
0
0
0
0
0
.2160
.1613
.0719
.0424
.0420
0,
0,
0,
0,
0,
.2160
.3773
.4493
.4917
.5337
24
13
9
7
6
.5663
.0761
.0590
.5090
.0000
11,
10,
5,
3,
3,
.85
.88
.36
.34
.51
0,
0,
0,
0,
0,
.0013
.0020
.0257
.0751
.0685
              Variable

              Intercept
              fll_llnv
              PMOlDlS_inv
              Precip4

            K4Stab4
         Parameter
          Estimate

          13.55955
         125.65797
         629.84523
         -12.18382
          -0.07630
           1.37241
 Standard
    Error

  4.27716
 48.00264
336.23325
  5.11993
  0.01451
  0.32101
Type II SS  F Value  Pr > F
   6.21906
   4.24028
   2.17136
   3.50416
  17.10874
  11.31041
10.05
 6.85
 3.51
 5.66
27.65
18.28
0.0030
0.0125
0.0685
0.0223
<.0001
0.0001
                           Bounds on condition number:  1.3725,  30.547
Final Report
                                             B-29

-------
        Variable
   Step Entered
       Kstab4
        fll_llnv
        Precip4
        PMOlDlS_inv
             Summary of sF*p(?i sCQ&ection

Variable                Number  Partial   Model
Removed     Label       Vars In R-Square R-Square

            K4              1
            Stab4           2
            fll_llnv        3
            Precip4         4
            PMOlDlS_inv     5

                   The REG Procedure
                     Model:  MODELl
           Dependent Variable: LnCORP LnCORP

               Stepwise Selection: Step 5
                             C(p)
                        F Value Pr > F
              Variable

              Intercept
              fll_llnv
              PMOlDlS_inv
              Precip4

            K4Stab4
0
0
0
0
0
.2140
.1966
.0588
.0437
.0271
0,
0,
0,
0,
0,
.2140
.4106
.4694
.5131
.5402
25
10
8
6
6
.6691
.9925
.0059
.2988
.0000
11,
14,
4,
3,
2,
.71
.01
.54
.59
.30
0,
0,
0,
0,
0,
.0014
.0005
.0391
.0654
.1375
         Parameter
          Estimate

          14.23612
         125.00731
         563.35265
         -13.04686
          -0.08337
           1.63208
 Standard
    Error

  4.72659
 53.04655
371.56318
  5.65791
  0.01603
  0.35474
Type II SS  F Value  Pr > F
   6.85515
   4.19648
   1.73710
   4.01818
  20.43059
  15.99544
 9.07
 5.55
 2.30
 5.32
27.04
21.17
0.0045
0.0236
0.1375
0.0265
<.0001
<.0001
                           Bounds on condition number:  1.3725,  30.547
Final Report
                                             B-30

-------
                                 fll  1
                                          f!2  1
                                                   f!4  1
                                                            f!6 1
                                                                     f!7 1
                                                                              f!9 1
                                                                                                DCF1
                                                                                                         Tol  PS1
Variable
f!2_l
f!4_l
f!6_l
f!7_l
f!9_l
 earson Correlation between X Variables
DCF
Tol PS1
                              f!2  1
                                           f!4  1
                                                        f!6 1
                                                                     f!7 1
                                                                                  f!9 1
                                                                                                           Label
                                                                                                           f!2_l
                                                                                                           f!4_l
                                                                                                           f!6_l
                                                                                                           f!7_l
                                                                                                           f!9_l
                                                                                                           GS1
                                                                                                           DCF1
                                                                                                           Tol PS1
                                                                                                              DCF1
fl£_l
f!6 1
f!9_l
f!9 1
DCF1
DCF1
Tol P S1
   -           < 0001
PcibPSlIr  under HO: Rho=0
Final Report
                                            C-l

-------
                    fll Imlnv
                                 f!2  Imlnv
                                             f!4  Imlnv
                                                         f!6 Imlnv
                                                                     f!7 Imlnv
                                                                                 f!9 Imlnv
                                                                                             GSlmlnv
                                                                                                         DCFlmlnv
Variable
Tol_PSlmInv
fll_lmlnv
fl2_lmlnv
f!4_lmlnv
f!6 Imlnv
f!7_lmlnv
f!9_lmlnv
GSlmlnv
                             X Variables
                                 f!2_lm
                                    Inv
                                                                                                           Label

                                                                                                           fll_lmlnv
                                                                                                           f!2_lmlnv
                                                                                                           f!4_lmlnv
                                                                                                           f!6_lmlnv
                                                                                                           f!7_lmlnv
                                                                                                           f!9_lmlnv
                                                                                                           GSlmlnv
                                                                                                           DCFlmlnv
                                                                                                           Tol  PSlmlnv
                                             f!4_lm
                                                Inv
f!7_lm
   Inv
f!9_lm
   Inv
                                                                                                       DCFlmlnv
fmS_CmlRvprocedure
f!2_lmlnv

f!4_lmlnv
f!4_lmlnv

filtrjlmlBtatistics
f!6_lmlnv

r5Y_lmInv
f!7_lmlnv

f!9_lmlnv
fl9_lmlnv
DCFlmlnv
DCFlmlnv
Prob >  |r|
Tol_PSlmInv
Tol PSlmlnv
Final Report
                                            C-2

-------
                                         Stab4
                                                  U4
                                                                                      RHE
                                                                                               PrecipJ
Variable
                                                                       Sum
         Correlation between X Variables
¥fiibCORR Procedure
Stab4
RH5

RH5
Prob >
           under HO:  Rho=0
Final Report
C-3

-------
                                                                                 MTBEO
                                                                                          PCEO
     ble
                                                                       Sum
                                                            BznO
                                                                          EbzO
                                                                                       MTBEO
                                                                                                      PCEO
                0.61564
                 <.0001
The CORR Procedure
oXO
oXO             0.43479
                 <.0001
BznO
BznO
MTBEO
                0.83145
           under HQSOBmo=0
Final Report
C-4

-------
                                     LnmpXO   LnoXO
                                                       LnTolO
                                                                LnBznO
                                                                         LnEbzO
                                                                                  LnMTBEO  LnPCEO
                                                                                                    LnCC140
Variable

LnmpXO
LnoXO
LnTolO
LnBznO
LnEbzO
LnMTBEO
Lfieaieon
LnCC140
                                                                       Sum
                                                                                               LnmpXO
                                                                                               LnoXO
                                                                                               LnTolO
                                                                                               LnBznO
                                                                                               LnEbzO
                                                                                               LnMTBEO
                                                                                               LnPCEO
                                                                                               LnCC140
                                                 LnBznO
                                                               LnEbzO
                                                                            LnMTBEO
                                                                                           LnPCEO
                  .62676
LnmpXO
LnTolO
Simple Statistics
LnBznO
LnBznO

LnEbzO
LnEbzO
      0.
LnMTBEO
LnPCEO
LnPCEO
Prob >
LnCC140
LnCC140
        Correlation Coefficients, N = 183
      0 . 27 904
under HO:OBBeh=0
Final Report
                                C-5

-------
General Comments:

Overall, the report reflects a substantial amount of high-quality work, and reflects good
practices in ensuring the quality of geographic data for use in subsequent analysis. The
statistical approaches are supported by independent data, including relative vapor
pressures for BTEX species.

General comments follow, with detailed in-line comments following.

1.
The choice of ordinary (multiple) linear regression for analysis of this data  set is
acceptable, but several caveats are in order.  While a full description of the  RIOPA data
collection protocol has not yet been published, it is EPA's understanding that several
homes were monitored concurrently for 48 hr (say n per subset), after which a new set of
homes was monitored. To conduct monitoring at 100 homes, 100/n = p different rounds
of home data collection would have to be undertaken.  This process introduces an issue of
non-independence of data for homes collected during the same of each of the p rounds of
data collection. During the same 48 hr period  of monitoring, the homes being monitored
shared the same meteorological data (used in the current analysis). As such, these data
may be analogous to the "clustering" phenomenon in surveys (associated with a loss of
sampling efficiency).  While this is unlikely to have a significant impact on the
magnitude  of the regression coefficients, it may have a substantial effect on their
estimated standard errors.  One way to significantly strengthen the current analysis would
be to include for 2-3 compounds (say, one species each of PM, VOC, and PAH) a
sensitivity analysis in which a mixed effects model is applied to the data sets, to account
for random within-"cluster" variation.  In SAS, the PROC MIXED procedure would be
used for such analysis. Addition of 1st order autocorrelation for data collected
simultaneously would also be appropriate  here.

It is now reported that typically 1 or 2 homes were sampled on a single day, though some
days had 3  or 4 homes samples so clustering should not be an issue.  PROC MIXED was
run with date as the repeated variable and no autocorrelation was found (page 33-34).

1.5
    On a related note, the low partial-Rsqr of most of the regression coefficients should
    be further explored.  One interpretation is that spatial patterns are relatively small
    contributors to overall variability in ambient concentrations. Another is that given the
    RIOPA sampling approach, the small number of concurrently-monitored homes
    resulted in assignment of a larger portion of explained variability to day-to-
    day/"samling  cluster" to "sampling cluster" variation than would be observed given a
    "balanced" design in which spatial and temporal variability would be more seperable.
    Recently, the Battelle Memorial Institute conducted an analysis of sources of
    variability in EPA's pilot project for air toxics monitoring in ambient air within
    several  cities nationally.  At a series of fixed sites with simultaneous measurements,
    within-city spatial variability contributed almost as great a fraction of total variability
    as temporal variability (Battelle Memorial  Institute and Sonoma Technologies, Inc.

-------
    (2003) Draft technical report for Phase II air toxics monitoring data: analyses and
    network design recommendations. Prepared for Lake Michigan Air Directors
    Consortium, Des Plaines, Illinois 60018).  A discussion of the role study design in
    interpretation of these results is appropriate in the report.
More details of the study design are reported and a copy of a paper in press detailing
that information is included.  The clustering due to either date or location is not a
problem based on the study design, as homes were selected throughout the 18 month
study period from all sections of the city without concentrating on any portion of the city
during individual time periods.

2.
Appendix A and the results section discuss diagnostic procedures applied to regression
outputs to determine multicollinearity.  However,  neither the Appendix nor main report
provide reasoning for decisions to apply corrective measures or not.  For instance, it is
mentioned that the distance to gas stations is significantly correlated with distance to
major roadways.  What was the strength of this association?  If greater than about 0.85,
this could lead to unstable coefficients. The relative significance of the associations
provide some assurance that variances are not super-inflated, but when one of the
distance terms was removed, did the  other remain stable? Such description is necessary
for the reader to be able to properly interpret the regression results.  Other areas where
further rationale is needed include decisions not to correct multicollinearity in cases with
failing diagnostics (e.g. condition index).
More details on the reason for the decisions have  been included in Appendix A.

3.
Why were "traditional" residual diagnostics not employed? Cook's Distance, etc.
provide the standard approach to such diagnosis, but the rationale for not using them is
not provided here.
 The approach used to look at the residual was a traditional residual diagnostic and is
more clearly stated.  The Cook's Distance was not use as it was more time consuming
and not thought to provide additional information past what was obtain for the objective
being considered, to derive a cohesive data base to examine the role of proximity on
ambient concentration.  The exclusion of outliers,  which probably had other variable
impacting their concentration, was taken to address this fundamental issue and is a
restrictive approach to identify  outliers, probably classifying some values as outliers that
were not.

4.
Please include a separate reference section, rather than citing the entire source in the text
itself.
Provided

-------
COMMENTS OF RICH COOK, EPA OTAQ

Chad -

I only had a chance to skim this before going on AL, but I have few comments:

1) In the section "National Emissions Inventory for 1999," I think a little more
discussion of how county level VMT is developed would help. Pechan actually
starts with State level VMT reported in HPMS by States (from sampling), which  is
then allocated to the county level using roadway miles for 12 functional classes
and vehicle class splits. This is briefly oulined in the the technical documentation
for the  NEI. Joe Somers can help with a description if needed.
The  VMT analyses was used as a guide to indicate which roadways to group
together and in the statistical analyses.  The actual emissions were not included
in the regression equations.  This is now stated in the text.  Thus, a more detailed
description of how they were derived is not warranted.

2) Table 10 - residential ambient air concentrations - I think it would be  helpful
to compare these data to local ambient monitor data, or maybe even national
averages from AIRS,  presuming resources permit.  Aldehyde concentrations are
much higher than typically seen at ambient monitors.  I wonder why?
Concentrations in the area measured by NJDEP has been added to the table.

3) When discussing why there is not a roadway proximity relationship for
aldehydes, it might be worth presenting some estimates of the secondary
contribution. I know that some modeling has estimated 90% of formaldehyde is
secondary. Again, this is subject to resource availability.
It is not clear to me how to include more on secondary contribution to aldehydes
using the approach taken other than what was done, examining the data by
splitting it into days above 10C and below, where different amounts of secondary
production should occur. More detailed source emission modeling, which
includes secondary production for formaldehyde, is being done by Dr. Panos
Georgopoulos with funding from the ACC and may address this issue in the
future.

4) I am suprised there is a signal for the two  PAH compounds they measured.
Nationwide, less than 20% of PAH emissions are from mobile sources. This
suggests a pretty strong raodway effect,  I think, given all the noise.
The effect does seem strong, but the compounds were selected as ones with
major mobile contributions.

5) Cliff says that coronene (which is mispelled in several places) is associated
more with gasoline vehicles and benzo(ghi)pyrelene more with diesels, but that
their analysis saw no  clear difference in source contributions. I checked  the
emission factors we used in the 1999 NEI and found the following:

-------
The text has been altered to indicate the coronene is predominantly gasoline
vehicles derived with the appropriate reference while benzo(ghi)pyrelene is
derived from both gasoline and diesel vehicles.

      a) Average emission rate for light duty vehicles and trucks (Norbeck, J. M.,
T. D. Durbin, and T. J. Truex.  1998. Measurement of Primary Particulate Matter
Emissions from Light Duty Motor Vehicles. Prepared by College of Engineering, Center
for Environmental Research and Technology, University of California, for Coordinating
Research Council and South Coast Air Quality Management District. Tables 1 6 and
    = 0.017mg/mi
      b) Average emission rate for heavy duty diesels (Watson, J. D., E. Fujita, J.
C. Chow, and B. Zielinska. 1998. Northern Front Range Air Quality Study. Desert
Research Institute.  See Table 4.4-4, page 4-41.) = 0.013 mg/mi

So I am wondering what the source of data is that shows benzo(g,h,i)pyrelene is
coming  mostly from diesels.  There is no reference in the report.

This is a good product.  I hope these  comments help.

Rich Cook
Environmental Scientist
U.S. EPA
Office of Transportation and Air Quality
2000 Traverwood Drive
Ann Arbor, Ml 48105
Phone:  734-214-4827  Fax: 734-214-4939

-------
Stephen Graham
09/13/2004 10:53 AM

To: Chad Bailey/AA/USEPA/US@EPA
 cc: Janet Burke/RTP/USEPA/US@EPA
 Subject: Re: RIOPA draft report

Hi Chad,
Some brief comments and questions. Overall, the draft needs work on sentence structure
in the both the text and descriptions in the tables/figures.

1) Should have more about the sample collection design (what samples collected and
when, for those included in this work) in background
More has been included and an in press paper is provided to give greater details.

2) All emission rate estimates in Table 4 are generally correlated (based on the Mobile
6.2 modeling, I assume).
 a) there is artificial variability introduced for lesser emitted chemicals (e.g. the
aldehydes) due to rounding
 b) since they are different roadways, should they not have different distributions of
vehicle  classes on them resulting in different emission distributions or those chemicals
listed?
 c) unsure why this was done since not used in regressions
As indicated in the  response to a comment by Richard Cook, this was done to facilitate
the grouping of the roadways and individual emission rates were not included in the
regression model so the effect of rounding is not important.  The individual road classes
are expected to have different vehicle distributions but each chemical and road class was
individually examined so this effect should be accounted for by the analyses.

3) If using statistics for "normal" data, then one  should use normal data or at least the
most normal data.  Several transformations were mentioned on page 33 and then
correlations performed on each of the transformed variables. Why do all possible
pairwise correlations, other than to 'see what gives the highest R'?
Only the Ln transformation of the concentration data was used in the analyses. As part
of the exploratory work to make sure that an association was not missed more extensive
correlations were evaluated.

4) In using multiple regression approaches (forward selection,  backward elimination, etc)
a statement about what each does to the estimate of variance is warranted.
Only the stepwise was used for deriving the final models.  The  others were run to verify
that consistent results were obtained independent of how the regression equations were
derived.

5) For influential ("outliers") statistics, why not  use something more standard like  Cook's
D (apparently similar to what was used in this study, cook's uses F distribution rather
than t), DFFITS, DFBETAS, COVRATIO?

-------
See explanation provided the 1st reviewer.

6) condition number of 10-30 indicates mild collinearity, 30-100 moderate, >100 severe.
Impact of excluding/flagging only severe category should be mentioned, although it looks
like even parameters with severe collinearity were indeed included in the "final" models.
We agree that collinearity did exist, but it was predominantly in the meteorology
variables, so the models were deemed acceptable for examining, particularly in a semi-
quantitative manner, the role of proximity.

7) coronene was misspelled several differing ways.
Fixed

8) number of outliers in table 15 is not consistent with Appendix regression outliers. For
example Table 15 lists 13 outliers for m,p-xylene, page A-3 states 17, Table A-2 states
20. QA check should be done here.
Fixed

9) In observing some of the stats for m,p-xylene, it seems that a 5-parameter model was
best and used, rather than a 7 parameter mentioned in page A-3
Fixed

10) Table 16
 a) lists 5 different significance levels ranging from 0.0001 through 0.105 (and I think in
the text it is mentioned on occasion as highly significant, more significant, etc.).
Establish a level of significance (e.g., p<0.05), and either something is statistically
significant or not, rather than varying degrees of significant.
For the final model, which was based on a stepwise procedure, p<0.15 was used as the
criterion for inclusion of a variable.  The other routines were allow to have less stringent
significance criteria as part of the exploratory analyses.

 b) does not indicate significance level for aldehyde, PM, PAH, and OC/EC parameter
estimates
All usedp<0.15

 c) precip units are not mentioned, it is apparently a significant parameter for the PAHs
only, but it did not really rain/snow that much over the study period (maximum listed in
table 14 is 0.13 mm if units are correct).  Would one expect that much washout from so
little precipitation? Even if it were inches, the median is 0.01, barely trace-level
precipitation.  If it is real, why no impact to the PM since essentially these PAH would
all be associated with some form of particulate matter? I suspect that the precipitation is
acting as a surrogate for some other parameter that has not been measured or possibly
systematic error in PAH measurements.
Units now given. The regression suggests and association not an explanation. It could
be another variable that both correlate with.

-------
 d) for ethyl benzene, inverse squared transformation was used and coefficient estimate is
 167.14 in table A-10 and also in Appendix C, however is listed as 0.17 in Table 16. This
 should be corrected, but I have a comment: It is good to see a general consistency among
BTEX coefficient estimates as expected, however, not sure why the inverse squared was
used outside of "it made a better model".  It is not very significant (r2=0.16) and would
rather see it in the same units as  the others.
All now use inverse of the transformed variable, not square.

 e) in general the distance parameters (FC, GS, DCF, Truck, etc) did not add very much
to explaining variation in residential concentrations, even for the true mobile source
chemicals.  This not surprising since the chemical is more than likely to never travel on  a
direct vector from highway A to home 1.  This tells us immediately that if we want to
know the impact a roadway is having on a residence, we need to do a better job of
measuring this in the future (i.e,  the 'dilution'  or mixing with other air not originating
from this source as a function of distance and micrometeorolgical conditions
(estimated?),  the time-of-day, day-of-week, month-of year (i.e., modified AADT)),
otherwise we are really just taking stabs at it in the dark.
Agree

 f) cannot remember why ambient concentration is not used as a parameter,  even for a
 single central site monitor since  it will probably do more for the model than the distance
parameters.
No central site data were available.

 11) correlation was mentioned  among some  of the input parameters-1 would like to see
what the actual correlations between FC14 and GS for the residences are rather than a
brief mention. This may be evident in the predictions given in figures 10, 11, and 13 that
 show no effective difference in using the either the FC or GS distance parameter. What
about stability and temperature by season, are there correlations here?
A correlation matrix has now been included in Appendix D.

 12) not sure where ridge regression was used  (technique mentioned on page 37)
Not used for the data presented,  section has been removed.

 13) no reference section included
Reference section added.

-------