fr;-' -
7-
          Guidelines for the Use of
      Anticipated Residues
                 in
  Dietary Exposure Assessment
            EPA ?.c. ••::•:..:. ': - - •
                                    I

-------
                            ACKNOWLEDGMENTS
Many people  participated in  the preparation of this document.
Anticipated  Residues  work group co-chairpersons  Edward Zager
(Chief, Chemistry Branch 2 - Reregistration Support, Office of
Pesticide Programs) and Ingrid Schultze  (Statistical Policy
Branch, Office of Policy, Planning, and  Evaluation} provided
technical expertise as well  as direction and coordination.   The
appendices were prepared by  Research Triangle Institute  (RTI, '
Research Triangle Park, NC)  and were revised by  the work  group to
reflect current Agency policy.   In addition, the following  work
group members provided valuable input  into this  effort:
Janet Auerbach
Paul White

John Faulkner
Jim Kariya

Paul Parsons

Steve Dapson
Maureen Clifford

Joe Reinert

Debra Edwards

Andrew Rathman

Richard Schmitt
Richard Griffin

Susan Hummel

Michael Metzger
Regulation Management Branch, Office of Water
Exposure Assessment  Group, Office of Research and
Development '
Economic Analysis Branch, Office of Pesticide Programs
Science Analysis and Coordination Branch, Office of
Pesticide Programs
Science Analysis and Coordination Branch, Office of
Pesticide Programs
Toxicology Branch II, Office of Pesticide Programs
Science Analysis and Coordination Branch, Office of
Pesticide Programs
Pesticide Policy Branch, Office of Policy, Planning,
and Evaluation
Chemistry Branch 1 - Tolerance Support, Office of
Pesticide Programs
Chemistry Branch 2 - Reregistration Support,  Office of
Pesticide Programs
Health Effects Division, Office of Pesticide  Programs
Chemistry Branch 1 - Tolerance Support, Office of
Pesticide Programs
Chemistry Branch 2
Pesticide Programs
Chemistry Branch 2
Pesticide Programs
- Reregistration Support, Office of

- Reregistration Support, Office of

-------
                               TABLE OF CONTENTS

                                                                           Page

      Abstract	      vi
I.     Purpose	       1
II.   Pesticide Registration and Tolerances	       3
   A.  Pesticide Registration	       3
   B.  Tolerances	       3
III.  Food Consumption	       5
   A.  The Dietary Risk Evaluation System (ORES)	       5
   3.  Other Food Consumption Estimates	       9
IV.   Types of Risk	       9
V.     Types of Data.	      11
   A.  Metabolism Studies in Plants and Animals	      11
   B.  Analytical Methodology	      13
   C.  Residue Field Trials	      13
   D.  Processing Studies	      14
   E.  Feeding Studies.	      15
   F.  Monitoring Data	      17
   G.  Residue Degradation/Reduction Studies	      20
   H.  Pesticide Usage Data	      22
VI.   Use of Data	      .24
   A.  Anticipated Residue Determination:  Sequence of Events in
       Determining Dietary Exposure	      24
     1.  Monitoring Studies	      29
     2.  Residue Field Trial and Degradation-Reduction Studies	      34
References	      39

APPENDIX 1:  EXAMPLES OF CALCULATIONS TO DETERMINE TOLERANCES AND
ANTICIPATED RESIDUES

Sxample 1:  Tolerance/Anticipated Residue determination Using Field
Trial/Degradation Data (Residues of Pesticide A in Grapefruit)	      1-1
Example 2:  Anticipated Residue Determination Using Monitoring Data
 Pesticide B in Grapes)	      1-3
Example 3:  Determining Tolerances and Anticipated Residues in Animal
Commodities	      1-4

APPENDIX 2:  MOVEMENT OF COMMODITIES IN COMMERCE

2.1      Introduction	      2-1
2.2      Production and Regional/Local Distribution Information	      2-1
2.2.1    Fresh and Processed Produce	      2-1
2.2.2    Fresh Produce Production and Distribution	      2-3
2 2.3    Processed Produce Production and Distribution	      2-7
2.3      Storage Information	      2-12
2 . 4      Marketing Channels	      2-15
2.4.1    Fresh Produce	      2-15
                                        11

-------
                                                                           Page

2.4.2    Processed Produce.	     2-18

APPENDIX 3:  STATISTICAL DESIGN AND ANALYSIS OF SURVEYS FOR PESTICIDE
RESIDUES IN THE DIET

3 .1      Introduction	     3-1
3.1.1    Purpose of Appendix	     3-1
3.1.2    General Approach to Survey Sampling	     3-2
3.1.2.1  Survey Populations	     3-2
3.1.2.2  Sampling Frames	     3-3
3.1.2.3  Stratification	     3-5
3.1.2.4  Cluster Sampling	     3-6
3.1.2.5  Sample Selection Procedures	,	     3-8
3.1.2.6  Types of Estimates	,..	     3-11
3.1.2.7  Determining Sample Sizes	•. . .     3-13
3.1.2.8  Estimation	     3-16
3.2      Sampling Points in the Food Processing and Distribution Chain     3-18
3.2.1    Overview	     3-18
3.2.2    Farm Level Sampling	...	     3-20
3.2.3    Wholesale Food Establishments	     3-22
3.2.4    Retail Food Stores	     3-27
3.3      Selecting the Appropriate Sampling Point	     3-31
3.4      References	     3-33

APPENDIX 4:  EXISTING SOURCES OF PESTICIDE RESIDUE DATA

4.1      Introduction	     4-1
4.2      Crop Field Trials	     4-1
4.2.1    Introduction	     4-1
4.2.2    Guidelines for Design, Implementation and Reporting	     4-2
4.2.3    Field Studies as a Source of Data for Characterizing
         Anticipated Residues	     4-3
4.2.4    Statistical Issues in Crop Field Trial Design	     4-4
4.2.4.1  Plot Selection	     4-4
4.2.4.2  Variations in Design and Reporting	     4-5
4.2.4.3  Compositing	     4-6
4.2.5    Assessing Chronic Exposure	     4-8
4.2.5.1  Analysis Based on Residue Data from Plots Treated with the
         Maximum Recommended Application Rate and the Minimum
         Registered PHI (or Registered PHI Reflecting the Highest
         Pesticide Residue)	     4-8
4.2.5.2  Analysis Based on Residue Data from Multiple Application
         Rates and/or Multiple PHIs	     4-12
4.2.6    Assessing Acute Exposure	     4-16
4.3      FDA Pesticide Monitoring Program.	     4-21
4.3.1    Introduction	     4-21
4.3.2    Regulatory Monitoring	     4-22

                                        iii

-------
                                                                           Page

4.3.2.1  Overview. . r.	     4-22
4.3.2.2  Basis for Sampling	     4-26
4.3.2.3  Variables of Interest Ln FDA's Data Base	     4-27
4.3.2.4  Data Assessment.-	     4-29
4.3.2.5  Conclusions	 .     4-34
4.3.3    Total Diet Study (IDS)	     4-34
4.3.3.1  Description of Program	     4-34
4.3.3.2  Suitability of IDS Data for Estimating Residues in
         Table-Ready Foods	     4-36
4.4      References	     4-38
         Appendix 4A	     4A-1

APPENDIX 5:  DEGRADATION OF PESTICIDE RESIDUES DURING STORAGE

         Preface	     5-i
5.1      Introduction	     5-1
5.2      Use of Information from Degradation Studies	     5-2
5.3      Approaches for Designing and Estimating Degradation Models...     5-8
5.3.1    Deterministic Decay Models for Pesticide Residues	  .   5-8
5.3.2    Deterministic Models for Toxic Metabolites	     5-15
5.3.3    Statistical Degradation Models	     5-19
5.3.4    Estimation of the Degradation Model	     5-26
5.4      Design of Degradation Studies	     5-36
5.4.1    Choosing Batches for a Degradation Study	     5-37
5.4.2    Sampling Plans for Commodity Sampling	,	     5-43
5.4.3    Choosing Time Points	     5-48
5.5      References	     5-66
         Appendix 5A  Illustrative Statistical Analyses	     5A-1
                                         IV

-------
                         ABSTRACT
     The U.S. Environmental Protection Agency (EPA) has the.
responsibility of balancing risks and benefits from the uses of
pesticides.  This requires a constant weighing of the benefits to the
consumer of enjoying a wide variety of fresh fruits and vegetables,
at a reasonable cost, with the risk to public health from consuming
pesticide residues.  EPA's Office of Pesticide Programs has developed
the concept of "anticipated residues" to estimate dietary exposure to
the consumer.  In determining anticipated residues, data from many
sources are examined at successive stages in the risk assessment
process until a conclusive statement may be made concerning the
potential risk of the pesticide.  First, tolerances are used to
estimate exposure and risk.  If tolerance level exposure estimates
indicate that the pesticide use exceeds certain threshold levels of
concern, then residue field trial data, percent crop treated data,
processing studies, degradation studies, monitoring studies, and
other types of data which would help provide a more realistic
estimate of exposure are used to determine anticipated residues.
Reliable data which are available are used prior to requiring
submission of additional data by the registrant.  The goal of
determining the best estimate of residues "at the plate" requires
weighing the usefulness of the available data sets.  Because various
types of data are available, and because these data may vary in
quality, considerable scientific judgement is required in the
assessment of dietary exposure.

-------
I. Purpose

     EPA has the responsibility of balancing the risks and benefits
from the use of pesticides.  One component of the risk/benefit
assessment is an analysis of the dietary risk from exposure to
pesticide residues in/on foods due to the use of pesticides in
agriculture.  Over the past several years, the EPA's Office of
Pesticide Programs has shifted its emphasis in dietary risk
assessment towards generating estimates that reflect actual pesticide
residue exposure to the U.S. population, and away from reliance 'on
"theoretical upper bound" exposure estimates.  To accomplish this,
the concept of anticipated residues has been developed.  Anticipated
residues are estimates of the residues in foods at the time of
consumption, and more realistically reflect consumption of pesticide
residues than do tolerance levels.
     Since anticipated residues are not necessarily upper bounds, a
smaller safety factor is built into these residue level estimates.
Therefore, it is important that the data used in estimating
anticipated residues be scientifically sound and reflective of
residues likely to be consumed.  The goal is to achieve the best
possible estimate of dietary exposure to the pesticide residue.
However, we realize that strict adherence to rigorous statistical
criteria such as those described in the Appendices to these
Guidelines may be extremely costly of time and of resources.  The
Agency will exercise its judgment in balancing the need for such
statistical rigor with the costs of obtaining adequate data, and with
potential hazards from consumption of pesticide residues, when
assessing anticipated residues.
     The purposes of these Guidelines are (1) to discuss the
approaches currently used in dietary exposure assessment and
determination of anticipated residues, (2) to discuss the limitations
in these approaches and the direction the Agency is taking to
overcome these limitations, and (3) to provide guidance for
generating residue data which are adequate to determine anticipated
residues.  The anticipated residue guidelines will provide detailed

-------
guidance regarding development of a realistic approach to estimating
"anticipated residues", taking into account the processes occurring
between pesticide application and food consumption which might
influence the pesticide residues consumed.
     These Guidelines are expected to be an evolving document.  Many
issues regarding anticipated residues and dietary exposure assessment
require further discussion and may result in a series of issue
papers.  These issues include, among others, the following:

     (1)   statistical design and evaluation of residue surveys at
          various levels in the chain of commerce (discussed in
          Appendix 3);
     (2)   statistical design and evaluation of residue
          degradation/reduction studies (discussed in Appendix 5);
     (3)   a more detailed discussion of criteria for use of FDA and
          other existing monitoring data for dietary exposure
          assessment (discussed in Appendix 4) ;
     (4)   criteria for use of existing field trial data, percent crop
          treated data, feeding studies, and processing studies in
          dietary exposure assessment (discussed in Appendix 4};
     (5)   use of data on pesticide usage and distribution;
     (6)   the strengths and weaknesses of the Dietary Risk Evaluation
          System (DRES),  how ORES can be used to evaluate risks to
          more highly exposed population subgroups,  and variations in
          risk due to geographic variability of residue levels or
          food consumption;
     (7)   residue values (e.g. average vs. 95th percentile) to use in
          exposure assessment considering the toxic effect and the
          type and quality of the available residue data;
     (8)   methods to estimate upper bound food consumption for
          chronic risk assessment;
     (9)   methods for obtaining a consistent set of residue data
          across chemicals;
     (10)  appropriate expression and communication of risks-average
         . risk vs.  risk to highly exposed individuals.

-------
II.  Pesticide Registration and Tolerances

     A.   Pesticide Registration

          Pesticide products must be registered by the Environmental
Protection Agency before they may be sold or distributed in the
United States.  The authority of EPA to require pesticide
registration is described in the Federal Insecticide, Fungicide, and
Rodenticide Act, as Amended (FIFRA, 1988).  Data requirements for
pesticide registration are provided in 40 CFR Part 158, and
Guidelines have been developed for the data required.  These data
include toxicity, product chemistry, and residue chemistry data, as
well as other information (see 40 CFR 158.108 for a list of available
Guidelines and ordering information) .  In addition to the required
data, 40 CFR 158.690(b) contains a conditional requirement for
"reduction of residue" data.  Reduction of residue data are required
when unreasonable risks are estimated assuming all foods contain
pesticide residues at the tolerance levels.  Reduction of residue
data include any residue data which would allow a more realistic
determination of pesticide residues as consumed (i.e. anticipated
residues) than would assumption of tolerance level residues.
     B.
Tolerances
          A tolerance is the maximum pesticide residue likely to
occur in an agricultural commodity as a result of registered
pesticide uses.  If residues exceed the tolerance, or if no tolerance
has been established, the commodity is considered to be adulterated
and is subject to seizure by FDA, USDA, or state regulatory
authority.  A tolerance is required before a pesticide may be
registered on a food or feed crop.  Tolerances are established by EPA
under the authority of the Federal Food, Drug, and Cosmetic Act
(FFDCA), and are used by the Food and Drug Administration (FDA) to
regulate the movement of agricultural commodities in interstate
commerce.  Section 408 of the FFDCA applies to raw agricultural

-------
 commodities  (racs),  and  Section  409 to processed commodities.  The
 residue data  submitted under  FIFRA and described in 40 CFR 158 are
 used  to determine  tolerances.
      Tolerances are  normally  established as a result of a tolerance
 petition which contains  all of the data needed to establish the
 tolerance  (see Section V).  These data are usually generated by
 pesticide  registrants  (usually major chemical companies) who wish to
 market the pesticide product.  For minor uses, including small scale,
 infrequently  needed, or  specialty pesticide uses for which there is
 insufficient  economic incentive  for timely development of data by
 chemical companies, the  U.S.  Department of Agriculture  (USDA) submits
 petitions  to  EPA under the Interregional Project #4 (IR-4) program.
     Tolerances are required  for raw and processed agricultural
 commodities,  animal feeds, and animal products (meat, milk, poultry,
 eggs, and  fish) in which pesticide residues could be found as a
 result of  registered pesticide uses.  Tolerances are necessary for
 processed  commodities only if the residue concentrates in the
 processed  commodity  (i.e. the residue is greater in the processed
 commodity  than in the raw agricultural commodity) or if the pesticide
 is applied directly to a processed commodity such as can occur during
 the fumigation of a food storage warehouse; otherwise, the tolerance
 for the raw commodity also applies to the processed commodity.  In
 all cases, the tolerance represents the maximum residue likely to be
 found in these products  as a  result of registered pesticide uses
 (discussed further in Section V) .  However, the tolerance is not
 necessarily the maximum  safe  level since tolerances are set no higher
than necessary to accomplish  the intended result of representing the
maximum residue likely to result from registered uses.
     Many tolerances for older chemicals were established based on
 residue data  which are no longer considered adequate due to advances
 in toxicological and chemical technology as well as more recently
obtained toxicological and exposure information.   In some cases,
tolerances may not be adequately protective of human health and the
environment.  For these  reasons, hundreds of tolerances for older
chemicals are being reevaluated as part of the Agency's

-------
reregistration process.  Any missing or inadequate data (data gaps)
are being required of the pesticide registrant ("called-in") in order
for the pesticide registrations to be continued.   In the
reregistration of pesticide products, pesticide active ingredients
have been divided into 4 lists as mandated by FIFRA 1988,  based on
the Agency's concern for these pesticides (Lists A, B, C and D).
List A includes those pesticides for which Registration Standards
were completed by December 24, 1988  (194 chemical cases, 350 active
ingredients).   Both lists A and B include important food-use
pesticides.  Lists C and D include pesticides of lower concern to the
Agency than those on lists A and B.  Further information regarding
the status of pesticide reregistration is available in Status of
Pesticides in Reregistration and Special Review (USEPA, OPPTS, 700-R-
92-004, March, 1992).
     The specific uses of different types of data in determining
tolerances are discussed in Section V.

III. Food Consumption

     A.   The Dietary Risk Evaluation System (ORES)

          The Dietary Risk Evaluation System (ORES) is a computerized
system which combines estimates of the level of pesticide residues on
crops  (and percent crop treated data) with information about how much
of each crop a person eats.  It then compares the resulting exposure
estimate to a Reference Dose (RfD) or other toxicologically
significant reference point.  All of the information about
anticipated residues for each crop is entered into ORES.  An
explanation of how ORES is constructed may therefore lead to an
understanding of how the estimates of anticipated residues are used.

     ORES uses consumption patterns derived from a survey conducted
by the U.S. Department of Agriculture in 1977-78 which involved 3-day
dietary records for 30,770 individuals and 3734 food items.  ORES can
handle separate residue estimates for a number of different food

-------
 forms and  food  items  for  each  commodity  (24) .  For example, the
 different  ORES  food forms for  apples  include fresh apples and cooked
 apples, and  food  items" include apple  juice.  Dietary exposure for
 ORES is expressed in  terms of  quantity of pesticide consumed per unit
 body weight  per day (mg pesticide/kg  body weight/day).  Dietary
 exposure to  a pesticide in a specific food or  food form is calculated
 by multiplying  the average amount of  the food  consumed daily by an
 estimate of  the amount  of pesticide in that  food or food form.  The
 total dietary exposure  for the pesticide is  the sum of these products
 over all foods  or food  forms for which there are tolerances for the
 pesticide  in question.  As a first approximation of dietary exposure,
 tolerance  level residues  are entered  into ORES.  ORES uses
 "anticipated residues"  in place of tolerance level residues to
 generate a more realistic dietary risk assessment.
     ORES  can estimate  dietary exposure for the U.S. population and
 22 subgroups of the population.  The  22"subgroups include groupings
 by season  (Spring, Summer, Fall, Winter), geographical region
 (Northeast, North Central, Southern,  and Western), ethnicity
 (hispanic, non-hispanic whites, non-hispanic blacks, and non-hispanic
 others), and age/sex  (10  subgroups).  ORES cannot estimate exposures
 for combinations  across groupings such as western region/hispanics.
 However, ORES can account for  varying residue  levels in a commodity
 for subgroups within  a  given grouping such as  by region or season.
 To accomplish this, ORES  analyses must be repeated for each residue
 level/subgroup  combination, and the calculated exposure summed for
all subgroups (e.g. summed for all regional subgroups, or all
 seasonal subgroups, etc.).  For example,  if a  higher residue level in
apples were found in  the  Southern region than  in the rest of the
 U.S., a separate  ORES analysis could  be performed for the Southern
region using this higher  residue value.  The calculated exposure for
 the Southern region could then be added to the calculated exposure
 for the other 3 regional  subgroups (calculated using the lower
residue level)  to obtain  the total U.S. population exposure.  As
previously stated, however, these types of analyses could not be
performed  for varying residue  levels  across groupings (e.g. non-

-------
hispanic whites/Northeast region).
     The precision and accuracy of the exposure calculations by ORES
for certain scenarios -is limited in part by the 3-day time period
over which the consumption data were generated.  For example, the
number of people whosconsumed certain minor commodities such as kiwi
fruit or macadamia nuts during the 3-day survey period was small, and
therefore, the variance of the consumption estimates for these
commodities is large.  If these commodities were to have
significantly higher residues than other commodities, the dietary
exposure estimate could be significantly affected by the imprecise
consumption data for the minor commodities.  An analogous situation
could occur for some major commodities which are consumed by
relatively few people within certain population subgroups (e.g.
grapefruit consumption by infants).  In these cases again, the low
incidence (e.g. infants who consumed grapefruit over the 3-day survey
period) may lead to relatively high uncertainty about the exposure to
this subgroup.  Uncertainties related to the short three-day period
over which consumption data were generated 'are of particular concern
in assessing acute toxicity.
     The accuracy of extrapolating from a 3-day survey to the longer
periods of time that would be needed to cause chronic effects (weeks,
months, or - in the case of carcinogenicity - a 70-year lifetime) is
questionable.  However, because data on long-term food consumption
patterns are not available, ORES assumes that the average consumption
values for chronic consumption by the general U.S. population and
each of the 22 subgroups are equal to the average consumption values
for the 3-day period over which the survey was taken.  It does not
assume, however, that the entire distribution of consumption values
for the 3-day period reflects the distribution of consumption values
over a longer period of time.  The assumption is that while some
people may eat far more or less than average amounts of a crop for a
short time, few if any will continue to consume such extreme amounts
over a long time period.
     Another limitation of the survey on which ORES  is based is the
lack of resolution within broad geographical areas.  This leads to
                                                                    /3

-------
 the  inability  to  calculate  pesticide exposures for local populations.
 DRES  can  divide the  U.S.  into  four major regions, but further
 geographical breakdown' has  not been attempted.  This becomes a
 particular problem when a high percentage of a locally produced
 commodity is consumed  locally  as in the case of fresh water fish
 consumption around the  Great Lakes.  We note that there are not only
 potential geographical  consumption "hot spots" but also potential
 geographical residue level  "hot spots", both of which can result in
 higher  (or lower) local/regional exposure relative to the national
 average.
      For  certain  commodities such as apples, squash, and particularly
 fish, the residue levels may vary considerably among different
 varieties or species.   However, ORES does not have a breakdown by
 different varieties or  species.  Therefore, exposures resulting from
 the different  residue  levels cannot be directly calculated.  To
 estimate  these exposures, an average residue level may be used
 (possibly weighted by the relative quantities of different varieties
 produced), or  a conservative assumption may be made based on the
 variety or species containing the highest residue level.
      EPA  also  uses information on the percent of a crop that is
 treated with a particular pesticide in carrying out a ORES analysis.
 The assumption that the percent of crop treated with a pesticide
 accurately reflects the percent of crop eaten that is contaminated
with  the pesticide leads to an overestimation of the risk for those
people who eat a higher percentage of untreated commodity, and to an
underestimation of risk for those people who eat a higher percentage
of treated commodity.
      In spite of the limitations discussed, ORES is the standard
assessment system to which refinements for individual analyses are
applied.  The  large amount of consumption information available in
ORES, and its ability to incorporate and manipulate residue data,
usage data,  and toxicological reference values into its dietary
exposure assessments, makes DRES a flexible and sophisticated dietary
risk assessment tool.  A more detailed description of the strengths
and weaknesses of the DRES are presented in references 15 through 20.
                                                                   II

-------
     B.   Other Food Consumption Estimates

          Other means also have been used in the past to estimate
consumption.  Prior -to the development of the Dietary Risk Evaluation
System, the Food Factor System was used.  The Food Factor method of
exposure analysis utilized two types of data to determine food
consumption nationally.  First, food consumption was estimated from
the retail weights calculated from agricultural production figures
(from USDA) adjusted for loss during distribution.  Secondly,
household surveys (USDA 1955, 1965/66, 1976/77) were conducted
(personal interviews with household members) to determine food
consumption measured at the level at which food enters the kitchen.
In the household surveys, food consumption was expressed as Ibs.
commodity/week/household, and a food factor was determined by
dividing the consumption value for a particular commodity by the
total food consumption (97.85 Ibs./week /household, 3.27 meal
equivalents per person per household per day.)  Total residue intake
was determined by multiplying the food factor by the residue level
for each commodity,  and then summing the resulting residue levels for
the individual commodities.  This system is no longer being used
because it does not account for processed forms of foods or
differences in consumption by population subgroups (regional, age or
ethnic subgroups), and for other reasons. (21, 22)

IV.  Types of Risk

     For the purpose of determining the residue value to be used in a
risk assessment, risk is broadly categorized into carcinogenic risk,
non-carcinogenic chronic risk, and acute risk.
     The Agency's current models of carcinogenesis relate the
frequency of carcinogenesis to the amount of pesticide exposure over
a long time period.   At any one meal, lower or higher levels of
pesticide residue may be consumed, but over a period of time, residue
consumption will likely approach an average residue level.

-------
 Therefore,  the  anticipated  residue values used for quantitative
 carcinogenic  risk  assessment  are estimates of average residue levels
 in  foods  at the time  of  consumption.  Although some regional
 variability in  the average  residue level is likely due to variations
 in  environmental conditions and agricultural practice as well as
 distribution  of commodities in commerce, this has not been considered
 by  the Agency in past risk  assessments because of the lack of
 adequate  regional  residue data, use  information, and food
 distribution  information, all of which would be required for this
 type of an  analysis.  Regional variations in residue levels may be
 increasingly  incorporated into risk  assessment as better regional
 residue,  use, food distribution, and consumption data are available.
     In determining exposure  for non-carcinogenic chronic effects,
 the Agency  currently  uses either the average residue from field trial
 data reflecting the worst case use pattern, or the 95th percentile
 residue level from monitoring data.  Since the field trial data used
 are for applications  at  the maximum application rate, maximum number
 of applications, and  minimum  PHI, the average residue levels found in
 these commodities  will likely exaggerate the average residue level
 actually  present in foods at  the time of consumption.  In practice,
 pesticides  are  commonly  applied at application rates less than the
 maximum rate, less than  the maximum number of applications are made,
 and crops are harvested  at  PHIs which are longer than those
 registered, all  leading  to  lower residues.  Additionally, pesticides
 may degrade between the  time of harvest and consumption.  In
 monitoring  studies, the  more conservative upper 95th percentile
 residue levels  are usually  used since the treatment history and
 representativeness of the samples are usually not known (e.g.
 uncertainties in percent of crop treated, application rates and PHIs
 used or storage times).  The Agency currently is reevaluatirig the use
 of these residue estimates  for non-carcinogenic chronic effects
 considering the nature of the particular toxic effect (e.g. exposure
 time and dose known to cause the effect), and the residue data which
are currently available  or  may be required of the registrant.  As in
the case of carcinogenicity, regional exposure analyses may be
                                  10

-------
performed if adequate regional data are available.
     The anticipated residue values currently used for acute exposure
are the tolerance leve'ls or the best estimates of the maximum
residues in foods at the time of consumption since the acute effect
could be induced by short-term exposure to the pesticide (over a few
days or possibly at a single meal).   However, in most cases, residue
field trials and monitoring studies are not designed to estimate the
maximum likely residue with sufficient accuracy to allow
determination of the fraction of the U.S. population in which the
toxic effect will likely be seen because samples are usually
composited and an insufficient number of samples are usually
collected.  This inaccuracy is compounded when multiple foods
containing high residues may be consumed.  Guidelines are currently
being written which address issues pertinent to exposure and risk
assessment for acutely toxic pesticides.
V.   Types of Data

     In Section V.A. through V.G. below, the major types of residue
data are discussed and how these data are used to establish
tolerances.  Use of these data in determining anticipated residues
and dietary exposure is discussed in Section VI.

     A.   Metabolism Studies in Plants and Animals

Plant and animal metabolism studies are designed to characterize the
chemical composition of the pesticide residue in plants and animals.
In plant metabolism studies, the plant is treated with the pesticide,
usually radiolabelled with I4C, in a manner similar to the proposed
use.  For example, if corn were to be treated with a pesticide using
foliar spray applications, foliar applications of the radiolabelled
pesticide would be made to corn in the metabolism study.  Following
pesticide treatment, the plant is managed as closely as possible to
                                  11
                                                                    n

-------
 the  way  the plant would  be managed  in the field and samples of
 important plant commodities  are obtained  (e.g. corn grain, forage,
 silage and fodder).  The samples usually are collected at times which
 correspond to normal harvest times.  The samples then are analyzed to
 determine the chemical structures and quantities of metabolites
 present  in the total residue.
     Two types of animal metabolism studies normally are conducted.
 If an animal is to receive dermal pesticide treatments (sprays, dips,
 etc.), the radiolabelled pesticide must be applied to the animal
 dermally.  If the animal will consume the pesticide or pesticide
 residue  orally, oral administration is required.  Following pesticide
 treatment for a sufficient length of time, animal tissue, milk, and
 egg samples are obtained and analyzed to determine the chemical
 structures and quantities of metabolites present in the total
 pesticide residue.
     The tolerance expression which is published in 40 CFR Part 180
 for each pesticide,  describes which chemical components of the •
 pesticide must be regulated.  Metabolites are included in the
 tolerance expression depending on their toxicological significance,
 their percentage of the  total residue, and whether analytical
 methodology can be developed to measure residues of the metabolite in
 agricultural commodities.  Methodology is essential for metabolites
 which are both toxicologically significant and present at significant
 levels.  The active ingredient and significant metabolites are called
 the total toxic residue.  If one component of the residue is
 significantly more toxic than the other components, two levels may be
 necessary in the tolerance expression.
     More detailed information regarding how to conduct and evaluate
 metabolism studies can be found in the Pesticide Assessment
 Guidelines,  Subdivision  0, Residue Chemistry (including Addendum 7 on
 data Reporting, NTIS No.  PB89-124598), and Hazard Evaluation Division
 standard Evaluation Procedures Qualitative Nature of the Residue:
 Plant Metabolism (EPA-540/09-88-102) and Metabolism in Food Animals:
Qualitative Nature of the Residue (EPA-540/09-89-061, NTIS PB90-
 103292).   Additionally,  OPP  is currently preparing a more detailed
                                  12

-------
document describing the conduct of plant and animal metabolism
studies which is scheduled to be distributed in the form of a PR
Notice in the stttnmer .of 1992.

     B.   Analytical Methodology

     Chemical components of the pesticide residue which must be
included in the total toxic residue are determined in metabolism
studies.  Once the total toxic residue has been determined,
analytical methods must be developed to allow determination of
residues of these components in agricultural commodities (raw,
processed or animal) for which tolerances are required.  These
analytical  methods are necessary to provide residue data in residue
field trials and as a means of enforcement of the tolerances.
Detailed information regarding analytical methods may be found in the
Pesticide Assessment Guidelines, Subdivision 0, Residue Chemistry
(including Addendum 2 on Data Reporting, NTIS No. PB86-248192) and
Health Effects Division Standard Evaluation Procedure Analytical
Method(S) (EPA-540/09-89-062, NTIS PB90-103284).

     C.   Residue Field Trials

     After the metabolism studies have indicated what to look for and
analytical methods have been developed to measure the total toxic
residue, the actual residue field trials are carried out.  These are
studies in which the pesticide is applied to crops in a manner
similar to the directions for use which will eventually appear on the
label; then, samples are obtained and analyzed for total residues.
The purpose of residue field trial studies is to determine the
appropriate tolerance level which is the maximum legally allowable
pesticide residue and is used to regulate the commodity as it travels
in interstate commerce.  Data normally are required for each crop (or
for representative commodities in a crop group as defined in 40 CFR
180.34(f)(9})  for which a tolerance and registration is requested.
Data also are required for each raw agricultural commodity (rac)
                                  13

-------
 derived  from  the  plant  (for  example, corn residue data would be
 required for  the_grain  and the  forage, silage, and fodder).  Samples
 generally are placed  in frozen  storage immediately after collection
 to minimize loss  or dissipation of the pesticide residue prior to
 analysis.  The  field'trial data must reflect the use conditions that
 could  lead to the highest residues and must represent the  highest
 application rate, the maximum number of applications, and  the
 shortest time intervals between applications and between the last
 application and harvest to be included on the label.  The  residue
 data also must be representative of major growing areas and seasons,
 major types or varieties of  the rac, the general types of  pesticide
 formulations  for  which  registration is requested, and the  types of
 applications  to be made (e.g. ground applications, aerial
 applications, ultra-low volume  aerial applications.)  Further
 information regarding field  trial data are available in the Pesticide
 Assessment Guidelines,  Subdivision O, Residue Chemistry (including
 Addendum #2 on Data Reporting,  EPA-540/09-86-151, NTIS No. PB86
 248192)  and Hazard Evaluation Division Standard Evaluation Procedure
 Magnitude of  the  Residue: Crop  Field Trials (EPA-540/09-85-021).
 Additional information  regarding the use of field trial data in
 determining tolerances  and anticipated residues is included in
 Appendices 1  and  4.

     D.   Processing Studies

     Processing studies are  designed to determine the concentration
 (or reduction) of residues when the raw agricultural commodity is
 processed commercially.  Typically, a raw agricultural commodity
 containing weathered residues,  frequently resulting from field
 applications  at exaggerated  (higher than maximum label) pesticide
 application rates to assure  obtaining detectable residues, is
processed using a method which  closely simulates commercial
 processing.    Important  processed fractions are obtained at various
points in the process and analyzed for the total toxic residue.  The
ratio of the  residue in the  processed commodity to the residue in the
                                  14

-------
raw commodity is the concentration  (or reduction) factor.  If the
ratio is greater than 1, the residue is said to concentrate upon
processing.  If the ratio is equal to or less than l, there is no
concentration of residues.  These ratios, if greater than 1, are then
multiplied by the tolerance for the raw agricultural commodity to
obtain the tolerances for the processed commodities.  Tolerances are
not required for processed commodities unless the residue
concentrates (concentration factor > 1) upon processing.  Additional
information regarding processing studies can be found in the
Pesticide Assessment Guidelines, Subdivision 0, Residue Chemistry
(including Addendum #4 on Data Reporting, EPA-54-/09-88-004, NTIS No.
PB88 117270) and Hazard Evaluation Division Standard Evaluation
Procedure Magnitude of the Residue: Processed Food/Feed Studies (EPA
540/09-86-145).

     E.   Feeding Studies

     In animal feeding studies, pesticide residue levels are
determined which are likely to be found in meat, milk, poultry, and
eggs as a result of ingestion of treated feeds by animals.  The
maximum residue levels in animal commodities likely to result from
ingestion of animal feeds treated at the maximum application rates
(and shortest PHIs) are used to determine tolerances for animal
commodities (except in cases where dermal applications are also made
to the animal in which cases residues from dermal applications also
would have to be incorporated into the tolerance level).
     In general, animals are dosed with the pesticide for a period of
time, and the resultant residues in eggs, milk, and animal tissues
are measured (the total toxic residue as determined in the animal
metabolism studies).  If metabolism studies show that there are plant
metabolites which are not also animal metabolites, the animal must be
iosed with the metabolites which are plant metabolites and not animal
metabolites, as well as with the parent compound.
     The livestock dietary burden (residue intake from treated feeds)
is determined by multiplying the tolerance level for livestock feed
                                  15

-------
 items by the maximum fraction of each feed item in the livestock diet
 (found  in Table 2 of the Residue Chemistry Guidelines, Subdivision 0,
 Pesticide Assessment Guidelines, Oct., 1982).  Then the residue
 contributions from each commodity are summed to obtain the total
 dietary burden of the animal.  The feeding levels to be used in the
 livestock feeding studies are based on the estimated dietary burden
 of the pesticide in the livestock feed.  The levels used should be
 approximately lx, 3x, and lOx of the estimated dietary burden, where
 Ix is the worst case estimate of potential livestock dietary exposure
 based on the assumption that all components of the feed contain
 tolerance level residues.  The exaggerated feeding levels are
 particularly important if non-detectable residues are reported at the
 lx feeding level; they help show whether residues in tissues vary
 linearly with the feeding level.  Additionally, exaggerated feeding
 levels will allow for future tolerance requests (the animal dietary
 residue burden must be less than the maximum feeding level used in
 the feeding studies or additional feeding studies may be required).
     The dietary burden is compared to the levels fed in the
 livestock feeding study, and the residue in each tissue, in milk, and
 in eggs is determined from a graph or linear regression analysis.
 Sometimes a simple ratio is used if the estimated dietary burden is
 close to one of the levels in the livestock feeding study or is
 significantly lower than the lowest level in the feeding study.  The
 residue estimated in this manner for meat and poultry tissues, milk,
and eggs is rounded upward and becomes the tolerance level.  However,
the tolerance level is not set at a level lower than the limit of
quantification of the analytical method.
     At this time, no information is available regarding the overall
 relative amounts of various agricultural commodities used as animal
 feeds nationally.  For example, a greater amount of corn forage
probably is used than is peanut hay for animal feeds.  The Dietary
 Exposure Branch is currently investigating the relative amounts of
use of various animal feeds.  When this project is completed, we may
be able to take this into account so that more realistic estimates of
animal dietary burdens can be made.  Further information regarding
                                  16

-------
animal feeding studies is available in the Pesticide Assessment
Guidelines, Subdivision 0, Residue Chemistry  (including Addendum #8
on Data Reporting) and' Dietary Exposure Branch Standard Evaluation
Procedure Residues in Meat, Milk, Poultry and Eggs: Feeding
Studies/Peed-throughS (EPA-540/09-90-087,  NTIS No. 90-208943).

     F.   Monitoring Data
          In a pesticide residue monitoring study, samples of
agricultural commodities are obtained at various times and from
various locations and analyzed for pesticide residues.  The specific
commodities sampled, sampling locations and times, numbers of
samples, sample sizes, and many other sampling parameters depend on
the purpose of the study.  Purposes for which pesticide residues are
commonly monitored in foods include enforcement of tolerances and
effluent discharges, trend analyses, assessment of environmental
contamination and dietary exposure assessment.  Although our focus
here is on dietary exposure assessment, monitoring data obtained
specifically for this purpose are not always available for many
commodities and pesticides.  Therefore, monitoring data designed for
other purposes commonly are used taking into account the
uncertainties or bias introduced because of the different purposes
for which the data were generated.  Further discussion of the use of
monitoring data in dietary exposure assessment is included in Section
VI and in Appendix 4.  Below we discuss some of the major existing
monitoring programs and the factors which determine their usefulness
in dietary exposure assessment.
     The most widely available monitoring data are those from the FDA
and USDA.  The Food and Drug Administration (FDA), as part of their
enforcement program for pesticides, collects four types of monitoring
data:  domestic surveillance, domestic compliance, import
surveillance, and import compliance.  Compliance data generally are
the result of targeting collection towards commodities suspected of
containing illegal residues, while surveillance samples are collected
without any suspicion that a particular shipment contains illegal
                                  17

-------
 residues.   They are,  however,  selected partly on the basis of volume
 of  production of a  commodity  and  partly on the basis of prior residue
 problems with a certain  food  commodity and growing region.   In their
 surveill-ance  monitoring  program,  FDA monitors a wide variety of
 agricultural  and processed  commodities for numerous contaminants,
 including pesticides,  using primarily multiresidue methods of
 analysis which are  capable  of  determining a variety of contaminants
 from a single sample  analysis.  In  its surveillance monitoring
 program, FDA  also conducts  incidence/level monitoring to acquire
 information on specific  pesticides, commodities, or
 pesticide/commodity/country combinations.  Among recent
 incidence/level  monitoring  conducted by FDA are monitoring for
 residues of aldicarb  (potatoes),  captan (cherries), benomyl  (apples,
 grapes, peaches), captafol  (apples, cherries, rice), an aquaculture
 survey, a milk survey, and  a processed food survey.  Although routine
 monitoring and incidence/level monitoring provide data for many
 pesticides, other pesticides are  not monitored by FDA or only limited
 data are available.   Domestic  samples are collected as closely as
 possible to the  point  of production in the food distribution chain
 since the prime  objective is to monitor fresh food being shipped in
 interstate commerce for  compliance with EPA tolerances.  Therefore,
 additional degradation which could occur between the collection point
 and the "dinner  plate" is possible.  Information which would allow
 determination  of  the  location at  which a sample was grown is not
 readily available.  Import  samples are collected at the point of
 entry into U.S.  commerce (12,  13, 14).
     A major objective of the FDA monitoring program is to prevent
 foods that contain illegal  residues from entering interstate
 commerce.  Although the overall program is not designed to provide
 truly representative  sampling of  commodities for the purpose of
dietary exposure  assessment, FDA's FY'92 program includes a  trial
effort to provide statistically based monitoring data in pears and
tomatoes.  Bias may enter if the  compound of concern was targeted for
FDA monitoring and higher than typical levels were seen.  If the
compound being assessed were not  given priority in sample collection,
                                  18

-------
arid monitoring were directed  towards  other  competing  compounds,  the
FDA surveillance data  for  the first compound may  show infrequent
"detects" and artificially low average  residue  levels.
     FDA. also conducts  the Total  Diet Study, also called  the Market
Basket Survey, in which pesticide residues  are  determined in food
prepared for consumption.   The Total  Diet Study is designed to
estimate dietary intake of selected pesticides  by various U.S. age-
sex groups.  Foods are  collected  four times per year  in retail
markets at  12 locations throughout the  U.S. and are prepared as
table-ready  (cooked) before analysis.   Each market basket consists of
234 foods that represent at least 90% of the items in the American
diet (14).  These data  are useful to  the FDA for  making trend
analyses; however, since so few samples of  each commodity are
obtained, these data have  limited use for risk  assessment.
     FDA monitoring includes  few  samples of meat  and  poultry (these
commodities commonly are included only  in the Total Diet  Study).
Monitoring data may be  available  for  animal commodities from USDA for
chemicals included in their routine monitoring  programs.   Pesticide
monitoring data from USDA's Food  Safety and Inspection Service (FSIS)
primarily include analyses for chlorinated  pesticides in  animal  fat,
and other selected pesticides in  liver samples.   USDA's Agricultural
Marketing Service (AMS)  monitors  shelled eggs and egg products for
pesticides, while FDA monitors for pesticide residues in  in-shell
eggs.
     The USDA, in cooperation with the EPA  and  FDA, has recently
(1991)  initiated the Pesticide Data Program (POP).  This  new program
is designed to provide  actual residue monitoring  and  usage data  to
nelp form the basis for conducting more realistic risk assessments.
Briefly, EPA provides USDA with pesticide/commodity combinations for
which the EPA desires data; and USDA  generates  these  data (including
residue monitoring and  usage  data).   These  data are then  provided to
:E:PA.  The POP monitors  residues in fresh fruits and vegetables.  An
important aspect of this program  is that it is  designed to meet  the
data quality and random sampling  criteria required for monitoring
data used for risk assessment.  Since this  is a new and rapidly
                                  19

-------
 evolving  program,  additional  experience is required to determine the
 means  and extent^to which  these data will be used in anticipated
 residue determination."
     Monitoring data also  may be generated by other sources including
 states, registrants, and other interested parties such as food
 processors and consumer or environmental groups.  Monitoring data
 generated by the states  (CA,  FL, NJ and others) are available; some
 of these  data are  incorporated into a data base acquired through an
 FDA contract.  FDA is  working currently with several states to
 coordinate data collection and compile the data to increase their
 availability and usefulness  (FOODCONTAM project).
     EPA  has the authority to require pesticide registrants to
generate  market level  surveys of pesticide residues and recently has
 exercised this authority in issuing "Data Call-Ins" requiring
 statistically based national  surveys for residues of specific
 pesticides.  Appendix  3 provides guidance on the design and
 evaluation of pesticide residue surveys.  A discussion of the use of
 existing  monitoring data in dietary exposure assessment is presented
 in Appendix 4.

     G.   Residue  Degradation-Reduction Studies

     Residue degradation/reduction describes any change in the amount
 and composition of the total  toxic residue from harvest to the point
of consumption.  Therefore, many types of processes are grouped under
degradation-reduction studies including storage, commercial
processing, washing, peeling, trimming, cooking, and others.  In the
case of post-harvest pesticide applications, degradation/reduction
describes the changes from pesticide application to consumption.  The
pesticide may degrade to form less toxic products or to form more
toxic products.
     Two  general mechanisms are responsible for the
degradation/reduction of pesticide residues in a commodity:  physical
processes and chemical processes.  Physical processes include
washing,.volatilization, and  removal of parts of a commodity such as
                                  20

-------
peels, hulls or outer leaves.  The pesticide also may react
chemically in the presence of moisture, heat, light, acids, bases,
enzymes, oxidizing or reducing agents, metal ions, or under other
conditions which may decrease or modify the residue.  The major
chemical degradation pathways are oxidation and hydrolysis, both of
which can occur by enzymatic or non-enzymatic mechanisms.  Most
enzymes responsible for pesticide degradation would lose their
activity permanently after being heated to 100°C or  above.
     The kinetics of pesticide degradation generally are assumed to
be pseudo first order for a particular degradation process depending
only on the pesticide residue level (which would be very low relative
to other chemicals involved in the degradation process such as
water).   However, many degradation processes can occur at the same
time.  Therefore, in order to determine the overall kinetics of
degradation, a mean half-life value (obtained by averaging half-lives
calculated from a series of sets of points along the curve of log
(residue concentration)  vs. time) may be used cautiously as an
estimate of the half-life of the composite degradation process.
     After harvest, commodities can be stored (sometimes for extended
periods of time), transported, commercially processed, waxed, washed,
peeled,  cooked, and treated in other commodity-specific ways.  Time
and temperature considerations are important when examining the
effects of storage, transportation, commercial processing, and
cooking.  Humidity may be important when examining storage and
transportation.  The point at which wax (with or without pesticides)
is applied to some commodities must be considered (e.g. apples,
cucumbers).  The typical way(s)  commodities are washed, peeled, and
cooked (e.g. boiled, fried, roasted, etc.) are important
considerations.  Other processes also may be important for specific
commodities (e.g. shelling nuts, removal of the outer leaves from
lettuce and cabbage, removal of the thick part of the stem from
broccoli and asparagus).  Residue degradation/reduction studies for
representative commodities within a crop group may be sufficient to
characterize residue degradation/reduction within the entire crop
                                  21

-------
 group  if  commercial  and  home  preparation practices are similar for
 the  different  commodities.
     A residue degradation/reduction study should take a treated
 sample through all of  the processes from harvest to consumption and
 should simulate typical  commercial and home practices as closely as
 possible.  Subsamples  should  be removed at each important point for
 residue determination  in the  edible portion of the commodity.  In all
 cases,  but particularly  when  degradation products are more toxic than
 the  parent, application  rates should be chosen which are close to the
 maximum registered rates so that metabolite ratios which approximate
 those  likely to result from typical applications can be determined.
 Residues  in the raw  commodities should be well above the analytical
 method limit of detection at  the beginning of the study so that the
 decline in residues  can  be measured accurately.  Analytical methods
 must have sufficiently low limits of detection (LODs) so that an
 acceptable risk can  be estimated using the LOD, considering combined
 risk from all  foods.
     EPA  is working  with the  National Food Processors Association
 (NFPA)  to develop protocols for commercial processing studies for
 some commodities.  These protocols are not yet available for
 distribution.
     Design of  studies to measure residue degradation/reduction in
 commodity storage is presented in Appendix 5.  Additional discussion
 of the  integration of field trial, storage, processing, cooking, and
 other data are  required.

     H.   Pesticide  usage Data

          Pesticide usage data describe the amount of pesticide
 applied per unit time  (Ibs.a.i. per year,  for example), the number of
 acres of each crop treated (or the percentage of the crop treated),
and similar information.  This is to be distinguished from use data
which describe  the specific way the pesticide is used on a crop such
as the type of  application ("in-furrow", for example) or the timing
of applications.  Pesticide usage data are collected by the Agency
                                  22

-------
for use in human risk/benefit analyses, environmental exposure/risk
analyses, and serve as an input for design and planning activities
for monitoring and enforcement efforts (25).
     Usage data are available from many sources.  Proprietary sources
of usage information''include those from Doane Marketing Research,
Inc., Maritz Marketing Research, Inc. and Technomic Consultants.
Doane and Maritz provide current estimated use and usage data for
major crops and some small acreage crops.  Doane also provides
livestock usage data.  Estimates generally are based on
surveys/panels and may include some expert opinion, especially
Technomic.  Survey data are available from USDA covering ~*jor field
crops, and more recently other crops.  Usage information are
available from many states, but the usefulness of these data
frequently are limited for many reasons including pesticide usage not
being reported by crop, sporadic collection of data, the availability
of only older data (5-10 years old),  and collection of data only for
"major crops".  The Census Bureau estimates usage by pesticide
classes, not specific pesticide, and can conduct special surveys for
selected states when funds are available.  Battelle provides
primarily foreign pesticide usage data.  Information sometimes is
obtained through phone calls to.cooperative extension personnel, but
the information usually is based on opinion rather than on hard data.
Finally, registrants provide data under Section 7 of FIFRA giving the
amounts of pesticide that are produced and distributed, but the
amounts used on specific crops are not provided.
     These data are most useful for estimating ranges of percent of
crop treated on a national and regional basis for major chemicals on
major crops (major crops as defined here include field corn, wheat,
soybeans, peanuts, cotton, sorghum, barley, oats/rye, alfalfa, and
perhaps rice, plus a few specialty crops such as potatoes, tobacco,
and citrus as a group.)  Data are limited for specialty (minor)
crops, postharvest applications (except apples, oranges, grapes, and
some grain fumigants), and livestock (while there are data on percent
of crop treated for feed, there is little information on which
animals are fed the treated feed.)
                                  23
                                                                     3(1

-------
     The  use of  percent crop  treated  information in dietary exposure
 assessment  is described in  Sections VI.A. and VI.A.2.  The usefulness
 of pesticide usage data for dietary exposure assessment has been
 limited to  national estimates of percent crop treated because of the
 reasons discussed above, and  because  there has been no information
 connecting  the treated crop to  its distribution in commerce and
 processing.
vi.  use of Data

     The following sections discuss how various types of data are
used in dietary exposure assessment.  First an overview of the
sequence of events in determining dietary exposure is presented,
followed by a more detailed discussion of how the various types of
data are used.

     A.   Anticipated Residue Determination: Sequence of Events in
          Determining Dietary Exposure

          Tolerances, as explained, often do not accurately reflect
actual residues likely to be found in ready to eat foods.  However,
an accurate prediction of likely crop residues is vital when
estimating dietary exposure to pesticide residues for the purpose of
risk assessment so that realistic risk estimates can be obtained.  To
this end, "anticipated residues" are determined.  An "anticipated
residue" is simply the best estimate of the pesticide residue likely
to be consumed.
     The sequence of events normally followed in developing dietary
exposure/anticipated residue estimates for pesticide chemicals is the
following:
(1)
(2)
Exposure assessment based on tolerance level residues

Reassessment of exposure using adjustments for the percent
of crop treated
                        24

-------
(3)(a)    Reanalysis of the residue field trial data to determine
          averages or upper limits on the residue levels, depending
          on th» toxip effect

   (b)    Adjustment, of the residue levels for the effects of
          washing, cooking, processing, storage and other factors
          depending on the available data
   (c)


   (d)

   (e)
   (f)
Use of existing monitoring data from FDA, USDA, the States,
etc., when available and reliable

Adjustments for the percent of crop treated

Reassessment of anticipated residues and comparison of
anticipated residues estimated from monitoring data and
field trial/degradation data (if both are available) to
determine consistency between the data sets, and if
inconsistent, which anticipated residues to use

Reassessment of exposure based on anticipated residues
determined in (3a) to (3e) above
(4)
Requiring monitoring or other studies to be carried out by
the pesticide registrant
     Exposure assessment is carried out in a stepwise manner in order
t.o assure that no unreasonable risk results from use of the pesticide
while not requiring the registrants to generate unnecessary data.  In
performing the sequence of events above, the process is stopped at
whatever point in the refinement of the exposure estimate that the
exposure results in acceptable risk levels.  Examples of some of the
calculations used in determining anticipated residues are presented
in Appendix 1.
     As a first step in estimating the dietary exposure to
pesticides, the Agency traditionally has assumed that residues would
                                  25

-------
 be at the tolerance level.   This  conservative  assumption  leads to
 unrealistically  high estimates  of dietary  exposure  (for chronic
 exposure)  for a  number' of  reasons.   For  example, pesticides are not
 always applied at the maximum rate or  minimum  PHI,  not all crops are
 treated,  and  residues on food at  the time  of consumption  often are
 significantly lower than the level of  residue  on the rac.  The latter
 is due to breakdown of the  pesticide during shipping and  storage, and
 other processing procedures such  as peeling, trimming, cooking, and
 canning which may reduce the pesticide residue.
      If the dietary exposure analysis  conducted using tolerance level
 residues  leads to an estimate of  dietary exposure which is considered
 to be acceptable,  then the  Agency does not attempt  to further refine
 the dietary exposure assessment.   However, if  the exposure estimated
 using tolerances  is of concern, tolerance  levels would be adjusted
 for percent crop  treated and the  exposure  would again be  estimated.
 Risk  management  decisions based on anticipated residues corrected for
 percent crop  treated are made considering  potential changes in the
 percent crop  treated as well as on the available pesticide
 alternatives.
      If estimations using tolerances and percent crop treated data
 show  exposure  to  result in  risk levels of  concern, anticipated
 residues would be  determined.
      Prior to  requiring submission  of  new  data, all available data
 would  be examined  for  its usefulness in  determining anticipated
 residues.  This would  include field  trial  data, processing studies,
 monitoring data,   feeding studies,  percent  crop treated data,
 information regarding  typical application  rates and methods, and any
 other  type of data  which would provide a more  realistic estimate of
 residues "at the plate".  If these data were determined to be
 adequate, a more accurate exposure estimate would be made based on
 the anticipated residues calculated  from these data.  Otherwise,
 additional data would  be required  of the registrant to maintain
registration of the  pesticide.   If the available data were considered
adequate to determine  anticipated  residues, and if exposure estimated
 from these anticipated residues were still of  concern, it then would
                                  26

-------
be determined whether additional data could provide a still more
accurate anticipated residue estimate which might indicate acceptable
risk.  If so, these data would be required of the registrant in order
to maintain the pesticide registration.  Otherwise, methods other
than improving the accuracy of the anticipated residues would be
utilized to mitigate the risk.
     For a typical exposure assessment consisting of one pesticide
and many commodities, anticipated residues are determined for each
commodity using either monitoring data or field trial/degradation
studies, depending on the data which are available for each
individual commodity (both monitoring and field trial data may be
used for different commodities in a single exposure assessment for a
pesticide).
     In some cases, anticipated residues determined from monitoring
data which are considered sufficiently precise, representative, and
free from bias, and which were generated in a manner such that the
residues seen are likely to reflect actual residues "at the plate",
are substantially different from anticipated residues determined
using other types of data.  This difference frequently can be
attributed to a lack of sufficient information regarding pesticide
degradation between harvest and consumption and the resulting
inaccuracy in the anticipated residue estimate based on field
trial/degradation data.  In these cases, the anticipated residue
estimate from monitoring data is considered more accurate and is
used.  If both the monitoring and field trial/degradation data are
considered adequate but give conflicting results which cannot be
attributed to some uncertainty in one of the data sets, the more
conservative estimate of the anticipated residue is used.
     When data necessary to determine anticipated residues are
required of the registrant, it is the registrants' responsibility to
develop an acceptable protocol, although the type of data needed may
be specified by EPA.  Registrants are encouraged to submit protocols
1:0 the Agency for review prior to the initiation of studies.
Additionally, to help assure that the registrants' resources are not
wasted on studies which will not be acceptable to the Agency, OPP has
                                  27

-------
drafted  "Acceptance Criteria"  for all types of residue studies.
These documents were prepared  as part of the Phase 3 Guidelines of
FIFRA 88.  These criteria must be met before the studies will be
accepted, (the studies may still be rejected for other reasons even
though they meet the"minimum requirements described in the
"Acceptance Criteria").
     The approach to estimating the anticipated residue generally is
governed by the type of data available for a given pesticide/crop
combination.  The types of data utilized are illustrated in Figure l
by a series of concentric circles in which the outer boundary
represents the highest permissible legal residue, and the center
reflects the actual residue intake by the consumer.  As.one nears the
center of the circle, the anticipated residue more realistically
estimates the actual residue intake.    Residue field trial and
processing data are available  for most pesticides in the tolerance
petitions.  Monitoring data, cooking studies, and studies of the
change in residues during transport and ambient storage generally are
not available in tolerance petitions.  Monitoring data typically are
available from FDA for pesticide chemicals which are capable of being
analyzed by FDA multiresidue or single-residue methodology.  These
include most chlorinated hydrocarbons, N-methyl carbamates, phenolic,
organophosphate and carboxylic acid-containing pesticides.
Monitoring data sometimes are  available from other sources including
the USDA, State agencies, and  registrants.
     The choice of the appropriate data bases to use for estimating
dietary exposure and the manner in which these data bases are treated
are issues which require considerable scientific judgment and are
decided oh a case-by-case basis.  In general, the database selected
must have sufficient information to determine the desired anticipated
residue with reasonable reliability.  This is discussed further in
Sections VI.A.(1)  and (2).
     A flow diagram depicting  some of the ideas discussed is shown in
Figure 2.  Also shown are generalized equations for determining
anticipated residues in plant  and animal commodities.
                                  28

-------
          1.   Monitoring Studies

               Monitoring data are the preferred source of data for
anticipated residue estimates, assuming sampling is representative
and sufficiently extensive, because these studies measure the residue
that actually is present in foods in the chain of commerce.  The
closer to the "dinner plate" the data are obtained, the more likely
the data will reflect realistic residue consumption.  Many factors
must be considered and weighed when determining the usefulness of
available monitoring data, and in designing a monitoring program;
formulation of a "cookbook" process for these purposes, which
includes all contingencies which might be encountered, is not
feasible.  Below we discuss some factors which must be considered
when determining the adequacy of monitoring data in determining
dietary exposure.
     Descriptions of the FDA Surveillance and Compliance Monitoring
Programs were provided in Section V.F. and are discussed in more
detail in Appendix 4.  As discussed, these programs were designed for
purposes other than dietary exposure assessment.  However, reliable
dietary exposure information can be obtained from these data in many
cases provided the limitations in the data base, which are discussed
below, are considered.
     An important consideration in determining the usefulness of any
monitoring data in dietary exposure is the geographical
representativeness of the data.  Determination of geographical
representativeness must be made on a case-by-case basis since crops
grown and pesticides used vary with location.  Since the location in
which a crop sample was grown generally is not available with FDA
monitoring data, absolute assurance of geographical
representativeness is not possible.  However, in many cases, the
Agency has concluded that FDA data were likely to be reasonably
geographically representative of pesticide residues in a commodity.
These conclusions were made considering the FDA collection districts
                                  29

-------
Figure 1;   Approach to Targeting Realistic Residues
           to  Use in Dietary Risk Assessment
                            Tolerance. Level
                                30

-------
Figure 2; Determination of Anticipated Residues  for  Quantitative
          Carcinogenic Risk
                         Best Estimate  from
                         Field or Monitoring
                         Study, Mean Residue
                                  (X)
                       % Crop Treated
                                   % Crop Treated
  Commercial Processing
           (Pf)
Commercial Storage
        (Sf)
               Food Preparation
                    (F)
                                    Fresh
                                    Market
Calculate Feed
     Intake
                         Meat, Milk,
                         Poultry, Eggs
          ARcrops = XF X %CT X Sf x  Pf X F

          ARc-op. = XM x Pf X F

          AR,^, = Feed intake  (corrected for  %CT)  x  residue
                    transfer  (obtained from  feeding studies)

     AR  = Anticipated Residue
     XF  = Average  residue from field trial or farmgate
monitoring
     XM  = Average residue  in monitoring studies, including
        non-detectable residues
     %CT = Percent crop treated
     Sf =  Storage  factor  - corrects for  change in residues
                            during  storage
     Pf =  Processing  factor - corrects  residue in raw
agricultural commodity
                            for  concentration  or
                            dilution  of residues  in processed
                            foods/feeds
     F   = Food preparation factor  -  corrects  for changes in
                            residues  during  food  preparation
                             (e.g. cooking, trimming,  etc.)
                                  31

-------
and  states  from which the samples were obtained.  First, the
collection  districts must represent those in which the commodities
are  known to be grown and could be treated with the pesticide.  If
major growing areas are  not  included, the data would be used only if
pesticide usage data "indicated that either the pesticide was not used
in those areas or that pesticide use in those areas was similar
enough to use in other represented areas so that the residue
information could be translated to the non-represented area.
Secondly, a sufficient number of samples from each collection
district must be available to assure the reliability of the
anticipated residues determined.  Again, the number of samples
required depends on the  crop being considered, as well as on the
percent of  that crop treated.  The number of samples needed also will
depend on the toxicological effect of concern since the number of
samples required for reliable assessment of chronic exposure will
differ from the number required for acute exposure.  In general, the
Agency requires analysis of a pesticide in at least 100 samples of a
particular  commodity in  FDA monitoring data before use of the data
will be considered.  Thirdly, consideration must be given to the
season or collection times of samples in each collection district.
If samples  were collected only at times when pesticide residues were
not  likely  to be found in a commodity, the data would have limited
usefulness.   Also, if a  large number of samples were obtained from a
specific local study, the data might not be representative of
residues throughout the collection district.
     Another important consideration in determining the usefulness of
FDA or other monitoring data in determining dietary exposure is the
analytical  methodology used.  Two factors are important: the limit of
detection (LOD)  and the chemical components which are measured by the
method.   The analytical method LOD must be sufficiently low to allow
unambiguous determination that the risk is acceptable.  In many cases
in which no detectable residues were found in a commodity, risks
estimated assuming non-detectable residues at the LOD, or even at 1/2
the LOD,  would be of concern.  Also, the method must measure all of
the components of the total toxic residue.  If only the parent
                                  32

-------
compound is determined, as is the case with some pesticides monitored
by FDA, a significant portion of the total residue may not be
measured and the data will have limited usefulness.
     If FDA or other monitoring data are determined to be adequate to
determine anticipated' residues based on the considerations discussed
above, anticipated residues could be determined for raw commodities,
processed commodities, animal products, or animal feeds.  Anticipated
residues are determined directly from the monitoring data for raw
commodities.  For processed commodities or animal feeds, anticipated
residues can be determined directly if adequate monitoring data for
the processed commodity or animal feed are available, or by
multiplying the anticipated residue for the raw commodity by the
concentration/reduction factor from processing studies available in
the tolerance petitions.  For animal products, anticipated residues
can be determined directly if adequate monitoring data are available
for these commodities or they can be determined by using anticipated
residues for animal feeds (determined from monitoring studies) in
conjunction with animal feeding studies (see Section V.E.).
     For a limited number of pesticides, monitoring data are
available from the U.S. Department of Agriculture (USDA) for animal
fat (or liver) and certain forms of eggs.  If a sufficient number of
samples are available, these data can be used to determine
anticipated residues in animal commodities in a manner similar to the
way FDA data are used to determine anticipated residues for raw and
processed agricultural commodities.
     Monitoring data from sources other than the FDA and USDA have
been used by EPA for dietary exposure assessments.  In some cases,
data generated by the registrants have been used.
     For monitoring to reflect real-world exposure it is important
that significant market disruptions have not occurred (8,11).  A
recent case where market disruption has occurred is Alar.  Longer
"erm monitoring will be necessary in these situations, and monitoring
should be continued until some time after normal use resumes, i.e.,
the market disruption is over, in order to obtain the most accurate
estimate of the anticipated residue.  It may be possible, however, to
                                  33

-------
 correct  the  data  for  the  effect  of  the market disruption, if the
 percent  of crop treated  is  accurately known both before and after the
 market disruption.
     The. discussion presented  above of the Agency's use of pesticide
 monitoring data for dietary exposure assessments provides general
 information  and guidance.   However, it must be emphasized that the
 adequacy of  the available data in dietary exposure assessment must be
 determined on a case-by-case basis  and requires considerable
 scientific judgment.  The process of dietary exposure assessment and
 use of monitoring data has  evolved  over the years and is continuing
 to evolve as additional degradation, monitoring, usage, and
 consumption  data become available.  Recent changes include the move
 towards  determining anticipated  residues rather than using tolerance
 levels,  and  towards the development of more statistically sound
 approaches to use of  these  data. . Statistical issues in the use of
 existing monitoring data and in  the design of monitoring studies for
 the purpose  of dietary exposure  assessment are presented in
 Appendices 4 and 3 respectively.

          2.   Residua Field Trial  and Degradation/Reduction Studies

               As stated earlier, residue degradation/reduction
 describes any change  in the  amount  and/or composition of the total
 toxic residue from harvest  to  consumption.  Numerous factors must be
 considered including  field  preparation, storage and transport (which
 can occur at several  points  between harvest and consumption),
 commercial processing (bottling, canning, cooking, drying, shelling,
 fractionation, extraction,  deodorizing, and many other processes),
 and home preparation  (peeling, washing, various types of cooking,
 etc.).   A commodity may follow any  of several pathways between
 harvest and consumption.
     The Agency has not before considered residue
degradation/reduction, as defined here, in a possible design for a
single study to determine anticipated residues, although data for the
s.eparate components (e.g. commercial processing) are used frequently
                                  34

-------
to determine tolerances and anticipated residues.  Descriptions of
the major residue degradation/reduction processes used in determining
anticipated residues have been provided in Section V.D.  (Commercial
Processing Studies), V.G. (Residue Degradation/Reduction Studies),
and VI.A. (Anticipated Residue Determination: Sequence of Events in
Determining Dietary Exposure).  Specific information regarding use of
these data,  as well as a preliminary discussion of the design of
degradation studies, are presented below.
     The data the Agency uses most frequently in determining
anticipated residues are field trial data, storage stability data
(frozen rather than ambient storage temperatures), commercial
processing studies, and feeding studies, as well as percent crop
treated data.
     The first step in anticipated residue determination by this
method is analysis of field trial data to determine either an average
or upper bound residue level  (see Section IV) reflecting the
registered use which would lead to the highest residues.  These
values are determined for each crop/commodity.  More than one value
may be obtained for a particular crop if the crop is known to be
treated in different ways and if sufficient information is available
to relate the different pesticide use patterns and residue levels to
different residue consumption by population subgroups.  As discussed
for monitoring data, the analytical method detection limit can limit
the usefulness of the residue data, particularly if all or a large
portion of the residues are not detectable.  If the limit of
detection (LOD) is too high, estimated risks may be unacceptable even
assuming non-detectable residues are at the LOD (or at 1/2 the LOD).
     The second step in this process is the incorporation of percent
crop treated data (for chronic risks only).  The average residue
determined from the field trial data generally is multiplied by the
percent crop treated for each commodity to obtain a residue value
which reflects an aggregate population exposure.  Using percent crop
treated in a dietary exposure assessment artificially "spreads" the
exposure over the entire U.S. population.  Higher consumption of
treated commodities by some population subgroups is addressed
                                  35

-------
separately if adequate data are available to make these evaluations.
Chronic dietary exposure analyses generally are done using percent
crop treated data for two reasons.  First, adequate pesticide usage
data and consumption data rarely are available which would allow
determination of dietary exposure to highly exposed subgroups.
Secondly, since the registered uses leading to the highest residues
are used to determine average residues, conservatism already is
incorporated into the anticipated residue determination.  Compounding
the conservatism already incorporated into the toxicological
reference values and residue levels with the additional conservative
assumption of 100% crop treated would lead to risk estimates which
exaggerate the aggregate U.S. population risk and would also likely
exaggerate the risks to many of the more highly exposed population
subgroups.  If adequate data are available to estimate exposures for
highly exposed population subgroups, these data are considered in the
risk assessment / risk management process.
     Percent crop treated data are used for raw and processed
agricultural commodities as well as for animal feeds (prior to
determining the dietary burden for the animal).  Two dietary burdens
frequently are calculated for dairy cattle reflecting animal
consumption of (1) feed items which contain high residues but are fed
only in limited geographical areas ("local milk shed" diet), and (2)
major feed items consumed in many parts of the country ("typical
national diet").  Two sets of average residues in milk are calculated
which show average residues which might be found in particular
localities as a result of feeding high-residue, locally-grown animal
feeds which have limited importance on a national basis, and national
average residues likely to be found in animal commodities resulting
from feeding cattle major national feed items.  This approach is
important for fresh milk since milk generally is shipped short
distances prior to consumption.
     When a range of percent crop treated values is provided, the
highest value (most conservative) is used.
     The use of field trial/percent crop treated data does not
account for exposures from imported commodities.  However, monitoring
                                  36

-------
data are available from FDA for many commodity/pesticide
combinations.  The issue of anticipated residue determination for
imported commodities requires further discussion.
     Storage stability data (frozen storage) are required in
tolerance petitions in order to assure that the pesticide residues in
crop samples from residue field trials are stable for the length of
time that the samples are stored prior to analysis.  Some change in
the quantity or composition of the pesticide residue frequently is
seen during frozen storage.  These data are used for two purposes.
First, the data are used to correct the results of the residue field
trials for any possible degradation during frozen storage.  Secondly,
the data can be used to provide an estimate of the minimum likely
pesticide degradation under ambient storage conditions if the storage
times are known for the commodity.
     Commercial processing studies also are required in tolerance
petitions if residues could concentrate in processed fractions (see
Section V.D.).  The concentration/reduction factors determined in
these studies are used differently in determining anticipated
residues for chronic and acute risks.  For acute risk, the maximum
concentration/reduction factors found in the processing study are
used to determine anticipated residues in the processed fraction.
For chronic risks, the average concentration/reduction factors are
used.
     Other types of studies have been used in determining anticipated
residues.  The effects of washing, peeling, and trimming have been
incorporated into some dietary exposure assessments.  In some cases,
conversion of residues during cooking has been an issue as in the
cases of alar (UDMH)  and EBDCs (ETU).  When the degradation product
is more toxic than  the parent compound, the Agency has used 100%
conversion as the first approximation of residues of the degradation
product on cooking.  Reduction of residues on cooking has also been
considered for some commodities for which processing studies were
available such as apples and tomatoes.
     Part of the difficulty in arriving at an accurate dietary
exposure estimate for pesticide residues at the time of consumption
                                  37

-------
is the variety of methods that may be used in food preparation and
the fact that very few commodities are eaten individually.  For
example, soup and cake'consist of a mixture of commodities.
Nevertheless, information on the effect of trimming, peeling,
washing, cooking (boiling, baking, frying) may be used in arriving at
anticipated residues.  The information is most useful if the studies
correspond to the appropriate ORES food forms.
     For additional guidance regarding field trial, residue
degradation/reduction, and processing studies see Subdivision O of
the Pesticide Assessment Guidelines and associated Addenda and
Standard Evaluation Procedures (10) and Appendices 4 and 5.
                                  38

-------
REFERENCES  (Includes references  for background reading as well as
those specifically cited  in these guidelines)

1.  Guidance on "Anticipated Residues and Their Use  in Risk
Assessment. C. L. Trichilo, Ph.D., Chief, Residue Chemistry Branch,
to RGB Reviewers, December 2,  1987,
                    \
2.  Guidelines for Predicting  Dietary Intakeof Pesticide Residues.
WHO draft,  1988.

3.  CodexLimits for Pesticide Residues in Foodand Consumer Safety.
Codex Aliraentarius Commission, FAO, WHO, Feb., 1986.

4.  Techniques for Deriving Realistic Estimates of Pesticide Intakes.
J. P. Frawley and E. Duggan, Advances in Pesticide Science. Part 3,
Pergamon Press, Oxford and New York, 1979.

5.  Guidelines for Developing  Pesticide Residue Data in Foods and
Consumed. Canadian Food Directorate, Ottawa, July 22, 1988.

6.  Recommended Approach  to the  Appraisal of Risks to Consumer from
Pesticide Residues in Crops and  Food Commodities. J. A. R. Bates and
S. Gorbach, ((c) IUPAC),  1987.

7.  Methodology for Estimating the Dietary Intake of Pesticide
Residues. R. D. Schmitt and M. J. Nelson, EPA, Washington, DC, 1981.

3.  Residue Data for Risk Assessment. C. L. Trichilo, Ph.D., Chief,
Residue Chemistry Branch, speech given to National Agricultural
Chemicals Association, October 16, 1987.

9.  Introduction to the Tolerance Assessment System. D. Stephen
Saunders, Ph.D., EPA and  B. Petersen, Ph. D., Petersen and
Associates, March, 1987.

10.  Pesticide Assessment Guidelines. Subdivision 0: Residue
Chemistry,  EPA, October,  1982, and,

Addenda Nos. 1-8 on Data  Reporting:

     Addendum 1:  Product Chemistry (EPA-540/09-88-048,  NTIS No.
                  PB88-191705
     Addendum 2:  Storage Stability Study (EPA-540/09-86-151,  NTIS
                  NO.  PB86-248192)
                  Analytical Methodfsi  (EPA-540/09-86-151,  NTIS No.
                  PB86-248192)
                  Magnitude of the Residue;  Crop  Fields  Trials (EPA-
                  540/09-86-151,  NTIS No.  PB86-248192)
                  Nature  of the Residue;  Plants (EPA-540/09-87-199,
                  NTIS  No. PB87-208641)
                  Magnitude of the Residue;Processed Food/Feed Study
                  (EPA-540/09-88-004,  NTIS No.  PB88-117270)

                                  39
Addendum 3

Addendum 4

-------
     Addendum  5:  Specialty Applications (I) Classification of Seed
                  Treatments and Treatment of Crops Grown for Seed
                .-Use Only as Non-Food or Food Use, (II) Magnitude of
                  the Residue: Post-Harvest Fumigation of Crops and
                  Processed Foods and Feeds, (III) Magnitude of the
                  Residue: Post-Harvest Treatment  (Except Fumigation)
                  of'Crops and Processed Foods and Feeds (EPA-540/09-
                  88-008, NTIS No. PB88-124003)
                  Directions for Use (EPA-540/09-88-049, NTIS No.
                  PB88-191713)
                  Metabolism (QualitativeNature of the Residue);
                  Food Animals (EPA-540/09-89-009, NTIS No. PB89-
                  124598)
                  Residues in Meat.  Milk. Poultry and Eacrs; Livestock
                  Feeding Studies (EPA-540/09-89-010, NTIS No. PB89-
                  124606)
Addendum 6

Addendum 7


Addendum 8
Hazard Evaluation Division  for  Health  Effects  Division)  Standard
Evaluation Procedures;

     Product Chemistry  (EPA-540/09-86-143)
     Directions for Use  (EPA-540/09-86-144)
     Qualitative Nature  of  the  Residue: Plant  Metabolism (EPA-540/09-
     88-102)
     Metabolism in Food  Animals: Qualitative Nature  of  the  Residue
     (EPA-540/09-89-061}
     Analytical Method(s)  (EPA-540/09-89-062)
     Magnitude of the Residue;  Crop Field Trials  (EPA-540/09-85-021)
     Magnitude of the Residue;  Processed Food/Feed Studies  (EPA-
     540/09-86-145)
     Residues in Meat. Milk.  Poultry and Eaas;  Feeding  Studies/Feed-
     throuqhs
     Residues in Meat. Milk.  Poultry and Eggs:  Dermal Treatments
     (EPA-540/09-092)
     Specialty Applications:  (I) Classification of Seed Treatments
     and Treatment of Crops Grown for  Seed Use Only  as  Non-Food or
     Food Uses  (2) Magnitude of the Residue;  Post-Harvest  Fumigation
     of Crops and Processed Foods and  Feeds   f3)  Magnitude  of  the
     Residue; Post-Harvest  Treatment (Except Fumigation)  of Crops and
     processed Foods and Feeds  (EPA-540/09-86-142)

11.  Collection and Use  of  Residue Data. C. L.  Trichilo,  Ph.D.,
Chief,  Residue Chemistry Branch, speech given  at  Conference for the
Food Industry Forum on Pesticide Residues, November  18,  1987.

12.  Residues in Foods - 1990.  JAOAC 74, Sept/Oct  1991,  FDA.

13.  FDA Research Program:  Pesticide  Analytical  Methodology.  FDA,
Oct., 1988.

14.  The FDA Pesticides  Monitoring Program. Reed, D. V.,  Lombardo,
                                  40

-------
P., Wessel, J. R., Burke, J. A., and McMahon, B., J. Assoc. offie.
Anal. Chem. 70; 1072-1081 (1987).

15.  Interim ReportNo. 1; The Construction of a RawAgricultural
Commodity Consumption Data Base. White, S.B., Peterson, B.J,.,
Clayton, C.A., & Duncan, D.F., Research Triangle Institute, April 27,
1983

16.  Interim report; Issues Concerning the Development of Statistics
for characterizing "Anticipated Actual Residues". Clayton, C.A.,
Peterson, B.J., & White, S.B., Research Triangle Institute, June 5,
1984

17.  Documentation of Analysis Files and Statistical methods in-the
Tolerance Assessment System, prepared for the Environmental
Protection Agency by Research Triangle Institute, not dated

18.  Documentation of the Food Consumption Files used in the
Tolerance Assessment System, Alexander, B.V., & Clayton,  C.A.,
Research Triangle Institute, December, 1986

19.  An Independent Assessment of Food Consumption Estimates Used in
the Tolerance Assessment System, prepared for the Environmental
Protection Agency by SRA Technologies, February, 1986

20.  Issues in the Use of Crop Residue Data to Estimate Human
Exposure to Pesticides. Norwood, Charles R., Radix data Inc., August
31, 1986

21.  Memorandum from Richard D. Schmitt to 0. E. Paynter, 1977

22.  Memorandum from Richard D. Schmitt to Acting Chief, Toxicology
Branch, 5/1/78

23.  The FDA Pesticide Program: Goals and New Approaches, Lombardo,
P., J. Assoc. Off. Anal. Chem.. 72(3), 1989

24.  Tolerance Assessment System Food Form Codes, Chaisson, C.F., and
Peterson, B.J.

25.  Pesticide Usage Data. Faulkner, J., 12/20/89 Summary Paper

26.  Agricultural Statistics, 1989, United States Department of
Agriculture, United States Government Printing Office, 1989.

27.  1987 Census of Agriculture. U.S. Department of Commerce, Bureau
of the Census, 1987.

28.  Federal Food, Drug, and Cosmetic Act. As Amended. 1980  (FFDCA)

29.  Federal Insecticide, Fungicide, and Rodenticide Act, as Amended
fl988) (FIFRA 1988}

                                  41

-------
              APPENDIX 1







EXAMPLES OF CALCULATIONS TO DETERMINE




 TOLERANCES AND ANTICIPATED RESIDUES

-------
     The following three examples illustrate determination of
tolerances or anticipated residues from field trial, processing,
or monitoring data.  These examples represent specific
applications of the more general methods discussed in the body
and appendices of these guidelines.  Depending on the usage,
chemistry, or hazard of a particular pesticide, additional data
may be required.  Scientific judgement is necessary to determine
the adequacy of available data.  We note that proposed Guidelines
for the Collection of Residue Data for Acutely Toxic Pesticides
are currently being prepared.  These guidelines will effect the
manner in which anticipated residues for acutely toxic pesticides
are determined.

Example 1:     Tolerance/Anticipated Residue Determination Using
               Field Trial/Degradation Data (Residues of
               Pesticide A in Grapefruit)

     Pesticide A has been proposed for registration on
grapefruit.  Grapefruit are grown primarily in AZ (4%), CA (14%),
FL (76%), and TX (6%).  (Percent of total production is shown in
parentheses).  Ten field trials were conducted at the usage level
likely to produce maximum residues,  1 in AZ, 2 in CA, 6 in FL,
and l in TX.   The number of field trials per geographic area was
roughly proportional to production.   The residue levels reported
in the field trials at the requested PHI (pre-harvest interval)
of 7 days were as follows (averages of triplicate analyses):

     State          Residue Level (ppm)
     AZ
     CA
     FL
     TX
2.11
3.76
1.23
0.76
1.35
1.86
1.57



0.84
0.97
1.05

                               1-1

-------
The maximum residue reported was 3.76 ppm, and the average
residue, 1.55 ppm.  The 95th percentile residue level would be
3.33 ppm assuming a normal distribution of the data.  Processing
studies also were conducted.  The following results were
reported.

              Results  of Grapefruit  Processing Study
     Commodity
Concentration/Reduction Factor
     Average        Maximum
     pulp  (not required) 0.07
     dried pulp          1.5
     peel                2.5
     oil (refined)      <0.003
     molasses            0.3
     juice              <0.003
                    0.10
                    2.0
                    3.0
                   <0.003
                    0.5
                   <0.003
     Determination of tolerance level.  Based on the results of
the residue field studies and the processing study, tolerance
levels would be determined.  Chemistry Branch would recommend a
tolerance level for grapefruit of 4 ppm (round 3.76 ppm up to a
single digit of 4 ppm).  In the processing study, concentration
of residues was observed in dried pulp and peel.  A feed additive
tolerance of 8 ppm would be required for dried citrus pulp, which
is obtained by multiplying the 4 ppm tolerance for grapefruit by
the maximum concentration factor of 2.0.  Although concentration
is observed in grapefruit peel, no food or feed additive
tolerance would be required since citrus peel is not defined as a
processed commodity in Subdivision O (Residue Chemistry) of the
Pesticide Assessment Guidelines.  Since no concentration was
observed in grapefruit oil, molasses, or juice, no food additive
tolerances would be required and the 4 ppm tolerance for the raw
agricultural commodity would apply to these processed comodities.
These tolerance values then would be provided for a Dietary Risk
Evaluation System (DRES) analysis to make a "first cut" risk
estimate to determine the acceptability of the proposed use of
the pesticide.

                               1-2

-------
     Determination of anticipated residues based on field
studies.  The anticipated residues for grapefruit and its
processed commodities for quantitative carcinogenic risk
assessment would be determined by multiplying the average residue
level in grapefruTt found in the field studies by the average
concentration/reduction factor found in the processing study.
Grapefruit  (edible portion)
Grapefruit oil
Grapefruit juice
               1.55 ppm x 0.07 = 0.10 ppm
               1.55 ppm x <0.003 = <0.005 ppm
               1.55 ppm x <0.003 = <0.005 ppm
     For a pesticide which already was registered, the
anticipated residues determined above for quantitative
carcinogenic risk assessment could be multiplied by the percent
of crop treated.  This operation is normally done within the
ORES.
     For other chronic effects, the anticipated residue for
grapefruit and its processed commodities would be the same as
those determined for carcinogenic risk since field trial data are
used in this example.
     For acute exposures, the anticipated residue for grapefruit
and its processed commodities would be determined by multiplying
the maximum residue level in grapefruit found in the field
studies by the maximum concentration/reduction factor found in
the processing study.
Example 2:
Anticipated Residue Determination Using Monitoring
Data (Pesticide B in Grapes)
     Since pesticide A above is being proposed for registration,
and therefore is not registered already, no monitoring data would
be available from any source and therefore could not be used for
the determination of anticipated residues.  Pesticide B, which
already is registered, will be used as an example for the use of
monitoring data in determining anticipated residues.  Pesticide B
                               1-3

-------
• has been registered for many years for use on grapes.   A
 tolerance has been established for residues on grapes  at 2 ppm
 and a food additive tolerance has been established for raisins at
 10 pptn.   Mo degradation studies are available for pesticide B on
 grapes.   The 2 ppm "tolerance is based on residue levels in field
 studies  where the maximum residue was 1.8 ppm and the  average was
 0.5 ppm.   Pesticide B can be analyzed by the Luke method (one of
 the FDA  multiresidue methods).   Consequently, FDA has  analyzed
 330 samples of grapes for pesticide B in the past 3 years.  Since
 a variety of sampling districts was associated with the samples
 reflecting the major growing areas and various sampling times,
 and since several varieties of grapes were examined, we conclude
 that these samples likely are adequately representative
 geographically and in terms of the types and rates of  pesticide
 application which may be employed.  The average residue reported
 by FDA in their surveillance monitoring was 0.2 ppm, the maximum,
 1.0 ppm,  and the 95th percentile,  0.8 ppm.  Based on the FDA
 monitoring data,  Chemistry Branch would use 0.2 ppm as the
 anticipated residue for quantitative carcinogenic risk
 assessment,  0.8 ppm as the anticipated residue for other chronic
 effects,  and 2.0 ppm (tolerance level)  as the anticipated residue
 for acute exposure.   The difference between the results of the
 monitoring study and the field study can be attributed to a
 number of factors including use of application rates and PHIs
 other than the maximum used in the field trials,  and degradation
 of residues between harvest and consumption.

 Example  3i      Determining Tolerances and Anticipated  Residues in
                Animal commodities

      Tolerances have been established for residues of  pesticide A
 on apples and citrus at 5 ppm.   Processing studies have been
 conducted.   The maximum concentration factors for Pesticide A are
 5x in apple pomace,  2x in citrus pulp,  and 0.5x in citrus
 molasses  (reduced by 2x).  Feed additive tolerances are required
                                1-4

-------
as follows:  25 ppm for apple pomace  (5 ppm  x 5x);  10 ppm  in
citrus pulp  (5 ppm x 2x).  No feed additive  tolerance is needed
for citrus molasses; since concentration was not  observed.  The
tolerance for the rac will adequately cover  residues in citrus
molasses.
     The dietary burden for beef cattle is estimated as follows:
Feed item
apple pomace
citrus pulp
citrus molasses
% in diet      residue (ppm)
     50%            25
     33%            10
     15%             5
          Total dietary burden =
          dietary burden
               12.5
                3.3
                0.75
               16.6 ppm
A livestock feeding study was conducted with feeding levels of
10, 30, and 100 ppm of pesticide A in the diet of dairy cattle.
Three cattle were dosed at each level.  The average residue of
pesticide A found in tissues was as follows:
Tissue
muscle
liver
kidney
fat
milk
          Feeding level
     10 ppm    30
100 ppm
<0.05
0.052
0.056
<0.05
0.075
<0.05
0.167
0.178
<0.05
0.223
0.15
0.652
0.703
0.053
0.825
For muscle, the estimated residue for a dietary burden of 16.6
ppm would be 0.15 ppm residue  x 16,6 => 0.025 ppm, which would be
                                100
the anticipated residue for quantitative carcinogenic risk
assessment.- The tolerance would be set at 0.05 ppm, the limit of
detection of the analytical method.  For livex, the anticipated
residue would be based on linear regression or interpolation.
The linear regression equation is y = ax + b, where a is the
slope of the dose response.curve and b is the y intercept, which
should be equal to zero.  For  liver, a - 0.0067 and b = -0.024,
with a correlation coefficient of 0.999.  The estimated residue
                               1-5
                                                                 $ 3

-------
then would be 0.09 ppm  in liver.  The tolerance for pesticide A
in liver would be set at 0.1 ppm.  Residue estimates in the other
tissues would be determined in a similar manner.  For milk, a
different diet for dairy cattle would be used to estimate the
dietary burden because  dairy cattle and beef cattle eat different
diets.
     In the example given above for the determination of
tolerance and anticipated residue levels, it is not likely that
the same livestock would consume both apple pomace and citrus
processed products because apples and citrus generally are not'
grown in the same area.  Consequently, we could base the
livestock dietary burden on residues in apple pomace only.  Note
that this would still be a conservative choice since residues in
processed products of citrus are lower than residues in processed
products of apples.
                               1-6

-------
            APPENDIX 2.




MOVEMENT OF COMMODITIES IN COMMERCE

-------
2.1   Introduction

      An understanding of the movement of commodities in commerce
is important to estimating pesticide residue levels in foods at
the time they are eaten because levels may be affected by how the
foods are transported, stored, and processed.  The purpose of
this  chapter is to provide an overview of the movement of
commodities in U.S. commerce, with an emphasis on the geographic
distribution, storage, and wholesale/retail marketing of fresh
and processed fruits and vegetables.
     Data have been collected from a variety of sources including
federal agencies, trade groups, books, and professional journals.
     This chapter is arranged in three sections.  Section 2.2
addresses questions about where fresh and processed commodities
are produced and consumed.  Section 2.3 addresses questions about
how and for how long fresh and processed commodities are stored.
Section 2.4 addresses questions about distribution points through
which fresh and processed commodities typically pass on their way
from growers to consumers.  Information about how different
varieties of commodities may differ with respect to
transportation, storage, etc. is provided when available in all
three sections.
2.2  Production and Regional/Local Distribution Information

     This section addresses questions about how much produce is
consumed in fresh versus processed forms, and where fresh and
processed commodities are produced and consumed.
     2.2.1
Fresh and Processed Produce
     Based on consumption data estimated by the U. S. Department
of Agriculture's Economic Research Service, (USDA/ERS) and
published in Food Consumption, Prices, and Expenditures, 1966-
1987 (USDA, 1988(b)), per capita U.S. consumption in 1987 of all
                               2-1

-------
fresh fruits totaled about  99 pounds.  This  compares  to  about  9
pounds of canned and chilled fruit,  4.1 pounds of  frozen fruit,
and 3.1 pounds of dried  fruit  (mainly raisins).  Americans also
consume about 20 pounds  of  canned,  5 pounds  of chilled,  and 40
pounds of frozen fruit juices each  year.  Estimated per  capita
consumption of individual fresh and processed fruits  are
available from the same  data.  These data should be valuable in
establishing the proportions of individual fruits  produced and
consumed fresh and in various processed forms.
     Processed vegetable consumption exceeds fresh consumption by
about one-third.  In 1987,  per capita fresh, frozen,  and canned
vegetable consumption (excluding potatoes) was about  79  pounds,
17 pounds, and 87 pounds, respectively.  Fresh and frozen
vegetable consumption is apparently on the rise over  time, at the
expense of canned vegetables.  Certain vegetables  like corn and
tomatoes are predominantly  processed, while  others like  carrots
and lettuce are predominantly consumed fresh.  The ERS/USDA data
show per capita consumption of processed forms for individual
commodities, including carrots, celery, corn, lettuce, spinach,
tomatoes, and other vegetables.
     Farm-weight basis potato consumption in 1987  was 124 pounds
per capita.  Processed potato consumption exceeds  fresh
consumption significantly.  All in  farm-weight basis, the total
includes approximately 48 pounds fresh, 46 pounds  frozen, 18
pounds in chips, 10 pounds  dehydrated, and 2 pounds canned (USDA,
1988(b}).
     Another source of information  on the disposition of
harvested fruits and vegetables is  the Preserved Fruits  and
Vegetables Industry Series  of the 1982 (and preliminary  1987}
Census of Manufactures (U.S. Department of Commerce,  1982).
Table 7 of the series is the "materials consumed by kind11 table.
It shows the guantities  and costs of various individual  fruits,
vegetables, sugar,  meats, and other  ingredients utilized  by SIC
(Standard Industrial Classification).  The industries covered are
Canned Specialties (SIC  2032), Canned Fruits and Vegetables (SIC
                               2-2

-------
2033), Dehydrated  Fruits, Vegetables, and soups  (SIC 2034),
Pickles, Sauces, and  Salad  Dressings  (SIC 2035), Frozen Fruits
              j*»
and Vegetables  (SIC"2035),  and Frozen Specialties  {SIC 2038).
The data show,  for example, that the  "canning industry" (SIC
2033) utilized  1,639,200 short tons of sweet corn  in 1982, while
the "freezing industry"  (SIC  2037) utilized 792,000 short tons of
the same commodity.   These  data could be used to estimate the
shares of selected commodities canned, frozen, or  otherwise
processed.
     The Almanac of the Canning, Freezing, and Preserving
Industries  (Judge, 1989) is another valuable source of
information on the production and disposition (fresh and
processed) of fruits  and vegetables.  Data are presented on
harvested acreage, yield per acre, and production  tonnage for
1979-1988 by commodity  (and sometimes by variety), by state, by
year, and sometimes by processed form.  These data show, for
example, that 601  thousand  tons of snap beans were processed, and
that 31 percent were  frozen and 69 percent were canned in 1988.
They also show the production of processed snap beans in each
state each year from  1979-1988, but do not indicate quantities
frozen and canned  by  state.  Similar  information is available for
many other fruits  and vegetables.  Some varietal information is
available from the same source (e.g. temple, Valencia, navel, and
other oranges).
     2.2.2
Fresh Produce Production and Distribution
     Data on acres of production and quantities harvested by
state are available for many individual crops in the Census of
Agriculture (U.S. Department of Commerce, 198.7) and Agricultural
Statistics (USDA, 1988(a)).  The former source also has this
information by county.
     Information on the production and distribution of fresh
produce (fresh fruits and vegetables) can be obtained or derived
from data collected by the USDA's Agricultural Marketing
                               2-3

-------
  Service's  (ARS)  Fresh Fruit and Vegetable Division.   Four
  publications  are especially valuable.
       Fresh Fruit and' Vegetable Shipments (USDA,  1989 (a))  reports
  container  net weights for commodities  and weights of domestic
  rail,  truck,  piggyback,  air and boat shipments of fresh fruits
  and  vegetables by state,  commodity,  and month.   One  can
  determine,  for example,  that 117,000 cwt (hundredweight)  of
  oranges  were  moved by truck from Texas producers in  March of
  1988,  or that 230,000 cwt of carrots were shipped by rail from
  California producers in  November of  198B.   Data are  reported for
  all  states and all months for each of  about 80 specific
  commodities.   The data are also available by variety for some
  varieties  of  some commodities (e.g.  iceberg v.  romaine lettuce,
  and  bell v. "other" peppers),  though the varietal information is
  very limited.   Imports of individual commodities by  country of
  origin and by month are  also reported".   Truck and air shipments
  and  exports for all commodities are  not available; therefore,
  those that are reported  should not be  interpreted as complete.
       Edwards  (1989)  reports that it  is very reasonable to assume
  that the state of origin in the "Shipments" data is  the state
  where the  commodity was  harvested. Very little commodity is grown
  in one state  and shipped from another.   Edwards also reports that
  the  month  of  shipment coincides closely with the month of
  harvest, since only a few commodities  of any significance (e.g.
  potatoes and  apples)  can be stored very long.   Consequently,
  Edwards  believes that the Shipments  data are the best single
  source to  determine which commodities  are harvested  where (state)
  and  when (month).   USDA  has not formally researched  the coverage
  or accuracy of the Shipments data.   Edwards reports  that
  shipments  are reported by all large  produce shippers,  most
  medium-size shippers,  but relatively few small shippers.
       The USDA also publishes data on arrivals at 20  U.S.  cities
  of fresh fruits and vegetables by month,  by state of origin,  and
  by several modes of transportation (rail,  truck, and air)  (USDA
.  1989 (b)).  The USDA data show,  for  example,  the quantity of
                                 2-4

-------
 fresh  apples that arrived by truck  in  Chicago,  IL  each  month  in
 1988 from each  of 15  states  (plus from 3  foreign countries).
 Data are  provided for about  75  specific commodities  (including
 tomatoes,  potatoes, spinach,  apples, and  bananas)  arriving  in the
 cities of Chicago,^Dallas, Denver,  Colorado,  Los Angeles, New
 Orleans,  St.  Louis, San  Francisco/Oakland, Atlanta,
 Baltimore/Washington  DC,  Boston, Detroit,  Buffalo, Cincinnati,
 Columbia  (SC),  Miami,  New York  City/Newark, Philadelphia, and
 Pittsburgh.   Like the shipments data,  the state of origin is
 supposed  to  be  (and believed by Edwards to be)  the state where
 the commodity was harvested.  These publications also report
 container net weights for commodities.
     Beilock  and  Portier  (1989) report that the arrival data  from
 USDA (USDA 1989  (a) and  (b))  offer  the best source (with
 limitations)  of  information  on  the  geographic distribution  of
 fresh  fruits  and  vegetables.  The cities  at which  arrivals"  data
 are gathered  act  as distribution points;  however,  there are no
 data on the distribution  of  the fresh  produce from these cities.
 They demonstrate  a fairly straightforward method to  determine the
 amount of  each  commodity  shipped from  each of the  50 states to
 each region of  the country (South,  Northeast, Lakes, and West).
     Edwards  (1989) reports  that the metropolitan  areas covered
 in the Arrivals data  have a  total population  of about 80 million
 people.   Based  on this and other information  along with
 reasonable assumptions, Edwards believes  that about  30  percent of
 all fresh  fruit and vegetable "receipts"  are  reflected  in the
 data.
     It should  be possible to use the  Shipments and  Arrival data
 to determine, subject  to  data inaccuracies, whether  regions (and
 possibly  states)  tend  to  consume "local"  produce.  One  would
 first  rely on the shipments  data to estimate  any given  state's
 production of a single commodity (or any  group  of  commodities).
 Next,  one would rely  on the  arrivals data  to  determine  the  origin
 of this same  commodity (or group of commodities) arriving in  a
particular city.   If the  share  of a commodity (group of
                                2-5

-------
commodities) arriving in a city originating in-state is greater
than the share of the total U.S. production originating in that
state, that would be' evidence that the commodity  (group of
commodities) tends to be locally consumed in that state.  By
examining a group -of commodities across a number of cities over a
period of time, one could make generalizations about fresh
produce in question and answer commodity-specific questions.
     Alternatively, one can examine fresh-produce arrivals in a
given location through the year to determine whether in-state
arrivals increase during local harvest months.  As an example,
consider tomato arrivals in Chicago.  In all of 1988, 54,900 tons
of tomatoes arrived in Chicago.  Of this total, 0.5 percent
originated in Illinois.  However, all of these "local" tomatoes
arrived in just three months: July, August, and September.
Taking this three-month period alone, 1.7 percent of tomato
arrivals in Chicago originated in-state.  This is evidence that
at least one fresh commodity (tomatoes) is locally consumed, when
available, in at least one area  (Chicago).
     Another USDA publication, "Economic Indicators of the Farm
Sector: State Financial Summary, 1988"  (USDA 1989(c)), may prove
useful in establishing the origins of commodities, but not their
distribution.  It shows 1988 farm-level cash receipts in each
state by commodity groups {e.g. "vegetables", "fruits", "poultry"
and "milk") and by commodity for some selected commodities (e.g.
potatoes, lettuce, grapefruit, chickens, and turkeys).  In states
where receipts warrant it, there is even greater detail.  In
California for example, data are reported individually for
winter, spring, summer, and fall potatoes; spring and summer
onions, freestone and clingstone peaches, etc.  Amounts are given
in dollar values rather than weight or acreage.
     A final valuable source of information on raw production by
state, commodity, and year is the Raw Product Statistics section
of the 1989 Almanac of the Canning, Freezing, and Preserving
Industries (Judge, 1989).  While most of the information
                               2-6

-------
contained therein is available from the USDA, it is well
organized and readily accessible in the Almanac.
     2.. 2.3
Processed Produce Production and Distribution
     Kohls and Uhl  (1985) describe the fruit and vegetable
processing industry as canners, bottlers, dehydrators, and
freezers.  These activities take place primarily in three 4-digit
SIC industries:  canned fruits and vegetables  (SIC 2033),
dehydrated fruits, vegetables, and soups  (SIC  2034), and frozen
fruits and vegetables (SIC 2037).  Fruit and vegetable products
traditionally bottled rather than canned  (e.g. catsup, chili
sauce, jams, jellies, etc.) are also classified in SIC 2033.
Relatively small quantities of fruits and vegetables are
processed in SIC 2032 (canned specialties), SIC 2035  (pickles,
sauces, and salad dressingsX, and SIC 2038  (frozen specialties).
No more than 4 percent of all fresh fruits and vegetables
destined for processing are shipped to any one of these latter
three industries (U.S. Department of Commerce, 1982 Census of
Manufactures).
     The processing industry for each fruit and vegetable
commodity tends to be located in areas with a  comparative
advantage in production of that crop (Carman and French, 1988).
Rhodes (1987) reports that the typical midwestern vegetable
processor procures mainly within a 50-mile radius of the plant.
Lopez and Henderson (1989) conducted a survey  of food processing
plants and found that proximity to raw agricultural commodities
was a "very important" or "important" location factor for about
three-quarters of the respondents.
     Hodgen (1989) confirms that fruit and vegetable processing
facilities are located very near growers.  He  states that fruits
and vegetables destined for canning or freezing are generally
processed within 24 hours of harvest.  Major exceptions are
potatoes and apples, both of which are purchased throughout the
                               2-7

-------
year from storage facilities.  It  is unusual according to Hodgen
for processors to store produce for more than a day or so.
     Hodgen indicates that  individual processing facilities tend
to specialize in one  (or a  few commodities), and that freezing
and canning generally do not take  place in  the same facility.
This latter point is supported by  the 1982  Census of Manufactures
(U.S. Department of Commerce, 1982), which  shows that only 2.2
percent of all canned fruits and vegetables are produced in
facilities classified in the frozen fruits  and vegetables
industry, and that only 4.6 percent of all  frozen fruits and
vegetables are produced in  facilities classified in the canned
fruits and vegetables industry.  Hodgen also indicates that
thorough washing of fruits  and vegetables just prior to
processing is routine practice, and that the washing activity is
a water-intensive process.  Most fruits and vegetables being
processed are harvested by  growers with no  company-affiliation
with the processors.  Instead, most produce destined to be
processed has been contracted for  a year in advance by individual
processing facilities.
     Conner (1988) estimated state-by-state 1982 production of
frozen fruits and vegetables, canned fruits and vegetables, and
dried fruits and vegetables.  These estimates have been combined
by summation into a single  variable titled  "processing value of
shipments" to compare with  state estimates of acres of land
producing vegetables and fruits in 1982 and state population to
help establish whether produce processing plants are concentrated
in areas of raw produce production.
     A multiple regression with 50 (state) observations of
processing value of shipments (in  millions of 1972 dollars) on
acres of land in fruit and  vegetable production (in thousands of
acres)  and population (in thousands)  yields the following result:
processing value
of shipments
39.098
1.543 acres  + .012 population
(t-15.639)           (t=1.345)
                               2-8

-------
R-square -  .925, N=50  (states), DF=49.
Note the highly significant coefficient on the acres variable (as
proxy for where produce is grown) and the insignificant
coefficient on the population variable  (as proxy for where
processed produce is consumed).  The results are consistent with
a hypothesis that processing is concentrated in places  (states)
of fruit and vegetable production.
     Most fruits (97%) and vegetables (85%) are processed in two
industries: SIC 2033  (canned fruits and vegetables) and SIC 2037
(frozen fruits and vegetables).  The size  (through-put)
distribution of facilities classified in these two industries are
depicted in Figures 1 and 2, respectively.  About 86.5 percent of
all fruit and vegetable canning is done in fruit and vegetable
canning facilities, 2.2 percent of is done in fruit and vegetable
freezing facilities, 3.5 percent is done in specialty canning
facilities, and negligible amounts of canning are done in other
industries.  About 93 percent of all fruit and vegetable freezing
occurs in fruit and vegetable freezing facilities, 4.6 percent
occurs in fruit and vegetable canning facilities, and negligible
amounts are done in other industries (1982 Census of
Manufactures).
     The Directory of the Canning, Freezing, and Preserving
Industries  (Judge, 1988) gives the name, location, volume,
product lines and other information on almost every canning,
freezing, and preserving facility in the U.S.  These data would
prove useful in identifying where different commodities are
processed, assessing the proximity of processors in relation to
growers, determining what types of processed commodities are
produced in the same plant, examining the size distribution of
factories producing different types of commodities, and exploring
other questions.
                               2-9

-------
 Number of
establishment*
                             CANNED  FRUITS  &  VEGETABLES
                                            SIC  2033
                                                          145
        217,829    833.333   1.326.923   4,928.203   8.266.102  23J71.724 45.688.889  70.175.000 167.720.000
                                            Annul Sale* (S)
                                    SOUTM:  1982 Cennv of MmiificBin*
                   Figure 1.  Sales • Size Distribution of Fruit and Vegetable Canning Facilities
                                            2-10

-------
 Number of
establishments
                             FROZEN FRUITS & VEGETABLES
                                            SIC 2037
                                                         66
        268.000    683.333    1.427,273   5J30.000  IU94.444 20.090,909  39.444416 72,013,63* 141.725.000
                                           Annual S«1«(S)
                                    Sourer 19S2C«uuiofMwfwtura
                    Figure 2. Saks - Size Distribution of Fruit and Vegetable Freezing Facilities
                                           2-11

-------
     Canned and  frozen  fruits  and  vegetables  are also imported
into the U.S.  By  value,  about 7 percent  of canned  foods  consumed
in the U.S. in 1987  were  imported  as  canned foods.   Six products
accounted  for 50 percent  of  all canned  food imports:  orange  juice
(20 %), apple juice  (9%),  mushrooms  (7%),  pineapples  (7%), olives
(4%), and  tomato products (3%).  This same year,  5  countries
accounted  for over half of all canned food imports:  Brazil  (19%),
Spain  (10%), Mexico  (9%),  Taiwan (7%),  and the  Philippines  (6%)
(U.S. Department of  Commerce,  1988).
     The share of  all frozen foods consumed in  the  U.S. imported
from other countries is small  - about 3 percent in  1987.
Virtually  all imported  frozen  foods  (90 %  in  1987)  come from 5
countries: Mexico  (56%),  Canada (17%),  Taiwan (7%),  Guatemala
(5%), and  New Zealand (4%).  Broccoli was  the largest imported
frozen item, accounting for  32 percent  of  the total,  followed by
strawberries (24%),  peas  (8%),  and cauliflower  (7%).   Nearly all
of the largest two imports,  broccoli  and strawberries,  came  from
Mexico (U.S. Department of Commerce,  1988).
     Annual data on  imports  into the  U.S.  of  specific commodities
by country of origin are  conveniently summarized  in  the
International Trade  section  of the Almanac of the Canning,
Freezing,  and Preserving  Industries  (Judge, 1989).  For example,
one can determine  that the United  States imported 7.3  million (24
Ib.) cases of canned tomatoes  from six  specified  and  other
unspecified countries in  1988,  and that 59 percent of these  (4.3
million cases)  came  from  Italy.
2.3  Storage Information
     The main purpose of storage is to make goods available at
desired times.  All commodities are "stored" in some sense, if
only during transportation, processing, wholesaling, and
retailing - all of which take time.  It is appropriate to think
of everyone from farmer to consumer as doing some storing, though
                               2-12

-------
   most is done by specialized agricultural businesses (Rhodes,
   1987).
        Rhodes (-T987)  reports that while  U.S.  farmers  store  enormous
   quantities  of  feeds and grain,  produce (with  the  exception  of
   potatoes) is rarely stored by  farmers.   Lace  (1989)  indicates
   that  the three  main storable commodities- potatoes,  apples, and
   onions- are generally stored in  combination packing/storage shed
   operations which may be  co-operatively owned by farmers or owned
  by private specialty companies.  He advises that these storage
  facilities are almost always located very near the farms where
  the commodities are harvested.
       Kohls and Uhl  (1985) also indicate that storage is carried
  out at every level of the food industry,  and that storage is
  interrelated with other  marketing functions  like transportation
  and processing.   They report it is extremely difficult to
  estimate the amount  of food in  storage  at any  time,  or even  to
  measure  total food storage  capacity available.
      Establishments  engaged primarily in  the unrefrigerated
  storage  of produce are classified in SIC  4221  "Farm  Product
  Warehousing and  Storage."   Those  primarily engaged in
  refrigerated storage of produce are classified in SIC 4222
  "Refrigerated Warehousing and Storage."
      Once fresh commodities  leave storage, they are usually
 consumed fresh,  or processed, within a week,  according to Lace
 (Lace,  1989).  It is rare for fresh produce destined for
 processing  to be stored any appreciable  length of time by
 processors  themselves.   Processors receive their  produce from the
 same storage  facilities as wholesalers of  fresh-form produce.
 Lace could not suggest any written references that describe the
 numbers or characteristics of storage facilities.   Hodgen  (1989)
 confirms  this  information.   He states that most fruits and
 vegetables are processed  within  24 hours of harvest,  and that
 storage of fresh  crop per  se is  rare at  processing  facilities.
     "Perishable" commodities like berries, bananas,  lettuce,
spinach, and tomatoes are  not "stored" in  the.same sense as
                               2-13

-------
apples, potatoes, and onions.  The year-round availability of
perishable fresh-form produce, whether for consumption or
processing, Depends on year-round harvesting in different states,
on imports, and on methods like greenhouse production.
Perishables are transferred as quickly as possible from one point
in commerce to the next and are not intentionally stored (Lace,
1989) .
     In September 1986, the Agricultural Research Service of the
USDA revised its handbook titled "The Commercial Storage of
Fruits, Vegetables, and Florist and Nursery Stocks"  (USDA, 19'86) .
The handbook describes how refrigeration, chemical treatment and
fumigation, atmospheric control and modification, waxing and
surface coating, irradiation, and protective packing can be and
are used to prolong the shelf life of fresh produce.
     The handbook discusses the normal and state-of-the-art
storage practices for about 40 individual fruits, including
varieties of certain fruits.  For example, normal and maximum
storage times are provided for 17 varieties of fresh apples.
Most fruits can be stored successfully for more than a week but
less than 6 weeks.  Of the 40 fruits discussed, only 11 have
approximate storage lives given in months rather than weeks:
apples, cranberries, coconuts, dates, vinifera grapes, kiwifruit,
lemons, pears, persimmons, pomegranates, and quinces.  Fruits
with storage lives of only days include several types of berries,
cherries, and figs.
     The handbook also discusses the normal and state-of-the-art
storage practices for about 65 individual vegetables, including
some varieties.  The approximate storage life of vegetables under
ideal conditions varies greatly across commodities.  Important
vegetables with storage lives under 1 month include asparagus,
beans,  broccoli, cauliflower, sweet corn, cucumbers, lettuce,
green peas, spinach, and tomatoes.   Important vegetables with
storage lives in excess of three months include cabbage, carrots,
onions, potatoes, and sweet potatoes.
                               2-14

-------
      Edwards  (1989)  reports  than  in practice only potatoes,
 apples,  and sope  onions,  peaches,  and pears are stored.  Most
 other commodities are  shipped  from the point of origin very
 quickly.  He  also reports that almost all commodities that are
 stored are stored in the  state where they are harvested.

 2.4   Marketing  Channels
     2.4.1
Fresh Produce
     As many as four generic types of businesses are typically
involved in bringing produce from growers to ultimate consumers
for fresh-form consumption: shipping-point establishments,
terminal markets, wholesalers, and retailers.  Each of these  four
types of businesses can be meaningfully sub-divided, and some may
be by-passed in the chain of-commerce under certain
circumstances.
     Shipping point establishments receive large volumes of
produce from numerous growers, prepare the products for market
(performing such functions as sorting, grading, cleaning, and
packaging), and distribute the products to points further
downstream.  Dickerson  (1990)  indicates that shipping point
establishments can be classified in either SIC 0723 or 5159.
     Facilities classified in SIC 0723 (Crop Preparation Services
for Market) are contract-service establishments that do not "buy"
and "sell11 produce, but rather handle it on a fee-for-service
basis.  Facilities classified in SIC 5159 (Farm Product Materials
Wholesale Trade, Not Elsewhere Classified) perform the same types
of services, but actually purchase, handle, and re-sell produce.
In 1987, an estimated 518 establishments were classified in
"wholesale trade of other farm product raw materials, excluding
hides, tobacco, wool, and cotton".  It is unknown how many of
these 518 establishments deal in fruits and vegetables (U.S.
Department of Commerce, 1987(b)).  These 518 establishments are
described as assemblers of farm products that purchase directly
                               2-15

-------
from farmers and market farm products at wholesale (U.S.
Department of Commerce, 1987 Census of Wholesale Trade).  Most of
these establishments (436) are described as "merchant
wholesalers" that "buy and sell merchandise on their own
account".  The other 82 facilities are described as "agents,
brokers, and commission merchants" that, unlike merchant
wholesalers, are essentially hired by others to buy and sell
merchandise.
     Shipping-point firms distribute fresh produce to three main
types of down-stream markets: wholesalers, large retailers, and
terminal markets.
     Wholesale firms receive produce from shipping-point firms,
break them into smaller lots, and distribute them to retail
establishments.  Most retailers including but not limited to
grocery stores, vegetable and fruit markets, restaurants, and
institutions are too small to cost-effectively buy truck-loads of
produce directly from shipping point firms.  Wholesale
establishments provide this link between shipping points and
retailers.
     In 1987, there were 5,838 wholesale fresh fruit and
vegetable establishments with an average of 17 employees and
sales of 5.2 million each ((U.S. Department of Commerce, 1987
Census of Wholesale Trade).  The size (sales) distribution of
fresh fruit and vegetable wholesale establishments in 1987 is not
available from the preliminary Census.  The number of
establishments in each state is also available from the Census.
About one-fifth of all wholesale fruit and vegetable
establishments (1,130) are located in one state: California.
Florida and New York are second and third, with 538 and 500
establishments, respectively.
     Facilities classified in other industries also wholesale
fresh fruits and vegetables as "secondary" activities.  These
industries include 5141 (groceries) and possibly others.
     Some retailers are sufficiently large to purchase directly
from shipping-points, by-passing wholesalers per se.  Herein
                               2-16

-------
referred to as  "large retailers", they include large grocery
chains, large restaurant chains, and large institutions.  Some
grocery chains", referred to as  integrated wholesale-retail firms,
are large enough to buy produce by the truck-load directly from
shippers.  They purchase fresh produce from shipping points,
break it into smaller lots, and distribute it to their own
stores.  Some large restaurant chains and institutions  (e.g.
military bases) perform the same activities (Erickson, 1990).
     Wholesalers are the main link between shipping-points and
most retailers.  Food retailers are classified in one of two
broad two-digit SIC industries: 54 (food stores) and 58  (eating
and drinking places).
     According  to the 1987 Census of Retail Trade {U.S.
Department of Commerce, 1987(a)), there were 190,706 food stores
in the U.S. in  1987.  Of these 190,706 establishments, 72 percent
(137,584) were  full-line grocery stores selling a wide variety of
goods, including fresh vegetables and fruits.  While grocery
stores comprise 72 percent of all food store establishments, they
account for 95  percent of all food sales.
     Other types of food stores for which Census data are
available are seafood markets, retail bakeries, fruit and
vegetable markets, candy stores, dairy stores, and miscellaneous
food stores.  It is reasonable to assume that besides grocery
stores, only fruit and vegetable markets sell a significant
amount of fresh produce.
     Fruit and  vegetable markets are establishments primarily
selling fresh fruits and vegetables,  but may also carry a limited
line of grocery items.  Roadside stands of farmers are not
classified in this industry.  In 1987, 1.7 percent of all food
stores (3,271)  were fruit and vegetable markets.  Fruit and
vegetable market retail sales in 1987 totaled $1.8 billion, only
six-tenths of one percent of all retail food store sales.  The
number of establishments and value of retail sales in 1987 in
each state for  each type of food store is available in the 1987
Census of Retail Trade.
                               2-17

-------
     According to the Census of Retail Trade, there were twice as
many eating and drinking places in the U.S. as grocery stores in
1987, but their retail sales were about one-half of that of
grocery stores.  Ninety-five percent of all eating and drinking
places are primarily food rather than beverage servers.  Of the
332,611 eating places surveyed in the 1987 Census, twelve percent
were cafeterias, caterers, contract food service establishments,
or ice cream stands.  The other eighty-eight percent of eating
places were restaurants or refreshment places.
     Of the 292,825 restaurants and refreshment places, 53
percent were the former and 47 percent were the latter.
Restaurants are those establishments where waiters/waitresses
take orders from customers as they are seated.  Refreshment
places are establishments that "...prepare items such as chicken
and hamburgers for consumption either on or near the premises or
for take-home consumption..." and do not have "...waiter/waitress
service where the patron's order is taken while the patron is
seated at a table, booth, or counter."  Mean annual sales of
restaurants in 1987 was about $429,000, compared to about
$412,000 for refreshment places.
     2.4.2
Processed Produce
     Most produce processors (canners, freezers, dehydrators,
etc.) contract with and receive produce directly from growers,
with the raw commodities by-passing shipping-points altogether
(Erickson, 1990).  The activities, locations, and other
information about produce processors have been described above.
This section concentrates on the marketing channels of processed
produce once it leaves processing plants.
     Essentially all processed (mainly canned and frozen) fruits
and vegetables pass through wholesalers on their way to
retailers.  Types of ownership vary more across establishments
than types of activities.  All wholesalers of processed produce
                               2-18

-------
purchase  large shipments from processing plants and distribute
smaller shipments to multiple retail outlets.
     Kenney  (1990) advises that wholesale establishments are
generally owned either by wholesale companies or retail
companies.  Wholesale companies are companies whose primary
business  is wholesaling.  Retail companies are companies for whom
wholesaling is a secondary business- their primary business being
retailing.  Only the largest retail grocery chains are vertically
integrated into wholesaling.  They buy directly from food
processors and distribute exclusively to their own retail stores.
Small chains, even those with 15-20 stores, are too small to buy
directly from food processors, and buy instead from independent
wholesalers.
     In 1987, there were 42,075 independent establishments
classified in wholesale trade of grocery products {SIC 514).
Within this three-digit industry are three four-digit industries
that wholesale most processed fruits and vegetables: SIC 5141
(general line groceries), SIC 5142 (packaged frozen foods), and
SIC 5149  (groceries and related products, not elsewhere
classified).  Other four-digit industries specialize in dairy
products, poultry products, seafood,  meat, etc.
     Using the 1987 wholesale Census data, we can infer that
there are as many as 8,821 establishments that wholesale
significant amounts of canned and frozen produce.  Of this total,
approximately 76 percent are "merchant" wholesalers.  They
purchase processed food commodities of their own accord, and sell
small shipments to multiple retail companies and establishments.
About 20 percent of the establishments are independently owned
but operate as agents or brokers on a contract basis exclusively
for one or several retailers (agents).  The other 4 percent of
establishments are actually owned by food processing companies
(company).  While there are far fewer agent and company
establishments than merchant establishments, the former are
typically considerably larger.  Annual per-establishment sales
for merchant, agent, and company wholesale grocery establishments
                               2-19

-------
are $ 6.9 million,  $18.8  million,  and  $14.1 million,  respectively
(1987 Census of Retail Trade).
     While some of  the largest  food  store chains perform their
own wholesaling function,  few if any eating and drinking places
companies do so.  kenney  (1990) estimates that about  95 percent
of all restaurants, refreshment places, etc. purchase processed
foods from merchant wholesalers.   The  other 5 percent of all such
retail establishments- owned by the  largest restaurant chains-
also purchase  from  wholesale establishments they do not own.
They do, however, purchase more exclusively under contracts with
agent wholesalers.
     Some processed fruit and vegetable products are  more likely
to be distributed to and consumed  in restaurants and  other eating
places than sold as groceries in food  stores, while the opposite
is true for other commodities.  The  1989 Almanac of the Canning,
Freezing, and  Preserving Industries  (Judge 1989) contains data
that can be used to estimate quantities of some (but  not all)
processed food products distributed  to retailers versus
restaurants and institutions.  The Almanac provides information
on quantities  shipped in various pack-size classifications.  Some
pack sizes are suitable mainly for retail sale, while others are
suitable mainly for food service/restaurant sale.
     The Almanac indicates, for example, that 87 percent of all
frozen french  fries shipped in 1988  were in restaurant/food
service packs.  Frozen green bean  shipments were about evenly
split between retail and food service packs.  About 58 percent of
all 1988 frozen spinach was shipped  in retail packs.  Estimates
of shipments of canned and jarred commodities by size of
container are also  available from the Almanac, and could be used
to estimate shipments to retail food stores versus restaurants
and institutions.
                               2-20

-------
               MOVEMENT  OF COMMODITIES IN COMMERCE
                     ANNOTATED BIBLIOGRAPHY
Beilock and Portier,  1989.  Richard Beilock and Kenneth Portier.
Using USDA Fresh  Fruit and Vegetable Arrivals to Determine the
Distribution of a State's Production.  Northeastern Journal of
Agricultural and  Resource Economics. 18  (!) April,  pp. 35-45.
They report that  the  arrival data  (USDA  1989  (a) and  (b)) offer
the best source (with limitations) of  information on  the
geographic distribution of fresh fruits  and vegetables.  The
cities at which arrivals data are gathered act as distribution
points; however,  there are no data on  the distribution of the
fresh produce from these cities.  They demonstrate a  fairly
straightforward method to determine the  amount of each commodity
shipped from each of  the 50 states to  each region of  the country
(South, Northeast, Lakes, and West).

Carman and French, 1988.  Hoy F. Carman  and Ben C. French.
Economics of Fruit and Vegetable Processing in the United States.
in Economics of Food  Processing in the United States.  Chester 0.
McCorkle, Jr., ed.  Academic Press.

Conner, 1988.  John M. Conner.  Food Processing: An Industrial
Powerhouse in Transition.  Lexington:  Lexington Books.

Dickerson, 1990.   Telephone conversation between Donald W.
Anderson (RTI) and Dale Dickerson, Industry and Commodity
Classification Branch, Bureau of the Census.  January 9.

Edwards, 1989.  Telephone conversation between Donald W. Anderson
(RTI) and Douglas Edwards, Market News Branch, Fruit  and
Vegetable Division, Agricultural Marketing Service, USDA.
November 14.

Erickson, 1990.   Telephone conversation  between Donald W.
Anderson (RTI) and Lynn Erickson, Market News Branch, Fruit and
Vegetable Division, Agricultural Marketing Service, USDA.
January 9.

Hodgen, 1989.  Telephone conversation  between Donald  W. Anderson
(RTI) and Donald  Hodgen, Consumer Commodities Branch,
International Trade Administration, US Department of  Commerce.

Judge, 1988.  The Directory of the Canning, Freezing, and
Preserving Industries: 1988-1989.  Edward E. Judge and Sons, Inc.
Westminster, MD.   Gives the name, location, volume, product lines
and other information on almost every  canning, freezing, and
preserving facility in the U.S.  This  data could be used to
identify where different commodities are processed, assess the
proximity of processors in relation to growers, determine what
types of processed commodities are produced in the same plant,

                               2-21

-------
examine the size distribution  of  factories  producing  different
types of commodities,  and  explore other questions.

Judge, 1989.  The Almanac  of the  Canning, Freezing, and
Preserving Industries:  1989.   Edward  E. Judge and Sons,  Inc.
Westminster, MD.  Provides information on the production and
disposition (fresh'and processed)  of  fruits and vegetables by
commodity  (and  sometimes variety), state, year, and sometimes
processed  form.  Section IX, International  Trade and  World Packs,
provides annual data of specific  commodities on exports  from the
U.S. by country of destination and imports  into the U.S. by
country of origin.

Kenney, 1990.   Telephone conversation between Donald  W.  Anderson
(RTI) and  Cornelius Kenney, International Trade Administration,
US Department of Commerce.  January 11.

Kohls and  Uhl,  1985.   Richard  L.  Kohls and  Joseph N.  Uhl.
Marketing  of Agricultural  Products.   New York. MacMillan.

Lace, 1989.  Telephone conversation between Donald W. Anderson
(RTI) and Larry Lace,  Fresh Products  Branch, Fruit and Vegetable
Division, Agricultural  Marketing  Service, USDA.  November 20.

Lopez and Henderson, 1989.  Rigoberto A. Lopez and Nona  R.
Henderson.  The Determinants of Location Choices for  Food
Processing Plants.  Agribusiness  5(6).  pp. 619-632.  "This
article examines the determinants  of  location choices for new
food processing plants  using the  results of a telephone  survey.
Six categories  of business climate factors  (market,
infrastructure, labor,  personal,  environmental regulation, and
fiscal policy)  containing  41 specific location factors are
considered.  The survey responses  are analyzed in their  entirety,
by types of raw products processed, and by plant size.   Findings
indicate that plant location choices  are driven by market and
infrastructural factors.   Fiscal  policies such as tax and
development incentives  are insignificant.   Implication of the
findings for devising  incentive packages to attract new  plants
are given."  Proximity  to  raw  agricultural commodities was an
important location factor  for  about three-quarters of the
respondents.

Rhodes, 1987.    V. James Rhodes.  The  Agricultural Marketing
System.  Wiley  and Sons.

USDA, 1986.  Agricultural Research Service, US Department of
Agriculture Handbook Number 66.  The  Commercial Storage  of
Fruits, Vegetables, and Florist and Nursery Stock.  Describes how
refrigeration,  chemical treatment  and fumigation, atmospheric
control and modification, waxing  and  surface coating,
irradiation, and protective packing can be and are used  to
prolong the shelf life  of  fresh produce.  This handbook  discusses

                               2-22

-------
 *the normal and state-of-the-art storage practices for about 40
 individual fruits and about  65  individual vegetables, including
 varieties of c«rtain fruits  and vegetables.  The approximate
 storage  life of fruits and vegetables under ideal conditions
 varies greatly across commodities.

 USDA, 1988(a).  Agricultural Statistics, 1988.  Provides acres
 planted  and harvested, and quantities produced of many individual
 crops and some processed commodities by state and sometimes by
 country.  Also provides data on exports and imports of some
 crops.   Updated annually

 USDA, 1988(b).  Economics Research Service Statistical Bulletin
 No. 773.  Food Consumption,  Prices, and Expenditures, 1966-1987.
 US Department of Agriculture.  Contains estimated per capita
 consumption of individual fresh and processed foods, which could
 be used  to establish the proportions of individual foods consumed
 fresh and processed in various forms.

 USDA, 1989(a).  Fresh Fruit  and Vegetable Shipments.  Calendar
 Year 1988.  Market News BranchT Fruit and Vegetable Division,
 Agricultural Marketing Service, US Department of Agriculture.
 April.   Reports domestic rail, truck, piggyback, air, and exports
 of fresh fruits and vegetables by state, commodity, month, and
 mode of  transportation.  Data are reported for all states and all
 months for each of about 80  specific commodities.  The data are
 also available by variety for some varieties of some commodities
 (e.g. iceberg v. romaine lettuce, and bell v. "other" peppers),
 though the varietal information is very limited.  Imports of
 commodities by country of origin and by month are also reported.
 Truck and air shipments and  exports for all commodities are not
 available; therefore, those  that are reported should not be
 interpreted as complete.

 USDA, 1989(b).  Fresh Fruit  and Vegetable Arrivals in Western
 Cities,   Calendar Year 1988;  and Fresh Fruit and Vegetable
Arrivals in Eastern Cities,  Calendar Year 1983.  Market News
 Branch,   Fruit and Vegetable  Division, Agricultural Marketing
 Service, US Department of Agriculture. April.  These documents
 (one for western cities and  the other for eastern cities) provide
data on  arrivals at 20 US cities on about 75 specific fresh
commodities by month, by state or country of origin, and by
 several modes of transportation (rail, truck, and air).
                               2-23

-------
USDA, 1989(c).  Economic Indicators of the Farm Sector: State
Financial Summary, 1988.  Economic Research Service, US
Department of Agriculture.  Shows 1988 farm-level cash receipts
in each state by commodity groups (e.g. "vegetables", "fruits",
"poultry" and "milk") and selected commodities.  In states where
receipts warrant it, there is greater detail and some commodities
are reported by variety and season.   This publication could be
used to establish the origins of commodities, but not their
distribution.  Amounts are given in dollar values rather than
weight or acreage.

U.S. Department of Commerce, 1982. 1982 Census of Manufactures.
Processed Fruits and Vegetables Industries Series.  Provides
information on the disposition of harvested fruits and
vegetables.   Table 7 of the series ("Materials Consumed by
Kind") shows the quantities and costs of various fruits,
vegetables, sugar, meats, and other ingredients utilized by
facilities classified in several food processing industries.
The industries covered are Canned Specialties  (SIC 2032), Canned
Fruits and Vegetables (SIC 2033),  Dehydrated Fruits, Vegetables,
and Soups (SIC 2034) , Pickles, Sauces, and Salad Dressings  (SIC
2035), Frozen Fruits and Vegetables (SIC 2035), and Frozen
Specialties  (SIC 2038).  These data could be used to estimate the
shares of selected commodities that are canned, frozen, or
otherwise processed.

U.S. Department of Commerce, 1987. 1987 Census of Agriculture.
Provides acres, quantities, and value of many individual crops
harvested by state and by county.
U.S. Department of Commerce, 1987(a).
Trade.

U.S. Department of Commerce, 1987(b).
Trade.
1987 Census of Retail
1987 Census of Wholesale
U.S. Department of Commerce, 1988.  International Trade
Administration.  1988 Industrial Outlook.
                               2-24

-------
                APPENDIX 3

STATISTICAL DESIGN AND ANALYSIS OF SURVEYS
    FOR PESTICIDE RESIDUES IN THE DIET

-------
3.1  Introduction
     3.. l. 1
Purpose of Appendix
     The purpose of this appendix is to provide an overview of
the statistical development of surveys to estimate the amount of
pesticide residues in food at the time of consumption.  In most
situations, these surveys will apply to pesticides that are in
general use at the time of the survey but require more refined
residue estimates to justify their continued use.  Such surveys
may be part of the Reregistration or Special Review processes,
and may be specified by EPA in a Data Call-in Notice  (DCI), under
the authority of Section 3(c)(2)(B) of FIFRA.
     The appendix begins with a brief discussion of survey
sampling.  The discussion is purposely nontechnical and includes
such topics as defining a survey population, constructing a
sampling frame, stratification, clustering, sample selection
methodology, sample size considerations, types of estimates, and
sampling and nonsampling errors.  Specific consideration is given
to how issues such as percent of crop treated with pesticide,
minor crops, population subgroups, residue hotspots, and factors
that cause changes in pesticide use (e.g., unusual weather)
affect the survey design.
     Next, the advantages and disadvantages of sampling raw
agricultural commodities, wholesale establishments, retail food
stores, and households are discussed.  Specific methodological
issues associated with sampling at each of these points in the
food processing and distribution chain are discussed, including
acquiring and developing a sampling frame, defining strata and
clusters, and sample selection techniques.
     Decision rules for determining the appropriate sampling
point for a given pesticide also are presented.  Factors
considered in the decision rule include:  persistence of the
pesticide, limits of quantitation, type of crop to which
                               3-1

-------
pesticide is applied, and patterns of movement of affected
commodities in commerce.
     This Appendix presents the issues to be considered when
designing pesticide residue surveys, the concerns which need to
be addressed, and identifies possible biases to be avoided.
Since the purpose of each survey is unique, survey designs must
be tailored to fit each situation to provide statistically valid
results.
     3.1.2
General Approach to Survey Sampling
          3.1.2.1   Survey Populations
          A survey population is a set of elements about which
information is sought.  For a survey of a pesticide residue in
the diet, the survey population might comprise all adults and
children living in the United States; the information sought
might be the mean concentration of the pesticide residue in food
consumed by members of the population over a certain period of
time.  Because the desired information (e.g. mean) is a function
of the amount of pesticide residue consumed by each member of the
survey population, it is regarded as a population parameter.
Survey sampling deals with methods for selecting and observing a
part (sample) of the population in order to estimate such
parameters.
     A sample value is an estimate of a population parameter that
is computed from the elements in the sample.  To be considered
valid in a statistical sense, any inferences drawn from a sample
must be supported by the probability structure that gave rise to
their selection.  The underlying probability structure provides
the required link between the sample and the population.  The
specification of the probability structure is referred to as the
sample design.  The requirements for a valid sample design are
that:
                               3-2

-------
     *    every element of the population be assigned a non-zero
          probability of selection into the sample, and,
     •    the randomization procedure used to select the sample
          generates, on average, the assigned probabilities for
          each element of the population.

Given these requirements, specific design issues then center on
assigning the probabilities in such a manner as to obtain
acceptable levels of precision for acceptable levels of cost.
          3.1.2.2
Sampling Frames
          A listing of the elements of a population or sampling
frame is indispensable for probability sampling.  Sampling frames
can be lists, and/or procedures that account for all of the
elements in a population without the physical effort of listing
them.  For example, a sampling frame for the population of retail
food stores in the U.S. could be enumerated with a listing of all
counties or county-equivalents in the country and a set of maps
that identify block and segment boundaries and all roads,
landmarks, and buildings within them.  Then, procedures could be
developed to uniquely identify all retail food outlets within
each block or segment.  Usually, such procedures require on-site
enumeration within the selected blocks or segments and have
special instructions for including population elements not on the
maps.  The use of geographic units for the enumeration of a
population is known as area sampling.
     Alternatively, the sampling frame could be a national
listing of retail food outlets.  Such lists are available from a
number of companies (e.g. Dun & Bradstreet) for a wide range of
commercial establishments.  In addition to providing the name and
location of an establishment, such lists often contain additional
information such as total sales, chain affiliation, and other
data that are useful for clustering, stratifying, and selecting
                               3-3

-------
the sample.  The use of lists to enumerate a population is known
as list sampling.
     At first glance', the ease of selecting a sample from a list
would seem to make list sampling the procedure of choice whenever
possible.  However/ list sampling frames can be vulnerable to
certain kinds of non-coverage that can be avoided with area
sampling.  Non-coverage, which is the systematic exclusion of a
subset of population elements, can result in biased sample
estimates if the characteristics under study differ between the
elements included and those excluded from the sampling frame.
For example, a listing of retail stores may only include
establishments that exceed a certain sales volume.  If the small,
excluded food stores tend to have, say, different food items or
longer shelf times than their large, included counterparts,
biased estimates may result.
     Farmers' markets and roadside stands would be particularly
vulnerable to exclusion from business lists.  Further, certain
characteristics of these establishments such as the relatively
short time between harvest and sale (when nonpersistent
pesticides are being investigated) and a more rural customer base
could make them significantly different than other types of food
stores.  As a result, systematically excluding them from a
sampling frame could result in the survey underestimating the
amount of nonpersistent residues or failing to detect a
difference in the residue levels between the urban and rural
subpopulations.
     If area sampling is used to enumerate retail food outlets,
procedures can be developed to include any establishment within
the selected geographic area.  Of course, the effectiveness of
such procedures will depend on careful consideration of factors
like the size of the area to be enumerated or the seasonal
availability of the food items being surveyed.
     In addition to providing a means to enumerate the elements
of a population, a sampling frame also can provide
characteristics of the elements which, when used properly, can
                               3-4

-------
greatly enhance the  efficiency of a sample design.   For example,
the annual sales  of  a  retail  food store can provide  an indication
of the size of the store.   If all of the stores  in a population
can be classified on the basis of annual sales as, say, large,
medium, or small, steps can be taken to control  the  distribution
of the sample of  stores with  respect to their size.  The most
common uses of such  characteristics in survey sampling are in
stratification and clustering.

          3.1.2.3    Stratification

          Stratification is essentially a precision-increasing
device by which a population  is divided into several
subpopulations or strata, each of which is then  separately
sampled and the results properly combined to give estimates for
the population as a  whole.  Stratification may be used to:

     •    Decrease the variances of sample estimates;
     •    Enable  the use of different methods and procedures in
          different  strata; and
     •    Control the distribution of the sample with respect to
          different  subpopulations of interest.

     A stratified sample may or may not resemble the population
from which it was selected.  In proportionate sampling, a sample
is selected separately from each stratum,  and is made
proportionate to the population size of the stratum.   The
variance of the overall estimate is decreased to the degree that
the stratum means diverge and that homogeneous population
elements exist within strata.  For example, when estimating
pesticide levels for the total volume of a commodity sold, food
items that have seasonal peaks in consumption (e.g. turkeys or
watermelons)  may need to be proportionally allocated across
seasons to accurately reflect their consumption patterns.
                               3-5

-------
      Disproportionate  sampling can result in a sample
distribution that  is quite  different from the population
distribution.   In  an" optimum  allocation, where different sampling
rates .are used  deliberately in the different strata, the variance
per unit cost is decreased  by increasing the sampling fractions
in strata having higher variation or lower sampling cost.  When
estimating pesticide residue  levels, variation usually increases
with the amount of pesticide  used.  If fresh produce is
regionally distributed, and if the concentration of pesticide
residue on fresh produce  is thought to be greater  in the
southeast than  in  other regions of the country, the sampling
could be concentrated  in  the  southeast even though the northeast
accounts for greater proportion of the total consumption of fresh
produce.
     Alternatively, a  sample  may be disproportionately allocated
to enable the detection of  differences between two or more
subpopulations.  For example, assume that the residue levels in
local brands of a  food item are thought to be different than in
national brands, but that the local brands account for only a
small proportion of the total sales.  A sample that is
proportionally  allocated on the basis of sales will contain too
few local brand food items  to allow accurate estimates of
residues in local  brands.   As a result, the ability (or power) of
the sample to detect a difference between the brands will be low.
If the sample is equally, rather than proportionally, allocated
to each stratum, the increased accuracy of the local brand
estimates will  outweigh the decreased accuracy of the national
brand estimates, assuming equal variances in each stratum.  As a
result, the detection capability (power) of the sample will be
increased.

          3.1.2.4   Cluster Sampling

          Cluster  sampling  has a superficial resemblance to
stratified sampling in that a cluster,  like a stratum, is a
                               3-6

-------
grouping of the  elements of  a  population.  However, the  selection
process is quite different for the  two methods.  When  a
population is stratified, every  stratum  is sampled;  when a
population is clustered, selections are  made among the clusters
                   \
in the same way  that  individual  population elements are  selected.
As a result, the two  sampling  methods have sharply contrasting
properties:  stratified sampling usually results in increased
precision;  cluster sampling usually results in decreased
precision.  Thus, cluster sampling  is useful only when there are
compensating savings  in data collection  costs which more than
offset the loss  in precision,  so that the precision per  unit cost
is actually increased.
     For most surveys of pesticide  residues in the diet, a
geographically clustered sample  will result in sizeable
reductions in data collection  costs compared to simple random
sampling, regardless  of whether  the population elements  are
crops, food items, or persons.   Compare  two samples:   the first
is a random sample of independently selected elements, say retail
food stores;  the second is  a  cluster sample with the  same number
of stores selected from a sample of counties.  Because the number
of counties is controlled in the cluster sample, there will be
significant differences in data  collection costs (e.g.,  travel
costs) between the two samples.  However, the difference between
the variances of the  sample  estimates also may be significant.
     When clusters are sampled,  homogeneity of population
elements, which  is a desirable attribute for strata, is  an
undesirable attribute for clusters.  In  fact, for maximum
precision in cluster  sampling, clusters  should be formed so that
the elements within each cluster vary as much as possible (Stuart
1984).  Cluster  sampling will be less precise than random
sampling if the  elements within a cluster vary less on average
than do the elements  in the  population as a whole.
     For example, consider the situation where a regional
estimate of the  pesticide levels in milk is desired.   Also,
assume that there is a single dairy within each county in the
                               3-7

-------
region which supplies most of the fresh milk to retailers in that
county.  If a sample of counties is selected and each retail
store within a selected county is visited, very little additional
variation in the pesticide residue levels of milk consumed in the
county will be accounted for after the first retail store in the
county has been sampled.  As a result, the effective sample size
of the clustered sample will be less than that of a random sample
of the same size.  Of course, a random sample of retail stores in
the region would include many more counties, on average, than the
clustered sample.
     To reduce the redundancy of information that often
accompanies cluster sampling, consider the following two-stage
sampling process:  first choose a sample of counties and then, by
another selection procedure, choose a sample of retail stores
from each selected county.  The concept of two-stage sampling can
be extended to three or more stages of selection.  In such
multi-stage sampling, relatively large clusters (say, states) may
be selected at the first stage; from selected states, a second
stage sample of smaller clusters (say, counties) may be selected;
from selected counties, a third-stage sample of retail stores may
be selected; and finally, at the fourth stage, a sample of food
products from the selected stores.   While no new sampling
principles are involved in multi-stage sampling, considerable
flexibility in the sample design can be obtained at the expense
of greater complexity in the selection and analysis of the
sample.

          3.1.2.5   Sample Selection Procedures

     Any sample selected by a randomization procedure with known
probabilities of selection is called a random sample.  The
probabilities of selection need not be equal for all possible
samples from a population, so long as they are knowable.  A
sample selection procedure which gives every possible sample  (of
a given size from a given population) the same probability of
                               3-8

-------
selection is called a simple random sample  (srs).  While totally
free of selection bias, an srs, because of  its randomness,  is
problematic in multi-stage sampling.
     The total sample size in a multi-stage sample design can be
subject to large variation if it is based on an  srs of clusters
that differ greatly in size (Kish 1966).  If the selected
clusters are subsampled at a fixed rate, the expected size of the
cluster subsamples will be proportional to the cluster size.
These fluctuations in the distribution of a sample across
clusters can cause administrative inefficiencies in the field.
For example,  if retail food stores are sampled at a fixed rate
within an srs of counties, the urban counties will have too many
stores for efficient data collection while the rural counties
will not have enough.  In addition,  statistical efficiency  (i.e.,
precision per unit cost)  tends to decrease when there are large
inequalities in the size of clusters (Hansen et al 1953).
     Variations in the sample size of a multi-stage design may be
controlled and decreased in the following ways:

          •    Split large clusters and combine small clusters.
               The practicality of forming new clusters that are
               more equal in size than the originals will depend
               on the nature of the clusters.   For example,  if
               clusters correspond to counties, large
               metropolitan areas can be subdivided fairly
               easily.   However,  in rural areas,  several counties
               may have to be combined to satisfy a minimum size
               criterion.   The resulting "super cluster" may
               cover so large an  area that it  requires special
               data collection procedures.
          •    Stratify clusters  by  size.   Stratification of
               clusters by size can  reduce the variation in
               sample size without having to form new clusters.
               Control  of the within-cluster sample size may be
                               3-9

-------
               achieved by varying the subsampling rates within
              ^each stratum.
          •    Select clusters with probabilities proportional to
               size (pps).  By selecting clusters pps, the sample
               will contain a preponderance of large clusters.
               However, by making the subsampling rates inversely
               proportional to the cluster selection rates, the
               same probability of selection is assigned to every
               element of the population regardless of the
               cluster it belongs to (i.e. a self-weighting
               sample is achieved).

     In practice, the pps selection of primary clusters often is
made systematically from an ordered frame (Cochran, 1977).
Besides being easy to draw, pps systematic selection methods,
like stratified sampling, can yield significant improvements In
precision when the frame ordering increases the heterogeneity of
the sample.
     The ratio of the actual variance of a sample to the variance
of a srs of the same number of elements is known as the design
effect.  Stratification tends to reduce the design effect because
of the variance reduction due to stratification.  Clustering
tends to increase the design effect, reflecting the losses in
precision due to clustering.  When a survey sample is both
stratified and clustered, the design effect will reflect the net
effect of the techniques.  Typically, design effects for a
complex survey design will be greater than one and may be as high
as two, depending on the objectives and resources of the survey.
The loss in efficiency that is reflected by a design effect may
be thought of as a reduction in the effectiveness of the sample
size.  Thus, it is convenient to divide the sample size by the
design effect to calculate an effective sample size for use in
srs variance formulas.
                               3-10

-------
          3.1.2.6   Types of Estimates

          For most surveys of pesticide residues, sample
estimates will be means, although for some pesticides, where
acute toxicity is the primary endpoint of concern, estimates of
the full distribution of residues including extreme values  (e.g.
the 95th or 99th percentile) may be desired.  Unlike extreme
values, sample means are fairly stable statistics in that they
are not unduly affected by moderately small or large values, and
this stability increases with increasing sample size.  However,
care must be taken when interpreting a mean value because this
stability can mask significant differences among the various
subpopulations that contribute to the mean.
     National pesticide usage ("percent crop treated") estimates
give no information about how pesticide usage is distributed
across the commodity supply at the retail level.  For example, a
survey may be designed to estimate the national mean
concentration of a pesticide residue for a food commodity with a
prespecified precision.  However, if there is only regional use
of the pesticide, and the food commodity is regionally
distributed, the national mean, although well below acceptable
levels, may disguise a problem in the region where the pesticide
is used.
     Geographic heterogeneity in pesticide usage and distribution
can lead to regions or areas where the average residue levels are
higher (possibly much higher) than the national mean.  This is an
issue that could affect many surveys of pesticide residue levels.
For an acutely toxic pesticide, higher mean regional residue
levels can be of immediate concern because a single exposure
could cause a health effect.  For a pesticide where chronic
exposure is of concern, higher exposures must be sustained for
longer periods of time.
     In general, regional residue variations will not be detected
by a survey designed to generate national estimates unless the
reasons for their occurrence are incorporated into the sample
                               3-11

-------
design.  Because pesticides typically are designed for use
against pests in specific situations, their application is often
regionally and seasonally restricted.  For example, pesticides
that are used on tomatoes in the spring and summer may not be
needed for tomatoes grown during the winter; or pesticides that
are routinely applied to grapes in California may not be needed
on grapes that are grown in New York.
     Insofar as the portions of a crop that are subject to
different levels of pesticide treatments are not combined or
mixed as they travel through the food processing and distribution
chain, information about pesticide usage patterns may be
incorporated into the sample design to increase the precision of
residue estimates.  In the examples above, the sample of tomatoes
could be concentrated in the spring and summer, while the sample
of grapes could be restricted to those grown in California.
     The pesticide usage issue becomes more complicated when
portions of a crop with different levels of pesticides are
combined at some point during the food manufacturing or
distribution process.  If the mixing or physical compositing
process results in food items with fairly consistent residue
concentrations, the sample mean will be generally applicable
across the population as long as the measurement is made on the
final food item.  However, if crops with differing pesticide
levels are used serially during the manufacturing or distribution
process,  the final food item can be subject to periodic
fluctuations or "hot spots" in residue concentrations (e.g., a
processing plant that uses imported tomatoes during the off
season and a locally grown domestic crop in season).  In this
situation, as in the one above, prior knowledge of variations in
the processing stage can, where possible, be incorporated into
the sample design.
     It should be noted that if an estimate other than a national
mean residue level is desired from a survey (e.g., a regional
estimate), this must be taken into account in the design of the
                               3-12

-------
o
study.  Estimation  of means  for  subdivisions  (or domains)  of  a
population wilj, also increase  sample  size needs.

          3.1.2.7   Determining  Sample Sizes

          Accuracy  is important  in any survey, and accuracy is
generally assumed when the precision  is good.  However, a  survey
that requires unrealistically  stringent precision requirements
will place an unnecessary and, perhaps, prohibitive financial
burden on the pesticide registrant.   A good sample design  must
stay in the narrow  realm that  lies between these extremes.
     The precision  of a sample estimate is determined by three
factors:

          1)   The  amount of inherent variation in the
               characteristic  being estimated,
          2)   The  way the sample is  stratified, clustered, and
               selected, i.e.  the design effect, and,
          3)   The  sample size.

Supplemental data (e.g. field  trials  or the results of a prior
survey), if available during the planning stages of the survey,
can provide estimates of the amount of variation in the pesticide
residues being studied.  Typically, this variance is multiplied
by an assumed design effect to incorporate the complexities of
the proposed sample design.  The use of the design effect  enables
the use of simple formulas (Cochran,  1977) to estimate sample
sizes for a complex sample design.
     In the absence of specific  information about variation, a
dichotomous variable can be used to estimate the proportion of
the population with a certain  attribute.  Because the variance of
a proportion is a function of  the proportion itself, the required
sample size can be  considered  for a range of potential
proportions.  An example is the expected proportion of retail
food items with detectable levels of a specified pesticide
                               3-13

-------
residue.  If this proportion is thought to be between 0.30 and
0.70, conventional sample size formulas using the normal
approximation to the binomial distribution may be used to
determine acceptable confidence intervals.  However, if the
proportion of detects  is thought to be small (e.g. less than
0.10), the normal approximation will underestimate the true
confidence interval and the required sample size.  In this
situation, techniques  for sampling rare events such as inverse
sampling (Cochran, 1977} should be used.
     Sometimes, the quantity to be estimated from a survey is a
percentile of a distribution, rather than the mean.  To determine
sample sizes when percentiles are of interest, an easy-to-use
nonparametric statistical method such as computing tolerance
limits (Conover, 1971) should be considered.  Tolerance limits
differ from confidence intervals in that tolerance limits provide
an interval within which at least a given proportion of the
population lies, with  some prespecified probability.  A one-sided
tolerance limit is identical to a one-sided confidence interval
for a percentile and is of the form, "At least a proportion q of
the population is less than the maximum sample value, with
probability l-a."
     A typical application of tolerance limits might be to
determine the sample size needed to be 95 percent sure that 95
percent of the residue values in the survey population are less
than the maximum value found in the sample.  Notationally, this
can written as:
P[q of the population <
                                            > l-a,
where X^ is the maximum residue value found in a sample of size
n.
     It can be shown, with the aid of calculus, that the sample
size n depends on the solution to:
                               3-14

-------
               a > q"
     In this example, a=0.05 and q=0.95, so the required
effective sample size is 59.  If a design effect of 1.5 is
assumed, an initial sample size of {l.5)(59)=89 is needed.  Using
similar assumptions, an initial sample size of 449 is needed to
be 95 percent sure that 99 percent of the residue values in the
population are less than the sample maximum.
     Tolerance limits can also be used to determine sample sizes
when the residue levels of a pesticide under study are thought to
be at or near the limit of detection.  In this situation, a
sample without any detectible levels may result.  If this
happens, the upper limit at which detectible residue levels may
have existed but were not included in the sample becomes
important.  Tolerance limits can be used to determine this-upper
limit simply by assuming that the sample maximum is not
detectable.  For example, with a design effect of 1.5, an initial
sample size of 449 is needed to be 95 percent certain of
selecting at least one detectable level if detectible levels
exist for at most one percent of the population.  Conversely, if
no detectible levels are found in a sample of 449 observations,
one may conclude that detectible levels exist in less than one
percent of the population, if they exist at all.  The same logic
can be extended to "high" residue levels - if no "high" residue
levels (say, over the pesticide tolerance)  are found in a sample
of 449 observations, then one may conclude that such residue
levels exist in less than one percent of the population.
     The discussion above centered on determining sample sizes
needed to achieve a specified level of precision for estimation
of a sample mean or percentile.   However,  statistical issues are
not the only considerations in determining sample sizes.  For
example, if the pesticide under study is not very persistent and
is applied to many different kinds of food items,  a limiting
factor may be the ability of a lab to analyze many food samples
                               3-15

-------
 in  a  sufficiently short  time  to  avoid substantial degradation of
 the pesticide.
           3.1.2.8    Estimation
          A  central  feature  of probability sampling is the
ability to estimate  the precision of a population estimate from
the sample data.   For example, a sample mean has a sampling
distribution because it varies from sample to sample with the
results of a random  selection process.  A common measure of this
variability  is the variance  of the sampling distribution, which
is defined as the  average of the squared deviations from the
sample mean.  The  more a distribution varies about its mean, the
larger its variance.  Thus,  greater variability in the population
results in greater variability in the sample mean.
     The relationship between the variability of the population
and the variability  of the sampling distribution has direct
implications on the  precision of a sample estimator.  In fact,
the sampling variance of a sample mean is equal to the population
variance multiplied  by the factor (1/n) x (l-n/N) where H is the
population size, and n is the sample size.  For most surveys of
pesticide residues in the diet, the sample size n will be a
negligible fraction  of the population size N, so that the
sampling variance  is effectively the population variance
multiplied by 1/n.   In other words, it is the size of sample, not
the proportion of  the population it contains, that affects the
precision of an estimate.
     For most complex sample designs,  sample weights are
necessary for the unbiased estimation of population parameters.
Weights, which are computed  as the inverse of the expected
selection frequency, serve to differentially weight the data of
sample members to reflect the level of disproportionality in the
sample relative to the survey population.  In the case of a
strictly self-weighting sample, all weights are equal and are not
needed for unbiased estimates of population parameters.  However,
                               3-16

-------
for surveys of pesticide residues  in the diet, distortions
between the sample and the population will almost always occur,
whether by design  (e.g., disproportionate sampling) or because of
the differential impact of non-response and non-coverage.  For
example, non-response occurs when  a sample specimen is lost or
contaminated, while non-coverage occurs when a portion of the
survey population is systematically excluded from the sample.
Thus, some type of weighting will  nearly always be necessary in
estimating population parameters.
     Most statistical software packages provide variance
estimates that are based on a simple random sample selected from
an infinite population.  When used on data collected as part of a
complex sample survey, these variances are usually smaller than
the true variance.  Three well known techniques have been
developed to provide relatively unbiased methods for estimating
the variances of descriptive statistics from a complex survey.
     Balanced repeated replication (BRR) and jackknife variance
estimation are two general approaches to forming replicate
estimates (see Cochran, 1977, p.320).  The basic idea behind the
replication approach is to use specified portions (subsamples) of
the sample to obtain different estimates of the parameter of
interest.  The variation of the subsample estimates about the
full sample estimate is used to measure the variance of the full
sample estimate.  Different ways of creating these subsamples
from the full sample yield different estimates of variance.
These subsamples are called replicates and the estimates
calculated from the replicates are called replicate estimates.
     The WESVAR procedure (Flyer & Mohadjer,  1988)  is a user
written SAS procedure {SAS, 1985) developed by Westat, Inc. to
compute basic survey estimates and their associated sampling
errors for user-specified characteristics,  using either BRR or
jackknife replication.  The procedure does not calculate the
sample weights necessary for estimation, but must be used with a
previously weighted (including replicate weights)  computer file.
                               3-17

-------
      The Taylor series  approach  to variance  estimation is based
 on  a  first-order Taylor series approximation of the deviations of
 estimates from their expected values.  This  approximation for
 large samples  is well known  (see Kendall and Stuart,  1961,
 p.231).   Woodruff (1971)  presented applications of this technique
 to  sample surveys.   This  method  provides one of the best known
 numerical approximations  for ratio estimates currently available
 in  the statistical  literature.
      The SUDAAN Procedures  (Shah et al., 1989) developed by  the
 Research Triangle Institute computes survey  estimates along  with
 their associated variance estimates using the Taylor  series
 approximation.   Like the  WESVAR  procedure, SUDAAN does not
 calculate the  sample weights necessary for estimation.

 3.2   Sampling  Points in the Food Processing  and Distribution
      Chain
     3.2.1
Overview
     Pesticide tolerances are based on field trials in which the
pesticide is applied to crops at known application rates, in a
manner similar to the directions for use.  Because field trial
data are commonly designed to reflect the use conditions that
produce the greatest residue levels if the directions on the
label are followed, the data represent the highest application
rate, the maximum number of applications, and the shortest time
between the last application and harvest as indicated on the
label.  If the dietary exposure analysis using these "worst case"
residue levels leads to an acceptable dietary exposure, EPA does
not require further exposure assessment.  Otherwise, the Agency
requires a more realistic estimate of the amount of residue on
food at the time of consumption.
     To estimate the mean concentration of a given pesticide
residue that is consumed by persons, consider the following
sample design.  The survey population for a survey of pesticide
                               3-ia

-------
residues in the diet comprises all adults and children living in
the U.S.  The gopulation is stratified with proportional
representation from demographic categories such as region, age
group, sex, race, and ethnic background.  There are special
provisions for oversampling subpopulations (e.g. pregnant women)
that may be particularly vulnerable to the pesticide residue
being studied.  The data collection protocol consists of sampling
portions of all food eaten by sample members at the time of
consumption.  The protocol also describes methods for insuring
that the food samples include snacks and food eaten outside the
home.  The sampling process is continued for several days or even
weeks to insure that the daily variations in each person's diet
are included in the measurements.  Finally, to include seasonal
fluctuations, data collection is done at least twice over the
course of a year.
     Such a survey may be difficult to implement for a number of
reasons.  Among them are excessive time and expense, the burden
placed on sample members, and the problems associated with
measuring relatively diminished residue amounts in prepared food.
However, thinking about such a survey helps to illustrate the
kinds of design and analysis issues that need to be considered if
people cannot be sampled directly.
     If food items, rather than people, are treated as the
elements of a survey population, the survey residue estimates
must be analyzed in light of the consumption patterns of the food
items being studied.  Otherwise, the survey estimates may distort
(up or down) the actual risk to people.  For example, an
infrequently consumed food item with a high residue concentration
may pose less of a health risk than a heavily consumed item with
a moderate residue level.
     In the following sections, a brief description of the issues
to be considered in sampling food items at different points in
the food processing and distribution chain is presented.  The
sampling points include raw agricultural commodities (racs) at
                               3-19

-------
the  farm  level, wholesale, food outlets, and retail food outlets.
The  implications of sampling at each point also are presented.
          3.2.2
Farm Level Sampling
          In most cases, the pesticide tolerance will
overestimate the actual residue levels on a crop at harvest.
When a pesticide tolerance level is combined with an estimate of
"percent of crop treated", which is usually based on limited
data, the resulting residue estimates for the crop may be
inaccurate.  A properly designed survey can overcome these
inaccuracies by directly measuring pesticide residue levels in a
probability sample of the raw agricultural commodities to which
the pesticide is applied.
     The first, and perhaps most important step in such a survey
design is the construction of an appropriate sampling frame.  For
example, a national listing of farms could serve as a list frame
because every field where a crop is grown {except for gardens)
and every head of livestock could be uniquely associated with a
farm or farming operation.  While such a list would be an
efficient sampling tool, acquiring a complete list of farms would
be difficult, if not impossible.
     Alternately, complete coverage could be attained if the
country were divided into geographic segments with identifiable
boundaries.  Then, a probability sample of segments could be
selected and all fields and/or livestock associated with the rac
of interest enumerated within each selected segment.  Of course,
the time and expense associated with the development of such an
area frame could be sizeable, depending on the size and number of
segments selected.
     The Agricultural Surveys conducted by the National
Agricultural Statistics Service (USDA/NASS, 1989) of the USDA use
a multiple frame sampling method that combines the efficiency of
a list frame with the accuracy of an area frame.  The most
                               3-20

-------
critical aspects of multiple frame sampling are the unique
association of^farm operators with land and the determination of
overlap between frames.  These aspects complicate survey
procedures because data collectors must obtain detailed
information about the operator of the land, other names the
operator may use, and other people associated with the operation.
In spite of this, the use of a multiple frame sample enables the
Agricultural Surveys to attain virtually complete coverage of all
major raw agricultural commodities raised in the U.S.
     The list frame for the Agricultural Surveys is compiled at
the state level from various listings of farm operators (e.g.
farm subsidies), and is updated on a continuous basis*  In
addition to name, address, and telephone number, each state
maintains control data that account for the field crops, grain
storage, cattle, milk, cows, hogs, sheep, and goats raised on the
land used by the operator.
     The area frame for the Agricultural Surveys is based on
satellite imagery and other information that delineate different
types of land use on aerial photographs.  Each geographic segment
is classified according to the standard set of stratum
definitions that is shown in Table 3.1.  The target size for
segments varies across strata.  Segments in cultivated land
strata are about 1 square mile, while range segments are 2 square
miles or more.  Urban and agri-urban segments vary in size, but
are generally 1/10 to 1/4 square mile.  Typically, range and
cultivated land segments are sampled at higher rates than other
types of segments.
     At present, the Agricultural Surveys do not collect data
about the use of specific pesticides on raw agricultural
commodities.  Further, listings of farm operators cannot be
released because of privacy restrictions.  However, conversations
with NASS staff indicate that it is possible to "buy in" to one
or more of the ongoing surveys.  For a mail or telephone survey,
questions concerning the use of certain pesticides could be added
                               3-21

-------
to the instrument;  for an on-site survey, actual rac samples
could be collected and chemically analyzed for residue levels.
     The primary advantage to sampling raw agricultural
commodities at the farm level is that an accurate estimate of
pesticide usage on eventual food items can be obtained.  If
pesticide usage is very low, localized, or clustered, a
farm-level survey can target monitoring toward areas where the
pesticide is known to be used.  Sampling at higher levels in the
chain of commerce under these circumstances could (depending on
the distribution of treated crop when it leaves the farm) cause
residue levels to be underestimated.  Also, if a pesticide
degrades rapidly to levels that are still toxic but difficult to
detect, sampling immediately after harvest may be the only
option, unless the limit of detection of the analytical method
can be lowered further.
     To obtain residue estimates in food as consumed, residue
estimates taken from raw agricultural commodities can be adjusted
for the effects of processing, storage, and cooking.  In the case
of major commodities like potatoes, that have many processed
forms, it may be difficult to adjust for the multitude of
possible changes that are made to the rac.  For some crops, like
spinach, that are primarily fresh, canned, or frozen, it may be
possible to develop degradation models that reflect changes in
residue levels.
     Another shortcoming of sampling raw agricultural commodities
at the farm level is that the survey population is confined to
the U.S.  Raw agricultural commodities that are grown outside the
U.S.  and then imported may be subject to different pests and
climatic conditions, and may thus exhibit different residue
levels.
          3.2.3
Wholesale Food Establishments
          The wholesale level might be described as the
"bottleneck" in the food processing and distribution chain in
                               3-22

-------
Table 3.1 Standard Land Use strata for the U.S.D.A. Agricultural
          Surveys
 Type of Land
 Cultivated  Land
 Cities and Towns
 Range
 Non-Agricultural
 water
Stratum
Dryland grain, 33-100% cultivated.
75-100% cultivated.
50-75% cultivated.
50-100% cultivated.
50-100% cultivated, 50% irrigated.
50-100% cultivated, 25-50% irrigated.
50-100% cultivated, 10-25% irrigated.
Orchards.
Vineyards.
Vegetables.
15-49% cultivated.
33-49% cultivated.
10-33% cultivated.
Agri-Urban, more than 100 dwellings per
square mile, residential mixed with
agriculture.
Residential/commercial, more than 100
dwellings per square mile.
Resort, more than 100 dwellings per square
mile.
Open range/pasture, 0-15% cultivated.
Woodland range/pasture, 0-15% cultivated.
Desert range, 0-15% cultivated.
Public grazing lands administered by Forest
Service or BLM, including some small parcels
of privately owned land, 0-15% cultivated.
Public land with no known agricultural
activity.                   	  	
Non-agricultural.
Lakes, reservoirs, canals, etc., under
construction.
Existing lakes, rivers, canals, etc.
Swamps.         	      	
                               3-23

-------
 that  most food  commodities  pass through a wholesale establishment
 before  being  disseminated to  retailers, restaurants, and
 institutions.   When  compared  to other survey populations in the
 chain {two million farms or a quarter million retail food
 outlets),  the 38,516 wholesale food establishments  (1982
 estimate)  provide a  relatively small population.  In addition,
 sampling  wholesale outlets  generally provides more complete
 coverage  of consumed food than sampling raw agricultural
 commodities,  which exclude  imported food items, or sampling
 retail  food outlets  which exclude restaurants and institutions.
 On the  other  hand, commodities such as fresh milk and other local
 products  may  not be  distributed by wholesalers.
      Wholesale  food  establishments are divided by the U. S.
 Department of Commerce  into nine Standard Industrial
 Classification  (SIC) categories which .range from specific food
 commodities like confectioneries or seafood to general line
 groceries.  The distribution  of wholesale establishments by SIC
 category  is shown in Table  3.2.  As the table shows, the
 wholesale market is  dominated by a relatively small number of
 large firms.  Consider: the fifty largest wholesale
 establishments account  for  27 percent of all sales at the
 wholesale  level;  the twenty  largest wholesalers of dairy
 products  account for 85 percent of sales in that category.
 Because sales is a fairly good indicator of food volume, this
 skewness  in the wholesale sales distribution can be exploited by
 the sample design.
      The  relatively  small number of wholesale food establishments
 enables a list sampling frame to be seriously considered.  Toward
 this  end, there are  several commercial mailing lists (e.g.,
American Business Lists) that maintain current machine-readable
 listings of wholesalers.  In  addition to name and address, these
 lists usually contain type  of business (i.e., SIC code) and sales
data.   With such a list in  hand,  the following sample design is
meant to provide a general  framework for sampling wholesale food
                               3-24
                                                                  /Of

-------
Table 3.2  Distribution of Wholesale Food Establishments
Kind of
Business
Groceries,
General Line
Frozen Foods
Dairy Products
Poultry and
Poultry
Products
Confectionery
Fish and
Seafoods
Meats and Meat
Products
Fresh Fruits
and Vegetables
Other
Groceries
Groceries and
Related
Products
Establishments
( number )
4,084
2,570
819
1,544
2,506
2,062
4,789
5,664
11,596
38,516
Sales
(million $)
70,573
22,629
8,991
8,752
10,877
5,706
38,585
24,154
84,443
288,659
20
Largest
(%)
37
27
85
25
38
22
29
14
31
17
5-0
Largest
(%)
55
40
96
38
66
35
39
21
44
27
Source:   U. S. Department of Commerce
          1982 Census of Wholesale Trade.
                               3-25

-------
 establishments.   Variations  in the design may be warranted,
 depending  on  th.e  objectives  of a particular survey.
      Assuming that  a  list  of targeted  food items has been
 developed,  the first  step  in the development of a usable list
 frame of wholesale  food  establishments would be to  identify the
 type  of establishments  (i.e.  SIC codes) that carry  one or more of
 the food items.   If,  for example, poultry products  are being
 studied, there would  be  relatively few SIC codes to consider;  if
 potatoes are  being  studied,  there could be a very large number of
 codes to consider.  The  final set of SIC codes would be used to
 subset the  desired  population of establishments from the original
 list  of all wholesalers.
      Because  large  wholesale establishments account for a
 significant proportion of  sales, the sales distribution of the
 population would  be examined.  The very large firms might be
 placed in a separate  stratum so that they could be  over-
 represented in the  sample.   (Because large firms account for more
 food volume than  small firms, their over-representation in the
 sample will improve the  precision of the residue estimates.) The
 other firms might be  stratified on the basis of census region,
 type of business, or  size.   While the  very large firms could be
 sampled directly  (i.e.,  in one stage), a two-stage  cluster sample
 could be drawn  from the  remaining firms in order to control the
 geographic distribution  of the sample.  Counties, which can be
derived from a  firm's zipcode, could serve as the first-stage
units for the  smaller firms  with the total sales of all eligible
 firms in a county serving as  a county-specific size measure.  The
 frame of counties could  be stratified  by census region to insure
geographic representation.
     Sampling  counties with  probability proportional to size
 (pps)  at the first  stage would over-represent counties with
greater-than-average wholesale sales activity.  At  the second
stage, a simple random sample of firms could be selected from
each selected  county.  (Optionally, firms could be  stratified by
size within each county.)  If the same number of firms are
                               3-26

-------
selected from each county, the overall selection probability
assigned to each firm will be approximately the same regardless
of where it was located.
     As stated above, there are several advantages to sampling
wholesale establishments.  However, in certain situations,
sampling at this point in the food processing and distribution
chain may not be appropriate.  For example, because many
wholesale establishments are national distribution centers,
regional residue estimates may be unattainable.  Also, many
commodities are consumed in several forms  (e.g. fresh, canned,
and frozen), and would require a different type of wholesale
establishment for each form of commodity.  If many commodities
are of interest in a particular survey, each with numerous forms,
the relative advantage of sampling at the wholesale level may be
reduced vis-a-vis other sampling points.
     3.2.4
Retail Foodstores
     The food retailing industry is composed of foodstores which,
by definition, are retail outlets with at least 50 percent of
sales in food products that are intended for off-premise
consumption (USDA, 1988).  There are two main categories of
foodstores:

     •    Grocery stores, which sell a variety of food products,
          including meat, produce, packaged and canned foods,
          frozen foods, other processed foods, and non-food
          products; and,
     •    Specialized foodstores, which are primarily engaged in
          the retail sale of a single food category such as meat
          and seafood stores, dairy stores, candy and nut stores,
          and retail bakeries.
Grocery stores are further subdivided into supermarkets,
convenience stores, and superettes.  The definitions of these
                               3-27

-------
types of stores and the estimated 1987 distribution of foodstores
by number and sales is shown in Table 3.3.
     The relative number of foodstores by category contrast
sharply with their ^relative sales.  In 1987, supermarkets had the
largest share of sales but represented the smallest category by
number.  The number of supermarkets, which accounted for 11
percent of retail foodstores, has remained stable in recent years
as store size increased.  Specialized foodstores were the
second-largest category by number but had the smallest share of
sales of all foodstore categories.  Superettes represented 39
percent of all foodstores in 1987.  These outlets had many of the
same departments found in supermarkets but lacked the minimum
annual sales to qualify as supermarkets.  Superettes accounted
for only 13 percent of foodstore sales, but 39 percent of all
foodstores.  Convenience stores accounted for over 20 percent of
foodstores, but only 11 percent of foodstore sales.  While other
types of foodstores have declined, convenience stores have grown
rapidly, increasing more than 6 percent in 1987.  However, the
share of sales of convenience stores remains among the lowest of
all foodstore categories.
     The contrasting distributional characteristics of retail
foodstores make coverage of this 236,000 element survey
population especially problematic.  While listings of foodstores
are available from a number of sources (e.g. Dun and Bradstreet),
an area sampling frame is a frequently necessary for insuring
complete coverage of this survey population.
     A typical sample design for retail foodstores might be a
two-stage cluster sample with counties serving as clusters.
Counties are well suited for this purpose because they have well
defined borders and because the Census of Retail Trade maintains
counts of the number and sales of foodstores in each county.  If
sales are assumed to be proportional to food volume, each county
could be assigned a size measure based on the total sales of its
foodstores.  To insure representation, the first-stage frame of
                               3-28

-------
Table 3.3  Estimated Distribution of Retail Foodstores in 19871
Type of Food
Store
Grocery Stores:
Supermarkets2
Convenience
Stores3
Superettes4
Specialized
Food Stores5
Total
Establishments
(thousands) (%)

27
50
89
70
236

11
20
39
30
100
Sales
(billion $) (%)

219
36
41
18
314

70
11
13
6
100
'Source:   USDA ERS Food Marketing Review,  1988.

2A grocery store,  primarily self-service in operation,  providing
a full range of department, and having at  least $2.5 million in
annual sales  (1985 dollars).

3A small  grocery store selling a limited variety of food and non-
food products, typically open extended hours.

4A grocery store,  primarily self-service in operation,  selling a
wide variety of food and non-food products with annual  sales
below $2.5 million (1985 dollars).

Primarily engaged in the retail sale of a single food category
such as meat and seafood stores, dairy stores, candy and nut
stores, and retail bakeries.
                               3-29

-------
 counties  could  be  stratified  by  region of the country and by
 degree  of urbanicity.   The  actual number of counties selected
 into  a  sample would  depend  on the research objectives of the
 survey.
      To develop the  second-stage frame of foodstores, an
 enumeration of  all foodstores would be required within each
 selected  county.   Specialized foodstores would be included in the
 frame if  one or more of their food products were targeted by the
 survey.   Mailing or  business  lists of foodstores could be helpful
 in this process, but ultimately, some sort of county-wide
 systematic search  would be  necessary.  During the enumeration
 process,  foodstores  could be  classified by type, size, and chain
 affiliation.  These  characteristics then could be used to form
 second-stage strata  within  each  county.  Approximately equal
 numbers of foodstores could be selected from each selected.
 county, yielding a self-weighting sample of foodstores within
 each  stratum.
      Sampling retail foodstores  offers several advantages over
 sampling  raw agricultural commodities or even wholesale
 establishments.  Prominent  among them is the relatively short
 time  interval between purchase and consumption.  In fact, because
 most  food is consumed soon  after purchase, only the effects of
 food  preparation and cooking  would not be taken into account in
 residue estimates obtained  from  a retail survey.  Also, retail
 foodstores are  the terminal point in the food processing and
 distribution chain.  Thus,  sampling foodstores enables regional
 residue estimates to be made.
      One  disadvantage of sampling retail foodstores is the high
 cost  of developing an area  sampling frame.  In certain
 situations, much of  these costs  can be avoided by using an
 existing  survey  sample.  For  example, the A.C. Nielsen Co., as
primary contractor for  the  USDA's National Constituent
Laboratory's food survey research projects, has developed a
Nielsen Food Index (NFI) that is designed to estimate nationally
representative  levels of trace chemicals in foods.  The NFI is a
                               3-30

-------
stratified, two-stage probability sample of 1,050 grocery stores
(presumably, specialized food stores are not included) selected
from approximately 600 counties nationwide.  The NFI is an
example of a scientifically valid survey sample that can be used
effectively for many surveys of pesticide residues in the diet.
However, if subsampling from an existing sample is proposed, care
must be taken to insure that the subsample does not
systematically exclude elements of the survey population and that
it is large enough to achieve the research objectives of the
survey.
     Another potential disadvantage of sampling at the retail
level is that, unless the use and distribution of the pesticide
of interest is relatively widespread and uniform across the
country, it may be necessary to take into account the pesticide
distribution in designing a retail survey, or poor estimates of
mean residue levels may be obtained.  For example, if a pesticide
is used exclusively on certain apple varieties grown and
distributed on the East Coast, a survey of residue levels of this
pesticide should not sample primarily apples from the West Coast.
However, at present, information about pesticide usage and
commercial distribution by region or commodity variety is not
necessarily available.
     Another disadvantage of sampling retail food stores is the
incomplete coverage in that restaurants and institutional
consumption is excluded.
3.3  Selecting the Appropriate Sampling Point

     In the preceding sections, the implications of sampling at
different points in the food processing and distribution chain
were discussed.  Although the final determination of the
appropriate sampling point must be made on an individual basis,
the following factors should be included in the decision-making
process:
                               3-31

-------
              •  persistence of pesticide
              •  Pesticide usage patterns
              •  Commodity distribution in commerce
              •  Types of estimates desired.

     Pesticide residues that degrade rapidly can pose
considerable measurement problems, especially for pesticides
where levels at or lower than the limit of detection are still
considered harmful.  In this situation, sampling raw agricultural
commodities at the farm level may be the only option, unless the
limit of detection can be lowered reliably.  Alternatively, if a
pesticide is relatively persistent over time and across a variety
of storage and processing conditions, other factors should also
be considered in deciding on the appropriate sampling point.
     The spatial and temporal distribution of pesticide usage and
the subsequent distribution of the pesticide-treated commodity in
the food supply also affect the feasibility of different sampling
points.  If usage of a pesticide of interest is very low,
localized or clustered, sampling at higher levels in the chain of
commerce may cause residue levels to be underestimated, unless
enough is known about the commercial distribution of treated crop
to take this into account in the survey design.  Alternatively, a
farm-level survey could be targeted toward areas where the
pesticide is known to be used.  However, this would not yield any
information about regional consumption of treated commodities.
     Apart from the pesticide distribution on treated
commodities, the commercial movement of a commodity itself can
also make some sampling points less efficient or even infeasible
in some cases.  For example, bananas are imported, and cannot be
sampled at the farm level within the U.S.  Fresh milk, which is
locally produced and sold, usually bypasses wholesalers entirely.
     Many food commodities like corn, potatoes, and apples also
appear in a multitude of food products.  As such, they are
canned, cooked, frozen, mixed, and exposed to varying storage
conditions.  Also, for some pesticides, levels of toxic
                               3-32

-------
metabolites may increase upon storage or processing.  The
combined effects of these factors make sampling processed food
products, as opposed to or in addition to raw agricultural
commodities, important, unless there are adequate data about
changes in residue*' levels with processing or storage.
     It should also be noted that if residue estimates for
subgroups of the population  (rather than national means) are
desired, this can also affect the choice of sampling point.  For
example, if a commodity is regionally distributed, sampling
retail level foods (which are representative of local
consumption) will provide better regional residue estimates than
sampling farms or wholesale establishments, from which most
commodities still move in interstate commerce.
     Clearly the tradeoffs between sampling points must be
carefully evaluated.  The accurate estimation of pesticide
residue levels in foods is an important component in analyzing
the risks and benefits of pesticide use.  Toward this end, this
appendix is meant to provide an overview of the design and
implementation of statistically valid surveys of pesticide
residues in the diet.  Because such surveys often require a
complex and technically demanding effort, a professional
statistician should be actively involved in the development of
the survey design and in the analysis of the data.
3.4  References
Cochran, W.G. (1977).  Sampling Techniques. Third Ed.,
John  Wiley & Sons, New York.
Conover, W.J. (1971).  Practical Nonparametric Statistics, John
Wiley & Sons, New York.
Flyer, F. and Mohadjer L. (1988).  "The WESVAR Procedure,"
Westat, Inc., Rockville, MD.
Hansen, M.H., Hurwitz, W.N., and Madow, W.G. (1953).  Sample
Survey Methods  and Theory.  I, John Wiley & Sons, New York.
                               3-33

-------
Kendall, M.8. and A. Stuart  (1961).  The .Advanced Theory of
statistics. II, Charles W. Griffin, London.
              *»
Kish, L. (1965). Survey Sampling. John Wiley & Sons, New York.

SAS Institute, Inc.  (1985).  "SAS User's Guide:  Basics," Version
5 Edition, Gary, NC.

Shah, B.V., LaVange, L.M., Barnwell, B.C., Killinger, J.F., and
Wheeless, S.C. (1989).  "SUDAAN: Procedures for Descriptive
Statistics User's Guide,"  Research Triangle Institute, Research
Triangle Park, NC.

Stuart, A. (1984). The Ideas of Sampling. Macmillan, New York.

U.S. Department of Agriculture, National Agricultural Statistics
Service (1989).  "Agricultural Surveys Supervising and Editing
Manual."

U.S. Department of Agriculture, Economic Research Service  (1989).
"Food Marketing Review, 1989."

U. S. Department of Commerce, 1982.  1982 Census of Wholesale
Trade

Woodruff, R.S. (1971).  "A Simple Method for Approximating the
Variance of a Complicated Estimate," Journal of the American
Statistical  Association. 66, 411-414.
                               3-34

-------
                APPENDIX 4
EXISTING SOURCES OF PESTICIDE RESIDUE DATA

-------
4.1  Introduction

     This appendix examines programs within the EPA and outside
the Agency that provide pesticide residue data for foods and
animal feeds on a continuing basis.  Particular attention is
given to determining if and to what extent the data generated in
these programs can be effectively used in estimating pesticide
residue levels in food at the time of consumption.  Section 4.2
describes crop field trials conducted by petitioners (or
registrants) to support pesticide tolerance and registration
requests.  This is an important source of pesticide residue
levels as data from these field trials are used by the EPA to
establish tolerances and register pesticides and may be used to
determine anticipated residues- for dietary exposure assessment.
The pesticide monitoring program conducted by the Food and Drug
Administration (FDA) is described in section 4.3.  References for
obtaining more detailed information are given in section 4.4.

4.2  Crop Field Trials
     4.2.1
Introduction
              Under  the authority  of  the  Federal  Food,  Drug,  and
Cosmetic Act, the EPA  is responsible for establishing tolerances
on pesticide residues  in food/feed distributed in interstate
commerce, and under the authority of the Federal Insecticide,
Fungicide and Rodenticide Act the EPA is responsible for
registering pesticides marketed in the United States.  All
pesticides marketed in the United States must be registered by
EPA.  In order to register a pesticide for use in the United
States,  a petitioner must submit to the EPA, among other things,
field measurement residue (field trial)  data for each crop (or a
representative crop) included in the request.  Data typically
show residues resulting from the maximum application rate and the
                               4-1
                                                                /It-

-------
minimum preharvest interval  (PHI); however, other application
rates and/or Pgls may be included in crop field trials.

     4.2.2    Guidelines  for Design.  Implementation  and Reporting

              Residue data  from field trials are required  by  the
EPA as part of a pesticide tolerance or registration request.  To
aid the petitioner in providing the appropriate information,
guidance on crop field trial design, implementation and reporting
is given in two reports—The EPA Pesticide Assessment Guidelines,
Subdivision O (Residue Chemistry), prepared by the EPA's Office
of Pesticide Programs [1] and the FAO Guidelines prepared by the
Food and Agriculture Organization (FAO) of the United Nations
[2].  The FAO Guidelines are incorporated by reference in the EPA
Guidelines.  In these documents, the level of specification is
well-defined in some areas while in other areas petitioners are
given considerable leeway in the design and implementation of the
field trials.  For example, the FAO guidelines are rather
specific with regard to issues such as pesticide formulation,
method of application, dosage rates, amount of product or
commodity to include in a sample, packing, and shipping.  The FAO
guidelines also specify the information to be reported and
include a suggested format.  However, the guidelines are more
general for issues relating to the number of test sites needed,
the selection of test sites, and the representativeness of field
samples.  This is evidenced from the following statements
contained in the FAO guidelines document.

     The number of sites needed depends on the range of
     conditions to be covered, the uniformity of crops,
     agricultural practices, and data already available.
     Sufficient data must be available to confirm that patterns
     can be determined to hold for all regions and the total
     range of conditions, including those that are likely to give
     rise to the highest residues.
     Trials in at least two growing seasons are normally needed.
     Trials should be carried out in major areas of cultivation

                               4-2

-------
     or production and should be sited to cover the range of
     representative conditions (climatic, seasonal, soil,
     cropping System, farming, etc.) likely to be met in the
     intended use of the pesticide.
     Representative, samples of the crop in each plot must be
     taken by a recognized procedure.  Although each plant or
     fruit should normally have an equal chance of being chosen,
     emphasis should be directed toward identifying the highest
     residue levels.

     4.2.3    Field  Studies  as a  Source  of  Data  for
              Characterizing  Anticipated Residues
              Crop  field  trials  represent  a  major  source  of
information on pesticide residues in foods at harvest because of
their direct use in establishing crop tolerances.  However, the
extent to which field studies can be effectively used to assess
levels of pesticides in foods at the time of consumption has not
been well established.  There are basically three reasons for
this.  First, unlike in field trials, pesticides in the real
world are not always applied according to label conditions to all
crops for which there is a registered use.  Usage varies all the
way from the recommended (or occasionally higher) rates and at
times to zero application.  Second, the interval from treatment
to harvest (TSI, or PHI  (preharvest interval)} is often longer
than the minimum PHI specified on the label.  Third, there is a
time gap between harvesting and consumption.  During this time
period, food commodities may undergo some form of processing
(e.g., washing, peeling, canning, freezing, milling) and/or
storage that could range from a few days to several months.  Both
processing and storage can cause changes in residue levels.
     Field trial data are generally available on crops for which
there are tolerances.  In order to effectively use field trial
data to estimate residue levels in table-ready foods, there
should be mechanisms for adequately determining (1) the effects
of departure from 100 percent treatment of all crops according to
the most extreme use directions on the label, (2) the effects of

                               4-3

-------
various types of processing on residue levels, and  (3) the
effects of various storage times on residue levels of the parent
compound and any toxic breakdown products.  This appendix
addresses issue (1).  Issue (2) is addressed in the main body of
these Guidelines and in the Pesticide Assessment Guidelines
Addenda on Standard Evaluation Procedures and Data Reporting
Guidelines for Processing Studies.  issue (3) is addressed in
detail in appendix 5 from the standpoint of the design of
experimental studies for estimating degradation rates of
pesticide residues during storage.
     4.2.4
Statistical Issues in Crop Field Trial D@sj.crn
              This subsection identifies  some  statistical  issues
which should be considered in the design of crop field trials.
              4.2.4.1
            Plot Selection
              To validly extend the  results  observed  in  a  sample
to a population of interest, every member of the population must
have some known positive chance (not necessarily an equal chance)
of being selected into the sample.  In the case of crop field
trials, plots are not selected on the basis of probability.
Rather, they are conducted on experimental farms or by growers
willing to participate.  Failure to use a probability sample of
plots is not surprising.  It would take a prohibitive amount of
time and resources to (1) identify all the possible plots in all
areas of the country where the crop under study could be grown,
(2) select a probability-based sample of plots, (3) get
landowners' consent for use of the selected plots, and  (4)
conduct the study.
     In reference to the selection and number of plots,  the FAO
guidelines indicate that "trials should be carried out  in major
areas of cultivation or production and should be sited to cover
the range of representative conditions (climate, seasonal, soil,
                               4-4
                                                                  /I?

-------
 cropping  system,  farming,  etc.)  likely to be met in the  intended
 use  of  the pesjticide.  ...  the  number of sites needed depends upon
 the  condition  to  be  covered, the uniformity of crops and
 agricultural practice,  and the data already available."  Since it
 is not  practical  to  use a  probability sample of plots for
 conducting field  evaluations of  pesticides, the issue of
 representativeness becomes even  more important.
     The  initial  step  in assuring that plot selections are
 representative of the  population of available plots is to
 identify  major growing regions by grouping geographic areas that
 are  similar with  respect to factors that could influence residue
 levels  (e.g.,  climate,  farming practices, growing conditions,
 rainfall).  Plot  selections are  then made within each region.
 Grouping  similar  areas into a  region increases the chances that
 plots selected in a  nonrandom  fashion will be representative of
 that region.   Information  on production down to the State' level
 is available in Agricultural Statistics [4J.  Suggested  locations
 (States), for  Crop Field Trials, which are representative of
 major growing  areas, are provided in Attachment 4 of the Phase 3
 Technical Guidance [5].  Another  way of assuring that plot
 selections are representative  of the population of available
 plots is to select a number of plots within a region that
 reflects the proportion of  national production grown in  the
 region.
              4.2.4.2
Variations in Design and Reporting
              In their examination and assessment of crop field
trials, SRA Technologies  [3] reviewed  15 petitions  for pesticide
use covering more than 30 different crops.  This review revealed
significant variability in the major design issues  in field
trials.  For instance, differences were noted in  (1) the number
of application rates and PHIs used, (2) the size of the trial  in
terms of the total number of plots, (3) how plots were allocated
to (or distributed among) major growing regions,  (4) the number
                               4-5

-------
  of samples within plots,  and (5)  the number of plots treated with
  the recommended maximum application rate and minimum PHI.   In
  addition to variations in field study design,  the  petitions also
  revealed variations  in the amount and type  of  data reported.
  This variability is  demonstrated  in the  following  listings  that
  show data elements that are generally reported  and data elements
  that are rarely or inconsistently reported.  Information
  generally reported includes:
       •   pesticide formulation,
       •   application  rates  (including  number  and timing),
       •   chemical analytic  methods  (including rate  of recovery,
          instrument calibration, and sample chromatograms),
       •   harvesting (including date and condition of crop),
       •   preparation  of  samples,
       •   storage, and
       •   shipping.
 Information rarely or inconsistently reported includes:
      •  basis for determining the number  of  plots  treated  with
         the recommended maximum application  rate and minimum PHI,
      •  basis for defining growing regions,
      •  rationale for plot allocations among growing region
          (i.e.,  number of plots assigned to the  various growing
         regions),
      •  rationale for selection and representativeness of  plots
         within  growing regions,
      •  methods of sampling crops  within  plots,
      •  other pesticide usage in current  and  previous year,
      *  method  of  irrigation,
      •  soil, and
      •   weather.
              4.2.4.3     Compositi
                                  na
              The  FAO  guidelines  specify  that  individual  items of
product be composited to form a sample from an individual plot.
                               4-6
                                                                /a/

-------
 For  example,  in  the  case  of  large vegetables  {e.g., cabbage,
 cucumbers, melons) the  guidelines recommend a 5 kg sample
 consisting of not  less  than  five items selected from all over the
 plot.   For the purpose  of dietary exposure assessment for acutely
 toxic pesticides,  compositing  is acceptable for crops that are
 generally processed  before consumption {e.g., grains).  However,
 for  those crops  in which  an  individual item may be consumed by
 one  person {e.g.,  apples,  peaches, cantaloupe), information
 provided from composited  samples may not be adequate to determine
 the  maximum likely residue level in an individual item.  For
 instance, if  six cucumbers (of equal size) are composited to form
 samples in a  crop  field trial, the highest residue measurement  in
 the  study is  the arithmetic  average of residue values from each
 of the six cucumbers making  up the sample.  Since variation
 always exists  (it  may not be detectable with the measurement
 method), there is  at least one residue value greater than the
 mean which is  used to estimate the maximum.  Information relating
 to the variation in  residues from one item of product to another
 (in  this case, cucumbers)  is not currently collected as part of
 field trials.  If  acute exposure is a concern for pesticides
 applied to crops in which an individual item  (e.g., sweet potato)
 may  be consumed by a person, then residue levels on individual
 items may be  needed by  the Agency in characterizing anticipated
 residues.  An alternative method for estimating the variability
 in individual  samples based  on variation in composite samples is
 discussed in  section 4.2.6.    EPA is currently preparing
guidelines for the collection of residue data for acutely toxic
pesticides which address  this issue.
     However,  it is  important to recognize that tolerances are
 enforced by the Food and  Drug Administration -(FDA) who also
composite samples.  The drawbacks to sample compositing relate  to
dietary exposure assessment  for acutely toxic pesticides, not to
tolerance enforcement.  A major disadvantage of analyzing
 individual samples is the significantly higher cost.  Minimal
uncertainty is generally  added to the dietary risk assessment for
                               4-7

-------
chronically toxic or carcinogenic pesticides as a result of
sample compositing.
     4-. 2.5
Assessing Chronic Exposure
              Different types of  crop  residue  data  or  different
methods of analyzing residue data are needed for chronic exposure
than for acute exposure.  Whereas the mean residue level is
satisfactory for assessing chronic exposure, an upper point in
the distribution o.f residue  levels of individual servings is
needed for assessing exposure to an acutely toxic pesticide.
This is especially important for crops such as fruits and
vegetables in which a substantial portion of an individual item
(e.g., apple, melon, potato  (baked)) may be consumed by an
individual over a short period of time.  For crops that are 100
percent processed before consumption  (e.g., grains), this
requirement does not apply since an individual serving is, in
itself, a composite sample.
     This section provides guidance in analyzing field trial data
for chronic exposure.  The assessment of acute exposure is
discussed in section 4.2.6.
     4.2.5.1  Analysis  Based on Residue  Data  from Plots  Treated
              with the  MaximumRecommended  Application Rate  and
              the  Minimum Recommended PHI  (or Registered PHI
              Reflecting the Highest Pesticide Residue)
              Suppose field  trial  data  for  a  particular crop are
being considered for use in estimating residue levels  in foods at
the time of consumption.  It is recommended that one begin the
analysis by estimating the mean (or median or geometric mean if
the data are approximately lognormally distributed) and
associated standard error using data from plots in which the
maximum recommended application rate and the minimum recommended
PHI (or registered PHI reflecting the highest pesticide residue)
                               4-8

-------
were employed.  The standard error provides a measure of the
precision or "adequacy" of the estimate.
     This section describes a method for estimating the
population mean residue level and its standard error.  A
numerical example is used to illustrate how the method can be
applied.  The critical assumptions in applying the procedure are
noted.
     Most crop field trials involve some variation of a common
design.  This design consists of dividing the U.S. into growing
regions, selecting plots within each growing region, and
selecting samples within each plot.  The procedure described
below can be used to analyze data from this design provided that
(1) there are at least two plots within each growing region, (2)
plots are randomly selected within each growing region, (3)
samples are randomly selected within each plot, and  (4) the
number of samples within each plot is constant for all plots
within a growing region.  Requirement 1 is necessary to obtain an
estimate of the standard error using this method.  The issue of
probability sampling {requirement 2}  was discussed in section
4.2.4.1.  Requirement 3 does not appear to pose a serious problem
since samples are generally randomly selected within a plot.
Requirement 4 is made to simplify the analysis but could be
relaxed.  Frequently these requirements are not all met.  In such
cases,  this method must be modified and resulting uncertainties
in the exposure analysis must be noted.
     Let:
     xhl   =  mean residue level averaged over  all  samples  in the
             i* plot in the h* growing region.   When there is
             only one  sample selected  from a plot, the
             measurement for that sample is taken  as the plot
             mean,
     H    =  number of growing regions,
     nh    =  number of plots selected  from the h* growing
             region, and
                               4-9

-------
     w,.
-   proportion of total production in the h* growing
              region,
     The mean  residue  level  for  the  h* growing  region  is
estimated  as:
  (1)
x* =
     The variance  of  the  estimated  mean  for the h* growing  region
is estimated as:
  (2)
,_2_
**^
= £  -
  h  n}
     The estimated mean residue  level  over  all  growing regions
 (i.e., population mean) is given by:
              H
              £
              h-l
and the estimated variance of  x  is  given by:
  (4)
"I • £  •* 4  •
               A-l
     Finally, the standard error of x  is  obtained  by  taking the
square root of the estimated error variance  given  in  (4),  namely:

 (5)      9--

     In summary, the above procedure uses plot means  to  obtain a
mean and associated variance of the mean  for each  growing  region.
Growing region production information  can then be  used to  combine
regional estimates to derive estimates of the population mean  and
associated standard error.  Plots are  assumed to be
                               4-10

-------
 representative  of  the  growing  region; the validity of this
 assumption  is enhanced by  selecting more plots within a growing
 region.
      Example A
     Ten  field trials have been conducted using pesticide x on
crop y at the recommended maximum application rate and minimum
PHI.  There are three major growing areas {designated as Region
A, B, and C) in the United States.  Region A produces
approximately 10% of crop y, Region B produces about 25%, and the
remaining 65% is produced in Region C.  Two of the ten trials (or
plots) were located in Region A, two were located in Region B,
and the remaining six plots were located in Region C.  Within
each plot two samples were selected and analyzed.  The residue
measurements (hypothetical) for this scenario are given in the
upper portion of Table 4-1.
     The  lower portion of Table 4-1 shows the calculations
required and gives the analysis results.  The population mean
residue level is estimated to be 3.21 ppm with a standard error
of 1.12 ppm.  The standard error can be used for judging the
precision or reliability of an estimate — the smaller the
standard error, the "better" the estimate.  As a general rule,
the estimated mean plus two to three standard errors provide a
meaningful upper bound on the true mean.  Thus, for this example
the true population mean is very likely to be less than 6.6 [=
3.21 + (3 x 1.12)] .
     When the precision of the estimated population mean is
deemed inadequate, another analysis procedure may be employed if
the field study provides residue data for other application rates
and/or other PHIs.  This alternative procedure is discussed in
the next section.
                               4-11

-------
     4.2.5.2 Analysis  Based  on  Residue  Data  from  Multiple
             Application' Rates  and/or Multiple  PHIs
             The  analysis  of  the  previous  section  utilizes  data
only from plots treated at the recommended label conditions,
i.e., maximum application rate and minimum PHI.  In practice,
however, crop field trials often include multiple application
rates — one of which is the recommended level.  The FAO
guidelines for field trials recommend that at least two
application rates be used.  Similarly, multiple PHIs are
sometimes incorporated in the field study design.
     As a result of using multiple application rates and/or
multiple PHIs,  the number of plots treated at the recommended
label conditions may be too small to give an estimate of the
population mean with the desired level of precision (using the
analysis of the previous section) .  When this happens, the* use of
linear regression (simple or multiple) is recommended for
estimating the population mean residue level at the most extreme
use pattern on the label, since this procedure utilizes all
available data from the field trial.
     If a linear relationship between residue level and
application rate or PHI can be assumed (frequently the natural
log of the residue level is assumed to be linearly related to the
PHI) , then linear regression can be used to estimate the
parameters of the assumed model.  The parameter estimates can
then be used to estimate the expected or mean residue level  for
the recommended application rate and PHI.  One model that may be
considered is:
 (6)
:hij
where:
                 observed residue  for the  j* sample from the 1th
                 plot  in growing region h,

                               4-12

-------
     X,       =   application rate (Ibs/acre),
     X2       =   preharvest interval (number of days between
                  application and harvest),
     #0,@i,&2  =   parameters to be estimated,  and
     ehlj       =   error associated with j* sample from the i* plot
                  in  growing region h.

In using this model  to make estimates,  to conduct tests of
hypothesis  regarding the  parameters, and to construct confidence
interval estimates,  it is assumed that the  error, ehlj is a
normally and independently distributed random variable with mean
zero and unknown  variance,  a2.  Once the parameters  have  been
estimated,  the mean  residue expected under  label conditions can
be predicted from:

                        Y = |30 + 3^ + $2X2

by setting  X, equal to the maximum recommended  application
(Ibs/acre)  and X2 equal to the minimum recommended PHI  (days).
Most statistical  texts give procedures for  constructing
confidence  limits on Y.
     In some instances, model (6)  may  not be applicable because
the assumption of a  constant variance,  a-,   is not valid.  In this
case, another model  that  may be considered  is:

(7)         lnYhl,   = In00 + ft  InX, + ft InX, + 
-------
Table 4-1.
Residue Data and Analysis Results - Exam01e_ A
Growing RegJLdfi
A

B

C





Plot No.
1
" 2
1
2
1
2
3
4
5
6
Observed Residue
Sample l Samole 2
2.50
0.10
2.50
5.65
1.70
4.15
9.15
0.30
1.40
0.60
2.95
0.40
2.00
6.40
1.55
3.35
11.95
1.10
1.95
0.30
xhl
(plot mean)
2.725
0.250
2.250
6.025
1.625
3.750
10.550
0.700
1.675
0.450
Growina Reaion

A
B
C
nh
2
2
6
wh
.10
.25
.65
**
feauation 11
1.4875
4.1375
3.1250
•*.
feauation 2)
1.5314
3.5627
2.4303
              Population Mean  (equation 3) :  X = 3.21
              Variance of X (Equation 4) :  s| = 1.2648
            Standard error of 'x (Equation 5) :  Sy = 1.12
             Relative standard error -^ = 0.35 or 35%
                               4-14

-------
The assumption under this model  is that the error,  
-------
     4.2.6
Assessing Acute Exposure
              When a pesticide is applied to crops in which the
individual item can be consumed directly by one person  (e.g.,
cucumbers, oranges, sweet potatoes) and when that pesticide has
the potential for being acutely toxic, then the residues present
in individual items  (i.e., the distribution of residues) are of
greater interest than, say, the mean residue level used in
assessing chronic exposure.  A measurement made on a composite
sample of several peaches, for example, provides a mean residue
level but does not adequately address the risks of eating, say, 3
or 4 fresh peaches in a single evening.
     The variability in residue levels among composite samples is
always less (i.e., the distribution of residue levels is more
compact)  than the variability among samples of individual items
that were composited.  The magnitude of this reduction in
variability is governed by how much variation there is in residue
levels among  individual items being composited (in the case of
field trials,  individual items within a plot are composited); the
reduction in  variability due to compositing increases as the
variation in  residue levels among individual items within plots
increases.  This section describes a method for estimating the
variability in residue levels among individual items using data
from field trials in which individual items from within a plot
have been composited prior to analysis.  This method requires an
independent estimate of measurement error whenever field trials
do not provide multiple measurements on the same sample.
     As noted earlier, most crop field trials involve dividing
the U.S.  into growing regions, selecting plots within regions and
selecting samples within plots.  In some studies, replicate
measurements may also be made on a single sample.  With this type
of design, the total variation in an observation can be expressed
as the sum of several components of variation.  For example, let
               total  variance  of  an  observation  made  on  a  sample
               consisting  of an individual  item  (e.g., apple).
                               4-16

-------
      atoT(o   = total variance of an observation made  on  a
               composite sample of m individual items (e.g.,  a
               .blending of m apples) .
 The  components making up these total  variances are
  (8!
           Op/g
  (9)
_
"TWO ~
           _2
           °P/K
»i
where:
      'P'R
      'S/P
     m
    component due to differences among regions,
    component reflecting plot-to-plot variation within
    a  region,
    component reflecting sample-to-sample variation
    (of  individual items)  within a plot,
    component reflecting measurement variation,-and
    number of individual items making up a composite
    sample.
Expressions  (8)  and (9)  are derived from additive models that
assume that  an  observation consists of an overall mean plus
effects due  to  growing regions,  plots within regions, samples
within plots and a  measurement error.
     The objective  is  to  estimate  a\OT(n using data from tests made
on composite samples.   We assume that the number of items being
composited,  m,  is known.   Combining (8) and (9)  we obtain
  (10)
 _2      _2
 aTOT( J)  =

                     _2
                     °S/P
The first term  on  the  right hand side of (10), the total variance
of a composite  sample,  may be estimated as
  (11)
where:
d2
            TOTICI
                           vhijk
                               4-17

-------
                  k* determination of residue made on the j*
                  sample in the i* plot within stratum h,
     wh«k      -^  sample weight assigned to yhljk,  and

     y        =   weighted mean ( =
 The  other term in (10)  which must be estimated before a|OT(D can be
 determined is a|,P, variation among samples  (of  individual  items)
 within  plots.
     Two  procedures  are given  for estimating a\lf — one procedure
 for  the case where replicate measurements are made on each
 composite sample and another procedure for the case where only
 one  measurement is made on each sample.   If q measurements (q >
 2) are  made on each composite  sample,  the ANOVA (analysis of
 variance)  procedure can be used to estimate a\lf as follows:
  (12)  62  = m ( {samPle~wi thin-plot mean square) - error mean squai
In most  field  trials  only one measurement is made on each
composite  sample.   With only one measurement per sample (i.e.,
q=l), the  error  mean  square from ANOVA provides the following:
                         A2
                         O o/p   . •>
     error mean square =  —— +• 6:1
                          m
  (13)      d|/P = m (error mean square -
Whereas the error mean  square in  (13)  is obtained directly from
ANOVA procedures  (for the  case of  a single measurement per
sample) , a^, must either be known or estimated  from other data
before a\/? can be determined,  a^  is the measurement error
variance and  represents the  failure of replicate measurements
taken on the  same sample to  be equal.   If r measurements (r > 2)
are made on each of k different samples, the variance of the r
                               4-18

-------
measurements on  a given  sample  is an estimate of a2m ; thus, there
are k independent estimates  of  o2m.  The k estimates can be
combined to form an  average  or  "pooled" estimate by adding the
sum of squares  {corrected  for the mean) for each sample and
dividing by the  sum  of the degrees of  freedom for each sample.

     Example A - (Continued)

     The data for Example  A, shown in Table 4-1, will be used to
illustrate how measurements  of  residue  levels on composite
samples can be used  to calculate the total variation  associated
with an individual item.
     From (11),
7TOTro is estimated as
     • 2
            9.75
In applying  (11), the subscript k drops out since there  is  only
one determination per sample.  Also, the weights for  individual
observations  (not shown  in Table 4-1) are:  0.025 for each
observation  in Region A; 0.0625 for each observation  in  Region B;
and 0.05417  for  each observation in Region C.
     For purposes of this example,  we assume that a composited
sample consists  of m-10  individual items and also assume a
measurement error variance of c*m = 0.07.
     The error mean square (i.e.,  variance of the two sample
measurements within each plot, pooled or averaged over all  ten
plots) is estimated to be 0.532.  Thus, from  (13) we  obtain:

                    d2s/P  = 10 {0.532  - 0.07} = 4.62
Finally, from (10) we obtain:
                               4-19
                                                                I I

-------
            - 1
            m
                            >S/F
= 9.75
                           4.62
           = 13.91

     In this example, the population standard deviation of
composite  samples is  estimated  at  3.12  ppm  calculated  from:
If  individual  items  had  been  analyzed, we would estimate the
population standard  deviation to be  3.73 ppm calculated from;

Thus, in this example, the spread  in the distribution of residues
for  individual  items  (as measured  by the estimated standard
deviation) is about 20 percent  larger than the corresponding
standard deviation for composite samples.  (Note:  in using
actual field trial data, the  "population" to which these standard
deviations apply is undefined since probability sampling is not
employed in the selection of  plots within growing regions.)
     In estimating exposure for assessing acutely toxic
pesticides, a point in the upper tail of the distribution of
residues of individual items  is recommended.  Typically, using
the estimated residue level corresponding to the mean plus three
standard deviations would be  appropriate, although use of this
value is not absolute since other  factors such as the nature of
the toxic effect must be considered.  In our numerical example,
the estimated mean is 3.21 (Table  4-1) and the estimated standard
deviation for individual items  (as derived in this section) is
                               4-20

-------
 3.73.   Thus,  for  this  example, the residue value for acute
 exposure  assessment  becomes  14.4  (=3.21 + 3 x 3.73).  The basis
 for  choosing  three standard  deviations above the mean is simply
 to obtain a residue  level that will likely encompass most of the
 treated crop  moving  in interstate commerce.  Four or perhaps five
 standard  deviations  may be a better choice since field trial data
 cannot  be used to make confidence statements regarding the
 proportion of a11 of the treated crop moving in interstate
 commerce  that exceeds  a specified residue level.
     It should be reemphasized that all discussions thus far
 relate  to the use of crop field trial data in estimating
 anticipated residues in treated crops moving in interstate
 commerce.  Whether (or how)  these estimates can be validly
 translated into estimates of anticipated residues in table-ready
 foods involves major issues  which are outside the scope of this
 appendix  {e.g., portions of  the crop of interest may not be
 treated or treated at  levels different from those used in crop
 field trials, effects  of commercial processing, storage, and food
 preparation practices).
4 . 3  FDA Pesticide Monitoring Program
     4.3.1
Introduction
              The responsibility of  regulating pesticides  is
shared by three  federal agencies.  Under the authority of the
Federal Insecticide, Fungicide, and Rodenticide Act, the
Environmental Protection Agency  (EPA) is responsible for
registering pesticides used in the United States.  If pesticide
uses can lead to residues in food or feed, EPA is responsible
under the Federal Food, Drug, and Cosmetic Act for setting
tolerances which limit the levels of residues that may be legally
present in these commodities in interstate commerce.  The Food
and Drug Administration (FDA) has the responsibility -:f enforcing
all tolerances established by EPA except those on meat and
poultry for which the Department of Agriculture  (USDA) has
                               4-21
                                                              fit.

-------
enforcement responsibility.  The USDA Food Safety and Inspection
Service (FSIS) has the responsibility of enforcing pesticide
tolerances on meat and poultry.  The USDA Agricultural Marketing
Service (AMS) has the responsibility of enforcing pesticide
tolerances on eggs.  The USDA Monitoring programs were not
examined in detail in the preparation of this appendix and are
not discussed further.
     In carrying out its tolerance enforcement responsibility,
FDA has been monitoring pesticide residues for almost four
decades.  During this period, the programs have increased in size
and scope, and have undergone extensive revisions.  The FDA's
current pesticide program has two components — regulatory
monitoring and the Total Diet Study.  An overview of these
program components, focusing primarily on the data they produce
and its usefulness to EPA in risk assessment, are given in
sections 4.3.2 and 4.3.3.  References [6] through [15] cited in
section 4.4 provide a detailed discussion of FDA's pesticide
monitoring efforts.
     4.3.2
Regulatory Monitoring
              4.3.2.1
           Overview
             The  primary  objective of FDA's regulatory
monitoring is the enforcement of pesticide tolerances in foods
moving in interstate commerce.  FDA officials make it clear that
this program component is not designed to resemble a
probability-based sample1  of  foods  in  interstate commerce.  An
important secondary objective of FDA's program is to make the
monitoring data (or summaries of these data)  readily available to
other governmental agencies,  interested parties and to the
general public.  Internally,  FDA uses these data routinely for
     'In a probability-based survey the chance of every item/member
being selected can be calculated.  This permits making inferences
from the sampled data to the general population, in this case, all
foods in interstate commerce.
                               4-22
                                                              /37

-------
  program planning and evaluation.   We note that FDA has recently
  included in its program a trial effort to provide statistically
  based monitoring data in pears and tomatoes.   This type of
  monitoring by FDA will be more useful for risk assessment
  purposes.
       Currently,  FDA collects and  analyzes about 19,000 samples
  each year.   About 8,000 are domestic samples  and the remaining
  11,000 samples come from imported commodities.  The various
  elements of regulatory monitoring are shown in the diagram below.

                         Regulatory Monitoring
                                (19,000)*

Domestic
(8,000)
i
Core Selective
(1,000) Surveys
(1,000)

\
Import
(11,000)
i i
District General
(6,000) (8,000)
i
Mexican
(3,000)
     "Typical numbers  of  samples  analyzed each year  are  shown  in
     parentheses.

     Each of FDA's 21 districts are required to collect at least 12
shell egg samples, at least 24 samples of milk and/or cheese, and at
least 12 samples of fish and/or shellfish.  These commodity-specific
requirements make up the core element.
     Selective surveys are special surveys initiated by headquarters
to monitor particular pesticide-commodity combinations.  A number of
factors (e.g., incidents of misuse, special requests from EPA,
information regarding potential risk of a pesticide) may trigger a
selective survey.  Some examples of pesticides.targeted in-1987
selective surveys include aldicarb in potatoes, benomyl in various
fruits, and captan in cherries.
     The district element makes up the major portion of domestic
sampling.  This encompasses all the sampling/analysis/reporting
activities planned and carried out by the individual districts.
                                4-23

-------
Except for special surveys initiated by headquarters, each district
determines what commodities to sample, how many samples to take, and
the analysis method (generally, one (or more) of the multiresidue
analytical methods are employed).   Most of the samples are termed
"surveillance" samples in that collections are made without
suspicion that the shipment contains illegal residues (i.e.,
residues exceeding established tolerances or exceeding zero if
tolerances have not been established).  However, a small portion of
each district's resources is set aside for "compliance" sampling
which refers to followup samples collected after an illegal residue
has been found or when there is strong evidence of a violation.  The
selection of samples is discussed in more detail later in this
report.
     The import element (general and Mexican) of regulatory
monitoring employs basically the same strategies as described above
for the domestic element - namely, periodic headquarters-initiated
surveys covering specific commodities and/or specific pesticides, a
minimum of two special emphasis surveys in districts having ports of
entry, and import sampling as determined by the individual district.
The basis for sample selection is discussed later in this report.
     A review of FDA's pesticide monitoring program (domestic and
import) was completed by the General Accounting Office (GAO) in
1986.
     FDA has responded positively to GAO's suggestions and has
implemented several program changes since 1986.  To ensure broader
coverage of foods sampled, requirements are now in place for
regional offices to develop annual sampling plans (with inputs from
individual districts).  These plans are submitted to FDA
headquarters for review at the beginning of the fiscal year and
updated at the middle and again at the end of the year.
Modifications to the sampling plans are made by headquarters on an
as-needed basis to ensure adequate coverage of domestic and imported
foods.  FDA began implementing these requirements in FY 1988.
     A regional sampling plan shows which food commodities are to be
sampled, the number of samples for each commodity, time of sampling
•(i.e., fiscal year quarter), sampling rationale, and analytical
                                4-24

-------
method  (for multiresidue methods, the PAM reference number is given,
and  for single residue methods, the targeted pesticide or pesticide
class is given) ._ This information is reported in the sampling plan
for  each state within the region from which one or more samples of a
food commodity are expected to be collected.  Mid-year and final
updates of the plan reflect the actual numbers of samples
collected/analyzed and provide for explanations of any sampling plan
modification.
     Based on name alone, sampling rationale would appear to be one
piece of information that may be useful to EPA, since it might give
the reason why the sample was selected.  Seven codes are used to
describe sampling rationale.  It should be noted that one or more of
the codes can be used in the sampling plan to describe why a
particular crop was chosen to be sampled.  These codes are:

     P =      violative  findings  within  a  district  from  the previous
              fiscal year for  a commodity-pesticide combination,
     V =      high  production  volume  within  a  district,
     C =      commodity  is  on  an  FDA  prepared  list  of  commodities
              that  may be considered  for surveillance  sampling,
     L =      lack  of prior sampling  of  the  particular
              commodity-pesticide combinations within  a  district,
     S =      sampling is based on the Surveillance Index,  or the
              pesticide  is  otherwise  significant,
     H =      headquarters-initiated  survey, and
     D =      special emphasis survey.

Clearly, codes H and D allow identification of headquarters
initiated and special emphasis surveys.  Further discussion with FDA
is necessary to determine if additional information regarding
sampling strategy can be obtained allowing EPA to better determine
the appropriateness of using these data.
     Greater emphasis  is being given to establishing FDA-State
cooperative pesticide monitoring programs to obtain maximum
commodity-pesticide coverage with available resources.  Guidance in
establishing (or enhancing) cooperative programs is provided to
                                 4-25

-------
regions and districts by headquarters.  As part of its annual
sampling plan, each region is now required to submit a listing of
planned activities for each state at the beginning of the fiscal
year and then submit a revised listing of actual activities at the
end of the year.  A copy of the region's sampling plan covering
FDA/State cooperative activities in Georgia is shown in Exhibit 4-1.
A detailed discussion of the FDA-State cooperative pesticide
monitoring programs is given in Annex I of Attachment A in [8].

              4.3.2.2     Basis for Sampling

              Under general guidelines  for  program  planning and a
resource budget provided by headquarters, each district generally
decides what commodities to sample, how many samples to collect of
each commodity, and where to select each sample; the exceptions are
headquarters-initiated surveys.  The total number of samples over
all commodities is governed by the amount of resources made
available by headquarters (also,  a portion of each district's
resources is set aside for special surveys and compliance (i.e.,
followup) sampling.   Some of the factors/considerations that
districts take into account in reaching their decisions about what,
where and how much to sample include:

     • intelligence information on local pesticide usage, i.e., who
       are the users and what are they using, (sources of
       information include state/local governments, the agricultural
       extension service, universities, retail sales information,
       etc.)
     • dietary importance of the commodity
     • commodities of local origin are preferred over those in
       transit
     • past violations within the district
     • Surveillance Index (highest priority is given to pesticides
       with greatest potential health risk)
     • production volume of the commodity
                                4-26

-------
     •  coverage of major raw and processed foods  produced  in  the
        state
     •  long-rangjs coverage (i.e.  over  multiple  years)  of commodity-
        pesticide combinations

Districts are urged  to make  collections reflective of  pesticides
being used and  to use growers or  packing sheds  as collection  sites
for fruits and  vegetables.   Although the above  considerations cannot
be meaningfully ranked in  terms of  importance,  FDA focuses
considerable attention on  intelligence gathering within districts.
With enforcement  of  tolerances as its primary mission, it  is
important for FDA to determine, to  the extent possible, what
pesticides are  being used  in a district, where  they are being used,
likely misuses, and  any  other information  that  will increase  the
chances of selecting and analyzing  food samples from shipments with
illegal residue levels.
              4.3.2.3
Variables of Interest in FDA's Database
              FDA's pesticide monitoring data are stored in a
computerized database.  This database contains residue measurements
along with a number of other characteristics or variables relating
to each commodity  sample/analysis  (e.g., sample identification,
collection date, collection district, location (state), analysis
laboratory) which  are used in generating various kinds of data
tabulations.  A variable of particular  interest to EPA in
determining geographic coverage of FDA  samples is where a given
sample was grown.  The origin of a sample is not explicitly  included
in the data file;  however, two allied variables - state and  FDA
district where the sample was collected - are contained in the file.
     Since enforcement is FDA's primary mission,  collecting
information that will enable violative  samples to be traced  to the
responsible party  is a vital activity performed within each
                                4-27

-------
I
o
u
erative
ption
FDA/State C
Program De
               52
                •*- 10
               •coo
                  i?
                (ft (B -C

             ifa£

             lik **    Q
                4> >- E
             *- 41 u. 8
             OX    U

             01 41 •— 10
        I/!
        *4
«    >->  g
41    >  C
             ul   C P
             41   ... 5


             H3?
             IA U C O
                 I U) (0

                 : il  41 U —
                                                O —
   41 "8
   UN       •

5 3 £ iV   -
   —* 41 _*    -*
a> 3 x Q.   3
* ° 2 1    ">
•a •- Q- it -n 41      a.



^      o ~a «      axo^Sx   — ^ »••    3 •-*  k c  e L. a- a
                                                  «      -          i-         — • *•*
                                                       vi"    ^i-uc    ieaio
                                                  *n    v      a  o _ .41   ^E^
                                                  C    5f    "    <0f      ncn
                                                  3««'O*«    "O4>*'&»aJ0i
                                                  O 10 — —      — "D  (- -.  «   SI
                                                  •«- 01 ui j       o    3iaio^f
                                                    4111       >£(.>*-0)4IC*'
                                      .         .^
                            _j o. c ui — 3      i-  u —
                            >- 41 o •—  o c   MOl.<04IC4l
                                                 u.oS.u3<.-«  ax  S. o  L.
                                             .t- 41
                                         1. J3 0 L.
                                         10      3
              ~    §!»
               O   —    U    T3  • *-  •

               « —» C* U X—' •—  Q — *-
              SI    <5 L.   c ^
              O    vj (0   O «


                * Q. (i C  o 5 - a. 5
               y 4) 01 W — J fl
              O ^ O> ^) 4; 1? u)
                            -i
                                                                                                                                      00
                                                                                                                                      fM
                                                                                                                                       I
                            •o

                            IM
                            O
                            in

-------
 district.   Sample  "collection  reports", prepared by and maintained
 (for  approximately two  years)  within district offices, reportedly
 provide  information that  can often allow origins of samples to be
 determined.   In  any event,  since samples of  local origin are
 preferred  by  FDA,  it appears that the state  (district) where a raw
 commodity  was collected is  a reasonably good indicator of the state
 (district)  where the commodity was grown.
              4.3.2.4
Data Assessment
              In carrying out their mission of pesticide tolerance
enforcement, FDA does not employ probability-based sampling.  Their
goal can be achieved more efficiently by selecting, to  the extent
possible, samples known  to have been treated with a particular
pesticide.  Samples selected  in this manner are more likely to
contain illegal residues than samples selected using a
probability-based procedure.  When selecting samples of a particular
crop, FDA will often focus (i.e., target) their efforts toward a
particular pesticide.  In this section we explore ways  of utilizing
this information to identify  major biases in the FDA monitoring data
and to identify segments of the data which might be useful to EPA
for dietary exposure assessment.
     Many of FDA's food  samples are analyzed by multiresidue
analytical methods — methods which can identify the presence of a
group of pesticides in a single analysis.  When residues of interest
cannot be determined by  multiresidue methods, single residue methods
are employed.  An assessment  of the usefulness to EPA of data
generated by each type of method is given.
     In determining the potential presence or absence of major
biases in FDA's monitoring data from the standpoint of  EPA heeds, it
may be useful to ask FDA the  following questions.  These are global
questions that relate to how  a particular set of available FDA
monitoring data in a particular district was targeted.
                                4-29

-------
     a)  Were the samples of the crop of interest targeted toward the
        specific pesticide of interest?  (Recall that EPA's concern
        is a particular crop-pesticide combination.)
                »•
     b)  Were the samples of the crop of interest targeted toward
        competing pesticides?
     c)  Were the samples of the crop of interest targeted toward
        noncompeting pesticides?
     d)  Were the samples of the crop of interest selected to assure
        commodity coverage rather than targeted toward specific
        pesticides?

     The FDA database and the annual sampling plan (in particular,
the section dealing with sampling rationale), submitted by each
region do not contain information required for answering these
questions.  Whether data are presently available or whether data
could, if necessary, be collected for answering these questions are
issues that must be addressed with FDA.
     In order to determine the potential presence or absence of
major biases in the FDA data, one of the above four questions must
be answered in the affirmative for a particular crop analyzed by
multiresidue methods.  For food samples analyzed by single residue
methods, question (a) would be answered "yes", given FDA's goal of
selecting treated crop samples.
     There may be occasions when questions are answered "no" because
some samples of a particular crop are targeted to a competing
pesticide and the others are targeted to a noncompeting pesticide.
For such situations, an alternative approach is to ask the above
four questions about each individual sample of the crop of interest.
This is probably an unreasonable request because of the burden it
would place on FDA.  Also, the questions are probably unanswerable
at the individual sample level.
     These questions should be directed to individual districts
since most decisions regarding what commodities to sample and sample
sizes are made at the district level and the reason for sampling a
particular crop  (i.e., answers to the above questions) would vary
from one district to another district.  The exceptions are those
situations where headquarters provides directions regarding the
                                4-30

-------
reason for sample selection that would apply to all districts.  For
these cases, headquarters  is the obvious point of contact for the
desired  information.
     The four questions discussed below address idealized sampling
situations which are not likely to occur.  These questions are meant
to provide a starting point from which to evaluate the usefulness of
available data for dietary exposure assessment on a case-by-case
basis.

     Question fa);   Were the samples of the crop of interest
targeted toward the specific pesticide of interest?  A "yes" answer
to this question implies that FDA knew (or strongly believed), at
the time of sample selection, that crop samples were likely to have
been treated with the pesticide of interest to EPA.  However, these
are not randomly selected samples of all treated crop grown in a
district but, rather, purposive samples chosen by FDA district
inspectors.  It appears reasonable to conclude in this case that
residues in the reported data are at least as high as would be
expected under random sampling of the crop.  This, then, is a
situation in which data reported by FDA on a pesticide-crop
combination may be useful to EPA when aggregated across all
applicable districts.  This question may be answered "yes" or "no"
for crop samples analyzed by multiresidue methods.  However, in
keeping with FDA's strategy of selecting treated samples, this
question should be answered "yes" for all crops analyzed by single
residue methods.
     Question fbl;   Were the samples of the crop of interest
targeted toward competing pesticides?  When this question is
answered "yes" and when both the targeted pesticide and the
pesticide of interest to EPA (i.e., nontargeted pesticide) are not
used on the same crop sample during a single growing season, the
results reported by FDA for the pesticide of interest are biased and
should not be used by EPA.  The bias in reported values for the
pesticide of interest is reduced if both the targeted and
nontargeted pesticides may be used on some of the samples, and in
these situations, the data may be of some value in estimating
dietary exposure.
                                4-31

-------
     Question (c^:   Were the samples of the crop of interest
targeted toward  noncompetina pesticides?  Suppose FDA inspectors in
a particular district answer "yes" to this question.  Furthermore,
assume that treatment -of a sample with the targeted pesticide does
not influence the decision to use the noncompeting pesticide of
interest on that sample  (i.e., uses of the two pesticides are
independent).  Under this assumption, it is reasonable to conclude
that reported data for the pesticide of interest might be useful to
EPA (when aggregated across districts); at least there is no strong
evidence of a significant bias.  This conclusion is not supported by
statistical evidence since probability sampling is not employed by
FDA.  Rather, it assumes that samples are "representative" of the
crop of interest.  If one cannot assume that treatment of a sample
with the targeted pesticide has no effect on whether the sample is
treated with the pesticide of interest, the reported data for the
pesticide of interest may be biased, and the reporting of a dietary
exposure assessment based on these data'must indicate the
uncertainties introduced by this bias.
     Question fd);   Were the samples of the crop of interest
selected to assure commodity coverage rather than targeted toward
specific  pesticides?  A "yes" answer to this question implies that
crop samples were selected for the sole purpose of commodity
coverage and that FDA disregarded all information relating to
treatment history - namely, were the samples treated and, if so,
what pesticides  were used?  In the event this sampling strategy is
employed, it is  reasonable to conclude that reported data for the
pesticide of interest might be useful to EPA when aggregated across
all districts.   This conclusion is not supported by statistical
evidence since probability sampling is not employed by FDA.  Rather,
it assumes that  samples are "representative" of the crop of
interest.
     It should be reemphasized that the above comments corresponding
to FDA's answers to questions (a) through (c) assume that FDA
inspectors are effective in selecting samples treated with the
targeted pesticide(s).  In the case of question (d) the comments
assume that FDA  inspectors ignore all pesticide usage information in
making sample selections.
                                4-32

-------
     FDA monitoring data on a particular crop-pesticide combination
 judged  to  be  potentially useful  to  EPA must be aggregated across
 applicable districts  to obtain a mean residue level on a nationwide
 basis.  The use  of a  h'ationwide  estimate derived by weighting
 district means by their corresponding crop production values is
 suggested.  Procedures described in section 4.2.5.1 can be used  (by
 substituting  the word district for  growing region) to calculate a
 population mean  and standard error.  Before making a nationwide
 estimate,  one should  verify that all major growing regions are
 represented in the data to be combined.  In assessing data from each
 district (using  answers from the four questions regarding the
 pesticides to which the crop samples are targeted), one might find
 districts  with potentially useful data, districts with data which
 are judged unacceptable, and districts that did not sample the crop
 of interest.  Thus, it is important to look closely at the location
 of those districts with potentially useful data and determine if
 they are representative of the growing regions in the United" States.
     Also,   one should determine  if  the sample size is "adequate"
 before making a  nationwide estimate.  The number of samples required
 will depend on the specific crop, the percent of the crop which is
 treated, the  toxicological effect,  the geographical distribution of
 pesticide  use, and other factors.   There are 21 FDA districts in the
 United States.   Since some crops may be grown in all districts while
 others may  be grown only in a small subset of the districts, it is
 difficult  to  establish meaningful lower boundaries on the number of
 samples per district.  Generally, a large total number of samples is
 required before  FDA monitoring data should be used as the primary
 source of data for exposure assessment.
     FDA's  answers to questions  (a), (b)  and (c)  are assumed to be
 correct in  the absence of conflicting information.  However, this
 assumption  may not always be valid.  In reality, crop samples
 thought to  be treated with the targeted pesticide(s) may have been
 treated with  other pesticides or they may not have been treated at
 all.  This  is a  source of misclassification bias (i.e., errors due
 to misclassifying the treatment  history of selected samples).
 Errors of misclassification can  also occur when question (d) is
answered "yes".   Although sample selection may not be formally
                                 4-33

-------
targeted toward specific pesticides in question  (d), FDA inspectors
may make selections that favor crops treated with pesticides of
interest to FDA.  The extent of these misclassification biases can
never be determined. -'The use of global type questions (e.g.,
questions (a) through (d)) makes it easier for FDA to provide an
answer regarding pesticide treatment but, at the same time, reduces
the level of data quality.
              4.3.2.5
            Conclusions
              Each year several  thousands  of  food samples  covering a
wide variety of commodities are analyzed by FDA  for a large number
of pesticide residues.  The results are entered  into a computerized
database and made available to the public.  In assessing whether EPA
can effectively utilize these data in estimating anticipated
residues, one must consider the number of samples available, the
geographical distribution of the residue data relative to locations
where the pesticide would likely be applied,  the availability of
historical FDA monitoring data over several years showing similar
trends from year to year, the likelihood that a competing pesticide
was targeted for most  samples thus introducing bias into the data,
whether the components analyzed correspond to the major components
of toxicological concern, the adequacy of the limits of
detection/quantification of the analytical method used, whether FDA
data are consistent with other available residue data, and whether
the FDA data are the best data available in a particular situation.
EPA will utilize FDA monitoring data for dietary exposure assessment
when consideration of these factors indicates that use of these data
is appropriate.
     4.3.3
Total Diet Study (IDS]
              4.3.3.1
            Description  of  Program
              The second component of the FDA's program to monitor
pesticide residues in foods is the Total Diet  Study  (sometimes
referred to as the Market Basket  Survey).  The objectives  of the  TDS
                                4-34

-------
 are to (1)  determine the dietary intake  of  pesticides,  (2)  compare
 intakes with Acceptable Daily Intakes  (ADI),  (3)  identify  trends,
 (4)  provide a measure of effectiveness of U.S.  regulations on
 pesticides,  and (5)  provide consumer confidence in  the  safety  of the
 food supply.   The TDS is unique  in  that  it  measures a broad range of
 pesticides  in foods  in the form  as  they  are consumed  [e.g.,  cooked
 pinto beans (using dried beans),  orange  juice  (reconstructed from
 frozen concentrate),  raw carrots].  The  TDS program began  about  30
 years ago when foods making up a diet of teenage males  were analyzed
 for  pesticide residues and for radionuclides from the fallout  from
 atmospheric nuclear  tests.
      Over the  past three decades, the TDS has undergone many
 changes.  The latest and most extensive  revision was  in 1982 when
.the  program was completely redesigned using food consumption data
 and  eating  patterns  from two nationwide  surveys —  the  1977-78 USDA
 survey and  the 1976-80 National  Health and  Nutrition  Examination
 Survey.  About 5,000  different foods were identified  in these  two
 surveys.  The 5,000  foods  were broken down  into 234 groups on  the
 basis of type and nutrient content; the  foods making up each group
 are  thus considered  similar with respect to these characteristics.
 The  individual food  within each  group having the greatest
 consumption was then selected for inclusion in  the  TDS.  For
 example, fruit pies  and pastries with fruit are represented by apple
 pie.   Various  pasta  dishes are represented  by spaghetti and
 meatballs in  tomato  sauce.   With this approach,  all foods  identified
 in the  two  nationwide surveys are "represented" by  the  234
 individual  foods  which are analyzed for  pesticides  in the  TDS.   In
 the  latest  program revision,  diets were  developed for eight age-sex
 groups.  A  listing of the  234 foods analyzed for pesticides in the
 TDS  is  given  in Appendix  4A.
      Individual food  items  plus  some additional  ingredients  needed
 to prepare  the 234 foods  listed  in Appendix 4A  are  purchased from
 retail  stores;  collectively,  the group of 234 foods is  referred to
 as a  "Market  Basket".   Four .Market Baskets  are  collected each
 year—one in  each of  four  geographic areas  of the United States
 (northeast, north central,  south, and west).  The order in which
geographic areas  are  sampled is  rotated  each year to ensure
                                 4-35

-------
uniformity over seasons.  Collection schedules are modified
occasionally to accommodate seasonal crops.  Samples of each of the
234 individual foods are collected in each of three cities within a
geographic area, shipped to the FDA Total Diet Laboratory in Kansas
City, MQ and then composited  (i.e., blended) to form a single sample
of each food item per geographic area.  Where possible, a portion of
each food item from each city is retained for future analysis, in
case the composite sample gives a residue measurement above some
predetermined level.
     The current scheme for determining the three sampling locations
within a geographic area is as follows.  'First, three Metropolitan
Statistical Areas (MSA) of different sizes are selected.  In the
next year three different MSAs, again of different sizes, will be
selected and the process of sampling without replacement continues
in subsequent years until all 261 MSAs have been used and the
process starts all over again.  It will, of course, take several
years to complete this cycle.  Next, one retail store is selected
from each MSA.  This is not, however, a random selection process.
For economic reasons, store selections are made from retail stores
large enough to be likely to have all the items necessary to prepare
the 234 foods and, at the same time, conveniently located so as to
minimize travel times of the food collectors.  Individual item
selections within the store are made by the food collectors as brand
names are not specified.

              4.3.3.2     Suitability of TDS Data for Estimating
                         Residues in Table-Ready Foods

              The  procedures employed  by FDA in obtaining  samples  of
the 234 food commodities for the TDS are nonrandom - selection of
MSAs within each region, selection of the one retail store within
each MSA, and selection of the food commodities within each retail
store.   Under the current sampling procedures, there are many
situations/conditions for which the 234 food commodities will not be
selected into the TDS sample - for example, retail stores located
outside an MSA (approximately 25% of the U.S. population live
outside an MSA),  small retail stores in an MSA that would not likely
                                4-36

-------
have all of the 234  foods, roadside fruit/vegetable stands, and home
grown foods.
     Suppose EPAJ^s objective is to estimate the anticipated (e.g.,
mean) residue of a given pesticide in a population of food defined
as "the.totality of  a particular commodity  (e.g., canned green
beans) consumed in the United States over,  say, a one-year period."
In contrast to this  population of interest, the population actually
sampled in the TDS is the totality of a particular commodity  (e.g.,
canned green beans)  sold by the 12 selected retail stores over a
one-year period.
     Without employing some probability mechanism for making
selections at each stage in a survey, results observed in the
sampled population cannot be validly extended to the entire
population of interest.  This derives from  the fundamental
principles of survey design which are discussed at some length in
Appendix 3 of these  guidelines.
     Thus,  we conclude that the TDS data have limited usefulness as
the sole database for estimating anticipated residues in table-ready
foods.  These data are most useful as supplementary information to
support or call into guestion anticipated residues estimated from
other data.
                                4-37

-------
4.4 References

 1. U.S. Environmental Protection Agency, Office of  Pesticide
    Programs,  (1982) .  "Pesticide Assessment Guidelines,  Subdivision
    0, Residue Chemistry."   EPA-540/9-82-023. and  addenda:

    "Addendum  1 on  Data Reporting:  Product Chemistry."   EPA-540/09-
    88-048.
 4,

 5,
    "Addendum  2 on Data Reporting:  Magnitude of the Residue:
    Field Trials, Analytical Method(s),  and  Storage Stability
    Study",  EPA-540/09-86-151.
                                                           Crop
    "Addendum  3 on Data  Reporting:
    EPA-540/09-87-199.
                                Nature of the Residue:  Plants."
    "Addendum  4 on  Data  Reporting:  Magnitude of  the  Residue:
    Processed  Food/Feed  Study."   EFA-540/09^88-004.

    "Addendum  5 on  Data  Reporting:  Specialty Applications:   (I)
    Classification  of  Seed  Treatments  and  Treatment of  Crops  Grown
    for Seed Use Only  as Non-Food or Food  Uses;  (II)  Magnitude  of
    the Residue:  Post Harvest Fumigation  of Crops and  Processed
    Foods and  Feeds;  (III)  Magnitude of the Residue:  Post-Harvest
    Treatment  (Except  Fumigation) of Crops and Processed  Foods  and
    Feeds."  EPA-540/08-88-008.

    "Addendum  6 on  Data  Reporting:  Directions for Use."   EPA-
    540/09-88-049.
     "Addendum  7  on  Data  Reporting:
     Animals."  EPA-540/09-89-009.
                                Nature of the Residue:  Food
"Addendum 8 on Data Reporting:  Residues in Meat, Milk, Poultry,
and Eggs:  Livestock Feeding Studies."

Food and Agriculture Organization of the United Nations. Codex
Committee on Pesticide Residues, (1981).  "Guidelines on
Pesticide Residue Trials to Provide Data for the Registration of
Pesticides and Establishment of Maximum Residue Limits." FAO
Plant Protection Bulletin.  29, 12-27.

Hersey, JC, Muenz, LR, and Held K (1987).  "Statistical Design
and Analysis of Crop Field Trials."  Report prepared for the
Environmental Protection Agency.  SRA Technologies.

USDA, 1988(a).  Agricultural Statistics, 1988

U. S. Environmental Protection Agency (1989).  "FIFRA
Accelerated Reregistration Phase 3 Technical Guidance."  EPA-
540/09-90-078.
                                 4-38

-------
 6. Lombardo, P.  (1989).   "The FDA Pesticides Program:  Goals and
    New Approaches."  Journal of the Association of Official
    Chemists. 72,  518-520.

 7. McMahon, BM^nd Burke, JA (1987).  "Expanding and Tracking the
    Capabilities  of Pesticide Multiresidue Methodology Used in the
    Food and Drug Administration Pesticide Monitoring Programs."
    Journal of the Association of Official Chemists. 70, 1072- 1081.

 8. U.S. Food and Drug Administration  (1988).  "Pesticides and
    Industrial Chemicals  in Domestic Foods, Compliance Program
    Guidance Manual 7304.004."

 9. U.S. Food and Drug Administration  (1987).  "Pesticides and
    Industrial Chemicals  in Imported Foods, Compliance Program
    Guidance Manual 7304.016."

10. U.S. Food and Drug Administration  (1988).  "Pesticides in
    Mexican Produce, Compliance Program Guidance Manual 7304.008."

11. Pennington, JAT (1981).  "Documentation for the Revised Total
    Diet Study Food List  and Diets."   PB82-192154, National
    Technical Information Service.  Springfield, VA.

12. Pennington, JAT (1983).  "Revision of the Total Diet Study Food
    List and Diets."  Journal of the American Dietetic Association
    82, 166-173.

13. U.S. Food and Drug Administration  (1986).  "Total Diet Study,
    Compliance Program Guidance Manual 7304.83."

14. Lombardo, P (1986).   "The FDA Total Diet Study Program."
    Proceedings of the Symposium on Exposure Measurement and
    Evaluation Methods for Epidemiology.  Environmental
    Epidemiology.  Lewis  Publishers, Inc.  Chelsea, Michigan.

15. Gunderson, EL (1988).  "FDA Total  Diet Study, April 1982-April
    1984, Dietary Intakes of Pesticides, Selected Elements, and
    Other Chemicals."  Journal of the  Association of Official
    Analytical Chemists 71, 1200-1209.
                                4-39

-------
       APPENDIX 4A

Listing of foods analyzed
 in the total diet study
          4A-1

-------
food
001 whole milk, fluid
002 tow-cat milk. 2% fe fluid
003 chocolate milk, fluid, tow-rat milk
004 ikim milk, fluid
005 buflfcmtilk, fluid
004 yofurt. plain, low-rat
007 milkshake, chocolate, fat-food type
008 evaporated milk, canned
009 yogurt, iweetaned. mawberry. pntHdrred
010 cheese. American, proceuad
Of 1 cooaje chtcM. creamed. 4% milkta
012 chteae, Cheddar, (tharp/miU*
013 beet ground, regular hamburger, cooked
in patty trope
014 beef chuck roa*. oven routed
015 beat round mak, nrwed in water
014 bee/ (toirviiHoiri) fleak. pan cookad wi*
addediarr
017 pork, ham, cured, not canned, 
-------
rota/ <**( ituoy food list with fram oiM/ititfaj* Ax sp«df?ec' ue-sex f rows feoHtt*,,^
load
	 . 	
040 cowpeas (blackeyed peul boiled from
dried
041 rmwj beans. mature, boiled from dried
042 lima beans, immature, rroien. boiled
v43 navy beans, boiled from dried
044 nd beans, boiled from dried
045 peas, green, canned
046 peas, green, frown, boiled
047 peanut butter, creamy, commercial in iar
040 peanuts, dry routed in jar, sated
049 pecans, packaged, unsifted
050 rice, white, ennched. cooked
051 oavneal, cooked
032 farina, enriched, cooked
053 com grio (hominy gnbi enriched.
054 cam, ffnjsh/frozcnt, boiled
055 com, canned
056 com. cream style, canned
057 popcorn, popped in oil
036 white bread, enriched
059 rolls, whin, toft, enriched
060 combread. southern style, homemade
061 biscuits, bakinf powder, enriched.
nrfngtf iied type, baked
062 whole wheat bread
063 Brttta. flour
064 rye bread
065 muffins (blueberry/plain)
066 saltine cracken
067 com chips
060 pancakes made from mix with addition of
«6t milk, and oil
069 noodles, egf, enriched, cooked
wv ffwirero, tnnoiM* COOfcid
Q*1 fr~* fl*Br«

973 Shredded Wheat cereal
<)74 Raisin Bran cereal
|)75 crisped nc* cereal
i)76 granola. with raisin*
070 apple, ndw* peel, raw
(179 orange, raw, cnavalrWanciU
*-f« yr.
f
1.374
0.543
0.575
0.954
2.007
2.944
0.477
2.170
0.317
0.41Q
16.732
3.226
2.300
4.215
5.246
4.540
4.106
1:107
30.751
13.240
4.432
3.660
2.156
2.221
0.606
0.914
2.521
2.094
5.736
1.747
5.027
3.235
1.441
1.300
1.040
0.991
0.216
1.242
12.343
7.002
5.195
5.919
3.436
3.935
3.701
1.490
1.553
2.792
1.123
3.344
0.703
1.193
2.977
0.914
0.225
M
2S-30 yr.
f
2.626 1.669
0.764
0.614
2.702
1.601
5.541
1.397
4.643
0.045
0.505
19.504
5.054
3.744
6.430
0.037
7.222
1.036
1.500
55.670
21.495
6.076
4.054
3.377
3.557
0.773
0.724
3.022
2.354
0.012
3.427
0.442
6.101
3430
3.124
2.971
2.196
0.294
3.949
16.640
9.634
0.101
3.154
4.410
2.244
4.070
3.010
1.255
3.724
1.173
1.770
2.549
0.542
1.474
0.694
0.474
0.409
1.242
0.710
1.396
5.319
1.671
1.214
0.731
0.944
15.495
4.340
* 1.921
3.130
5.475
2.440
1.615
1.690
34.424
' 10.055
3.049
3.063
4.176
2.306
1.711
0.947
3.442
0.009
4.442
1.459
5.312
1.344
0.240
1.515
1.231
0.774
0.547
0.447
13.945
7.974
3.449
3.447
2.402
3.434
2.441
2.140
1.442
3.917
1.194
3.134
0.794
0.734
6.201
1.394
0.592
M
3.356
0.023
1.431
2.203
2.554
5.09*
1.399
2.529
1.633
0.701
22.943
2.021
2.530
3.744
7.277
5.029
2.194
1.432
36.730
20.267
6.450
S.031
7.742
3.945
2.666
0.945
3.912
2.350
6.976
2.499
6.299
1.419
1.944
2.757
1.311
0.941
1.641
0.924
12.115
9.264
6.716
3.530
4,757
3.494
2.393
2.343
1.231
3.020
0.643
2.422
1.471
0.945
4.277
0.445
0.474
car
6C-4S ?r.
f
2.267
1.137
1.634
1.362
1 325
5.347
2.224
1.194
0.541
0.500
12.466
10.039
3.660
4.010
5.733
.1.341
1.435
0.125
34.900
6.029
0.313
4.080
5.490
0.400
3.103
1.014
3.559
0.200
3.610
1.210
2.641
1.793
0.067
2.046
2.301
1.054
0.753
0.345
15.750
0.055
12.719
7.265
6.679
6.924
4.795
3.700
4.309
5.509
1.057
0.510
2.743
1.051
rf.766
2.234
0.745
tfnMrf
M
3.M0
1.547
2.069
1.390
1.446
' 1.166
1.631
1.823
1.269
0.955
14.296
13.570
4.901
6.311
7.277
2.257
3.390
0.253
50.673
9.030
11.927
>.ao6
6.507
0.999
3.603
0.030
4.059
0.510
6.296
2.326
3.544
2.904
0.064
2.437
4.344
0.650
0.460
0.291
15.516
9.454
13.510
4.757
6.450
7.053
5.490
3.039
4.739
4.942
1.317
6.901
3.114
1.666
10.209
1.594
0.754
4A-3

-------
total 674* study food list with gnm quantities for JpeofTerf age-sex groups (continued)
feed--
6>M
mo.
2
y*
14-16 yr.
f
W
25-30 xr.
'
M
60-65 yr.
f
W

095 railim, dried
096 prune*, dned, uncooked
097 avocado, raw
098 oranifl juice, mven, recorariuad
099 apple juice, canned, uruxetianed
100 frapemiii juice, ftwen, nconadMad
101 arapt juice, canned
102 pineapple juice, canned
103 prune juice, bottled

canned
105 lemonade, frozen, recoratfMed
10* jpinach. canned
107 ipinacn. (mshyftoienl bofled
106 cortanfc, (frtsMrwertl boiled
109 lettuce, raw
1 10 cabbaaa, boiled mjm «w

112 lauerknuL canned
113 broccoli. (mnMtaianl boiled
114 cater* raw
115 «o*rafu». (freihAweril boiled
1 1* cauimowet (mBMroanl boiled
117 tomaaxraw

119 ajmaa> lauoe. canned .
1 20 lomatDei. canned
121 beam, map frean, (maftfeuant boiled
122 beam, mapfien, canned
123 cucumber, nw, pared
124 tout*. Mmmat (rrwMrtuanl boiled
125 iw«et pepper, tr**". nw
126 iquarii. winajr (Mufebaid/acpml
(i9WwQ£eni boiled
127 canott, raw
128 onion, raw
129 ipuolei. mned. canned
130 muthnjomj, canned
131 been, canned
m^^^L -_^
fVrai v fw
133 onto riiy, bnjaded and Had, aoaan.
134 French Heh froaen. commercial, healed
135 mathed potBMi we)i margarine and
milk, pnpand feaja) maaw
!3* boiled potaw wafrM paa)
137 bated puujai w» peel
* «gk •^••^gBifc 4«]e44aj« tfajaa^BB^BBVjaB^aiJ

140 f^poaaa. bahadlnite 	
141 ffA*Mt pOlflDf CaVawaM* flOMMRIfldat

V43 DaHf aVV ^4fKeawlw MaTWp flOflMflHOat
144 jSL**1**1*' *M"I> oomm-ltil''
145 diffloon came, beef and beam, canned
14* macaroni and cheese, piepared KMI bo>
mi*
wMa tefl wi* aamiih, te4ood type
1 lal inalaatlnBBf tllaaf h rbf¥l Bill • llal
I4f apache* to tomalo uuca, canned


0.167
0.064
0.017
16.115
17.094
0.769
2.885
1.037
0.000
9447
a»»^^*T
OJ07
0.049
0.197
0.401
0.12S
0.238
O OA1 •
^».w»i
0.030
0.359
0.030
0.045
0.000
0.351
0.131
0.154
0.000
0.347
1.515
0.144
0.059
0.007
0.653

1.22*
0.013
1.0*4
0.000
0.019
0.000
0.050
2.475
7.3M

2.623
0.601
0.1*6
1.023
0.3*7
0.095
4 7M
^./ ^v
2.631
0.484
0.931
3.545

'
1 193
•. • »*
2.259
0.2*8
1.158
0.094
0.112
59.370
17.594
1.119
6.714
2.301
0.29*
24 9S4
4l~B#«^V
4.641
0.341
0.51*
0.501
2.619
0.6*7
0147
v«^^»
0.0*4
1.254
0.173
0.171
0.317
3.944
1.005
1.777
0.152
0.905
3.234
0.711
0.403
0.0*5
0.22*

2.400
0.221
1.112
0.124
0.337
0.037
0.097
13.515
9.179

6.5*2
3.171
2.0S7
2 049
e»*W^V
0.453
0.355
1* 410
4 W«V IV
5.791
3.191
2.22*
7.411
A «7-t
**^/*»
3 913
«P« (F*JieF
4.619
1.213
0.291
0.069
0.522
42.121
5.371
2.0*7
2.925
0.312
0.032
19O14
1 7*V •"
9.203
0.315
1,590
0.720
13.49*
1.803
1 9*2
9 >^W*
0.2*9
1.415
0.54*
0.119
0.404
9.500
0.888
1.124
0.154
1.209
4.6S4
1.730
0.5*1
0.27*
0.409

2.*47
1.312
2.732
0.543
0.733
0.273
0.6*7
19.954
15.004

9.495
5.420
4.191
4.145
0.9*4
0.322
17 477
1 / «^r m
8.881
12.445
7.M5
1.449
11 AM
1 J.V79
& 494
1 3^222
1.741
0.322 0.257
0.017
0.055
60.734
4.090
3.0*7
3.9*2
1.213
0.071
20 444
4M4\*VIJ^V
11.547
0.66*
1.937
1.353
12.247
1.7*5
f ^aU
0.737
0.901
0.335
0.2*5
0.303
9.712
1 632
' »a»ipa>
2.941
0.335
2.757
6.61*
0.9*0
0.829
0.652
0.027

2.9*4
1.755
3.533
0.301
0.413
0.13*
1.170
31.212
24:521

1.235
4.823
6.352
5171
^•Wr •
0.801
0.947
13570
+lf*m* V
10.89*
39.971
14.307
12.025


103*9
• W*^^4F
5.390
4.159
0.091
1.311
50.531
6.454
6.77*
4.547
1.23*
0.1*4
It 444
• J.^4W
6.890
0.831
2.415
0.903
23.540
3.03*
3 «199
0.656
3.954
0.116
0.805
0.131
16.244
3*51
••ajaw •
2.439
0-88*
2.341
5.252
3.421
2.245
0.911
0.987

3.799
2.374
S.9M
1.359
0.817
0.294
0.520
15.701
11.610

8.353
7.203
2.194
5 1*4
^* f^^
1.103
0.5*2
1C CAf
1 Jv^nfl
8.074
9.207
5.341
7.400


5591
»i*aT*
4.141
2.355
0.359
0.10*
1.055
58.954
3.280
3.504
4.419
1.802
0.115

.04*
10.707
0.175
2.405
1.349
23.311
2.603
f ala%9
1JS3
2.795
0.621-
0.757
0.712
17.193
* 4 7BJ
*.» ww
2.911
0.373
3.036
5.705
2.710
1.421
I.J73
1.422

3.023
3.373
5.623
1.337
1J01
0.422
0.927
32.750
18.835

11.727
1.117
4.037
7424-
* «^e»^
1.270
1.004
23 917
m&uW ••
11.313
13.511
14.3*9
1.699

t
12.2M
4.211
CO
0.455
1.014
0.915
54.00*
3.263
6.902
3.943
1.174
2.256
• tie
.335
6.983
0.724
3.017
2.748
20.885
4.162
2494
.674
0.915
2.452
1.3*7
1.327
1.380
22.594
S SSaf
1*420
1.619
3.507
6.803
4.062
3.827
0.895
1.913

3.993
2.342
6.115
0.496
1.741
0.602
0.145
6.011
10.405

15.604
7JH
0.449
* **•
2^573
0.609
ft. 414
V«7^V
10.901
2.003
4.730
5.106

"
4.342
1.175
2.153
no»o»d
0.410
0.904
1.282
46.214
2.333
7.495
2.609
1.090
1.422
5,^ •
.9M
5.431
1.416
2.539
J.017
21.837
3.727

1.415
2.824
1.848
1.744 1
1.0Q|J
2'i^Baal

llaW
1.943
4.687
7.995
3.524
4.250
0.803
2.333

3.9*2
3.631
6.263
0.693
1.430
0.6*7
0.112
14.518
17.015

19.800
6.321
0.553
7 6*4
2^49*
0.593
naas
't^P«
14.517
2.231
9.085
4.473 1
6 *•* L
M
7.0a^l
3.27P1
1.776 I
4A-4

-------
Tola/ diet study food list with grvn cjua/if/t/ei for spec/fed ige-tex groupt (continued)
feorf
151 lasagne, homemade
1 52 potpie, ta»en, commercial, chicken.
o^n heated
1 S3 pork chow mein, homemade
154 froien dinner, fried chicken, mashed
heated '
15$ chicken noodle soup, canned.
reconstituted with water
156 tomato soup, canned, reconstituted with
whole milk
157 wgrtable beef soup, canned,
I SB beef bouillon, canned, reconstituted with
water
1 59 fnvy, brown, tarn mil
t«0 white sauce, medium, homemade
i6i picfcto, dill, boated
'62 margarine made with partially
hydrogenaied tegetaWe oil. stick type
' 63 ulad dreuing, Italian, boned
164 butter, sock type
165 vegetable oil, com, beaded
166 mayonnaise, battied
167 cream, haJTand ha* fluid
168 cream substibtt, powdered
169 sugar, white, granulated
1 70 syrufc pancake. booM
171 jelly, grape, boded
172 honey, boded
173 catsup, boitled
1 74 ice cream, chocola*
1 ?S pudding, chocolate, instant, made with
whole milk
I '6 ice cream sandwich
1 rt ice milk, vwttlla
1 ?fl fKi"u' iia* Jfi* (••kat usifiH fk-n-pt*l*t» i**iMt
(itady4»«ia*r<»eft»


ICrO CGfflMCeWft, (fUM^t&^eKffTOsfeHlf
lill doughnuo. cake type, plain.
!ready-*>*aVfra*in>
1S2 (Oarwh paaw/fweei roUl
(readyto-eaimvan)
1«3 cookies, chocolate chip

185 apple pie, frozen, heated
1 86 pumpkin pie. fasten, heeled
107 candy, plain milk chocolaa*
146 candy, caramels
169 chocoJate powder, iweetened, to mix
witn hot or coid mnk
19) gelatin deuert, prepared, strawberry
191 carbonated soda, iweetened, coU type.
canned
191 carbonated soda, iweeuned, lemon-nme.
canned


i If* c*vbooj(i* iod*, k*r cjJori*, coU.

'
6-n
mo.
^
0.261
0.142

0.321
0.065

6.835

3.052

5.792
0.274
1.091
0.346
0.004
0.88*

0.055
0.367
0.167
0.04*
0.015
0.280
0.913
0.5*1
0.55*
0.334
0.102
1.490
1.280

0.054
0.11*
0170
* 1 * V
0 414
v*
0.919
6.*49

5.172

6.07*
1 9S9
J»»*T
8.357
3.239
5.788-
2.433
1.120

6.03*
165.605

92.710

6925*
W*41wNr
15.941

25-30 YT.
f
4m\ -
3.044
1.344

3.624
1.399

7.092

10.854

5.870
• 013
3.692
0.979
1.309
3.691

3.5*8
2.445
1.3*2
1.314
1.763
0.962
5.44*
2.213
2.211
0.489
1.662
9.359
2.672

0.6*1
1.899
3 342
e»-»«*^*4»
SIM
•«••**
0.744
3.691

2.030

3.751
1 64*
* iB^^tJI
4.89*
3.303
2.797
1.633
0.96*

3.343
132.839

41.622

32.490

54.19*

M

5.143
2.088

4.513
1.223

7.511

11.112

15.099
3.102
7.941
1.710
1.72*
4.647

5.745
3.276
1.509
2.729
2.920
1.416
8.3*3
3.571
3.40*
0.873
4.263
13.499
3.843

0.55*
2.047
5 913
9*9 P *
6 979
9*7* *
1.39*
5.6*3

3.650

4.723
2 711
*.» 1 9
7.359
5.59*
4.090
4.059
0.73*

3.44*
214.529

77.529

19.2*1

23.1*2

6C-6S
F

1.523
1.494

2.920
f.262

4.275

11.707

21.639
3.754
4.141
0.734
1.00*
4.575

3.434
2.297
0.900
1.111
• 3.045
1.641
5.18*
1.488
2.703
0.665
0.794
11.303
4,700

0.369
2.922
1.9*4
6 A9J
w< V«
1.199
2)307

2.521

3.0*4
1 QSi
t >w^«i
t.282
4.807
1.995
1.124
0.358

7.201
31.040

19.247

4.702

27.549
continued
rr.
w

1.947
1.392

2.904
1.946

14.202

11.620

20.536
4.178
6.S36
1.637
1.102
5.681

3.84*
3.290
1.214
1.555
4.284
2.322
7.46S
3.099
4.209
0.024
0.615
17.755
5.183

0.584
2.456
3 469
<*<^^»
• 154
**4WV
1.490
4.727

3.159

4.576
J 432
e»"~rf*
6.715
5.200
1.900
1.185
0.849

8.427
48.220

22.092

14.215

15.123

4A-5

-------
                   APPENDIX 5
DEGRADATION OF PESTICIDE RESIDUES DURING STORAGE

-------
Total diet study food lilt with
hod

1*15 coffee beverage, from instant
1% coffee beverage, from instant
decaffeinated
M'7 lea beverage, hot made with lea bag
11« beer, canned
199 wine, cable. 12.2% alcohol
2CO whisky, 80-proof
Mi water
202 milk^based infant formula with iion.
canned, ready-to- lerve
20} milk-based infant formula without iron.
2OI infant mixed cereal, prepared from dry
witfi whole milk
XS beef. 
213 vegetables with (baconmamt (Mt)
214 chicken and noodles; (Mr-)
215 tomatoes, beef and macaroni, (Mr.)
216 turkey and nee. (Mel
217 oatmeal with applesauce and bananas.
(Mr.)
214 carrots, (Mr.)
219 green beans. (n./y.)
220 (mixed vegetables/garden »egeaolm
. (M*>
221 (tweet potatoes/yellow tquashl (Mr.)
222 com. creamed. (MO
223 peat. (Mr.l
224 spinach, creamed. p4*4ppl.dwTv/apPle grape) juice,
••ramed
231 ior»nge/orange pineapple) jMtce. stained
232 ipuddingycvstardl any flavor, (Mt)
233 iruif dessert wMiapioea. any mA (Mr.)
234 tduich apple/apple butyl isuy.)
TOTAL OAKY INIMOJ
f ram quantities for specified »ge-tex groups (continued)
*•" 2 f*-;6yr. 25-30 yr. 60-65^
"* • I* f w ;

0.274 2.162 24.432 26.237 243.427
0.030 0.340 3.194 1.635 17.254
4.690 31.637 74.957 46.474 163.333
0.024 0.057 0.071 16.119 34.591
0.000 0.000 0.335 0.721 11.925
0.000 0.000 0.084 0.304 4.936
150.000 321.000 432.000 544.000 399.000
47.724 0.642
41931 1.900
51.842 0.147
6.294 0.049
1.409 0.030
5.124 0.000
6.200 0.225
9.23* 0.207
3.303 0.000
12.717 0.147
5.412 0.034
4.112 0.114
9.4S4 0.000
3.954 0.000
4.200 0.000
16.077 0.316
7.101 0.000
6.124 0.202
5.532 0.060
4.28* 0.124
2.659 0.000
3.212 0.024
1.544 0.000
21-.957 0.240
19.434 0.147
M.293 0.172
14.303 0.237
4.479 0.35*
14.324 0.940
4.220 0.604
4.32* 0.015
4.392 0.402
4.041 0.060
1,2ft3 1,503 t.954 2.677 2.173
M f M
334204 372.058 476.429
20.943 75.943 62.862
143.786 153837 97,523
299.202 9.291 94.479
11.289 5.164 4.623
4.514 3.894 4.575
512.000 346.000 581.000
3.073 2.239 2.690
•to ft)od Hems in parentheses, fte flrst earn it selected if available. * * it not available, 
-------
     Preface
     Appendix 5, DEGRADATION OF PESTICIDE RESIDUES DURING
STORAGE, is longer than most Appendices in these guidelines.
This is not meant to imply that more importance is attributed to
this material, but the added length was necessary to incorporate
the desired information into these guidelines.  The information
included in this Appendix is not meant as an all-inclusive
description of the steps necessary to design a degradation study.
Rather, this information provides background and a starting point
from which to begin to examine these issues.  The Agency strongly
encourages submission of study protocols prior to initiation of
these studies if they are to be submitted to the Agency for
registration or tolerance purposes.
                               5-i

-------
5.1  Introduction

     The purpose of this appendix is to provide guidance in
designing studies for estimating degradation rates of pesticide
residues in foods during storage and in performing statistical
analysis of data resulting from such studies.  Consider a
specific pesticide and a specific commodity  (crop) to which the
pesticide is applied.  To determine what residue levels will be
present in/on the commodity after it undergoes storage requires
three basic ingredients:
     •  information on initial {i.e., post-harvest and
     pre-storage) residue levels — for example, an average level
     or perhaps an entire distribution of such residue values;

     •  a function or set of functions (mathematical models) that
     describe how the pesticide residue level will change with
     time while the commodity is in storage; and
     •  information on how long the commodity is stored (e.g.,  a
     distribution of storage times).

The focus of this appendix is on the second item, which requires
experiments to be designed and carried out that allow us to
estimate the function(s).
     We begin, in Section 5.2, by describing how the above three
types of information are used to determine post-storage residue
levels.  We then discuss degradation models and their estimation
in Section 5.3.  Issues relating to the design of pesticide
degradation experiments are addressed in Section 5.4.
                               5-1

-------
5.2  Use of Information from Degradation Studies
     In general terms, the purpose of a pesticide residue
degradation study is to provide information from which we can
calculate an average degradation factor, G.  Such a factor is
used to multiply an initial pesticide residue value to produce an
average post-storage residue level.  The initial residue levels
are independently determined and are not addressed in this
appendix.  Clearly, if a single initial residue level, say X0,  is
multiplied by G, then a single post-storage value results; on the
other hand, if a distribution of initial residue concentrations
is used, then a distribution of post-storage residue levels is
produced.  The post-storage value(s) might then be further
multiplied by factors that account, on average, for processing
and preparation steps, as well as the percentage of crop treated.
If these additional factors.are jointly denoted by G",  then
          Residue at time of consumption
X0  x  G  x  G
The ultimate result produced by this calculation is then an
average anticipated residue level for the commodity of interest
that is intended to reflect an average residue level at the time
of consumption.  We focus here on the storage component — that
is, on the experiments, data, and calculations needed to arrive
at a value (or set of values) for the factor G.
     To make a determination of this storage factor G, we must
have information of two types:

(1)  a distribution of storage times, and
(2)  an estimated degradation model or function that indicates
     how much reduction in the initial residue level will have
                               5-2

-------
                                                                        1
     taken place by time t.1

     Assume for the'moment  that we know f(t), the  (probability
density of the) storage-time distribution, excluding that portion
of the commodity that  is exported.  Assume also that we know
g(t) , the degradation  function that describes how  the pesticide
residue level  in/on the stored crop behaves with time.  Then,
assuming an initial residue level of X0,  the level at time t is

          Xt = X0 x g(t).

Note that g(0)=l.  Since the distribution of storage times
indicates how  long various proportions of the commodity remain in
storage, we can then compute the average residue level upon
leaving storage, as follows:
    =  [ X. fit) dt
      0
          git) fit)  dt
        0
        EAg(t)}
where Ef{  }  denotes "expected value of" with respect to the
storage-time distribution f.  Thus G is a storage-time-weighted
average of the g(t) function.
     We also note that the above calculation  (i.e., combining of
     'if toxic metabolites  occur,  then a  model  reflecting their
changes  (including  the possibility of  increases)  over time will
also be needed.  See subsection 5.3.2.
                               5-3

-------
the degradation function with the storage-time distribution) can
apply equally as well to toxic metabolites, except that the
"degradation" function g(t) does not then have to be a
nonincreasing function bounded above by g(0)=l, and X0 would then
represent the initial concentration level of the metabolite
rather than the concentration of the parent compound.
     In practice, neither  f(t) nor g(t) is known.  Rather we have
an estimate of f(t), usually in the form of a histogram or
frequency table of storage times like those presented in Appendix
2.  We attempt to obtain an estimate of g(t) by conducting a
degradation study and performing a statistical analysis of the
data.  Suppose we have a histogram of domestic-use storage times
available.  Suppose there  are K intervals with midpoints tk,
ranges Rk/  and relative frequencies fk, for k=l,2,...,K.  Exhibit
5-1 shows an example.  Then after we conduct the degradation
study, we can estimate G as follows:
 (5.1)
where g(t) is the estimate of g(t).  An alternative way, which
may be slightly more accurate, involves using the average of the
g(t) function over each of the intervals rather than simply using
the value of g(t) at the midpoint of the intervals.  That is, we
would use gk  in place of g(t)  in formula (5.1),  where

 (5.2)
            =   -   g(C) dt
                 «*
Exhibit 5-1 illustrates how formula  (5.1) is applied; both
versions of the calculation are demonstrated.  In this example,
we chose g(t) = e'8t, which is an exponential decay model with an
                               5-4

-------
EXHIBIT 5-1.
ILLUSTRATIVE CALCULATION OF AVERAGE STORAGE
DEGRADATION FACTOR, G, BY TWO METHODS
k
1
2
3
4
5
6
7
8
9
10
11
12
13
ak
0
10
20
30
40
60
80
100
120
140
160
180
200
*>„
10
20
30
40
60
80
100
120
140
160
180
200
250
tt
5
15
25
35
50
70
90
110
130
150
170
190
225
Total
f^xlOOt
5.366
5.122
4.878
4.390
7.317
7.927
12.195
13.415
12.439
10.976
8.537
4.390
3.049
100.000
9(tJ
0.92774
0.79852
0.68729
0.59156
0.47237
0.34994
0.25924
0. 19205
0.14227
0.10540
0. 07808
0.05784
0.03422

g.
0.92861
0.79927
0.68793
0.59211
0.47414
0.35125
0.26021
.0.19277
0.14281
0. 10579
0.07837
0.05806
0.03503

9(tk)ffc
0.04978
0.04090
0.03353
0.02597
0.03456
0.02774
0.03161
0.02576
0.01770
0.01157
0.00667
0.00254
0.00104
0.30937
9iA
0.04983
0.04094
0.03356
0.02600
0.03469
0.02784
0.03173
0.02686
0.01776
0.01161
0.00669
0.00255
0.00107
0.31013
                               5-5

-------
estimated rate parameter's, with ft = 0.015.  If the kth histogram
interval is denoted"as  [ak, bk], then  (for  this exponential decay
model.)
  (5.3)
                [0
The estimated values of G obtained by the two methods are 0.309
and 0.310, respectively, and are obtained by adding the
cross-product terms in each of the last two columns of the
exhibit.
     Clearly, care must be exercised in performing such a
calculation.  For example, rather than calculating a single
overall G factor, we may need to calculate separate G factors for
each of several crop varieties and/or growing regions and/or
different types of storage conditions.  (Some guidance is
provided in Section 5.4 for making such decisions.)  Similarly,
for subsequent stages in the calculation of an anticipated
residue level (i.e., food processing and preparation steps),
separate G* factors may need to be developed for each food form.2
Storage factors, for instance, may apply to apples, but separate
processing factors would apply for apple juice and apple sauce.
When the distribution of storage times differs according to the
subsequent food form — e.g., different distributions for apples
depending on whether they are to be marketed as fresh apples or
to be processed into juice, sauce, etc.— then separately
calculated storage factors, too, may be appropriate for different
forms of the same commodity.
     2The term food form used here conforms to the terminology used
in the Dietary Risk Evaluation System (ORES).  Food  form refers to
a farm-gate commodity (e.g., apples)  combined with a form  in which
it is eaten (e.g., cooked or canned)  that  generally  indicates what
processing and preparation has taken place.
                               5-6

-------
     The above demonstrates the calculation appropriate to a
pesticide with chronic effects.  It would also be applicable to a
pesticide with acute effects if the commodity of interest is
naturally composited prior to consumption (e.g., grains, sauces,
or juices).  For acute-effect pesticides applied to other types
of commodities (i.e., those not naturally composited), a specific
time point  (say t*)  representing a worst-case situation (i.e.,
maximum of g) could be chosen and g(t*) ,  rather than G, could be
multiplied by the initial residue level.  For  a parent compound,
such a maximum would occur at time t*=0.   For a toxic metabolite,
the maximum of g might occur at some later time.  It may be
necessary in some cases to determine the range of dietary
residues expected so that approximate percentages of the total
population exposed to various levels can be estimated.
     As previously noted, the calculation of a single G factor
(from a single national distribution of storage times and a
single g(t) function) would imply that the degradation function
g(t)  does not depend on varietal differences, growing regions,
differences in storage conditions, etc.  If such differences are
suspected, then the degradation studies should allow such
differences to be discerned and the separate g(t) functions to be
estimated.  In the remaining sections of this appendix, we focus
on the design and analysis of such degradation studies — a
series of longitudinal experiments aimed at estimating the g(t)
functions.  It is important to recognize, however, that an
estimate of f(t)  is vital for estimating G.   Moreover, the f(t)
and g(t) estimates must be compatible in terms of the subsets of
commodity to which they apply.  For instance, we may design a
degradation study to allow separate estimates of g(t) for golden
and red delicious apples, but this information may be of'limited
value if we do not also have separate storage-time distributions
for these two varieties  (and separate varietal production or
consumption figures).
                               5-7

-------
5.3  Approaches for Defining and Estimating Degradation Models
     Estimation of g(t) requires a series of longitudinal
experiments involving a number (say J) of batches of the
commodity of interest.  Each batch is assumed to have been
treated as a single unit prior to storage (e.g., a single plot
within a field trial).  Each such batch is exposed to "typical"
storage conditions.  Periodically, samples are selected from each
batch, and one or more pesticide concentration measurements are
made.  Statistical analysis of these data is used to estimate
g(t); to do so, however, we must have some idea regarding
possible model forms for g(t).  Subsection 5.3.1 describes a
number of possible models, starting with a simple model and
progressing to more complex ones.  Then, in subsection 5.3.2.,
some possible models for characterizing toxic metabolites are
indicated.  The models described in these two subsections are
deterministic; statistical models, based upon these deterministic
models, are then presented in subsection 5.3.3.  Techniques for
estimating the model parameters are discussed in subsection
5.3.4.
     5.3.1
Deterministic Decay Models for Pesticide Residues
          The simplest model of pesticide residue degradation is
based upon the notion that first-order kinetics hold, at least
approximately.  That is, the rate of residue decay at any given
time is proportional to the residue concentration at that time.
In mathematical terms this deterministic relationship (for a
given batch of commodity) can be expressed as
  ( ^ 4. \     C ~ C A~Pe
  \O , * I     t-c — <-Q S

where Ct denotes  the concentration of the residue at time t and
where 6 is a positive number that indicates how quickly the
residue will degrade with time.  In this appendix, t=0 is taken

                               5-8

-------
to be the time at  which  storage  of  the  commodity of interest
begins.  C0 in^ eq.  (5.4)  thus denotes the initial concentration
(at time 0).  Figure  5-1 illustrates  how the  fi parameter can be
interpreted.  Figure  5-la,  for instance,  shows its  relation to
the residue half-life — namely,

     Half-life = (lnfl2)  = °-6931  .
                    P         P
On the logarithmic scale (see Figure  5-lb), we can  write (5.4) as
ln(Ct)  = ln(C0)  - ftt.   Hence,  -ft  represents  the slope of the line
relating ln(Ct)  to time  t.
     Generalizations  of  (5.4) in  several  ways are apparent.
Since different batches  will  not  have the same initial  (and hence
later) concentrations, the  model  can  be rewritten for the ith
batch as

  (5.5)    Cc(i)  = c0(i) e-t*

This model yields  a family  of exponential degradation curves,  one
curve per batch of commodity, for which differences between the
batches occur solely  due to differences in their initial levels.
That is, all of the curves  have a common  rate parameter J3
regardless of varietal or storage condition differences.   Such a
situation is depicted in Figure 5-2.  Note that the lines on the
logarithmic scale  are parallel.
     Since different  values of i might  correspond to different
varieties and/or growing conditions and/or storage  conditions, we
might anticipate that the rate constant 6 should also depend on
i; hence a further generalization of  the  model (5.5)  is given by
  (5.6)    Ct(i) = C0(i) e-*U}t .

Note that if two batches (say i and i') are exposed to  two
different conditions  that affect the  rate of  residue degradation,
then this can be reflected  by different values for  fl(i)  and
                               5-9

-------
FIGURE 5- 1 .  ILLUSTRATION OF EXPONENTIAL DEGRADATION
400
350 -
300 -
130 -I
200
15O
100 -j
 30 i
                                           -O.OISc
                                                                   240     270
  (b) LalCoacnmiaa) Soto
                                    Ln(Cc)-Ln(.00)-0.015t
                                    120      130      180     210     24O      270
                                      5-10

-------
FTOURE 5-2.
                              120    130     110     210    240    270
Cb)
             Scato
               40      90     220     130    ISO    210    24O
        30
                                                                270
                           5-11
                                                                 773

-------
S(i').  Such a between-batch difference  is thus embedded  in the
model.  Figure 5-3 portrays model  (5.6).  Note that the lines  in
Figure 5-3b are not'necessarily parallel.  It is clear, however,
that .model (5.5)  is a special case of  (5.6)  in which  all  of the
fi(i) are equal.
     In the above mentioned models, the  rate parameters
associated with a given batch are assumed to remain constant  over
time.  This may be reasonable so long  as storage conditions
remain constant.  On the other hand, if  conditions change during
the storage of a single batch, then this, too, can be anticipated
to affect the rate.  For instance, suppose for batch  i that the
temperature is maintained at one level until time T(i) and is
then changed to another level.  Mathematically,  we have
 (5.7)
Ce(i)  = C0(i)
Cc(i)  = C0(i>
Q z t z T: (i)
T(i) <  t *
This illustrates,  in the most simplistic way,  that  the rate
parameter fl(i) may not be a constant.   It may  be  a  function of a
number of time- dependent storage-condition  factors (e.g.,
temperature, humidity).  Figure  5-4  illustrates this  situation.
     Let h(t) symbolically represent such storage conditions at
time t.  Thus a generalized version  of  (5.6) and  (5.7)  is
 (5.8)     Ce(i) - C0(i)  a-HJ.A
-------
FTGLHRE 5-3.
(•) Coacnnnoo Scito
4OO
                                                         *  >
                                                       ARAMETERS
                                                           270
I  -
        30     6O     9O     120    ISO     ISO    210     24O    270
                               «
                               5-13
                                                                 775-

-------
FIGURE 5-*. JLLUSTRATIO
             COM?	
                _ - MMODITY .
                CHANGE AT A
                                    EXPO
  (*> Cooc«onaoa Scato


  -tOO -  c
  350 -
  3OO
  23O
          A

POlNT"fNlT
                             'GRADATION FOR
                             'ARAMETERS THAT
                                                                 240
                                                                         270
•3  -L
          20
                  *0
90
                                 120
                                         13O
                       ISO
                                                         210
                                       24O
                                                                        270
                                     5-14

-------
complex; it might include second- or higher-order effects,  for
instance.  (The rate parameters are unknown and must be
estimated, but the form of the function g must be Known,or  a
reasonable approximating form must be found.)   In (5.9),  the  rate
parameters are allowed to vary among batches i and within batches
(because of the influence of h(t)  variations).  In order to
maintain the convention that CQ(i)  denotes  the initial
concentration, we require that g be equal to 1 at time zero,
i.e., g[6(i,h(0)),0] = 1.  Table 5-1 summarizes the degradation
models described above.
     5.3.2
Deterministic Models for Toxic Metabolites
          Let Ct denote  the  concentration of a pesticide at time
t and let Mt  denote concentration  of  a toxic metabolite.  For
illustrative purposes, we assume that the pesticide on/in  the
stored commodity can degrade via a mixture  of  two exponential
processes—one which produces the toxic  metabolite and one which
does not.  The first process occurs at rate ft^ and  the  second, at
rate 62.   The metabolite itself  is assumed  to degrade at rate B3;
schematically, we have:
                Pesticide
                               Toxic Metabolite
     Assuming exponential decay for each of the processes,  we
have the following deterministic models:
                               5-15
                                                              777

-------
  (s.ioa)
  (5.10*)
         A =
                  P2 -
where MQ denotes the concentration of the metabolite initially
(time 0).  In contrast with eq.  (5.4) and  (S.lOa), note that
(5.10b) implies that the metabolite concentration can increase
with time, over some time interval.  Figure 5-5 illustrates how
metabolites might behave over time in accordance with model
(5.10).  The solid curve depicts the degradation in concentration
of the parent compound (eq. (S.lOa)); the other two curves
represent two possible scenarios for metabolite behavior  (i.e.,
two different choices of parameter values in  (S.lOb)).  Given
model  (5.10), the maximum of the Mt curve occurs at time
f =
          ln[-
                   A,
                   AM0
In Figure 5-5, for example, the dashed-line curve showing the
increase in the metabolite level over time  (followed by a decline
in level) was generated using C0=400 ppm,  M0=150 ppm, B^O.015,
A2-0.025,  and fl3=0.020.  Thus the metabolite level increases  up
to t*=21.5 days,  and then begins to decline.   In the same way
that the basic exponential degradation model of subsection 5.3.1
was generalized to allow batch-to-batch differences in both
initial levels and rate parameters, we can expand eq.  (5.10) as
                               5-16

-------
TABLE 5.1.
       SUMMARY OF DEGRADATION MODELS
Model
Form of Model on
Concentration Scale
Form of Model on
Logarithmic Scale*
                                                   Remarks
5.5
                                  i) -pt
                                            Exponential  decay;  6
                                            does not depend on
                                            i,  so lines  on log
                                            scale are parallel;
                                            special case of
                                            model (5.6).
f . 6
                                  i) -p(2) t
                                            Exponential decay;  0
                                            may depend on batch
                                            i,  but remains
                                            constant over time;
                                            straight line on log
                                            scale may not be
                                            parallel; special
                                            case of model (5.8).
5.8
                                            Exponential decay;  fl
                                            may depend on i and
                                            on time-dependent
                                            storage conditions,
                                            h(t).  Form of h(t)
                                            must be known.
                                            Special case of
                                            model (5.9).
5.9
                                A(i) -ln(g) ,  where

                                      i,A
-------
 FIGURE 5-5. ILLUSTRATION OF PESTICIDE AND METABOLITE BEHAVIOR
 (•) Coocannaoo Sol*

       c or


 40O
 350  -
 300
 25O
             30
                                                          240
                                                                              270
Parent Compound » Solid LiiM
Metabolic** 1 «k I • Duh*d Uran
                            9O       120      IX)      ISO      210     240      270
30
Py*nt Compound » Solid Lin*
MitiiboiiinT* 2 » OuiMd LIIMS
                                          5-18

-------
 (S.ila)    Ccti)  =CQ(i)  e'(plU)*p2
-------
     e(i,j,t) = a random deviation  (i.e., departure from the
     deterministic component) for the jth  sample  at time t  from
     the ith  batch.
                  X
If the model is to be useful, the deviations e(i,j,t) must have,
at least approximately, a common variance regardless of
concentration level.  If the data support this assumption, then
we can entertain several forms for the deterministic component of
the model  (e.g., models (5.5), (5.6), or  (5.lib)), estimate such
models, and perform statistical tests to decide  among them.
Graphical and statistical procedures for checking this variance
homogeneity assumption are described below.
     We anticipate that the assumption of error  variance
homogeneity on the concentration scale will often be clearly
invalid.  In this case, we should seek a transformation of the
data (e.g., taking logarithms or square roots) that will yield
errors with homogeneous variance.  Suppose, for  example, that
measurements are obtained for several samples of batch i at each
of several time points during storage.  We first calculate a mean
and a standard deviation for each such batch and time, and plot
the logarithms of the standard deviations versus logarithms of
the corresponding means.  If the resultant scatterplot exhibits a
linear relationship, then a logarithmic transformation of the
concentration data is indicated.  This will, we  believe, often
provide a reasonable approximation.  If so, the  following model
is indicated:
 (5.13)
ln[CeU)]
  or
ln[Aft(i)3
                               i,j, t)
where Ct(i)  and Mt(i) are as previously defined,  and
                               5-20

-------
Yt(i,j)    =    the natural  logarithm of the  observed
             ^ concentration  for the jth sample at time t from
               the ith  batch,  and
e(i,j,t)  =    a random deviation  for the  jth sample at time t
               from the ith  batch.

     Some important special cases  of model  (5.13)  are  the
following:
  (5.14)     Yt(i,j)  = A(i) - pt + e(i, j, C)
  (5.15)     Yt(i,j)  = A(i) - P(i) t +  e(i,j, t)
  (5.16)     Y-U.j)  = A(i) - P(i) t +  -i(i) t2 + e(i,j, t)

In each case, the A(i) term is equal to ln[C0(i) ] or ln[M0(l)}.
Equation  (5.14) is the stochastic  version  of  model  (5.5),  on the
logarithmic scale  (see Table  5-1 and Figure  5-2(b)), and results
in parallel lines on that scale since there  is  a  common rate
parameter R assumed for all commodity batches.  Model  (5.15)
generalizes (5.14) by allowing the rate parameter to vary  among
batches (see Figure 5-3(b)).  It corresponds  to model  (5-6).
Model (5.16) is a further generalization:  it permits  the
In(concentrations) for each batch  to be represented by a
quadratic model.  This is equivalent to allowing  the rate
parameter to change linearly  with  time (i.e., rate  parameter =
6(i) - v(i)t) , rather than being  a constant  for  each  batch,  and
is thus one example of a stochastic version  of  model (5.8).
     An example of data for which  such models would apply  is
depicted in Figure 5-6, which shows a scatterplot of some
hypothetical concentration data generated  for various  time points
(namely, at 0, 4, 8, 16, 32,  64, 128, and  256 days).   The  data
were generated by assuming a  given model  (namely,  model (5.15))
and by allowing deviations from the model  to  be normally
distributed with a 15 percent coefficient  of  variation (CV).
Three different commodity batches  having different  initial

                               5-21

-------
residue concentrations and different rate parameters were
assumed.  Three "measured" values per batch were generated for
each time point.  Part (a) of the figure shows the results on the
concentration scale, and part (b),  on the logarithmic scale.
Note that in part (a), the dispersion (i.e., variance)  of the
concentration measurements for a given batch is larger when the
concentration levels are higher, while in part (b),  it tends to
be stable across the range of concentrations for each batch.   As
described above, a plot of the ln(standard deviations)  versus
In(means) was generated.   This plot, presented in Figure 5-7,
indicates that a logarithmic transformation of the data would be
suitable, since a straight-line relationship between the
In(standard deviations) and the In(means) seems to hold.
     The random deviations in the stochastic models result from
both sampling errors and measurement or analytical errors {and
also possibly from lack of fit of the chosen model).  Hence,  even
on the logarithmic scale, we might anticipate that the deviations
will tend to be larger (in absolute valUe) for some batches than
for others because of variability in the uniformity with which
the pesticide was applied, variability in the uniformity with
which rainfall or irrigation occurred, etc.  It is for this
reason that we should identify the plotted data points by batch
(e.g., as in Figure 5-6).  Having variance homogeneity within a
batch across time is important,  because the model estimation
methodology — namely, least squares— is meaningful only if
errors tend to have the same dispersion regardless of time or
concentration level.  If variances are stable within each batch.
but vary from batch to batch, then the model can still be
estimated.  However, we will typically be interested in testing
hypotheses about differences in degradation rates among the
batches, and the validity of standard statistical tests reliej§_on
the assumption that the within—batch variances are also stable
from batch to batch.
                               5-22
                                                               fvr

-------
FIGURE 5-6. ILLUSTRATIVE RESIDUE DATA FROM THREE BATCHES

(•) CoocMcrkaoa Seal*


  X

300 *
      o
300 •>
200
        Z   -
loo  r   a   -

        "
    0       30      60      90      120     150     ISO     210     24O     270

                                       c

                  BATCH   »*«!      ooa2.     »««3





(b> La(Cooc»no<»> Scito


 Y

 7 -

                                     s


0

X
i
30 60 90 120 ISO 180 210 240 27(
t
BATCH « « » t ooo2 « • • 3
                                      5-23

-------
 c/%
 * »^p*^
 J£
S
0    I
E    5
                 QL
                       D  '
                                       X



                                       X
o
                                           5-24
                   c*
                   i
of

a
Mcan
2

Ln(
                                                                                        x


                                                                                        X


                                                                                        X
BATCH

 i

-------
     The recommended approach for testing for variance
homogeneity both within and among batches is as follows  (See
Levene's test in Snedecor and Cochran, Seventh Edition,  1980).
The test makes use of data from those batches and time points for
which residue measurements on three or more samples have been
obtained.

     1.   For each such batch (i) and time point  (t), calculate
          the mean concentration, mt(i).
     2.   Compute the absolute deviations of the data points from
          their respective means:
                   zt(i,j)  = jxt(i, j)  - mt(i) |.
     3.   Perform an analysis of variance (ANOVA) on the zt(i,j),
          using the following sources of variation:
          •    Batches
          •    Times Within Batches
          •    Residual (i.e., within times and batches).
          Test the first two sources of variation against the
          residual using standard ANOVA F tests.
     4.   If at Step 3 the test for "Times Within Batches" is
          statistically significant, then a transformation of the
          data should be considered.  (The data should also be
          examined for outliers.)  Steps 1 through 3 should be
          repeated using the transformed data.
     5.   If at Step 3 the test for "Batches'1 is statistically
          significant, then variance heterogeneity from  batch to
          batch is indicated.  Such heterogeneity implies that
          care must be exercised in subseguent testing for
          degradation-rate differences among batches (see next
          subsection).

     The above procedure is illustrated in Exhibit 5-2,  utilizing
the data shown in Figure 5-6.  The top portion shows the results
for the original, concentration-scale data, and the lower

                               5-25

-------
portion, for the log-transformed data.  Each portion gives the
means and standard deviations, by cell (i.e., batch and time
point), and the ANOVA results.  In the top portion, note that
both the "Times Within Batches" and the "Batches" components are
highly significant, suggesting the need for a transformation.
     5.3.4
Estimation of the Degradation Model
          The overall recommended strategy for estimating the
degradation model is portrayed in Figure 5-8.  Step 1 involves
establishing an appropriate measurement scale for subsequent
statistical analysis, as determined by the plotting and testing
strategies described in the previous subsection.  If such
approaches are inconclusive, use of a logarithmic scale may be
considered.  For demonstration purposes, we assume in the
remainder of this subsection that this scale will be employed.
(This assumption is also implicit in the steps indicated in
Figure 5-8.)
     Step 2 involves fitting model (5.16) and testing it for lack
of fit.  Standard regression software is available to accomplish
this.  with certain software, such as the SAS GLM (General Linear
Model) procedure, the fitting of the model can be done
simultaneously for all batches, but this is not necessary (e.g.,
the SAS regression procedure, REG, can be applied separately for
each batch).4  Part (a)  of  Exhibit 5-3  shows the estimates  of
the model parameters for model (5.16) when it is applied to the
data of Figure 5-6(b).  Since separate regressions were performed
for each batch, the standard errors shown in the exhibit are
based upon the separate residual variances.  Standard t tests can
be used to determine if the  (i) parameters in model (5.16) (i.e.,
the coefficients on t2)  differ from zero:
     4SAS is the registered trademark of SAS Institute,  Inc., Gary,
North Carolina, 27511.
                               5-26

-------
   EXHIBIT 5.2.
SUMMARY OF ILLUSTRATIVE DATA AND EXAMINATION OF
VARIANCE HOMOGENEITY
    (a)  Summary of Concentration Data

t (days)
0
4
8
16
32
64
128
256
Means, By Batch
1
47.375
50.223
47.981
36.539
33.931
21.121
7.294
1.069
2
132.579
154.288
145.798
107.234
94.585
41.484
12.021
1.049
3
396.455
378.243
359.934
247.831
165.956
82.875
17.644
0.715
Std. Devs., By Batch
1
5.210
3.806
5.663
7.696
3.487
3.479
0.634
0.312
2
21.253
21.285
18.407
10.815
18.914
6.272
3.125
0.137
3
41.421
25.399
37.743
18.037
36.257
7.505
1.241
0.055
ANOVA for Absolute Deviations from Cell Means:
Source of Variation
Batches
Times Within Batches
Within Cells
Degrees of Freedom
2
21
48
Mean Squares
946.37
183.70
67.65
F Values
13.99*
2.72*

Continued on next page
                                  5-27

-------
EXHIBIT 5.2.
SUMMARY OF ILLUSTRATIVE DATA AND EXAMINATION OF VARIANC
HOMOGENEITY  (CONTINUED)
     (b)  Summary of Ln (Concentration)  Data

t (Days)
0
4
8
16
32
64
128
256
Means, By Batch
1
3.854
3.915
3.866 .
3.583
3.522
3.041
1.985
0.034
2
4.879
5.032
4.977
4.672
4.536
3.717
2.462
0.041
3
5.979
5.934
5.882
5.511
5.096
4.415
2.869
-0.337
Std. Devs. , By Batch
1
0.112
0.076
0.119
0.218
0.105
0.163
0.087
0.326
2
0.154
0. 142
0.125
0, 100
0.203
0.159
0.278
0.136
3
0.105
0.066
0. 108
0.072
0.217
0.093
0.071
0.079
  ANOVA for Absolute deviations from Cell Means
Source of Variation
Batches
Times Within Batches
Within Cells
Degrees of Freedom
2
21
48
Mean Squares
0.01410
0.00714
0.00451
F Values
3. 13
1.58

 *^ Statistically  significant at 0.01 level  of  significance.
 "Three observations per batch per  time point.
                                   5-28

-------
     T = [estimate of  (ij]/[standard error of estimate of  (i)].

To determine statistical significance at the 0.01 level  (say), we
compare the absolute value of these calculated values to the
upper 99.5 - percentage point of the t distribution with n(i)-3
degrees of freedom, where n(i) is the number of samples of batch
i for which we have residue concentration measurements and upon
which the regression is based.  For the data illustrated here,
the T values are not significant, since they are all less than
the tabulated value of 2.831.
     To determine if model (5.16) (or some submodel such as
(5.15) or (5.14)) provides an adequate representation of the
data, a lack of fit test can be performed, so long as more than
one sample measurement is used to furnish data at each of several
time points.  This is because a "pure" error variance
(independent of a mathematical model) can then be calculated,
against which the residual variation from the model can be
compared.  The "pure" error sums of squares (S3) for the ith
batch can be obtained as
     Error SS(i)= Z[ (nt (i) -1) s2t (i) ] ,
where the summation is over time points, and where nt(i)  = number
of observations for batch i,  time t; and
     st(i)  =   standard deviation of In(concentrations) for batch
               i, time t (see Exhibit 5-2{b)).s

     To test for lack of fit, we first compute

F = [RSS(i)-Error SS(i)]xDFE(i)/[(DFR(i)-DFE(i))xError SS(i)],
     5Error SS(i) can also be obtained directly as the  "residual or
error"  sum  of  squares  that  results  from  treating  time  as  a
categorical  variable   in  an   analysis  of  variance   of  the
In(concentrations) for batch i.
                               5-29

-------
            FIGURE 5-8.  ESTIMATION STRATEGY
                 Calculate means and standard
                deviations of log(residue cone.)
                 for each batch and time point.
                  Fit model  (5-16) by least
                squares and test for lack of fit.
             Model  (5-16)
Model (5-16)
              adequate
 inadequate
  Test hypotheses to
determine if model  (5-15)
 or (5-14) will suffice.
         Utilize successive
       difference approach to
      develop piecewise linear
          approximation.
                                   5-30

-------
    EXHIBIT 5-3.   ESTIMATED MODEL PARAMETERS  FOR ILLUSTRATIVE DATA*
Model"
5. 16
5.15
5.14
(Batch)
i
1
2
3
1
2
3
1
2
3
Estimates of
A(i)
3.9226
5.0501
5.9845
3.9396
5.0356
5.9844
4.2339
5.0448
5.6773
fl(i)
-0.0143
-0.0204
-0.0247
-0.0152
-0.0196
-0.0247
-0.0198
Y(i)
3.6X10'6
3.0X10'6
2.3X10'8
0
0
0
0
0
0
Est. Std. Errors for
Estimates of
A(i)
0.0503
0.0547
0.0360
0.0412
0.0447
0.0293
0.0809
0.0809
0.0809
I3(i)
0.0015
0.0016
0.0011
0.0004
0.0004
0.0003
0.0005
y(i)
5-8xlO~6
6.4X10'6
4.2xlO'6
-
-
-
. -
-
-
  Data  are  shown  in  Figure  5-6(b) and are summarized in Exhibit 5-2(b).
'"Models relate In (concentrations)  to time t:
     A(i)  = intercept
     fl(i)  = coefficient of t
     y(i)  = coefficient of t2
Note:     All of the A(i)  and fl(i)  are statistically significant.
          Y(i) are not.
The
                                   5-31
                                                                  ft?

-------
where
     DFE(i) ». Z(nt(i)-l),  the degrees of freedom for the error
               component;
     DFR(i) =  the degrees of freedom for the residual component;
               and
     RSS(i) =  residual sum of squares from fitting the model.

This calculated F value is then compared to the upper 99th
percentage point of the F distribution with [DFR(i)-DFE(i)] and
DFE(i) degrees of freedom  (assuming a significance level of 0.01
is to be used) .6  Part (a)  of Exhibit 5-4 shows these
calculations for testing whether model 5.16 is adequate.   In the
exhibit, the differences in the sums of squares and in the
degrees of freedom are referred to, respectively, as the
lack-of-fit (LOF) SS and degrees of freedom (DF).  None of the
calculated F values are statistically significant, since they do
not exceed the tabulated F value of 4.44.
     The above results indicate that model (5.15) or  (5.14),
which are special cases of (5.16), might suffice.  Hence,  as
indicated in Figure 5-8, the next step is to fit these models and
evaluate them for their adequacy.  The results are summarized in
Parts (b) and (c) of Exhibits 5-3 and 5-4.  The latter exhibit
indicates that model (5.15) is adequate, but that model  (5.14) is
not.  Part (c) of Exhibit 5-4 also includes a test of the
hypothesis that all of the rate parameters are equal.7
     This test employs the test statistic
     6A  significance  level  of  0.01  is  used  for  illustrative
purposes; some other prespecified level could  also be used.
     Depending upon  the design  of  the degradation  study,  more
sophisticated statistical analyses may also be  possible  that allow
testing degradation  rates  among  certain groups of batches (e.g.,
varieties or  storage conditions) rather than  simply testing for
equal rates among all batches.   Similarly,  such analyses can test
for higher order trends  and  also whether these trends vary among
groups of batches.  Appendix 5A  furnishes  an example.
                               5-32

-------
     F' = (RSSe-r RSSfcUHxEDFRfcU) / {[DFRC-S DFRb(i)]x2 RSSb(i)>,

where the summation is over batches  (i), and
                  V
     RSSC -  residual sum of squares from Part c (i.e., model
     5.14) ;

     RSSb(i)  = residual sum of squares from Part b, batch i;
                                                              •.
     DFRC -  residual degrees of freedom from Part c; and

     DFRb(i)  = residual degrees of freedom from Part b, batch i.

The F1 value  is compared against  tabulated F percentage points,
with [DFRC-  IDFR^i)]  and DFRC degrees" of freedom.  In the present
example, since F' > 4.99, we conclude,  at the 0.01  level of
significance, that the rate parameters  for the different batches
are different.  Note  that the test statistic involves pooling of
the residual  sums of  squares across batches; as such, the test
procedure assumes that the underlying sampling and  analytical
variances are essentially the same for  all batches  (see
subsection 5.3.3 and  Exhibit 5.2). If,  at Step 2 of the
estimation strategy depicted in Figure  5-8, we were to conclude
that model (5.16) was not adequate, and if outliers have been
ruled out as  a cause, then use of a more complex model would be
indicated.  If we have a theoretical basis for choosing one, then
we can obviously attempt to fit that model.  A prime  example is
the case of toxic metabolites, for which the following nonlinear
statistical model is  a likely candidate:
 (5.17)    Yt(i,j) = In [*fe(i)l + ed.j.t)
                               5-33

-------
EXHIBIT 5-4.  ILLUSTRATION OF  PROCEDURES  FOR TESTING MODEL ADEQUACY
(a) Model 5.16: Tf(i,j). = A(i) + fl(i)t + y(i}t2 + e(i,j,t)
(1)
Batch
(i)
1
2
3
Total
(2)
Error
ss
0.4629
0.4639
0.1978
1.1246
(3)
Residual
SS
0.5564
0.6589
0.2857
1.5010
(4)
LOF
SS
0.0935
0.1950
0.0879
0.3764
(5)
Error
DF
16
16
16
48
{6}
Residual
DF
21
21
21
63
(7)
LOF
DF
5
5
5
15
(8)
F for
Testing
Lack of Fit
0.65 NS
1.35 NS
1.42 NS
1.07 NS
 (b)  Model 5.15;  Yf(i,j)  = A(i)  + 6(i)t + e(i,j,t)
(1)
Batch

1
2
3
Total
(2)
Error
SS
0.4629
0.4639
0.1978
1.1246
(3)
Residual
SS
0.5662
0.6661
0.2857
1.5180
(4)
LOF
SS
0.1033
0.2022
0.0879
0.3934
(5)
Error
DF
16
16
16
48
(6)
Residual
DF
22
22
22
66
(7)
LOF
DF
6
6
6
18
(8)
F for
Testing
Lack of Fit
0.60 NS J
1.16 NS fl
1.19 NS |
0.93 NS I
(c)  Model 5.14:
                          = A(i) + fit + e(i,j,t)

(2)
Error
SS
1.1246
Difference
(3)
Residual
SS
8.9418
7.4238
(4)
LOF
SS
7.8172
(5)
Error
DF
48

(6)
Residual
DF
68
(7)
LOF
DF
20
(8)
F for
Testing
Lack of Fit
16.68"
2 [Model 5.14-Model 5.15}
Testing 6(1) = 8(2) = 6(3): F = (7.4238/2) / (1.5180/66) = 161.39"
(Compared to F with 2 and 66 degrees of
freedom)
Note:     Col  (4) - Col  (3)  -  Col  (2)
          Col  (7) » Col  (6)  -  Col  (5)
          Col  (8) = [Col  (4) x Col  (5)]  /  [Col (2)  x Col (7)]
Lack of fit is indicated  if  Col (8)  exceeds tabulated F with Col (7} and
Col  (5) degrees of freedom.  NS denotes  "Not statistically Significant".
"**"  denotes "significance at the 0.01 level.11
                                    5-34

-------
where Mt{i)  is given by eq.  (5.lib).  This model is, in fact, the
stochastic version of that model, on the  logarithmic scale.
Because Ct(i)  is involved in the model,  measurements on the
parent compound must be available as well.  One approach  for
handling this situation is the  following:8

          Determine an adequate model for the parent compound
          using the methods described above.  Suppose that we
          conclude that it can  be modeled via model  (5.15),  and
          that we obtain the estimated rate parameters  fl(i)  and
          intercepts A(i).  From this estimation of the
          parent-compound model, we can substitute  the
          appropriate estimated values into model  (5.17)  —  i.e.,
          substitute exp[A(i)]  for  CQ(i) ,  £(i)  for fl, (i)+flz(i),
          and exp[A(i)-/l(i)t] for Ct(i) .   Model (5.17)  then will
          involve three parameters  per batch (i.e., MQ(i), J52(i),
          and 63(i)),  which can be estimated using nonlinear
          least squares.9  Note that the 62(i) parameters  in
          (5.17) should be restricted to  be less than or  equal to
          the corresponding fl(i) in (5.15).

     If the above-described parametric approaches for modeling
the residue degradation appear  inadequate, then an  alternative,
successive-differences approach is  suggested.  This approach,
described by Schwertman and Heilbrun  (1986), approximates the
degradation of In(concentrations) over time for a batch with a
series of piecewise linear functions over a fixed set of  time
intervals.  The data points do  not  necessarily have to  coincide
with the ends of the intervals; for instance, we might  simply
     8A more sophisticated approach would simultaneously model  the
parent  compound (e.g.,  using  (5.15))  and  the metabolite  (e.g.,
using  (5.17)).
     'standard  statistical  packages,  such  as  the  SAS  procedure
NLIN, are available for  performing such  analyses.
                               5-35

-------
choose the intervals to coincide with those available for the
storage-time frequency distribution.  The slopes of the line
segments are 'estimated (using weighted least squares) by modeling
differences between successive In(concentrations) as a function
of time increments'.
     Regardless of the particular method of model estimation
(ordinary least squares or nonlinear least squares applied to a
parametric model, or weighted least squares applied in the
successive-differences approach),  the result of the statistical
analysis of the degradation study data is an estimated g(t)
function, or a set of such functions.  Assuming that a set of
compatible storage-time distributions are available, the methods
described in Section 5.2 and illustrated in Exhibit 5-1 would
then be employed to produce the average storage degradation
factor(s) G.

5.4  Design of Degradation Studies

     As previously indicated, a degradation study's primary goal
is to estimate, via a statistical model, how pesticide residues
(or metabolites) in/on foods change over time while the commodity
is in storage.  A generic structure for a degradation study
involves the following:
          J "representative" commodity batches known to have been
          treated with the pesticide of interest are selected.
          The batches are stored under "typical" commercial
          storage conditions for a specified period of time, s.
          At each of m chosen time points, an "appropriate" plan
          for selecting samples of the stored commodity is
          implemented; at time point t, this involves selecting
          nt(i)  samples (perhaps composites of several samples)
          from the ith  batch, selecting one or more subsamples
          (to obtain specimens having suitable mass for
                               5-36

-------
          analysis), and analyzing these one or more times to
          determine a residue level for each sample.

The above items demonstrate that design of a degradation study
will require consideration of a number of statistical issues.
For example:  What values should be used for J, S, m, and nt(i}?
What is meant by "representative," "typical," and "appropriate?"
What sort of spacing of the m time points should be utilized?
These and related issues are addressed in the following
subsections.  Subsection 5.4.1 discusses issues related to
choosing batches for inclusion in the study.  Subsection 5.4.2
discusses plans for commodity sampling, subsampling, and
compositing — for a given time point.  Subsection 5.4.3
considers the number and spacing of the time points.

     5.4.1  Choosing Batches for a Degradation Study

          The first step in designing a meaningful degradation
study is to explicitly define its objectives.  This means that we
must define all commodity subsets and conditions for which we
want to obtain a separate degradation curve estimate.  (Recall
that compatible storage-time distribution data need to be
available as well, if we want to construct subset-specific
estimates of G, the average degradation factor.)
     To define the subsets, it is first useful to list the major
factors (i.e., controllable variables) that might have an impact
on the pesticide's degradation.  Such factors fall into several
classes, including:

     •    variations in where the crop is grown (climatic/growing
          regions),
     •    variations in pesticide application practices,
     •    variations in growing, harvesting, or post-harvest
          practices,

                               5-37

-------
     •    variations related to crop differences (e.g.,
          varieties),
     •    variations in storage conditions.

     Clearly, such' factors may not be independent of one another.
For example, storage conditions typically employed in certain
areas of the country may differ from those used in another part
of the country, or certain varieties will be confined to certain
regions.  Nevertheless, it is helpful to list each potential
factor separately, and to identify the relevant levels of each
one.  If there are more than two or three factors, it will
probably be necessary to prioritize them and to develop a design
directed at only the two or three of highest priority.
Similarly, if a variable has a large number of levels  (e.g., a
large number of varieties), then we should attempt to combine
some of the levels in some meaningful way.  Varieties might be
grouped according to major uses — for instance, apple varieties
used primarily for processing (e.g., York, Rome Beauty) might
form one group, and those generally marketed fresh might form
another group (or set of groups).  Such groupings are meaningful
not only because pesticide degradation rates may differ but also
because storage-time distributions are likely to be different for
the different groups.
     Having identified the major factors of interest, the next
step is to form a matrix showing all possible combinations of the
factor levels.  For instance, we might decide that there are four
major growing regions, three major varieties (or groups of
varieties), and two primary storage condition scenarios (perhaps
dependent on how the commodity will be processed subsequently).
In the case of apples, for example, scenario "a" might designate
"controlled atmospheric storage," and scenario "b" might
designate "standard refrigeration."  (Whereas the former allows
storage for up to 12 months, the latter allows storage for only
two to four months.)  This initial matrix might then appear as

                               5-38

-------
follows:
                                      Growing Region
Group Condition 1234
A - a
b
B a
b
C a
b
























The purpose of forming such a matrix is to ensure that we have
not omitted any important combinations of the factors.
     The next step is to eliminate any of the combinations that
do not occur in practice or are otherwise deemed irrelevant.  For
example, the cells marked with an "X" below might be considered
irrelevant to our study:
                                     Growing Region
vaj_xt:<_y ouui. ai^t; •
Group Condition 1234
A a
b
B a
b
c a
b


X
X


X
X
X

X

X

X

X

X
X


X
X
Note that variety A is grown in regions 1 and 3, while varieties
B and C are grown in all regions except 1 and 4, respectively.
Similarly, storage condition "b" occurs in all regions, while "a"
                               5-39

-------
is confined to regions 1 and 4.  We are thus left with 11
pertinent cells.  We might therefore define the objective of the
study to be:  "estimate degradation curves for each of these 11
cells.."  In order that the experiment have the capacity to
demonstrate reproducibility, it may be desirable to include two
or three batches of the same variety from the same growing
region.
     Alternatively, we might believe that the "growing region"
factor will have little or no impact on the degradation rate
(i.e., on the shape of the curve).  If so, our reason for
inclusion of that factor in the matrix would be our desire to tie
in the degradation study with a planned or ongoing field trial
that attempts to ensure that there is some measure of geographic
representativeness.  In that case, the study objective would be
to estimate six curves — namely, one for each combination of
variety and storage condition.
     A degradation study may be thought of as an extension of a
residue field trial study.  Except for factors related to storage
conditions and possibly some other post-harvest activities (e.g.,
drying methods), the relevant factors are common to both types of
studies.  Ideally, therefore,, the two studies would be designed
simultaneously.  Thus we can consider the pertinent factors to be
of four possible types:
          Pre-harvest "treatment" factors associated with plots.
          These factors, which would include variety and mode of
          pesticide application, are common to both field trial
          studies and degradation studies.  These factors, in
          contrast to "blocking" factors, are anticipated to have
          a possible effect on degradation rates.
          Pre-harvest "blocking" factors associated with plots.
          Such factors may affect initial residue levels but they
          are presumed to have no influence on degradation rates.
          Hence such factors, which might include growing region,
                               5-40

-------
          are not of special interest in the degradation study,
          but their inclusion can ensure that some diversity in
          plots is present in the study.
     3.   Post-harvest "treatment" factors associated with
          batches.'  These factors, such as storage conditions,
          are not a part of field trials.  For those plots for
          which more than one level of the factor is possible
          (like variety A or C in growing region 1 or variety B
          in growing region 4 in the prior illustration),  we
          should randomly assign batches to the levels.
     4.   Post-harvest "blocking" factors associated with
          batches.  These factors, which might allow differences
          in pre-storage handling of the crop to be accounted for
          in the study, would not be anticipated to affect
          degradation rates.

Since factors of types 3 and 4 are related to activities that
occur after harvest, they would not be pertinent to a residue
field trial study.  Factors of types 1 and 2 may be relevant to
both types of studies.
     One major way in which field trials and degradation studies
differ is in terms of what they represent.  The field study
should represent the population of potential plots upon which the
crop might be grown.  For example, all major growing regions
should be covered and a larger portion of the field trials should
generally be conducted in those regions with the largest
production.   Field trial commodity samples from a plot should be
selected in such a way as to be representative of the entire
plot.  A degradation study, on the other hand, does not need to
be representative of all available plots and samples used in a
degradation study do not need to be representative of the entire
plot.  In fact, we recommend that a single large batch of the
commodity from each plot be selected.  The rationale for this
recommendation is twofold: (l) representativeness of the entire

                               5-41

-------
plot is not needed, because it is the change in the pesticide
residue level over time, rather than the level itself, that is of
interest, and (2) there appears to be little need for multiple
batches from the same plot, since the temporal degradation
behavior of a pesticide or metabolite from two such batches is
likely to be quite similar.  Having batches from different plots
would be more useful.
     If the commodity is typically shipped or stored in
individual containers (e.g., bushel boxes or 20-bushel pallet
boxes), then a batch can be conveniently defined in terms of the
contents of these containers.  If a single container can readily
provide several times the amount of material needed for all the
chemical analyses {at time 0 and later times),  then a single such
container's contents can be regarded as the batch (see
subsections 5.4.2 and 5.4.3).  If natural storage containers are
not used or are much too large for handling experimentally —
e.g., as in grain storage — then a batch can be defined as some
smaller amount of commodity; we should take care that this choice
of a smaller unit will not appreciably affect the manner in which
the commodity would normally be stored, however.
     Ideally, all of the commodity batches that are to be stored
under the same conditions would be transported to a common
storage facility.  In the above illustration regarding apples,
for example, two such facilities would be needed — one for
standard refrigeration and one for controlled atmospheric
storage.  Such control of the storage conditions should enhance
our ability in the statistical analysis to discern degradation
rate effects due to storage conditions, as well as other
batch-to-batch differences (such as those associated with
varieties, for example).
                               5-42

-------
     5-4.2  Sampling Plansfor Commodity Sampling

          The following general strategy may be used for sampling
of the commodity from a batch.  Upon arrival of the batch at the
storage facility, "the commodity should be placed into 20 to 30
individual storage containers.  Each container should be at least
of sufficient size to accommodate all analyses of its contents to
be performed at any given time point (as discussed in the next
subsection).  In doing so, we should attempt to fill the
containers more or less simultaneously if there is reason to
believe that systematic heterogeneity in the pesticide levels
exists within the batch.  Also, to the extent possible, we should
seek to avoid excessive handling of the commodity.  Unique
identifying numbers should be assigned to, and recorded on, the
containers.  Batch identification and the date and time storage
began should also be readily available.  The containers should
then be placed together into the storage facility and stored
under the designated conditions.
     A single container would be chosen for each time point that
chemical analyses are due.  A random ordering of the containers
is used to determine which container is withdrawn for each time
point.
     For a given time point at which analyses are to be
performed10 on a  given container's contents, the question of
how to sample and composite the commodity must be addressed.
Since our objective is to estimate a degradation curve, the goal
at a given design point should be to provide an accurate and
precise estimate of the true residue level (or logarithm thereof)
for that commodity batch at that time.  To ensure accuracy,
suitable and adequate quality assurance and quality control
procedures must be employed.  To ensure precision, "adequate"
     10Such  time  points  are  hereafter  referred  to  as  "design
points."
                               5-43

-------
numbers of the "appropriate" types of commodity samples must be
subjected to chemical analysis.  To define "adequate" and
"appropriate," we need to have information on the magnitudes of
both the sampling variability and the analytical variability that
we can anticipate encountering in the degradation study.  This
information can be used, as described below, to investigate how
precision will be affected by compositing and by the number of
analyses.
     We must first define the minimum unit of commodity that is
conveniently represented by a chemical analysis.  We will refer
to this as the analytical unit.  Examples of such units would be
an individual apple or potato, or a pint of wheat.  Suppose we
were to perform an analysis on such an analytical unit; then we
might employ the following model to describe the structure of the
resulting observation:
(5.18)
X = Z + €,
where x = observed residue level, z = true residue level, and e
deviation due to measurement error.11  We now define the
following:
        = standard deviation of true residue levels among
          analytical units in the batch or batch subset  (e.g.,
          reflecting variability of residue levels among apples
          in the batch that would be observed in the absence of
          measurement errors).
        = measurement-error standard deviation of residue levels
          (i.e., reflecting variability among replicate analyses
          on the same unit).
     11The term "measurement errors" is not here restricted to mean
instrumental errors; rather it may include errors arising from many
components,
extraction.
   such  as   sample   homogenization,   preparation  and
                               5-44

-------
Given the above structural model, the total variance associated
with observed jresidue levels x can be represented as
Note that if estimates for the total variability and for the
measurement-error component are available, then an estimate for
the true, among-item component can be obtained by difference.
     At the given time point, we consider the following general
strategy for obtaining the estimated residue value:

     l.   Choose k random samples of size p  (e.g., p apples per
     random sample) from the randomly chosen container.
     2.   Composite the p units of each sample, so as to obtain k
     composited samples.   (We assume that p  is not so large that
     homogenization is a practical problem;  in fact, we assume
     that no additional measurement error is introduced through
     the compositing process.  We also assume the p units are
     approximately the same size.)
     3.   Perform r replicate analyses per composited unit.
     (Hence there are rk analyses in total.)
     4.   Using the concentration scale, calculate the means of
     the replicate analyses, and average these over the k
     composited samples.   (Let L denote this overall average).

     With respect to z, compositing is essentially like
mathematical averaging.  Thus the variance of L  is,
approximately,
  (5.20)
                (-<)
                  P
Note  that  if  p=l,  then no compositing actually takes  place.
Similarly,  if r=l,  then there are no replicate analyses on the
                               5-45

-------
same  (analytical or composited) unit.
      Eq.  (5.20) indicates how precision of an average
concentration at a design point is affected by our choice of
values for p, k, and r.  Note first that increasing k will tend
to be more beneficial than increasing r, since both variance
components are divided by k, while only the measurement error
component is affected by r.  Disregarding costs, it would
therefore be better to maximize k and choose r=l (i.e., use k
samples and one analysis per sample), unless information for
estimating  the individual variance components is regarded as an
important secondary objective of the degradation study.  (This
may very well be the case, especially during the early phase of
the study since that information could help us refine the initial
sample size estimates.)
     If we divide eq. (5.20) by L2,  then the variances can  be
expressed on a relative basis, and we can obtain:
 where
          Yt = -=r  = the overall relative standard error for the L
                    estimate,

                 - relative standard deviation associated with
                    measurement errors, and
  [5.22)   U = J <-5i)  + {^;
              N   P      r
          where
R = ( — )  =
                           = ratio of sampling to measurement
                             error standard deviations.
Table 5-2 shows values of U, as defined by eq.  (5.22), for

                               5-46

-------
selected values of R, r, and p.  Table 5-3 shows values of u
divided by the square root of k  (i.e., values of YL,  apart from
the factor ye) ,  for selected values of R,  r,  and k (rows of
table) and of p (columns of table).  The rows are sorted by R and
by k x r, which is' the total number of analyses to be performed
for the given batch and time point.  If we have an idea of the
magnitude of R, then we can use the tabular results to see what
relative gains and losses in precision can be achieved by varying
p, r, and/or k.  Several aspects of these results are noteworthy.
First, for given p, R, and k x r, the aforementioned preference
for large k and small r should be apparent.  Also note that
little benefit is derived from compositing if the ratio of
sampling variability to measurement-error variability is small;
for instance, for R=0.5, forming a composite sample of 20 units
(e.g., apples) will reduce the overall average standard deviation
by about 10 percent over that that would be obtained without
compositing  (i.e., p=l).  But for large R, reductions of well
over 50 percent can be achieved even by compositing as few as 5
units.  Finally, we note the diminishing benefits on the
precision due to increasing k — that is,  much more reduction in
precision is achieved by increasing k from 1 to 2 than by
increasing it from 4 to 5.  This is obvious also from eq. (5.20)
and (5.21),  which show that the overall standard error varies
inversely with the square root of k.
     In addition to precision, cost considerations are also
important.   The incremental cost of performing the k x r analyses
is comprised of two major components:  k x r x  (average per-unit
analysis cost) + k x  (average per-unit sampling cost).  Analysis
cost includes the average cost of selecting a subsample,
preparing it for analysis, and the actual chemical analysis.  The
sampling cost component consists of the average per-unit costs
associated with collecting, handling, and preserving a sample.
If p>l, then the sampling cost is also assumed to include the
cost of compositing.

                               5-47

-------
     5.4.3
Choosing Time Points
          After choosing batches using the ideas expressed in
subsection 5.4.1 and deciding on how the sampling of batches
should be carried'out  (subsection 5.4.2), we must next decide
when we are going to sample the batches and make the
concentration measurements.  This involves two major steps:

     1.   Determining  S, the overall time interval that the
          degradation  study is to cover.
     2.   Deciding how to allocate the design points to the
          interval (i.e., at how many time.points within S should
          we sample the stored commodity, how should the time
          points be spaced over S, and how many samples should be
          analyzed at each chosen point?).

We address each of these in turn.
     In general, the time interval of the study should encompass
the entire range of time over which the commodity is stored.  If
there is prior knowledge concerning the rate of degradation that
suggests that concentration levels will approach zero (or limits
of detection) over a shorter interval, then S can be tentatively
set to be equal to that shorter time interval.  We should
nevertheless be prepared to modify S to be longer if the data
from the degradation study so indicate.
     The choice of numbers of design points, their spacing, and
the allocation of samples to each point are intimately related
and must be considered jointly.  The optimal choice of design
points depends .upon our objectives and upon the form of our
assumed model.  Unfortunately, there will almost always be (a)
more than one objective, and (b) some uncertainty concerning the
form of the model.  The objectives could include:
                               5-48

-------
TABLE 5-2.
VALUES OF U AS A FUNCTION OF THE RATIO OF RELATIVE
STANDARD DEVIATIONS (R),  NUMBER OF UNITS
COMPOSITED (p),  AND NUMBER OF REPLICATE ANALYSES
(r)
Ratio of Relative
Std. Deviations (R)
0.5
0.5
0.5
0.5
1.0
1.0
1.0
1.0
1.5
1.5
1.5
1.5
2.0
2.0
2.0
2.0
5.0
5.0
5.0
5.0
r
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
p-1
1.1180
0.8660
0.7638
0.7071
1.4142
1.2247
1. 1547
1.1180
1.8028
1.6583
1.6073
1.5811
2.2361
2.1213
2.0817
2.0616
5.0990
5.0498
5.0332
5.0249
p=5
1.0247
0.7416
0.6191
0.5477
1.0954
0.8367
0.7303
0.6708
1.2042
0.9747
0.8851
0.8367
1.3416
1.1402
1.0646
1.0247
2.4495
2.3452
2.3094
2.2913
p=10
1.0124
0.7246
0.5986
0.5244
1.0488
0.7746
0.6583
0.5916 "
1.1068
0.8515
0.7472
0.6892
1.1832
0.9487
0.8563
0.8062
1.8708
1.7321
1.6833
1.6583
p=20
1.0062
0.7159
0.5-31
0.5123
1.0247
0.7416
0.6191
0.5477
1.0548
0.7826
0.6677
0.6021
1.0954
0.8367
0.7303
0.6708
1.5000
1.3229
1.2583
1.2247
R =   (relative) sampling-error standard deviation/(relative)
     measurement-error standard deviation
                               5-49

-------
TABLE 5-3.
FACTORS FOR ASSESSING RELATIVE PRECISION OF
CONCENTRATION AVERAGES FOR A GIVEN COMMODITY BATCH
AT A GIVEN TIME, AS A FUNCTION OF NUMBER OF UNITS
COMPOSITED (p),  NUMBER OF SAMPLES ANALYZED  (k),
AND NUMBER OF REPLICATE ANALYSES PER  (COMPOSITED)
SAMPLE (r)
Ratio of (Relative)
Std. Deviations (R)
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
1.0
1.0
1.0
No. of
Analyses
(k x r)
1
2
3
4
5
6
8
9
10
12
15
16
20
1
2
3
k
1
2
1
3
1
4
2
1
5
3
2
4
2
3
5
4
3
5
4
5
1
2
1
3
1
r
l
1
2
1
3
1
2
4
1
2
3
2
4
3
2
3
4
3
4
4
1
1
2
1
3
P-l
1.1180
0.7906
0.8660
0.6455
0.7638
0.5590
0.6124
0.7071
0.5000
0.5000
0.5401
0.4330
0.5000
0.4410
0.3873
0.3819
0.4082
0.3416
0.3536
0.3162
1.4142
1.0000
1.2247
0.8165
1.1547
p=5
1.0247
0.7642
0.7416
0.5916
0.6191
0.5123
0.5244
0.5477
0.4583
0.4282
0.4378
0.3708
0.3873
0.3575
0.3317
0.3096
0.3162
0.2769
0.2739
0.2449
1.0954
0.7746
0.8367
0.6325
0.7303
p=10
1.0124
0.7159
0.7246
0.5845
0.5986
0.5062
0.5123
0.5244 •
0.4528
0.4183
0.4233
0.3623
0.3708
0.3456
0.3240
0.2993
0.3028
0.2677
0.2622
0.2345
1.0488
.0.7416
0.7746
0.6055
0.6583
p=20
1.0062
0.7115
0.7159
0.5809
0.5881
0.5031
0.5062
0.5123
0.4500
O^^R
0.3579
0.3623
0.3395
0.3202
0.2940
0.2958
0.2630
0.2562
0.2291
1.0247
0.7246
0.7416
0.5916
0.6191
                               5-50

-------
TABLE 5-3.  (CONTINUED)
Ratio of (Relative)
Std. Deviations (R)
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.5
1.5
1.5
1.5
1.5
1.5
No. of
Analyses
(k x r)
4
5
6
8
9
10
12
15
16
20
1
2
3
4
5
6
k
4
2
1
5
3
2
4
2
3
5
4
3
5
4
5
1
2
1
3
1
4
2
1
5
3
2
r
1
2
4
1
2
3
2
4
3
2
3
4
3
4
4
1
1
2
1
3
1
2
4
1
2
3
p-1
0.7071
0.8660
1.1180
0.6325
0.7071
0.8165
0.6124
0.7906
0.6667
0.5477
0.5774
0.6455
0.5164
0.5590
0.5000
1.8028
1.2748
1.6583
1.0408
1.6073
0.9014
1.1726
1.5811
0.8062
0.9574
1.1365
p=5
0.5477
0,5916
0.6708
0.4899
0.4830
0.5164
0.4183
0.4743
0.4216
0.3742
0.3651
0.3873
0.3266
0.3354
0.3000
1.2042
0.8515
0.9747
0.6952
0.8851
0.6021
0.6892
0.8367
0.5385
0.5627
0.6258
p=10
0.5244
0.5477
0.5916
0.4690
0.4472
0.4655
0.3873
0.4183
0.3801
0.3464
0.3291
0.3416
0.2944
0.2958
0.2646
1.1068
0.7826
0.8515
0.6390
0.7472
0.5534
0.6021
0.6892
0.4950
0.4916
0.5284
p=20
0.5123
0.5244
0.5477
0.4583
0.4282
0.4378
0.3708
0.3873
0.3575
0.3317
0.3096
0.3162
0.2769
0.2739
0.2449
1.0548
0.7458
0.7826
0.6090
0.6677
0.5274
0.5534
0.6021
0.4717
0.4518
0.4721
                               5-51

-------
TABLE 5-3.   (CONTINUED)
Ratio of (Relative)
Std. Deviations -fR)
1.5
1.5
1.5
1.5
1.5
1.5
1.5
2.0
2.0
2.0
2.0
2.0
2.0
2.0
2.0
2.0
No. of
Analyses
(k x r)
8
9
10
12
15
16
20
1
2
3
4
5
6
8
9
10
k
4
2
3
5
4
3
5
4
5
1
2
1
3
1
4
2
1
5
3
2
4
2
3
5
r
2
4
3
2
3
4
3
4
4
1
1
2
1
3
1
2
4
1
2
3
2
4
3
2
P-l
0.8292
1.1180
0.9280
0.7416
0.8036
0.9129
0.7188
0.7906
0.7071
2.2361
1.5811
-2.1213
1.2910
2.0817
1.1180
1.5000
2.0616
1.0000
1.2247
1.4720
1.0607
1.4577
1.2019
0.9487
p=5
0.4873
0.5916
0.5110
0.4359
0.4425
0.4830
0.3958
0.4183
0.3742
1.3416
0.9487
1.1402
0.7746
1.0646
0.6708
0.8062
1.0247
0.6000
0.6583
0.7528
0.5701
0.7246
0.6146
0.5099
p=10
0.4257
0.4873
0.4314
0.3808
0.3736
0.3979
0.3342
0.3446
0.3082
1.1832
0.8367
0.9487
0.6831
0.8563
0.5916
0.6708
0.8062
0.5292
0.5477
0.6055
0.4743
0.5701
0.4944
0.4243
p=20
0.3913
0.4257
0.3855
0.3500
0.3339
0.3476
0.2986
0.3010
0.2693
1.0954
Qjj*L6
O^Br
0.6325
0.7303
0.5477
0.5916
0.6708
0.4899
0.4830
0.5164
0.4183
0.4743
0.4216
0.3742
                              5-52
                                                             "3, If

-------
TABLE 5-3.   (QONTINUED)
Ratio of (Relative)
Std. Deviations (R)
2.0
2.0
2.0
2.0
5.0
5.0
5.0
5.0
5.0
5.0
5.0
5.0
5.0
5.0
5.0
5.0
5.0
No. of
Analyses
(k x r)
12
15
16
20
1
2
3
4
5
6
8
9
10
12
15
16
20
k
4
3
5
4
5
1
2
1
3
1
4
2
1
5
3
2
4
2
3
5
4
3
5
4
5
r
3
4
3
4
4
1
1
2
"l
3
1
2
4
1
2
3
2
4
3
2
3
4
3
4
4
p-1
1.0408
1.1902
0.9309
1.0308
0.9220
5.0990
3.6056
5.0498
2.9439
5.0332
2.5495
3.5707
5.0249
2.2804
2.9155
3.5590
2.5249
3.5532
2.9059
2.2583
2.5166
2.9011
2.2509
2.5125
2.2472
p=5
0.5323
0.5916
0.4761
0.5123
0.4583
2.4495
1.7321
2.3452
1.4142
2.3094
1.2247
1.6583
2.2913
1.0954
1.3540
1.6330
1.1726
1.6202
1.3333
1.0488
1.1547
1.3229
1.0328
1.1456
1.0247
p=10
0.4282
0.4655
0.3830
0.4031
0.3606
1.8708
1.3229
1.7321
"1.0801
1.6833
0.9354
1.2247
1.6583
0.8367
1.0000
1.1902
0.8660
1.1726
0.9718
0.7746
0.8416
0.9574
0.7528
0.8292
0.7416
p=20
0.3651
0.3873
0.3266
0.3354
0.3000
1.5000
1.0607
1.3229
0.3660
1.2583
0.7500
0.9354
1.2247
0.6708
0.7638
0.8898
0.6614
0.8660
0.7265
0.5916
0.6292
0.7071
0.5627
0.6124
0.5477
                               5-53

-------
      (a)  achieve precise estimates of model parameters for the
          chosen model form;
      (b)  test model parameters  (e.g., equality of degradation
          rates for different batches or groups of batches) with
          adequate" statistical power;12
      (c)  explore the structure  of the errors and how they change
          with concentration level;
      (d)  test for lack of fit of the model  (and fit alternative,
          more complex models if necessary);
      (e)  estimate components of variance associated with
          sampling and with analysis.

The first objective can be regarded as the primary one; the
second is a complementary objective, in the sense that design
modifications aimed at improving precision for the parameter
estimates will also enhance the power of tests for comparing the
parameters.
     The remaining three objectives are frequently secondary, but
are still important, since they pertain to the model's capability
to represent the data and thus relate to how well we will be able
to defend our choice of model and data transformation.  Their
importance in any particular study will be largely determined by
prior knowledge.  There may be some cases, for instance, where we
have prior information on the measurement error variability and
typical sampling variations and there will be little need to
incorporate design features aimed at objective (e).  On the other
hand, if that information were important to obtain within the
framework of the degradation study (e.g. in the case of acute
toxicants), then replicate analyses (r>l) on the same analytical
unit would be called for early in the study,, and more design
points should perhaps be allocated to the first few days of the
     12Power  is  the probability of detecting  a difference when a
true difference of some magnitude actually exists.
                               5-54

-------
study (so as to refine our initially selected values of p, r, and
k at later time points).
     Unfortunately, designs developed to achieve one of the above
objectives may be highly inefficient or even useless in terms of
meeting another objective.  Similarly, designs optimal for one
assumed model will not be optimal if another model proves to be
correct.  For example, suppose at the design stage that we assume
model (5.15) — namely, In(concentrations) are represented as
straight lines over time.  Then the optimal design for maximizing
precision of the fi(i) parameter estimates  (i.e., achieving
objective (a)) is to use two time points, at time zero and time
S, with half of the total observations at each point.  Such a
design is useless, however, if another model applies.  For
example, three or more design points are necessary to estimate
the parameters in model  (5.16).  Also, for the same reason, such
a two-point design would not allow the lack of fit of model
(5.15) to be evaluated (in the manner illustrated in Exhibit 5-4,
for instance).
     Thus the objectives, especially objectives  (a) and  (d), are
somewhat antagonistic, and compromises must be made.  The
resultant designs are then not optimal for meeting any single
objective, but can be regarded as useful with respect to several
objectives and robust against misspecifications  of the
degradation model.
     To define a design over the time span S, we need the
following notation.  Consider a given commodity  batch.   Let m
denote the number of design points, and  let nu denote the number
of observations  (i.e., samples or composited samples) to be taken
at the uth design point  (for batch  i).13   Denote the total number
of observations  as n = S nu.   Let tu denote the  uth time <0
-------
points, {t,,  t2,  t3,...,tm},  with t,=0  and  tm=S.  Also define the
set of rescaled  times tj=tu/S.   Note that t^=0 and t^l.
     Even though the straight-line  model (5.15)  may not provide a
totally adequate fit, it  is convenient  to  utilize that model for
design purposes  and to  consider the impacts  of  alternative
designs on the   precision of the parameters  in  that model.  This
is because the linear component (on the logarithmic scale) is
likely to be a dominant component.   This is  especially true for
parent compounds, but it  is also likely to hold for toxic
metabolites  (see Figure 5-5b,  for instance).   Let Y(i)  denote the
mean of the  In(concentrations)  over all n  observations for batch
i.  Let t denote the average of the n associated times (expressed
in days, for instance).   Fitting model  (5.15)  to the data via
ordinary least squares  will result  in the  following estimated
degradation rate parameter for the  ith  batch:

  (5.23)    p(i)   = -  CY*


where Q is the sum of cross-products of the  times and the
observed In(concentrations), i.e.,
                 m  "u
(5.24)
            cyi
                       [(tu-C) (YuU,j)-Y(i))]  ;
                 U-l  J"l
and where Qtti is the sum of squares of  the  times  for the batch i
observations , i.e.,
  (5.25).
            eti
                 u-l
                                  nu(tu-c)2
                                u-1
Note that  (5.25)  can  be  re-expressed in terms of the rescaled
times:

                     nu(t'H -
                                      Q'tti
                   u-l
                               5-56
                                                                  a/?

-------
where S*t and t'= t/S is the mean of the rescaled  times.

     Assuming model  (5.15), the  standard deviation of fi(i)  is
equal" to

  (5.26)    std. dev.  [3]  =


where ay|t is the square root of the variance of Y  (within times) .
If model  (5.15) is an adequate model,  then  the square root  of the
residual variance from fitting the model will furnish an estimate
of aY|t.  Also, at the design stage, if we have approximate  values
for ye and R (see Section 5.4.2), then we can approximate ay|c by
yeU,  where U is given by eq. (5.22).   Thus we can approximate the
standard deviation of 5 (i)  as.
  (5.27)    std. dev. [p(i)I -
     This equation makes  it clear how we can enhance  the
precision of the estimated degradation rate parameters:

(1)  Decrease ye,  the relative measurement-error standard
     deviation — for example, by use of a more sensitive
     analytical protocol;
(2)  Decrease U —
     a)   by reducing R — i.e., achieving smaller variation
          among samples;14
     b)   by increasing p, the number of units composited;  and/or
     c)   by increasing r, the number of replicate analyses on
          the same (composited) unit;
(3)  Increase S, the time span of the degradation study; and/or
     HThis  was  one  reason  for  recommending  use  of  a  single
commodity  batch,  as  contrasted,   for  example,   with  commodity
representative of an entire plot.
                               5-57

-------
(4)  Increase QJtj
     points.
               the sum of squares of the rescaled design
The items subject to most control are  items 2b, 2c, and  4'.  Items
2b and 2c are concerned with what constitutes an observation  at a
given time point and were considered in the prior  subsection,
with some illustrative results presented  in Table  5-2.
     The remainder of this subsection  focuses on item  4.  Given a
fixed p and r, our other mechanism  for influencing the precision
with which we can estimate the rate parameters  in  model  (5.15")  is
to change the value of Q£ti by manipulating the  design  parameters
— that is, by varying the values for  m,  n, {t,, t2, . . . ,tm}, and
n
      2, . . .
           ,nm} .
     Some prior comments have already been made  concerning  the
choice of m.  We noted, for  instance, that while m=2  is  optimal
for one objective, it has major drawbacks with respect to others.
Hence, we recommend choosing m to be at  least 4  for short
duration studies  (say, S<2 months), and  somewhat larger  (say,
m=6, 7, or 8) for longer studies.  As a  general  rule, we should
use a smaller m when we are  confident that linear  relationships
between In(concentrations) and time hold (e.g.,  based on evidence
from prior degradation studies on a similar  commodity) and  should
use a larger m when we are less certain  that that  is  the case.
     There are obviously many possible choices -for the spacing
scheme of the design points  — that is,  the  choice of times {t,,,
t2,...,tm} over the  interval  S to  be used as  design points.   For
simplicity, we consider here only two possible schemes.  Recall
that t'.asSt..,  for u«=l,2,... ,m.
     l.   Equal spacing over  [0,S]:   t,1,  =
     2.   Base 2 spacing over [0,S]:
          u>2.
                                 'u
                                  t!
0, and t; - 2u/2m for
Table 5-4 gives the actual  values  of  the tj  for these spacing

                               5-58

-------
schemes, for m ranging from 2 to 8.  Note  that  the  two schemes
are identical ^f or m=2 and for m=3.
     There are also"many possible allocation  schemes  — that is,
many possible choices for {n,,  n2,...,nm}.  We consider here  the
following three:

     Allocation scheme 1:  multiples of  {1,1,1,...,!}
     Allocation scheme 2:
          m even: multiples of  {m/2,...,3,2,1,1,2,3,...,m/2}
          m odd: multiples of {(m+l)/2,...,3,2,1,2,3,...,(m+l)/2}
     Allocation scheme 3:
          m even: multiples of  {2(rv2>/2, . . . , 4, 2,1,1, 2 ,4 , . . . , 2(m'2)/2}
          m odd:  multiples of  {2
-------
which appears in eq. (5.27).  Such W factors are shown in the
last two columns of Table 5-5 (labeled as "slope standard
deviation factors");  Rows of the table are organized by m and n,
which.appear in the first two columns.  The allocation scheme is
identified and defined in the next set of columns.  The results
indicate that use of either spacing scheme will produce
essentially the same precision for the slope estimates.
Allocation scheme, however, is a more important determinant of
precision of the rate parameter estimates of model  (5.15).  As an
example, consider m=6 and n*=24.  Allocation scheme  2 — namely,
{6,4,2,2,4,6} — produces a W«0.51., while W*0.59 for allocation
scheme 1 with four samples per time point.  This indicates that,
for fixed values of U (see Table 5-2) , ye>  and S,  scheme 1 with
these m and n will produce estimates of the fl(i) having standard
deviations that are about 20 percent larger than those that would
be achieved by allocation scheme 2.
     The table, along with eq. (5.22) and (5.27), can also be
used to find designs that will achieve a specified  level of
precision in a fi(i) estimate.  Suppose we want the  standard
deviation of such an estimate to be less than 0.001  (in
In(concentration) units per day).  If R=l, r»l, and p=10 are
assumed, then eq.  (5.22) or Table 5-2 show that U=1.0488.  Thus
if S=100 days and ye=0.15,  and we use (5.27)  and (5.28), we can
obtain:
      (0.015)(1.0488)W/100 < 0.001,
or
     W < 0.636.

Table 5-5 can then be examined to find designs meeting  this
requirement.  If we want m=6, for example, then n=14 with
allocation scheme 3 would suffice  (i.e., a total of 14  composited
samples for each batch for which we want an estimated degradation
curve.)
                               5-60

-------
TABLE 5-4.  DEFINITION OF SPACING SCHEMES FOR DESIGN  POINTS
     Values of  t*,  Under equa 1 Spacing
2
3
4
5
6
7
8
0
0
0
0
0
0
0
1
1/2
1/3
1/4
1/5
1/6
1/7

1
2/3
1/2
2/5
1/3
2/7

1
3/4
3/5
1/2
3/7

1
4/5
2/3
4/7

1
5/6
5/7

1
6/7 1
  m
Values of t,', Under Base-2 Spacing
2
3
4
5
6
7
8
0
0
0
0
0
0
0
1
1/2
1/4
1/8
1/16
1/32
1/64

1
1/2
1/4
1/8
1/16
1/32

1
1/2
1/4
1/8
1/16

1
1/2
1/4
1/8

1
1/2
1/4

1
1/2 1
                               5-61

-------
     Based upon the results of Table 5-5,  the following should be
considered:
     l.   Use base-2 spacing (spacing scheme 2),  at least
          initially.  From a precision standpoint,  there is
          likely to be little difference in the two spacing
          schemes.  Base-2 spacing generates more early data than
          scheme 1, and adjustments in subsequent design point
          selection can be made more quickly.  For instance, if
          the degradation is slow relative to the overall time
          span of the design, and assuming data are obtained and
          analyzed on an ongoing basis throughout the experiment,
          then it would be appropriate to add more design points
          in the last half of the time span.  If the degradation
          is rapid, then more usable data will be obtained by
          this scheme.
     2.   If we are confident that the straight-line model is
          appropriate, allocation scheme 2 or 3 should be
          considered.  If not,  as would be likely when a toxic
          metabolite was of interest, then the equal allocation
          scheme should be considered since this should allow
          other models such as (5.16) or (5.17)  to be fit with
          adequate precision.
                               5-62

-------
Table 5-5.       Factors for Assessing Precision of  Slope Estimates in Model (5.15), for Selected Designs
Mo. of
'in»
i'oints (m)
:>
i
4
5
Total
Sample
Size (n>
2
4
6
8
10
12
3
5
6
9
10
12
15
15
18
20
25
30
4
6
8
12
12
16
IB
20
24
24
30
36
5
Alleeation
Scheme
1
1
1
1
1
1
1
2
1
1
2
1
1
2
1
2
2
2
1
I
1
1
2
1
2
1
1
2
2
2
1
nl
1
2
3
4
5
6
1
2
2
3
4
4
5
6
6
a
10
12
1
2
2
3
4
4
6
5
6
8
10
12
1
n2
1
2
3
4
5
6
1
1
2
3
2
4
5
3
6
4
5
6
1
1
2
3
2
4
3
5
6
4
5
6
1
n3






1
2
2
3
4
4
5
6
6
a
10
12
1
1
2
3
2
4
3
5
6
4
5
6
1
n4


















1
2
2
3
4
4
6
5
6
8
10
12
1
n5






























1
n6












-


















n7































n8































Slope Std.
Dev. Factor:
Equal Spacing
1.4142
1.0000
0.8165
0.7071
0.6325
0.5774
1.4142
1 .0000
1.0000
0.8165
0.7071
0.7071
0.6325
0.5774
0.5774
0.5000
0.4472
0.4082
1.3416
0.9733
0.9487
0.7746
0.6682
0.6708
0.5620
0.6000
0.5477
0.4867
0.4353
0.3974
1.2649
Slope Std.
Oev. Factor:
Base 2 Spacing


















1.3522
0.9749
0.9562
0.7807
0.6894
0.6761
0.5629
0.6047
0.5521
0.4875
0.4360
0.3980
1.2649
                                                         5-63

-------
Tabl* 5-5.
(CONTINUED)
NO. of
Time
Points (m)
5 Ccont.)













6














7

Total
Sample
size 
1
2
3
1
1
2
1
3
1
2
3
2
3
2
6
12
12
U
18
24
24
28
30
36
36
42
48
56
60
7
14
AU«Cation
Schema
2
3
4
3
4
6
5
8
6
9
12
12
16
15
1
1
2
3
1
1
2
3
1
1
2
3
2
3
2
1
1
nl
2
2
2
3
4
4
5
4
6
6
6
8
8
10
1
2
3
4
3

6
8
5


12
12
16
15
1
2
n2
2
1
1
3
4
2
5
2
6
3
3
4
4
5
1
2
2
2
3

4
4
5


6
8
8
10
1
2
n3
2
2
2
3
4
4
5
4
6
6
6
8
8
10
1
2
1
1
3

2
2
5


3
4
4
5
1
2
n4
2
3
4
3
4
6
5
8
6
9
12
12
16
15
1
2
1
1
3

2
2
5


3
4
4
5
1
2
n5














1
2
2
2
3

4
4
5


6
8
8
10
1
2
n6














1
2
3
4
3

6
8
5


12
12
16
15
1
2
n7





























1
2
n3































Slope Std.
Oev, Factor:
Equal Spacing
0.8944
0.75S9
0.6667
0.7303
0.6325
O.S34S
0.5657
0.4714
0.5164
0.4364
0.3849
0.3780
0.3333
0.3381
1.1952
0.8452
0.7293
0.6482
0.6901
0.5976
0.5157
0.4583
0.5345
0.4880
0.4211
0.3742
0.3647
' 0.3241
0.3262
1.1339
0.8018
Slope Std.
Dev. Factor:
Base 2 Spacing
0.8944
0.7553
0.6642
0.7303
0.6325
0.5341
0.5657
0.4697
0.5164
0.4361
0.3835
0.3777
0.332"1
0.3378
1.1898
0.8413
0.7258
0.6421
0.6869
0.5949
0.5132
0.4541
0.5321
0.4857
0.4190
0.3707
0.3629
0.3211
0.3246
1.1328
0.8010
                                                    5-64

-------
Table 5-5.
(CONTINU€D)
N». Of
Time
Paints 
-------
5.5  References

     "A Successive Differences Method for Growth Curves With Missing Data
     and Random observation Times," Journal of the American  Statistical
     Association. Vol. 81, No. 396, p.912-916.

     Box,  GEP and HL Lucas (1959).  "Design of Experiments in Non-Linear
     Situations," Biometrika. Vol. 46, p.77-90.

     Schwertman, Neil C and Lance K Heilbrun (1986).

     SAS Institute, Inc. (1985).  SAS User's Guide;  Statistics. Ven Ion  5.
     Edition.  Gary, NC:  SAS Institute, Inc.

     Snedecor,  George W and William G Cochran  (1980).  Statistical
     Methods. 7th Edition, Ames, Iowa:  Iowa State University Press.
                                    5-66

-------