FINAL REPORT
on
EVALUATION OF A COMPUTER METHOD TO PREDICT
OCTANOL WATER PARTITION COEFFICIENTS
TECHNICAL DIRECTIVE 12
by
ALBERT J, LEO
SUBCONTRACT NO. T6415(7197)-029
CONTRACT NO. 68-01-5043
R.G. Wilhelm, Project Officer
R. Lipnick, Task Officer
Office of Pesticides and Toxic Substances
U.S. Environmental Protection Agency
September 1982
-------
This document has been reviewed and approved for publication
by the Office of Toxic Substances, Office of Pesticides and
Toxic Substances, U. S. Environmental Protection Agency.
Approval does not signify that the contents necessarily reflect
the views and policies of the Environmental Protection Agency,
nor does the mention of trade names or commercial products
constitute endorsement or recommendation for use.
-------
TABLE of CONTENTS
Executive Summary 1
I. Shake Flask Measurements 3
II. HFLC as Alternative Measurement Method ... 5
III. Improvements in Manual Calculation
Rule Change 10
Rule Simplifications 11
Electronic Effects 14
Alkyl-Aryl Effects 15
Intramolecular H-Bonding 16
Negative Ortno Effect 17
IV. Use of CLOGP-I 19
V. Analysis of CLOGP-I Test Results 23
Bibliography 29
TABLES
-------
1. Compounds Measured by Shake Flask
2. Sigma-Rho Constants for Electronic Effect
3. Negative Ortho Corrections
4. Example Calculations Assisting CLOGP-I
5. Solutes with Highest Deviations—Reasons
6. WLN Test Structures Ordered by Formula
7. WLN Test Structures Ordered by FCON Number
FIGURES
1. Halogen-H-Folar Interaction in Aliphatic Systems
2. Statistics for CLOGP-I, Mode (1)
3. Statistics for CLOGP-I, Mode (2)
-------
4. Statistics for CLOGP-I, Mode (3)
5. Plot of Residuals vs. CLOGP, Mode (3)
SUPPLEMENTS
1. Output at 2N Level of Test Structures
2. Magnetic Tape of CLOGP With Inconsistencies Corrected
3. Complete Paper on Calculating Hydrophobicity in Aromatic Rings
-------
EXECUTIVE SUMMARY
Octanol-vater partition coefficients of over 160 solutes were measured
by shake-flask procedure. These were used to confirm old Fragment and
Factor values and to establish new ones. A number of important pesticides
were included as 'benchmarks'.
A minimum amount of experimental effort was expended on HPLC procedures
with the objective of developing one which would provide a more rapid but
just as reliable method for determining the hydrophobic parameter. The
difficulties of using water as the mobile phase led us to undertake a more
thorough literature search, which in turn led us to work in progress at the
U. of Illinois (Champaign-Urbana), and to the conclusion that any
successful solution to the problem was likely to be found in the use of
very expensive, computer-driven, multi-pump HPLC units, which were beyond
the scope of this contract. (See body of report for evaluation.)
Several very important improvements were made in the rules for log F
calculation as a result of this contract effort. A new method for
calculating the proximity effect of polar fragments imbeded in aromatic
rings was devised and tested. A new simpler system of accounting for the
t*
electronic effect in aliphatic systems between halogens and H-polar
fragments was implemented. Finally, a comprehensive method of calculating
the interaction of multiple-substituents on aromatic rings was perfected.
It will be essential to include all of these improvements in the CLOG?
algorithm if it is to perform the tasks EPA expects of it.
-------
PAGE 2
The task of installing a working version of CLOGP (initial version by
Chou & Jurs, Pennsylvania State Univ.) at EPA in Washington, D.C. for
practical application work was, frankly, disappointing. The programming
strategy initially chosen for speed on a Modcomp 16-bit minicomputer
baffled our attempts to convert it from a research tool into working tool
to be handed over to someone with minimal experience in log P calculation.
In spite of considerable effort to eliminate the problem during the period
of this contract, the output of CLOGP depended on the ORDER in which the
structural information was input. Although this is rarely a problem if
only a dozen structures are input at a 'sitting1, it often occurs when
computing from a file of hundreds or thousands. Post-contract effort has
(apparently) eliminated this problem. However, the original programming
strategy also set a practical upper limit of 300 fragments and factors, and
thus new ones cannot be added as they determined. Finally, our efforts to
install CLOGP on the DEC 20 at EPA in Washington, D.C. were unsuccessful,
but it is operational at ERL-Duluth.
For these reasons, and after consulting other programmers familiar with
CLOGP who also were unable to circumvent these problems, some of the later
programming effort by Pomona personnel under this contract was directed
towards devising a table-driven second version of CLOGP. But until such
time as a 'stand alone* program becomes available, the instructions
provided in this report will enable the operator to use CLOGP-I in a manner
which equals or exceeds the original objectives set forth for the contract.
-------
PAGE 3
I. Shake-Flask Measurements
Since any measurements used to establish Fragments or Factors used in
computer calculations tend to be used without frequent review, it is
incumbent to set the highest standards on the procedures for determining
the octanol/water partition coefficients. Experimental details can be
found in Ref. 1, p. 587. Lots of MC/B octanol (m.p.-16 to -15) were tested
for absence of absorbtion down to 220 nm. A single lot was used for the
entire set of determinations without need for further purification. Stock
solutions of octanol saturated with distilled water (or buffer) and water
(or buffer) saturated with octanol were maintained in an air-conditioned
laboratory. Temperature records were kept, and a variance of five degrees
(C) necessitated a re-saturation of the solvent phases followed by a 12 hr.
settling period.
A minimum of three determinations over at least a three-fold
concentration range was required to establish a partition coefficient for
this work. A ten-fold concentration range was often employed, and
concentration dependence was looked for as an indication of possible
association in either phase (usually the water). It was not unusual for 12
or more runs to be made to establish one log P. (The current charge by
Pomona Medchem for each partition coefficient measurement meeting these
criteria is $250.00.)
Unless there was a poor material balance, it was the usual practice to
analyse only one phase (usually the water). Standard analytical procedures
were followed. Samples were weighed on a Cahn 25 Electrebalance. U.v
determinations were made on a Gary 15 spectrophotometer. Gas
-------
PAGE 4
chromatography was performed on a Varian 2700 using a Vidar 6300 digital
integrator. HFLC (used for analysis, not partitioning) was done on a
Waters unit with LKB Uvicord 2138 detector operating at 208 nm.
It is readily apparent from the compounds listed in Table 1. that the
choice of structures for measurement was not always based on the simplest
one with the feature in question. Considerations of cost, availability and
purity were also important. Furthermore, because of the association of the
Pomona Medchem Project with other research efforts, it was often possible
to use a set of chemicals on which other important determinations were
being made. For example, there were X-ray coordinates determined for the
set of hippurates in Table 1, and conformational analyses made to
rationalize other unusual properties of these substituents were expected to
help predict the hydrophobic Factors needed for these groups in similar
circumstances. A similar opportunity arose for the N-methoxy-ureas, a
functional group fairly common to compounds of environmental concern. The
availability of a solute set expected to show some conformational anomalies
was most fortunate.
Table 1 lists the compounds whose partition coefficients were measured
under this contract. They are sorted by structure via WIN.
-------
PAGE 5
II. HPLC Procedures as an Alternate to Shake-Flask
There are numerous reports of work that attempt to relate the relative
retention time in high-performance liquid chromatography columns to the
shake-flask partition coeffiecient between octanol and water.2 3 ** 5 6 From
a theoretical viewpoint, it would seem that the closest relationship ought
to exist where the stationary phase was 'bound* octanol and the mobile
2 7
phase was water saturated with octanol. Given conditions of column
length and pumping rate so that an equilibrium was attained, log (T-To)
should be proportional to log P. Although a few laboratories still report
notable success with this method, mechanical details have plagued the
procedure, and even frequent runs of reference standards do not always
eliminate errors. The columns are not available pre-packed from the
manufacturer, and while column preparation procedures have been simplified,
one can expect considerable variation between laboratories, especially with
the shorter columns where channeling can introduce large errors. Most HPLC
pumps do not operate well on a mobile phase as viscous as water, and the
check valves are much more prone to malfunction with it.
Most of the mechanical problems can be overcome by using a commercially
available C-18 column and a less viscous mobile phase than water. The long
alkyl chains are thought to mimick the C-8 chain of octanol, while the
residual polar silanol sites take the place of the polar -OH group.
Mixtures of water and an organic solvent can be used as the mobile phase.
While acetonitrile has been proposed as the organic component, we can see
no reason for not employing as many -OH groups as possible, and therefore
we think methanol is to be preferred, except for very lipophilic solutes
where i-propanol may be advantageous.
-------
PAGE 6
While our work was still in the preliminary stage of determining
whether there was a linear relationship between % methanol and log(T-To)
for various structures, we became aware of the work of Dr. John Garst at
the University of Illinois (Champaign). He had available a Hewlett-Packard
model 1084B HPLC which could be programmed to run and plot (T-To) for a
series of solvent percentages. In return for providing him with
'preferred1 octanol/water shake-flask values from our data base, he kindly
kept us abreast of his research, a report of which has recently been
Q Q
submitted for publication. His investigations followed closely to what
we had planned, but his advanced equipment allowed him to accomplish much
more than we would have in the same time. In addition, he developed some
techniques to prolong column life and reproducibility which could be
crucial to any procedure which EPA may depend upon for a regulatory
function.
Both Veith6 and Garst8 have appreciated that their HPLC procedures can
extend a measured hydrophobicity beyond the practical limits for the shake-
flask method. It may well be that the model will not prove to be identical
to the equilibrium of a solute between two liquid phases, but with
compounds whose log P values would be above 6, the concentrations being
delt with in the aqueous phase are so low that adequacy is all that should
be hoped for in a model. We think it very likely that HPLC parameters in
this high range will be found to correlate certain biological data as soon
as there are enough appropriate ones available.
There are a few aspects of Garst1s work with solutes in the 'ordinary*
log P range which should raise a note of caution and indicate that further
work should be undertaken before his method can be considered as a reliable
-------
PAGE 7
replacement of the shake-flask for hydrophobia parameter measurement. The
first objection is not a serious one for EPA's purposes: Because of the
small difference (T-To), log P values below 0.4 cannot be accurately
determined. The second is more important: Sometimes the relationship
between % methanol and log(T-To) is curvilinear. For example, in the case
of benzene, the curvature is concave downward, and the intercept at 100%
water is lower than a tangential projection at appreciable methanol
percentages. The log P calculated from the actual intercept is low by over
0.5 log unit. In the range of 60% methanol down to roughly 35% methanol,
the curvature is slight and can be treated as linear within a confidence
limit of 96.9%. Extrapolation of this linear portion gives a log(T-To)
intercept which results in a calculted log P of 2.00. This compares more
favorably with the accepted shake-flask determination of 2.13.
Garst provides only one example of a plot of log(T-To) vs. methanol %
where the curvature is concave upward—that for methyl-i-butyl xanthine.
Although the partition coefficent for this solute has not been measured by
shake-flask, it can be calculated from the dimethyl analog with some
confidence. It appears that an extrapolation from high methanol content,
where the curvature is least, gives an intercept which yields a calculated
log P which is much higher than what is expected by shake-flask. The
intercept from low or zero % methanol would make the discrepancy even
greater.
Garst suggests that the curvature of these plots results from failure to
reach equilibrium with the columns and flow rates investigated so far, but
that extrapolation from the 'linear' portions can circumvent this
difficulty. But he also suggests that many of the shake-flask values are
-------
PAGE 8
not correct for the given structures because of dimer and other multiple-
component formation, and the values from HPLC may be the 'true1 parameters.
In this regard, his argument seems to us to have little merit in view of
the fact that, in the case of benzene, there is no concentration dependence
over a 10+ concentration range, and there is agreement between HPLC and
shake-flask for naphthalene and anthracene. The fact that there IS good
agreement for such a wide variety of structures is important, and if the
classes of structures where disagreement is great is carefully studied and
defined, it may well be that HPLC can lead us to a new parameter with
significance for biological QSAR.
Caffeine, like the xanthine derivative discussed above, is found by HPLC
to have a much higher log P value than determined by shake-flask (0.63 vs.
-0.07). Garst attributes this to the possibility of hydrate formation.
Hydrate formation has been shown to be a factor in the partition
coefficient in solvent pairs where the organic phase is as non-polar as
benzene.10 Evidence for its influence in alcohol/water partitioning is
lacking. As the HPLC-shakeftask differences become more clearly defined it
may be found that the former are better parameters for certain biological
activity. If not, then the differences must be analyzed for their
predictability if HPLC is to serve as a 'plug-in* replacement for the more
costly shake-flask method.
-------
PAGE 9
III. Improvements in Calculation Methods
The earliest method to be proposed for calculating log P (oct/water)
from structure was based on the 'pi1 system of replacing the hydrogen on an
aromatic ring with a specified substituent group.11 This system vas widely
used for over a decade, but was not well suited for reduction to a computer
19
algorithm. Rekker 1£ proposed an additive (as opposed to a replacement)
system in which a hydrogen was also assigned a hydrophobic constant. In
accounting for constituitive Factors, Rekker made use of a 'Magic Constant'
which was multiplied by an integer to yeild the desired interaction
correction. To date no method has been published which allows these
integers to be predicted from structure, and so it has appeared impossible
to reduce Rekker1s method to a computerized form. (Another difficulty
arose from the fact that Rekker never defined a Fragment; if one was found
in his table of values, you knew it qualified as such.)
The log P calculation method of Hansch and Leo 13also uses the system of
adding Fragments and interaction Factors, but it was designed from the
outset with a computer algorithm in mind. It was apparent in the early
stages of this effort that the partition coefficient was an extremely
complex parameter, influenced by apparently subtle changes in the
>
arrangement of the component parts of the solute structure. The most
effective computational strategy, therefore, depended on the best
compromise between two conflicting demands: If the Fragments were defined
as very small units (i.e. the constituent atoms), the Factors necessary to
allow for the many possibilities of interconnection would be unmanageable;
if the Fragments were defined too large (thus containing many of the
troublesome Factors within a measured value) many more measurements would
-------
PAGE 10
be required to begin a workable system, and in fact it would be
indistinguishable from the 'pi1 system. It was hoped that a satisfactory
compromise would result if Fragments were defined by means of 'Isolating
Carbon Atoms' (ICs): An Isolating Carbon atom is one not multiply-bonded
to a hetero atom; all atoms or groups of atoms whose remaining bonds are to
ICs are fundamental Fragments. This definition was used quite successfully
for a great number of manual calculations, but the only way it was possible
to clearly appreciate any improvements which might be called for was to
greatly extend this range through calculation via computer—a task made
possible largely through the contract which supported this work.
Rule Change
The only desirable rule change indicated from this study involves
Fragments fused in hetero-aromatic rings. The original rule (Ref. 13,p.34)
produced unnecessarily large Fragments for the common purine analogs, for
example. Furthermore,each aromatic system seemed to act as a unit as far
as proximity effect is concerned, and so the positive Factor was more
closely related to how many Fragments were present, not to how closely they
were spaced. Based on these observations the revised rule states: "All
carbon atoms in an aromatic ring are isolating unless they are doubly
bonded to a hetero atom outside the ring."
Using the revised definition, the Factor for multiple occurence of
hydrophilic Fragments (called 'proximity effect* in aliphatic systems) is
proportional to the number of Fragments present and to the sum of their
values: p _ n oor f j. f
r - = 0.22Z ft +f2
f^ = 0.32E fi +/2 +/3
F = 0.42Z
-------
PAGE 11
The new aromatic Fragment definition and multiple-Factor have been
implemented on the operating version of CLOGP, except for rings in which
1 aromaticity1 is difficult to define. For.the purpose of caclculating log
P, it is expedient to define an aromatic ring as any not containing a
saturated carbon atom, but we have not devised a system to allow a carbonyl
group to be accepted into an aromatic ring without disturbing other parts
of the logic of CLOGP-I. The present version recognizes a di-vinyl
attachment in the case of quinone and a styryl/aromatic attachment for
naphthoquinone. Adequate agreement with observed values is obtained if an
aromatic Fragment value is taken in the first instance and the average of
aromatic and di-aromatic taken in the second.
Rule Simplifications
When work on this contract began, calculation of the Factor in aliphatic
systems arising from the electronic effect of halogens near an H-polar
group was not well-defined. See Ref.i3,p. 27-28. Brand s tromlt* recently
published a method for calculating this effect in the Hammett fashion as a
product of (sigma x rho). This approach appears to treat halogen-halogen
interactions on a more rational basis than the geminal and vicinal
corrections incorporated in CLOGP, but it is no more accurate and much more
difficult to program. In Brandstrom's examples of halogen-H-polar
interactions, the (sigma x rho) product works well, but in the wider
selection employed in this work it does not, as will be apparent below.
For this reason we are continuing with a pragmatic approach in CLOGP.
When two isolating carbons separate a halogen from a Fragment capable of
H-bonding, the correction Factor is +0.55. Normally this Factor should
only be applied once for each H-polar Fragment even if a beta-halogen
-------
PAGE 12
appears on more than one of its valencies. The early version of CLOGF
introduced a Factor for EACH such occurrence.
As expected, the electronic effect of a halogen on an H-polar Fragment
is greater when only one isolating carbon intervenes. The Factors for each
type of occurrence which had been measured by 1979 appeared in Table IV-2
in Ref. 13 (p. 28). All but two of these were handled by the early version
of CLOGP and all are now operational on the current Pomona version.
To widen the scope of computer calculation of alpha-halogen effects, the
blanks in Table IV-2 (Ref. 13) were filled in, either by measurement (e.g.
the chloroacetic acids found in Table 1.) or by interpolation. It was seen
that a simplification was possibly whereby the Factor does not depend on
halogen type and there is need to distinguish only three types of H-polar
fragments. The overall effect can best be appreciated from Fig. 1 which
plots the correction Factor (always positive) against the number of alpha-
halogens. It can be seen that sulfonyl-containing Fragments remain as
sensitive to the electronic enhancement with the addition of the second and
third alpha-halogen, while the increment is less for other Fragment types.
This data does NOT fit the Brandstrom (sigma x rho) scheme. It is hoped
that as additional partitioning data is acquired for this structural
feature, three Fragment types will continue to suffice. Implementation of
this system could not be included in CLOGP-I but is planned for CLOGP-II.
One of the earliest constituitive factors recognized as necessary to
calculate log P (oct/water) from structure was the electronic effect of
aromatic substituents. The first attempt by this .author to deal with
this very commonly-occurring and often sizeable effect was to assign
'exalted1 values to certain susceptible Fragments when they appeared on the
-------
PAGE 13
same aromatic ring as one which was strongly electron-attracting. See
Table IV-1, p.23, Ref. 13. in implementing this in the earliest version of
CLOGP the sigma level for a substituent to qualify as an 'Inducer' was set
too low, and the unnecessary corrections almost offset the desired ones.
This has been changed in the present operating version of CLOGP-I, but this
feature still is in need of improvement.
Both Fujita16 17 and Brandstrom ^have proposed a Hammett-like treatment
of this correction Factor. Both proposals are unsuitable for our present
computer needs, because of lack of suitable rho and sigma values, because
of the complications of sorting out appropriate meta and para distinctions
(especially for fused ring systems), and because ortho substitution could
not be treated. Much of the Principal Investigator's time under this
contract* was devoted to working out a practical compromise between the
'quantum level' approach (tried and found wanting) and the too-
sophisticated approach of Fujita-Brandstrom.
The octanol/water partition coefficient of aromatic solutes can deviate
from the sum of Fragment values due to: electronic interaction, to
intramolecular hydrogen bonding, to alkyl substitution, and due to a
special ortho effect. It was shown that in 400 examples of aromatic
solutes, these four Factors could reduce the deviation between calculated
and observed log P values by over three-fold.
*The work on this phase was not complete at the expiration of sub-contract
T-6415(7197)-029. It was completed under ERL-Duluth Grant CR 809295-01-0
and submitted for publication in the Journal of the Chemical Society,
Perkin II. The journal article was too lengthy to be included in this
report in its entirety, but is submitted as supplementary material. The
summary included in this section should suffice for most purposes.
-------
PAGE 14
Electronic Factor
It was found that the electronic effect could be dealt with using the
Hammett (sigma x rho) product 15with the following simplifications: 1) a
single sigma parameter could be used for ortho, meta and para interactions;
2) most substituents could be assigned to either an 'Inducer* or a
'Responder* class, greatly limiting the need for considering a 'bi-
directional' interaction; 3) all halogens could be assigned the same sigma
parameter; and 4) generalized substituent structures could be used, each
class receiving the same sigma or rho value. Rather than use the classical
1S
sigma constants derived from ionization equilibria, it was decided to
derive a set which might be more appropriate to partitioning equilibria. A
program for successive approximations was written in APL and applied to a
model set of 90 di-substituted benzenes chosen to eliminate or minimize bi-
directionalinteractions. An average of sigma (meta+para) values was
introduced as the first approximation from which the first level rho values
were calculated. These were used to re-calculate the second level sigmas,
and the dialectic process continued until the change in both sets was less
than 0.01 unit. The entire process was repeated using sigma inductive
constants instead of the meta/para average, and for a third time using
values for Field effect . In every case the hydrophobicity-oriented
sigma/rho set turned out the same within .01 units. The greatest
difference from accepted Hammett sigma constants were found for the nitro,
sulfonyl and carboxaldehyde groups. This set, enlarged with values for bi-
directional substituents (Inducer/Responders) appears in Table 2.
Ideally, the effect of Hammett parameters are additive; i.e., two
chlorines have twice the effect of one. This did not apply to their effect
-------
PAGE 15
on hydrophobicity, but it was possible to use the same diminishing power
series to all those sigma values in Table 2 which were studied in
multiples. For two groups, the sum of sigmas was multiplied by 0.75; for
three, 0.60; for four or more, 0.35. When the Inducer is any group except
-N=, the rho values of the Responders are averaged; with -N= as the
Inducer, rho values of Responders are added until the product reaches a
maximum of 2.80. See Table 4 for example calculations.
Alkyl Substitution
A comparison has previously been made between aliphatic and purely
aromatic solutes which showed a different relationship between log P and
20
molar volume. It might be expected, therefore, that a correction Factor,
albeit a small one, would be required when alkyl chains are attached to
aromatic rings. Because the simplest example, toluene, did not appear to
require any such correction, the effect was overlooked for some time. In
the 400 solutes used in this study, over 60 exhibited this feature, and its
existence was established beyond reasonable doubt.
To evaluate the alkyl-aryl effect, an indicator variable was added to
the regression equation relating the observed log P to the 'simple additive
log P';.>. i.e. that which would result from addition of Fragment values
without correction Factors. This indicator variable was given the value of
*
the number of alkyl groups on the ring system on which there was already
present some other substituent. Thus the variable would be 0 for toluene,
1 for p-cresol, 2 for xylene, and 3 for 3,5-dimethyl phenol (i.e., 1 for
-OH/3-CH3; 1 for -OH/5-CH3; and 1 for 3-CH3/5-CH3). For 69 examples in the
training set, the alkyl-aryl Factor was evaluated as -0.17.
-------
PAGE 16
Ortho Effects
The effect which ortho substitution has upon most reaction rates and
equilibria is so complex that only a few authors have dealt with it in the
classical Hammett fashion. 21 In terms of its hydrophobic effect, both
Fujita and Brandstrom elected not to deal with it in their first papers
on electronic effects. Ogino and Fujita22 did develop an equation for ALog
P in 2- and 2,6-disubstituted guanamines which relates to the problem.
They showed that the sigma para value had to be corrected by a Field Effect
term for use in the ortho position, and that a steric parameter was also
necessary. As will be seen below, this information proved valuable in
filling out the chart of Ortho Effects (Table 3) by interpolation from the
rather sparse data presently available. Very strong ortho effects show up
in environmentally important compounds from pesticides to PCBs, and in
estimating log P values the first question to be asked is whether an
intramolecular H-bond can be formed.
Intramol H-Bonding
The partition coefficients of solutes with substituents in ortho
position are generally lower than if they were in the meta or para
position, UNLESS they have the capability of forming an intramol H-bond to
which the octanol/water solvent pair is sensitive. In that event a large
positive correction is required.
The question of the 'sensitivity* of the octanol/water solvent pair
needs to be addressed further. The position of equilibrium in the
partitioning process depends on the free energy of solvation/de-solvation
as the solute passes from one phase to the other. Free energy is, of
-------
PAGE 17
course, the difference between the enthalpy and entropy for the process,
and if these differ in the same direction and amount between the free and
H-bonded forms of the solute, then no difference in log P will result.
These enthalpic-entropic differences are NOT the same for all solvent
pairs, and the H-bond Factor must differ accordingly. Looking at the
problem from a slightly different perspective, we see that in the
heptane/water pair, heptane solvation is changed little whether or not the
nitro group in nitrophenol can H-bond with the hydroxyl; the water phase,
on the other hand, can lose two potential solvation forces—H-accepting and
H-donating when this occurs. In the octanoI/water pair, both phases lose
solvating power when this occurs, with octanol losing more, perhaps,
because of greater ordering necessary to orient the hydroxyl group attached
to a long alkyl chain. The observed difference in log F between o- and m-
nitrophenol in the heptane/water system is +3 log units, but in
octanol/water it is -0.21.
In evaluating the H-bond Factor, the appropriate (sigma x rho) product
was applied but the F-HB was allowed to account for all other interactions.
It was applied as an indicator variable in a regression equation of the
form: OLP = a(ALP) + b(rho x sigma) + c(F-HB) + d, where F-HB can take the
value of-1 or 0. In 15 solutes where a carboxy1 group was ortho to either
an -OH or -NH- group, F-HB was found to have the value of +0.63. The
qualifying pairs of substituents are seen in the square (sub-matrix) in
Table 3.
Negative Ortho Effect
There are several plausible reasons for a reduction in log P when two
appropriate substituents are in close proximity on an aromatic ring. There
-------
PAGE 18
is reason to expect that separated charges will have a positive effect on
log F if the distance between them exceeds a certain minimum, but will have
a negative effect if they are closer. And if one or both of the
substituents is a polar group attached by a hetero atom with lone pair
electrons, then the other member of the pair, if bulky enough, can prevent
the first from attaining true planarity with the ring. This would inhibit
delocalization and make the group more like one which was aliphatic-
attached; i.e. one with a lower Fragment value.
Evaluated through regression analysis, with the negative ortho Factor
taking integral values from 1 to 5, 59 solutes yielded a value of -0.28.
Table 3, in matrix format, shows the number of times this Factor must be
applied for each of many substituent pairs. Keeping in mind that the
effect arises from both field (electronic) and steric forces, Table 3 has
been enlarged with interpolated values which appear in italics. An example
of a calculation using this Factor appears in Table 4.
Summary of Aromatic Substituent Interactions
The log P of aromatic solutes can be closely approximated if, to the sum
of the appropriate Fragment values, one adds four correction Factors: (1)
sigma x rho (+ second sigma x rho if both substituents I/R). (2) -0.17 x
(number of alkyl groups on already substituted ring). (3) +0.63 x (number
of ortho groups which can H-bond) (4) -0.28 x (integer in Table 3 for
appropriate ortho pairing)
-------
PAGE 19
IV. Use of CLOGP-I Program
A. Documentation
Two booklets—one describing overall strategy with examples, and a
second alphabetically listing the subroutines— have been prepared by
Pennsylvania State University and appear as a supplement to this report.
The program changes made at Pomona, which insure consistency of calculation
regardless of the order in which structures are added, are included in the
complete program (on tape, 1600 BPI) submitted as a supplement to this
report.*
B. Program Options
1. The usual one of output to 'printer1 or 'terminal*.
2. Degree of detail in output for calculation: Penn State version gives
three: No debug (NDO); Calculation debug (CDO); and all debug (ADO).
Since operator - inspection of CLOGP-I output is ALWAYS recommended, the
first option is dropped in the operational version provided by Pomona, and
*It should be a matter or record that these were major changes and not a
minor de-bugging. It was largely due to this extra effort that the
original.timetable had to be abandoned. The problem was first appreciated
when an early version of CLOGP would fail to recognize a simple halogen
fragment after it had processed several multi-halogenated structures.
Correcting a re-initializing failure seemed to solve the problem. However,
it cropped up again (with hydrogen-bonding Factors, ether oxygen Fragments
etc.) when an attempt was made to develop performance statistics using the
Selected File of Log P values from the Pomona Data Base. We were aware
that a similar version of CLOGP was being used by a commercial firm which
also was attempting calculations in long runs on large files. Their output
was also judged seriously flawed, apparently from the same inconsistency
problem." Serious efforts by this firm to correct the program problem were
not successful and were abandoned in favor of work on a new approach.
Efforts at Pomona were not successful either, in the time period of the
contract or its extension. Nevertheless, work on it was continued, and the
program errors were found and remedied by September 1982.
-------
PAGE 20
the others are called by (2N) and (IN) respectively.
3. Structural Input
a. Cursor-equipped CRT (Penn State): Atom and bond entry creates
structural diagram. (Program is hardware-dependent.)
b. Wiswesser Line Notation (Pomona CLOGP-I; requires PL/I compiler)
(i.) Entered individually (following program-prompt)
(ii.) Entered from structural file using command 'Fragval'.*
c. Entry by SMILES (SLOGP, courtesy Dr. David Weininger, Duluth ERL)
SMILES is a rapid, easily-learned system of converting a two dimensional
structural diagram into a linear array of conventional atomic symbols and
bonds which is processed by computer to yield an ADAPT connection table
(needed to drive CLOGP) and also to return, to a suitable CRT, the diagram
so that encoding accuracy can be verified. Other features are the same as
entry via WLN. This entry system is operational at ERL-Duluth and can be
implemetented on any EPA installation of CLOGP-I.
C. Testing the Program
With a program as complex as CLOGP it is advisable to periodically
verify that it processes all the Fragments and Factors as it did when first
put into operation. In the event any changes are made, it is essential to
*If a measured log P is stored in file, program will give clogp, obsv. log
P and deviation as output. SAS (statistical analysis systems) treatment of
this data is seen in the Results section. It would be desireable to
determine the number of times each Fragment and Factor type is called when
calculating the 'select1 set and the average deviation associated with each
of them. It is not practical to do this with CLOGP-I, but it is an
important objective for version II. These statistics can be expected to
vary, of course, depending upon file orientation (i.e., drug-pesticide vs.
general organic raw materials, etc.).
-------
PAGE 21
verify that other portions of the program were not disturbed. It is a
simple task to perform if the structures which appear in Table 6, encoded
in WIN, are kept as a file in permanent storage. They appear as the
simplest structures which will call each Fragment and Factor. They are
sorted on Fragment formula. They are repeated in Table 7, sorted on FCON
number.
D. Need for Human Intervention
1. Program Warnings or Suggestions: (values provided by program)
(a) Correction for ionic form of carboxy1 group.
(b) Correction for zwitterion form of amino acid.
(c) Correction for tertiary amine chains.
2. Known Corrections Not Yet Implemented (No warning)
(a) Electronic effect on aromatic rings EXCEPT when R = -OH or -NH- and
I » -N02, -CN, "-CF3, or -S02(X); CLOGP-I averages all these as +0.77; for
all other combinations see text and Table 2.
(b) Alkyl-aryl effect - n(-0.17); see text.
(c) Negative ortho effect - n(-0.28); text and Table 3.
(d) Lactone » -0.9.
(e) Alicyclic clusters = -0.45 (e.g bornyl and adamantyl derivatives
Steroids (fused four-ring system) = -1.1; these also need special
corrections for substitution in 11 & 17 positions.
(f) N-oxide fragment value increased by 0.63 if adjacent to ring fusion
-------
PAGE 22
as in quinoline or acridine.
(g) If three adjacent Isolating Carbons in a chain have -OH and/or -NH-
fragments attached, the proximity correction should be increased by 0.45;
e.g. in chloramphenicol analogs.
3. Anomalous Calculations to be Expected.
A. Folded Conformations:
N,N-disubstituted phenoxyacetamides provide a good example of this anomaly.
Log F values go through a minimum at the diethyl analog and then rise
normally at least to the dibutyl. Conformation analysis by CAMSEQ 21+
indicates that the diethyl is optimal to fold over the benzene ring,
eliminating two hydrophobia surfaces from aqueous solvation. Lengthening
one or both alkyl chains necessarily exposes the extension to the solvent
once again. N.N-dialkylamide substituents on erythromycins also are over-
predicted.- On the other hand, two or more linked hydrophilic rings, such as
is found in clindamycin analogs, are underestimated. Some sesquiterpene
lactones also appear to behave in this fashion. Half of them can be
calculated very well, but half are over one log unit underpredicted.
Conformations which bring a carbonyl and hydroxyl group into close
proximity (even though their separation by 'skeletal1 route is great) are a
possible explanation. Folding or 'screening1 may be the reason aliphatic
rings or chains of eight or more carbons are likely to be calculated higher
than observed, as do N,N-disubstituted amides with total chain length of
eight or more.*
*0f course experimental difficulties are greater with these classes of
solutes, and one can never be sure whether the deviation is in the
measurement or in the calculation.
-------
PAGE 23
B. Tautomerism: Sometimes the observed log P lies between the values
calculated for the two tautomeric structures, but this need not be the
case.
C. Peri Substituents: In quinoline analogs, H-polar groups on the
8-position are generally more hydrophobic than usual. There is not enough
data to generalize about other substituent types, including the
1,8-disubstituted naphthalenes.
D. Hydrophobic Groups Between Aromatic Rings: 2,2'-disubstituted
biphenyls have been mentioned above. They are less hydrophobic than
calculated (up to one log unit or more). The same applies to halogens on a
methyl or ethyl group between phenyl rings (as in DDT analogs).
E. N-nitroso-ureas are a type of fragment which CLOGP-I sees as a
combination. Proximity effects, such as the halogen-H-polar interaction
ICF-17, are counted twice.
F. The zwitterion correction (-2.4) is too great for an amino acid
moeity connected to a very polar aromatic ring.
G. When a benzene ring is totally substituted with large halogens and
polar groups, the presently known interaction effects fail.
H. Solute structures containing a ring completely surrounded by other
rings (e.g. strychnine) may be calculated too low.
I. Oxy-N-heterocycles are poorly predicted.
V. Analysis of Test Results
-------
PAGE 24
A. Data Set: 3517 'Selected1 Log P values from the Pomona Medchem Parameter
Data Base.
These values were selected on the basis of: a) reliability (measurement
error, if known; or agreement with other measurements.) b) small or no
correction needed to obtain value for uncharged species, except for a few
values for completely ionized solutes.
It should be noted that the solutes in the Medchem Data Base are biased
toward bioactive organics, i.e., chemicals with pharmacophores or
toxiphoric moeities. Dndoubtedlly this necessitates dealing with a wider
variety of Fragments and Factors than would be present in the same number
of ordinary industrial organic chemicals. For this reason this test may
well provide 'worse case* statistics.
Before any statistical evaluation could be made, it was necessary to
verify program consistency. To accomplish this the entire 'selected' set
was calculated-and the values stored with corresponding WLNs; the set was
then recalculated five times, each time after a random reshuffling of input
order. The program then saved the WLNs for which different values were
recorded. This procedure had to be repeated several times before all the
programming errors could be located and corrected. (All versions of CLOGP
with dates earlier than Sept. 1982 are liable to produce inconsistent
results.)
B. Range of Values: Measured values of log P in the 'Selected Set' ranged
from -3.31 and -3.21 (pentaglycine and glycine) on the low side to 7.54 and
6.36 (hexachlorophene and DDT) on the high.
C. Percent Fully Perceived: 775 of the 3517 selected solutes contained
-------
PAGE 25
'excluded fragments' and elicited the message "unaccounted atoms." This
amounts to 22% of the structures and is very close to the original
objective of the program's level of effectiveness. It should be noted that
with operator intervention this failure rate can be cut approximately in
half, if a higher probable error can be accepted. The three unprogrammed
approximations which make this possible are: a) the difference between
aliphatic and aromatic attachment at any fragment valence bond is about one
log unit (aromatic is higher), and few of the multi-valent fragments in
CLOGP-I have values for all possible 'environments'; b) enlarging an
already hydrophilic Fragment by fusing on another hydrophilic segment
reduces the original value much less than addition of the two Fragment
values and can be approximated as -0.3; e.g., adding -NH2 to -CONH- changes
the aliphatic Fragment value from -2.71 to -2.50 if added to the right side
and to -2.18 if added on the left; c) Many 'missing1 Fragments differ from
known ones by a hydrogen atom; a hydrogen atom on a Fragment usually has a
higher value than when attached to an I.C. but can be approximated as
0.40.*
D. Precision of Calculation: This can be examined for each of three modes
of operation: (1) When the only output examined is the final calculated
log F; . i.e., the warnings and corrections integral to the output are
ignored. (2) When the corrections provided by the program are applied (see
IV.D.I above), and the anomalies listed in IV.D.3 above are removed. (3)
When, in addition to the steps in (2), the corrections listed in IV.D.2 are
*At the outset of this work, some consideration was given to incorporating
these features as options in the CLOGP program, but the time required to
correct the inconsistency problem precluded this. They are planned for
version II.
-------
PAGE 26
applied. Because of the aforesaid bias of the Pomona Medchem Data Base
(e.g. steroids and antibiotics heavily represented), the listed anomalies
reduce the 'effective perception1 of CLOGP-I to a greater extent than would
be the case if tested on a file of general industrial organics. In mode
(3) it still calculates 70% of the structures, and perhaps half of the
remainder could be satisfactorily estimated (within 0.8 log unit with 90%
confidence) by operator assistance with procedures provided.
Mode (1): The statistical analysis of the results of CLOGP-I operating
in a 'blind mode1 is seen in Fig. 2. The standard deviation of over 0.8
log units is about twice that of the original objective. The mean of 0.076
shows that the net corrections which remain to be applied should be
positive, especially when one considers that a number of zwitterion
corrections (-2.4, in program prompt) have NOT been applied. The residual
frequency chart shows a fairly normal distribution with well over 85%
chance of calculation within one log unit.
Mode (2): The statistical analysis of the results when the operator
follows computer prompts and eliminates readily recognized anomalies
appears in Fig. 3. It cannot be stressed too strongly that simple operator
assistance can reduce the standard deviation to half the level achieved
when the program 'runs blind*. The chances of the calculation falling
within one log unit of the measured value are now greater than 98%.
However, since aromatic solutes needing large corrections for electronic
effects are now included in the 'anomalous1 category (even though the
corrections are known), the calculable percentage drops to 66%.
Mode (3): With a very small investment of time and effort the operator
of CLOGP-I can be instructed in the additional correction procedures given
-------
PAGE 27
in Section III. Use of these constitutes Mode (3), and the statistical
analysis of these results are given in Fig. 4. The standard deviation has
been reduced almost to 0.35, which is respectable considering the diversity
of structures in the Medchem Data Base (ranging from phenylglucosides to
hexachlorophene). The residual frequency chart indicates that one can
expect 92% of the calculations to be within 0.6 log units of measured
values.
A plot of residuals vs. clogp for Mode (3) is seen in Fig. 5. A
regression of this data shows that the apparent downward slope of the
points is not very significant (squared correlation coefficient = 0.11).
Nevertheless, it seems to indicate that very hydrophilic solutes tend to be
calculated low and very hydrophobic ones calculated high. This may reflect
very real physical limitations on this parameter on both ends of the scale.
Table 5 contains the CLOPG output for the 5 solutes with the highest
deviation as calculated by Mode (3), together with an analysis of the
reasons why they are predicted poorly.
Summary of Results: With assistance from an operator who has been
given a reasonable amount of instruction, CLOGP-I can meet or exceed the
original objectives set forth in this contract; i.e., when operated in Mode
(3) described above. To reduce or eliminate this need for operator
assistance, and to enable the program to be expanded as new measurements
are made and new interaction effects percieved, an entirely new algorithm
must be employed. In spite of the failure of CLOGP-I to 'stand alone1, it
has proved invaluable in charting out a course for further development and
will continue to be used in practical applications until an improved
-------
PAGE 28
version becomes available.
-------
BTDLIOCRAI'HY
1. Leo, A., Hdnsch, C. and CLkins, D.. Clu-m. K*v.. 7i , 575 (1971).
2. MirrLc'O*, M.S., i? I. c. I., J. hed. Chem., 1(?. 6i5 (1976).
3. Henry, I).. et.dL., ibid., 19. 619 (1976).
4. Yamani. 1., el. a I., J. Pliarrn. Sc i . , 66, 747 (1977).
5. Uncjer . S. H., el. a I.., ibid., 67, 1364 (1978).
6. Veith, r,., et.al.. Water Res., 13, 43 (1979).
/. Bradwhow, J. and Latter, I)., GLaxo Research, private communication.
(1982).
8. Garst, J. E. and Wilson, W. C., (I.) J. Chromatog.
submi rled, (1982).
9. Garst, J. E., (II.) ibid.
10. Van Duyne, R., et.al., J. Phys.Chem.. 71, 3427 (1967).
11. Fuji-fa. T., Iwasa, J. and Hansch, C., J. Am. Chem. Soc. , 86, 5175
(1964).
12. Rekker, R., "The Hydrophobic FragmentaI Constant", Elstvier,
Amsterdam, (1977).
13. Hansch,C. and Leo, A., "Subs I i tutrnt Constants for Correlation
Analysis in Chemistry ft Biology", Ch.3V.. Wiley In terseipnc«,
N.Y., 1979 (appears as Appendix A of this report).
14. Brands trom, A., Ac. ta Pharm. Sutic . , 19, 175 (198?).
1L>. Hammett, L. P.. "Physical Organic Chem i «>i ry* . 2nd Ed., MrCraw-H i L I.,
N.Y.. 1970.
16. Fujita, T., in "Prog. Phyi. Org. Chem." A. Stre i twe i ser am! R. Tal'l
Edi., Wiley Tntprsciencp. in press.
17. Fuji la. T., J. Pharm. Sci., in press
If!. Parafiif ler Da tit Hose. Pofiitm^ College- Mt-ricin-m Project, Issue »ri
July 1982.
19. Ulr i t ton by Steve Burns, Pomond College.
I'O. Leo, A.. Hansch. C. and Joiv, p., J. Med. Chem., i9, 611 (1976).
21.Char ion, M., in "Prog. Phys. Urg. Chem.'. A. Strei\weiser & R.
Taft, tds., Wiloy Intersclence, Vol. 0, p. 235 (1971).
1>2. Ogino, A., Mfllsumura, S. and Fujita, T., J. Meek Schem. . 23,
437 (1980)'.
23. Maget-,P., Chevi on Chemical and Peacock, S., Molecular Design,
priv
-------
Table 1.
Compounds Measured by Shake-Flask
No.
UL N
K1U1
E2E
FR CMVN 1o,01
FR DOV1MVR
FXFFR C01MVR
F4
GR BVU1
GR BV1
GR COVIhVR
GYGU1
G1UiG -C
G1U1G - 1
L C666 I:V IV DQ GQ k
M2M20 Nrf2M2Q
L C666 BV TVJ DZ
L C666 BV IVJ
L6TJ AOV1 CMVNNOS.2G
L6VTJ BM1 BR BG
L66 BV EVJ GO
I.66J BMYZUS
NCR DUKV?,RU)2
NCR DUV1MVK
VINYL BROMIDE
DIBROMOETHANE
1(3'-FLUOROPHENYL)-3-MET
HOXY-3-METHYLUREA
P-FLUORDPHENYLHIPPURATE
M-TRIFLUOROMETHYLPHENYL_
HCPPURATF
FLUOROBUTANF
0-CHLOROBENZOTC ACID,
hETHYL ESTER
0-CHL OROAf.ETOPHFNONF
ri -CHLOkOPHEHYLHJ PPLIRATC
V1NYLID1NE UHLUR1DE
1,2-DrCHLOROEFHYLENE
1 ,2-DICHLOROETHYLENE
ANTHRACENLDTME
1 -AM J NOAt>ITHRAQUl NONE
ANTHROQUINONE
1(2'CLETHYL)-1-NO-3-(3'_
CARBOMETHOXYCYCLOHEXYL )
UREA
KETAMINE
5-HYDF
-------
PACE 2
NC1U1
NC20R
QR BV1
QR CG Dfr
QR DMV1
QR DNU
QVR BOR
QVR BM1
QVR L
-------
PAGE 3
RVM1VOK
RVR
RV1R
i'HR BVQ
SH1YZVQ
T C5 C6556/C-P/JP C-
_3ACJ P CX EY JXOV 0
UTJ BV« EUI FQ M NQ
T G5 D6 K666 CV HO M
0 POTMTfcJ 1YU1 S01
T01
T307J D
T3DTJ BR
15-10- HOVY liU LUTJ_
DU1 l-l L
T'5N Ci'J BZ
T5NJ A
T5NMVTJ AR
TSNVTJ A1U1
75NTJ AVMRA H E
T50 COTJ PR BOVM1
750 COTJ
T50J BVH
T501J
15SUTJ
T5VMVJ
TS« BM DN FMVMVJ
FS6 IU1 DN FVri ]NJ HZ
TCJ6 Hh DHJ CZ
PHENYLHTPPURATC
BENZOPHENONE
DEOXYBENZOIN
THIOSALICYL1C ACID
CYSTEINE
GIBBERELIC ACID
ROTENONE
PROPYLENE OXIDE
STYRENE OXIDF
COSTUNOLIDE
2-AMINOTHTAZOLC
N-METHYLPYRROLE
1-PHENYL-3-PYRAZOLIDJN_
ONE
N-VINYL-2-PYRROI.IDINONE
CISANILIDE
DIOXACARB
1,3~DIOXOLANE
2-FURALDEHYDE
TETRAHYDROFURAN
TETRAMETHYLENESULFONE
MALEIHIDE
XANTHINE
GUANTHE (AT I'H=13)
2-AMINOBENZJhTDAZOLr
4
4
4
4
6
0
4
4
4
2
4
7
4
4
4
4
4
6
5
4
6
10
4
4
2.31
3.12
3.19
2.39
-1.87
0.24
4.10
0.03
1 .61
2.09
0.33
1 .21
0.89
0.37
2.33
0.67
-0.37
0.41
0.46
-0.77
-0.29
-0.73
-0.91
0.91
.03
.02
.04
.03
.03
.10
.05
.02
.03
.01 8
.01
-.04
.02
.01
.02
.01
.03
.02
*
.04
.02
.or.
.01
.04
-------
/* PNJ
DO CHJ K01VN1
Y&&1Y
T56 EO DO CHJ G1U1V-
_AT6NTJ
T56 DO DO CHJ G1U1VZ
T56 PO DV CHJ C C IQ
T56 UCU&J
T56 BOr&J C C IQ
T56 Pi1 HNJ
TAN CNJ B DZ ElhVNNO
&2G
TANJ HOR
TANJ B I) F
TANJ PI 2U
TANJ CC,'
ToNJ DNU
TANJ DVH
TANNVNJ AO D1R
TANTJ A- ALATfJ AR
TANTJ AV1U1R
TAVMrlVJ
TAG CO LOTJ EOF
TA,-', UNNNVJ DiS'l1
TAA PUPO EHJ C.V COi
TAA CNJ CO
VHR Bf,
1 -METHYLINDOLE
N,N-DI-l-BUTYL-3,4-DIOXY
HETHYLENECINNAMAMIDE
N(3,4-METHYLENEDIOXY_
CINNAMOYDPIPER1DINE
a^-DIOXYMETHYLrNE.
CINNAMAMIDE
3-KETOFURAN PHENOL
2,3-DIHYDROBENZOFLIRANt
CARBOFURANE PHENOL
A-AZATHJANAPHTHbNE
ACNU
2-PHENOXYPYUTD1NE
COLLIDTNf
1 ,2-DKALPHA-PYftlDYL)
ETHYLENE
3-HYDROXYPYRID1NE
4-NITROPYR1DINE
4-PYRID1NE CARBOXALDEHYDE
4-PENZYL-1,2,4-TR1AZ1NE-
3-ONE-1-OXIDE
PHENCYCLIDJNE
N-CINNAMOYL PIPERJDINE
MALEIC HYDRAZIDE:
PARALDEHYDE
&01 ftZlNI-'HOi1 METHYL.
S'ALITHION
J-QUTNOL1NC-N--OXIDE
O-CHLOROBIINZnLDIIIIYDF
F'AGE
4
4
4
8
4
5
5
4
3
3
4
4
4
3
3
3
2
4
6
5
3
4
4
3
4
2.72
4.37
2.82
1.40
1.87
2.14
2.08
1.74
0.94
2.39
1 .88
2.11
0.48
0.33
0.43
0.19
-0.01
2.74
-0.84
0.52
-.75
2.67
0.25
2.33
.01
.03
.02
.01
.01
.03
.02
.03
.01
.005
.04
.03
.01
.004
.02
.03
.05
.01
.02
.01
.03
.OS
.01
.04
-------
PAGE 5
VHR CXFFF
VH1U1
VH2
UNR B CNU ENl-J
UNR BM1
UNR BNU
UNR BC)1
UNR HQ CNU ENU
UNR t
-------
ZK COV1MVR
7.R CVQ
ZR DI
ZR DSWMV01
ZSUR COV1MVR
ZSUR CSZU
ZVH
ZVR B01
ZVR BVZ
ZVR BZ
ZVR C01MVR
ZVYQ
ZV1U1
ZV1U1R
ZV1VZ
Z1R DNU
1MV10R
1MV1U1R
1MY1R
1 NR&R
1NRJ.V10R
1N1iV10k
1N1&V1U1R
1ON&1&VMR
H-AMTNOPHLNYLHIPPURATi:
M-AMINOBENZ01C ACID
4-IODOANILINE
ASULAM
M-SULFONAMTDOHHENYL_
HIPPURATE
1,3-BENZENEDISULFONAMIDE
FORMAMIDE
ANISAMIDE
0-PHTHALAM1DE
ANTHKANtLAMTDE
M-CARfiOXAMlDOPHENYL_
HIPPURATE
LACTAriTDE
ACRYLAMIIiE
CINNAMAHIDE
MALONAMIDE
H-N1TROBENZYLAMINE
N-METHYLPHENOXYAr,ETAMIDE
N-hETHYLCINNAMAMTDE
N-METHYLAMPHETAMINE (AT_
PH=13)
N-METHYLDIPHLNYLAMINE
N-METHYLPHENOXYACET_
ANILIDE
N,N-DIMETHYLPHENOXY_
ACETAHiDE
f^N-DIMETHYLCINNAMAMIDE
1-PHENYL-3-METHOXY-3-
METHYLUKEA
PAGE
4
7.
3
8
9
4
A
4
4
4
3
5
4
4
5
3
4
3
3
4
3
4
4
A
6
1 .30
0.07
2.34
-0.27
0.84
-0.55
-1 .51
.084
-1 .73
0.35
1 .20
-1 .39
-0.67
1 .43
-2.01
1 .06
1 .02
1 .81
2.07
3.90
2.26
0.80
1 .73
1 .29
.0!J
.04
.02
v
.04
.04
.02
.01
.03
.03
.04
.06
.02
.01
.06
.01
.02
.01
.03
.05
.04
.2
.01
.02
10N1,%VMR D02R I)
-(2' (4'-MF.THOXYPHENYL)
3.81
-------
PAGE 7
lONI&Vrtk DU40R
1QR D
iOR C
10R CQ E01
10R D
1I3R DOV1MVR
10VR B
1 02 OR
1VMK COV1MVR
U'OllJi
1Y&OP04.S1R&OY
20P5&2&SR
2GVM1
20VR BV02
20V1U1
202
4N4&V10K
CTIIOXY) PHFNYL) 3-ME7I-IOXY
-3'-METHYL UREA
N-1-MEO-N-1-METHYL-N'-3-
(4-(4-PHENOXYBUTOXY)
PHENYDUREA
0-HETHYLAN1SOLE
M-METHYLANISOLE
3,5-DIMETHOXYPHENOL
P-METHYLANISOLE
P-METHOXYPHENYLHTPPURATE
0-TOl.UIC ACID. METHYL
ESTER
2-METHOXYETHOXYBENZENE
M-ACETAhIDOrHENYL_
H1PPURATE
VINYL ACETATE
IBP
FONOFO.V
N-HETHYLCARBAMTC ACID.
ETHYL. ESTER
DIETHYLPHTHALATn
ETHYLACRYLATE
D1CTHYL ETHER
N,N-DIBUTYLPHENOXY_
ACCTANILIDE
3.57
.03
3
3
4
4
4
4
4
4
4
2
4
4
4
4
4
4
2.74
2.66
1 .64
2.81
2.28
2.75
1 .73
1.70
.073
3.47
3.94
0.34
2.47
1 .32
1 .00
3.23
*>
.01
.02
.06
.04
.04
.04
.04
.01
.01
.07
.03
.OS
.06
.03
.07
* Concentration dependent; extrapolated to zero cone.
** Measured at pH 7.4; other values obtained at pH 5.4, 6.5, 9.0 and 14.0
-------
TABLE 2.
Sigma and Rho Constants
No.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
Sigma Rho Generalized Structure
0.84 0.00
0.71 0.00
0.65* 0.00
0.65 0.00
0.60 0.00
0.49 0.00
0.28 0.00
0.58 0.44*
0.51 0.27
0.32 0.35*
0.32 0.72
0.17* 0.50?*
0.50° 0.88C
0.00d 0.50d
0.00 0.61
0.00 1.06
0.00 1.08
-N-
-S02F
-so2-x
-CN
-NO
~CF3
Halogens
-CHO
-C(=0)-X
-CONH-X
-0-X
-S02NH-X
-S-X
-OH
-NH-X
Examples
a
pyridine, quinoline
X = alk, N(Me)2
F, Cl, Br, I
X = alk, OCH-, C.H., N(Me)0
j o j 2
X = H, NH2, CgH5, alk
X = alk, CONHCH , CON (Me) ,
mCn V PnTO— allr^
_I>U_H, ru^u— aj.Kj_
£* /n &
X = H, C-H..
b y
X = H, alk
-N(Me)2, -N=NN(Me)2
X = COMe, CON(Me)2, CHO, alk,
* Not determined by successive approximation program.
a. Effect cut in half for Responders on non-hetero ring.
b. With original training set of 90 solutes, 0.51 was obtained. With the set
enlarged with bi-directional solutes, 0.50 gave coefficients for the F term
o
closer to unity.
c. Acts either as 'I1 or 'R* but not both at the same time; i.e. it is not
truly bi-directional; exception is solute #208 in Table 5.
d. Not well characterized; should be considered tentative.
-------
TABLE 3.
Ortho Factor Levels
I ^^
CM CO
O t*4
Mt-l
CM 0 0 CM J3
O ^ K O O
en
H CM
N N
U* O tS
M •• 3
CM CO PL. ijpq
UK I . I
a o •«* v.-&
sf
NO r»«
co
d -H^CM"
CM CMJ-CM
110131
(1)
..
W = OMe, Me, N(Me)2
X = H, NH2
Y = CONHHe, COMe,Me,CON(Me)2
OCH2C02H
Zj= CONH2
Z2= COMe
*This level becomes 5 if Y = CCHC
O 3
( )= borderline effect
Within submatrix,1Hydrogen Bonds', F
o
t s anomalous; see text
Italicized numbers are interpolated.
0
0
0
1
1
1
1
2 0 (0) 1
2
2
2 3
1
0 0
,Intra-Mol. , 0
'HYDROGEN » 0
0, BONDS ) 1
• . 2
(1)
1
o
0(0) (0)
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
-------
Solute
1. 2,3-dichloroaniline
2. 3,4-dichloroaniline
3. 2,4-dichlorophenol
4. 3,5-dichlorophenol
5. 2,4-dibromophenol
6. 3,5-dinitrobenzamide
7. 2-aminopyrimidine
8. 2-aminopyrazine
9. 2,6-dinitro-4-CF3-
aniline
10. 3-iodo-4-amino-
benzoic acid
11. 3-bromo-4-aniino-
benzoic acid
12. 3-chloro-4-amino-
benzoic acid
13. 4-fluoro-4-amino-
benzoic acid
14. 2,3,4,6-tetrachloro-
phenol
TABLE 4.
Multiple Electronic Effects
_OLP_ALP_ Fa Calc.
2.78 2.32 .75(.28 + .28)(1.08) 2.77
2.782.32 .75(.28 + .28)(1.08) 2.77
3.08 2.88 .75(2)(.28)(1.06)-.28* 3.05
3.44 2.88 .75(2)(.28)(1.06) 3.33
3.22 3.18 .75(2)(.28)(1.06)-.28* 3.34
0.83 0.12 .75(2)(.6)(.72) 0.77
-0.22-1.63 .75(2)(.84)(1.08) -0.27
-0.07-1.45 .75(2)(.84)(1.08) -0.09
2.29 1.26 .6(.60+.60+.49)(1.08) 2.35
1.65 1.99 .75(.28-K32)(1.08-K35)*2
-2(.28)* 1.75
1.49 1.73 (as #10.)
i.33 1.58 (as #10.)
1.49
1.34
1.2971.01 .75(.28-K32)(1.08+.35)r2
-.28* 1.05
4.10 4.30 .35(4)(.28)(1.06)-2(.28)*4.16
Dev.
+.01
+.01
+.03
+.11
-.12
+.06
+.05
+.02
-.06
-.10
0.0
-.01
-.24
-.06
; see text and Table 3.
NH3
-0.85 +
ALP - log P pyrinidine + 2 !!„„ + n
IUU f^LJ ^ T_f
««- cnov.,n
f- e. b j
0.40 +2 (-1.23) + 2.01
Fa ** (n=2)coef. Za Ip
(0.75) x (.84 + .84) x (1.08 + 1.08) = 2.72
obsv. = 1.58 calcd = 1.87
(C-l)
-------
Table 5.
Solutes with Highest Deviations — Reasons
1. FXFF1NNO&1XFFF Frag. Sum = -2.94
Factors: 9(ICF-3) = 9(-.12) = -1.08
F N-NO F 6(ICF-5) = 6(.53) = 3.18
F-C-CH2-N-CH2-C-F ICF-17 = 0.55 = 0.55
Calc. = -0.29; Obsv. = 2.15
Reasons: CLOGP finds only one Fp . (Halogen-H-Polar); should find
two cases of B-Factor for ~ three fluorines as in Fig. 1.
Correct ICF (replacing #17) = 2(1.20) = 2.40
Corrected Calc. = 1.56.
Conclusions: -NNO fragment is very sensitive-to o; may be in class
like -S02-X.
2. GR CG B1UYV02&V02 Calc. = 4.897; Obsv. = 2.69
Reasons: From parent and 2,4-Cl? analog
it can be determined tnat a
negative ortho effect operates on 2,2-
disubstituted styryls. For the first
ortho C15 F = -0.80; for the second, F =
-1.20.
Corrected Calc. = 2.90
3. Q1X1Q1QMVMR Frag. Sum = -2.405
HOCH Factors: 7(ICF-3) = -.84
2 ^^ 2(ICF-4) = - 44
HOCH^t-HHCONH-oJ 3(ICF-12)= 3(.835) = 2.505
HOCH2 — Calc. =-1.18; Obsv. = 0.43
Reasons: When H-polar fragments are on three adjacent ICs, the present
proximity correction is understated, as pointed out in section
2(g), p. 17. It occurs twice in this solute; additional
correction = 2(.45) = 0.90.
Corrected Calc. = -0.28
4. G1U1X2&OVZ1UU1 Calc. = 0.33; Obsv. = 1.71
CH2-CH3 Reasons: A new positive Factor is needed
C1-CH=CH-C-C=CH when a strong ethynyl pi-electron
6-CONH2 cloud is forced close to a carbonyl oxygen.
-------
Table 5. (cont.)
5. T B6566 B6/CO 4ABBC R BX FV HO PN GHT&&TTJ CQ JQ P2U1 (Naloxone)
calc. = 0.682; Obsv. = 2.09
Reasons: Since morphine analogs
without the alicyclic -OH
are well-calculated, it is possible
that this group is not freely solvated.
It should also be noted that the measured log P for naltrexone, with
a cyclopropylmethylene replacing the allyl on the nitrogen atom is
0.17 units lover when it is expected to be 0.5 units higher than the
above. This might indicate either: unusual conformation or in-
correct measurement for naloxone.
No conclusion possible with present data.
-------
Table 6.
FRAGMENT and FACTOR VERIFICATION LIST
Simplest Appropriate Structures in WIN
Ordered by Fragment Formula
E1tM
G1**2
GR**32
FR**33
WSFR**34
A. FFFFFSR**227
I1**4
IR**35
1N1&R**36
1NR&R**67
C. 1N1&1R$*84
TANJ**103
A. T5NJ A*k140
A. T5NJ AR**164
ONRt*37
T6NJ AOT>)*107
UN1*#6
UNR$*38
UNO 1*fc1 98
*U51f,NRi,SUl**no INFINITE LOOP (i.e., logic error in program)
T6NNJ**141
ONN1f.U**28
ONN1&R**200
UNR DNUNN1&1**218
A. T3NTJ A- 3/P.V/*fc206
2D2**7
UNR D02**93
T50J**106
101**B
OPR&R&Rt*201
E. T5SJ A0$»109
10PO«.RA01$*202
10Pi'8,01&OR*>*208
US1&01
USR&ORM-213
10PO&0 14.01 4*1 46
m P n A. i ii ,x
-------
1S1**1 1
1SR$*44
RSR**71
RSSR**226
1M1$*13
1MR$*46
R MR * « 72
WNR DM1**94
T5MJ**105
USR&M1**209
WS1&MR**79
T5MNJ$«143
T5MNNJ$«142
QR**48
Q1R*#85
SHR*#49
Z1$*17
ZR*#50
ZR DNU$«96
C. Z1R**86
1SPZO&01*#205
ZSWR$*51
ZSUR DNU**97
RMMR$*228
RMM1R**229
ZMR$*#9
ZSUMR$#214
ZMSW1*#215
1X$«18
1XR*»52
L66J$*111
T66 BNJ**1 12
GYGUNR**220
NCR$«53
NC1R*#87
1VNU1$»20
1NR&VR*#74
10VN1'&1$*190
1N1&VOR**191
A. 1VN1&NUNR**178 (recognizes 1VNR&NUN1 instead)
1V1$«22
iVR*#56
RVR**75
1V1R$*89
WNR DV1$*9Q
E. LAV DVJ$«115
T6DYTJ BU.S$«192
E. OV1$«24 (gives 'ionic correction fro neutral fragment)
E.
-------
WNR DV01$*99
RVOR**76
T66 BOVJ**114
10V1R«*90
R**113
MUYR&R**221
1NU1R$*144
RNU1R**145
VHNU1$-**15
1 MVRt*59
•1VMR$«81
RVMR**77
T6MVJ$«116
T5NOJ***6
10VM1$M157
1VMOR*#J*18
10VMR$»82
T6MYTJ BU54-K186
5UY1&MR*»187
NCMR$»197
1N1&VKR$*161
ONN1&VMR**164
VH1$-»27
VHR$«-62
VH01**193
VHOR*»194
GjV1$#28
QVR$*63
GJV1R**.91
: ZVR$w64
A. ZV1R$*92
ZVR DNW$»100
QNU2$*26
ZVOR**18B
ZV01R**189
QMVR$»«3
2UNM1**148
1MVMR**65
RMVMR$#78
T5MVMJ**165
ZVN1&R$«151
ZVNRiR**152
ZVN1&NO*«167
ZVMR*#66
ZMVR*»179
SUYZM1**168
SU_YZMR$*169
-------
I V V M» w | 7 :>
RVVR$*196
1VMV1$*181
RVMVR$*182
E. T5VMVJ$*183
VHMV1$»184
RVMNU1R**180
ZVMV1$*170
ZVMV1R**171
ZVMNU2**172
ZVMNU1R**173'
SUYZMNU1R$*174
ZMVMNU1R**175
1VMVMV1*#I76
3H**117
G. 1U1R*#122
R1U1R$*154
R1UU1R**154
1UUi**125
1Y$tt126(ICF-4)
QY$*127
A. L66 B6 A B- C 1B ITJ$«128
GYG*#129(ICF-5)
GYGG$«130
GXGGG*#131
G2G$«i32(ICF-6)
1 OPO&0 1 &OR*# 1 55 < ICF-30 )
Qia2**133(ICF-30)
10101**133(ICF-7)
10201$#134(ICF-12)
T60 COTJ$*135(ICF-8)
T60 DSTJ*#136
T6M DOTJ$*153
QVR BZ*»139
QVR BQ$«156
1UU1R$K«21
ZV2G*#«25(ICF-17)
T60TJ BQ**#26(ICF-9)
T60TJ B01**#26
T60TJ
T60TJ C01$##27
H. F
FXFFVMR*#*#*8
FXFFOR*####9
FXFFSR*«###10
FXFFV01***#*13
6
GYGVMR*#««#17
Q1
01
-------
E1 VMR*****23
Q2E*****24
EYEVMR*****25
FXFFSWR*****27
G1 V1$w«**28
E1 VR*****29
QV1F*****3©
QV1G*****31
QVXGGG*****33
E1V01*****34
QV1E*****35
T60TJ BQ$(ICF-9>
T60 COTJ BQ*(ICF-1©>
T60TJ BQ CQ$(ICF-16)
Z1VQ*U
A. Value assigned; fragment not recognized.
B. Value correct; differs from text (appears only on CL06P RESULTS which follows.)
C. Value assigned; fragment recognized but bonding assignment error.
D. Preferred value as shown (appears only on CLOGP RESULTS which follows.)
E. Value assigned; fragment not recognized from WISCT assignment of aromatic
or ionic bonds.
F. May need warning for alicyclic correction
G. Conjugated and non-conjugated values reversed.
H. Could replace warning with a value.
$ ends WLN; the number of asterisks which follow identifies FCON list;
i.e., $****! is the first value in FCON-4
(ICF-#) refers to identification of Factor as seen in CLOGP printout.
-------
Table 7.
FRAGMENTS in Simplest WIN
Ordered by (FCON-1) Number
E1**1
G1$*2
F1**3
I1**4
WN1$*A
202*»7
Unused:
OS1&1*#9
cannot be retrieved
dormant
SH1*#16
NCS1$w2i
1V1$«22
1V01*#23
OVi$«24
QNU2**26
VH1**27
ZV1**29
ZVM1**30
ER**31
USFR**34
IR*M35
1N1&R*#36
ONR$-»37
UNR«M38
10PO&01&OR**43
QR$»48
45: cannot be retrieved
47: dormant
ZR*»5G
ZSWR$«!>1
1 XR$«52
NCR**53
NCSR**55
i VR*#56
1 OVR$*57
OVR$*58
1 MVR**S9
1MVOR$*61
VHR$*62
-------
1NR&R$*67
ROR$«68
2.
USR&R**70
RSR**7t
RMR**72
WSR&MR**73
1NR&VR**74
RVR**75
RVOR$*76
RVMR**77
RMVMR**78
1VN1&R$*80
1VMR**81
10VMR$*82
E1R$*83
NC1R$«87
NCS1R**88.
1V1R$«89
10V1R$*9G
QV1R$«91
ZV1R**92
UNR D02**93
WNR DM1**94
ZR DNU*»96
ZSUR DNU**97
WNR DVi$«9a
UNR DV01**99
ZVR DNW**100
T6NJ**103
T5NJ AR**1©4
T5MJ**'1'05
T50J**106
T6NJ A0**107
T5SJ**108
T5SJ A0**109
1R**110
L66J$«111
T66 BNJ*#112
R**113
T66 BOVJ**114
L6V DVJ**115
95:
101:
102:
can be retrieved by: WNR CQ;
is obsolete.
can be retrieved by: WNR CNW; Obsol
11: WNR CO!; "
3H$*117(ICF-3)
L3TJ**118(ICF-1)
1U1*#121 (ICF-2)
1U1R«*122
1UU1**125
1Y*#126(ICF-4)
QY**127
L66 B6 A B- C 1B
GYG$wl29(ICF-5)
GYGG**130
123: dormant
124:
ITJ**128
G2G*#132(ICF-6)
Qi02$»133(ICF-36)
10101*wl33(ICF-7)
10201$«134(ICF-12)
T60 COTJ*»135(ICF-B)
-------
T5NJ A$«140
T6NNJ$«141
T5MNNJ$-M42
T5MNJ**143
1NU1R$4f144
RNU1R$*143
10PO&01&01**146
WNR DOPO&01&01**147
2UNM1*«148
138: obsolete.
13 >
3.
ZVNR&R$*152
T6M DOTJ$»153
-------
*USi&NR&SU1*«210 INFINITE LOOP
WSR&OR$»213
1N1&NUNR$*217
WNR DNUNN1«,1**218
&YCtUNR$w220
MUYR&R$»221
QYR&UNQ*#222
HUYZM1$«223
RSSR$*226
FFFFFSR$*227
RMMR$*228
RMM1R$«229
(FCON-2)
WS1&01$**1
VHMR$**2
QMVR$**3
QMR$**4
1VOR$**5
T5NOJ$**6
ZMR$**9
VHN1&!$**! 5
RNUNR$**16
ZV01$**17
1VMOR$**18
ZYUS$**19
ZYR&US$**20
1UU1R$**21
ZV2G$**25
T60TJ BQ$**26
T60TJ CQ$**27
ONN1&1$**28
00. .
224: dormant
7: =-1.05; dormant?
8: =-1.83; can't retrieve: SUYZN1&1
10-14: empty
22, 23: can't retrieve with: 1V1UU1 or
101UU1
24: empty
29: obsolete
-------
rii-fr-:.- :
::.rp:: ;
. —L-
. rTjTTTT.
r- • -i
Fig. jl.
. ::..:.; LdI;.:.:.;..;-;; | ;.;>:;. ;),..:
Halogen-H-Pplar Interaction in Aliphatics
i::o;_
-------
Fig. 2. CLOGP-I Mode (1)
5 T A 1 I S T I C A L
ANALYSIS S1 Y S I ]£ M
15.1M THURSDAY, SlIPrLMBEk 9. 190.'
VARIABLE
OLOUP
CLOGf-1
RES ED
N
MEAN
.VTD DEV
SUM
MINJflUrl
MAXJHUM
27-12 1.74209336 1.30126617 4776.020000 -4.08000000 7.54000000
2742 1.66663239 1.64775191 436V.906000 -A.71500000 S.10000000
2742 0.07046098 0.01211378 206.714000-4.50500000 5.19700000
CORRELATION COEFFICIENT BETWEEN OLOGF AMI; CLOGP = .87071
R-SQUARED = .75813
FREQUENCY BAR CHART
MIDPOINT
RES ID
1
-1.25 I * «•***
-1.15 | *
-1.05 IM
-0.95 |*
-0.35 |*K
-0.75 |**
-0.65 |fc)n*
-0.55 !*«•**
-0.45 I**-**
-0.35 I******
-0.25 | #* **»)***•
-0.15 | >¥•***> )••*•*• ••}»•* -4
- 0. 05 | K « )!•<'• 'i * *• l< > Hr » )* *• Ji- mi if *}i--
0.05 | A >j»)tj>»-»jr *•*•*•* rfit*
0.1 'J | *:*««**•*•*•***•
0.2lr> | i"***-»Jfrf#»-it
0.35 |«h-Wi^^
0.45 |i*i**^
0.55 JMKXV
0.65 |**«
0.75 |*<*
0.85 |^i*
0.95 | )••
1 . 05 | *
1.15 I*-
A O CT 1 \. u v \r •..• \. • \
1 > 1 vt "ft J^" )f *Jl *lt P" )t
FREQ
101
14
10
25
33
37
51
70
02
1 11
162
242
n •* 443
294
222
199
121
89
87
55
37
34
25
22
18
158
CUM.
FkEU
101
115
125
150
183
220
271
341
423
534
6V6
938
13lh
1675
1897
2096
221 7
2306
2393
2448
2485
2519
2544
2566
2534
2742
PERCENT
3.68
0.51
0.36
0.91
1 .20
1 .35
1 .86
2.55
2.99
4.05
5.91
8.83
16.16
10.72
0.10
7.26
4.41
3.25
3.17
2.01
1 .35
1 .24
0.91
0.80
0.66
5.76
CUM.
PERCENT
68
19
3
4
4.56
5.47
6.67
8.02
9.83
12.44
15.43
1 9.4 I
25.30
34.21
50.36
61 .09
69.18
76.44
80.85
84.10
87.27
89.28
90.63
91 .07
92.78
93.58
94.24
100.00
100 200 300 400
FREQUENCY
-------
VfiRJAflLL
OLQGP
CLOCP
RES ID
Fig. 3. CLOGP-I Mode (2)
STATIST 1 C A L
N
ANALYSIS SYSTEM
16.36 THURSDAY, SEPTEMBER 9, 19«
MEAN
STD DL=.V
SUM
MINIMUM
MAXIMUM
2314 1.7878089? 1.334276424136.990000-3.69000000 7.54000000
2314 1.76274201 1.42505265 4078.V85000 -4.29000000 7.31000000
2314 0.02506698 0.39138656 58.005000 -2.20700000 2.62300000
CORRELATION COEFFICIENTS BETWEEN OLOGP AND CLOGP = .96189
R-SQUARED = .92523
FREQUENCY BAR CHART
MIDPOINT
RES ID
-1.25
-1.15
-1 .05
-0.95
-0.85
-0.75
-0.6'-.
-O.b5
-0.3L'
-0.25
-0.15
-0.05
0.05
0.15
0.25
0.35
0.4^
0.55
0.65
0.75
0.95
1 .05
1 .25
to
>
•A
)(.*.
*h*
H*rf-Ht
liritH* iH"fc
*to> }**********•
tt*Ktty*-**fc)»tt*y)ftttt
h»* »*•**»>•*>*•* fcjn
ft**** ti.)tfc*.4t».
•»fc*j-4f **•#**«
*»*•**•»
*i**to
«tf-ft
4>tt
rt
V
*
*
. — _-_ — .1 _._«.J_ JL
~ ^ T^
100 200 300
"REQ
6
3
6
13
10
24
44
66
74
106
156
237
448
290
214
1 92
112
84
66
45
27
?7
15
14
0
19
CUM.
FREQ
6
9
15
2y
46
70
114
1 80
254
360
516
753
1201
1491
1705
1 897
2009
2093
2159
2204
2231
2258
2273
2207
2295
2 3'i 4
PERCENT
0.26
0.13
0.26
0.56
0.78
1 .04
1 .90
2 . 85
3.20
4.58
6.74
'i 0 . 24
19.36
12.53
9.25
8.30
4.84
3.63
2.85
1 .94
1.17
1.17
0.65
0.61
0.35
0.82
CUM.
PERCENT
0.26
0.39
0.65
1 .21
1 .99
3.03
4.93
7.78
10.98
15.56
22.30
32.54
51 .90
64.43
73.68
81.98
86.82
90 . 45
93.30
9!5 . 25
96.41
97.58
90 . 23
98.83
99.18
100.00
400
FREQUENCY
-------
Fig. 4. CLOGP-I Mode (3)
S T A T 1ST I C A L
A N A L Y S I £ S \ STEM
12:50 FRtDAY, SEPTEMBER 10, 1 9H2
VAR i ADLE'
OLOGI1
CLOGP
RES ID
N
MTAN
S7D Dl V
SUM
MINIMUM
2454 1.76560460
2454 1.74323513
2434 0.02244947
1.311672/0 4332.990000 -3.69000000
1.3U963354 4277.899000 -4.29000000
0.36280510 55.091000 -2.20700000
MAXIMUM
7.54000000
7.31000000
2.53000000
CORRELATION COEFFICIENTS: HETUEEN OLOLP AND CLOG I-1 = .96'->56
R-SQUARED = .93231
MIDPOINT
RES ID
FREQUENCY BAR CHART
TREQ
1 .25
1 .15
1 .05
0.95
0.85
0.75
0.65
0.55
0.45
0.35
0.25
0.15
0.05
0.05
0.15
0.25
0.35
0.45
0.55
0.65
0.75
0.05
0.95
1 .05
1.15
1 .25
*
y-
**
*•**•
)Ht* V-
fc *!.>.> It*-
** y >' * ».
******
***)-«•)•
******
H-* «***!•
******
******
*)***
***
*»
*
«h-
)>•
*
*
100 200 300 400
FREQUENCY
CUM.
FREQ
3
3
5
9
14
20
46
63
70
123
178
251
4l39
320
239
199
117
82
63
45
29
30
17
11
8
12
3
6
11
20
34
54
100
163
241
364
542
793
1232
1602
1041
2040
2157
2239
2302
2347
2376
2406
2423
2434
2442
2454
PERCENT
CUM.
PERCENT
0.12
0.12
0.20
0.37
0.57
0.81
1 .87
2.57
3.18
5.01
7.25
10.23
19.93
13.04
9 . 74
8.11
4.77
3.34
2.57
1 .83
1 .18
1 .22
0.69
0.45
0.33
0.49
0.12
0.24
0.45
0.81
1 .39
2.20
4.07
6.64
9.82
14.83
22.0?
32.31
52.24
65.20
75.02
83 .M3
87.90
91 .24
93.81
95.64
96.82
98.04
98.74
99.19
99.51
100.00
-------
Fig. 5.
S T ft T T S T 1 C A L ANALYSIS S Y S T L fl :>1
11.21* WFDNESDAY, SCPTChUKR 8. 1 9(i.»
PLOT OF KESlDaCLOGP LEGEND- A - 1 (Mo, B =• '.-' OK, [1C.
3.0
S.S
r.o
1.5
1 .0
0.5
RES ID
0.0 •
-O.L5 •
-1 .0
-1 .5 •
-2.0 •
•
A
A A
h
A
A A A
A A
)• A A
A
A
AA A
A B
ABA ABA AA
AA AA AD AB AA A H A
A AABBBAAAD A A AAB B
A B B
ABDAD AACCACC ABD BAA
A DDAAE BUBBDGCEDAD CAB A l>
A B CAACCADCBFFKEIHCD ACBCCB A
A B A
CABBIFCLJMFIGF1CEGEBCCA A A A
A C A AB DBEFDGIJOUlUkdLI'JI'EDEnDUD AK A A
A A CBKDFGGGOlZRSTZQYQNPLMDCtirACBA A
CBEBHGGDLJTVRZZZZZZZZZYZUIIGCDDHA CB ft
B A
A A
A
A A
A
A
-2.5 +
1
1
-3.0 +
-5 -2
A nFFCltiCLPPVQS'UMRXUOLKGCDEEADBA A
AAAADECDCJHkILPNVUNNKKHGi)FGG AC. A
A ACCBBDCDIMKIrJLrKEJDCCBBBAA
B AA BbBCDCFKHHEbDDHCD C A A A
AA AAAAADDCCCBDCEEEDB AU A
AA DAAD BCACAAAA BAA A
AA AAAABC A BAA A
A A AAB A A A
A A
A A
A
A
A
+ 1 ^ . .
1 4 7
CLOGP
NGTil 103 DBS HIDDEN
STATISTICAL
A N A L Y y ] S S Y ." T E M '52
11 23 WEDNFSDAY, SUP TkhHFR 3, 19»:2
-------
|