F distributions are not normally distributed. Since a variance is an average of a squared score it cannot be negative, so the ratio cannot be less than zero. F distributions start at zero and get larger. The shape changes as the size of the samples in the variances gets larger. Because of this there are many F distributions. Each one depends on the df's in both the numerator and the denominator. F tables start on page 529 in the text. This table only gives two values for each distribution, the critical values for p < .05 and p< .01.The probability of getting a number larger than the critical value if Ho is true is .05 or .01 for each F value listed.
If the samples are all truly random
samples from the same normal population the between variance and the within
variance are two independent estimates of the population variance.
They should only differ from each other due to random variability. If
is computed a statistic based on the ratio of two independent variance
estimates of the same population when the estimates are independent of one
another and differ only due to randomness. In analysis of variance the ratio
is usually constructed with the between variance in the numerator and the
within variance in the denominator.
There are many F distributions,
based on the degrees of freedom, i.e. the number of independent measures
in both the numerator and the denominator. The F table in the text, p 529,
lists the critical values for and a of .05 and
.01 for many of these distributions. It is less than a probability of .01
or .05 that a computed F larger than the critical value of the appropriate
distribution will occur simply due to random variation.
One Way Analysis of Variance (ANOVA)
The logic of ANOVA is: The samples
are treated differently. The researcher has some idea that the different
treatments have different effects on the dependent measure,
, so that the different samples
that if the means do not all come from the same population. If
is true the means should be more different from one another, i.e., more
spread out, than if the means all come from the same population. If that is
the case the variance based on these means should be larger than if they all
come from the same population.
Computing ANOVA using multiple summation signs --if you understand it, it is easier than the text method. Computational formulae for one way ANOVA is in Box 14.3 on page 340.
Once ANOVA is computed it is tested
for significance using the F table, p. 529. If F is larger than the critical
value, reject
and conclude that at least one mean is from a population different from
the others. If the F is significant, one can get a quick measure of
how much of the total variance is due to the difference between means. This
is eta squared.
. Eta squared is analogous to r squared.
The text discusses eta squared on p 345 and call it R squared.
A worked example of a simple
ANOVA including eta square and Neuman-Keuls done on Excel
Planned
or post hoc comparisons
Afterwards there are ways such as
t tests, Newman-Keuls tests, or Tukey HSD tests to try to locate which means
are significantly different from one another. The simplest way to compare
the means is by t-tests. One cannot simple run t tests between all of the
means in the analysis because the probability of rejecting a true Ho becomes
prohibitively high. However, I feel that if Ho is rejected it is legitimate
to run t tests between means that are ordinally adjacent to each other.
. Interestingly, these t-tests are computationally equivalent to the Newman-Keuls
for the first column, number of steps=2.
The Newman-Keuls and Tukey tests can be done between any two pairs, They
require computing Q, a statistic that is computationally similar to the t
test.
The variance in the denominator is the within mean square and n is
the number of scores used to compute each mean. Significance is affected by
size of Q, the df, and, for the Newman-Keuls, the number of means separating
the two means being compared when the means are put in numerical order.
Tukey's HSD is more conservative, and less powerful. It uses all of the
means in the comparison set for all comparisons.
To compute the Newman-Keuls (or the Tukey HSD) you can enter the values
from the ANOVA directly into this formula and solve for Q. Or you can find
the relevant Critical Value of the difference between means required for
significance. Look up the critical values of Q in the Q table (Table A.5
in Hurlburt) and solve for the critical value of the difference between means
according to this formula
. For Tukey HSD any difference between means greater than this
critical value is significant. For Newman- Keuls this value has to be recomputed
for each number of steps between means.