Psychology 207
Erwin Segal
Back to Syllabus
Hypothesis Testing and
Parameter Estimation
Problems to ponder
- How do you know what the
properties of the population are if all you have is a sample? (E.g., you
flipped a coin 20 times, or you gave an IQ test to 50 people, or you measured
the height of 30 10 year olds.)
- Whatever data you may
collect, they may have come from any of a large number of sampling distributions.
E.g., if your sample has a mean I.Q. of 105, it could be a random sample from
a population with a mean of 95, 100, 105, or even 115. How do you know?
Hypothesis testing is quantifying our lack of knowledge
- You don’t know what population
the sample is from!
- However, you can state
the probabilities or evaluate the likelihood that it did or did not not come
from certain sampling distributions.
- These probabilities can
be quantified. Hypothesis testing and parameter estimation are based on them.
Confidence
intervals 1
- If we have a random sample,
we know that it is a random sample from some population. In other words, the
sample is a sample of some true state of affairs, which is called the population.
We can use the properties of the sample to inform us about the properties
of the population from which it comes. The sample we have obtained is one
of a large number of random samples which could have been selected from the
population.
- A statistic based on
a sample may be thought of as a point in a sampling distribution of that
statistic based on many samples taken from the population.
- We can
estimate the population parameters from which the sample is drawn.
- Assume that our sample
of 50 people had a mean I.Q. of 105.
- What is the
mean of the population our sample comes from?
Hypothesis
testing 1
- The previous analysis based on
Confidence limits can be reversed.
- We can use a population with
some parameter which is critical for some particular reason as the basis
for generating a sampling distribution, and then test whether our sample
has a reasonable chance of coming from that population .
- If our sample statistic is too
far from the parameter, we conclude that the sample did not come from that
population.
- The assumption that the critical
population is true is called the null hypothesis
and is represented by H0.
- We do an experiment and get
a sample. Then we test whether it is reasonable for us to consider the sample
a random sample from the H0 population.
Examples
- Who is likely to win the
next election, Giuliani or Clinton?
- The null hypothesis,
H0, states that exactly half the potential voters are in favor
of each candidate. But H0 may be in error
. It may be that H1, the hypothesis that one of the candidates
is ahead, is true.
- A statistically equivalent
H0 is that a coin is unbiased.
When flipped often enough it will come up heads exactly half the time. H
1 may be that this particular coin is unbalanced, and in the long run
would come up heads at some unknown proportion other than .5. It may appear
10, 60, 85 or even 90 percent of the time.
- How can we test whether
H0 is in error?
- If we can identify a
critical sampling distribution, we can see if the actual results of an experiment
are reasonably likely.
- Standard procedures accept
H0 as reasonable, if a deviation from the mean of the sampling
distribution as large as that of our experiment is likely to occur more than
once in 20 experiments (e.g. 5% chance).
- The technical terminology
is, we will set a
at the .05 level.
- a
is defined as the probability of rejecting a true null hypothesis.
- If the sampling distribution
is normal, 95% of all samples are within 1.96 standard deviations of the mean.
- If we plan on sampling
1000 people (or flipping a coin 1000 times), H0 would be that
p=.5 is the probability that any single person would be for Giuliani and
q=.5 is the probability that the person is not for Giuliani (i.e. for Clinton).
If the random variable, X, is the proportion of people favoring Giuliani,
we would expect X for our sample to be close to .5.
- Using
we see that 95% of all 1000 person samples would have a proportion favoring
Giuliani within 1.96 X .0158 units from .5, or between .47 and .53. We would
thus reject H0 for any proportion outside this range.
- Any proportion in our
sample less than .47 would predict a Giuliani loss and any proportion greater
than .53 would predict a Giuliani win. Between .47 and .53 would be too close
to call.
Power:
Size of n
- Comparing the Binomial
with 100 Bernoulli trials to this one with 1000 trials we see important differences.
- With n= 1000 there is
much more power in the binomial than with N=100. The sampling distribution
is tighter and a greater % of the scores will be nearer the mean of their
distribution. Thus it is more unlikely that our sample will be far from its
mean.
- 95% of all samples are
within 1.96 standard deviations of the mean.
- In the Giuliani-Clinton
example, if n=1000, 95% of the sample proportions are within 1.96 X .0158
from .5, or between .47 and .53. We would reject H0 for any proportion
outside this range. If n=100,
=.05, we would not reject H0 unless Giuliani's proportion were
below .40 or above .60. Any score in the .40's or .50's would be too close
to call.
Hypothesis testing 2d
Hypothesis testing is quantifying our lack of knowledge
- You don’t know what population the sample is from!
- However, you can state the probabilities or evaluate
the likelihood that it did or did not not come from certain sampling distributions.
- These probabilities can be quantified. Hypothesis
testing and parameter estimation are based on them.
Possible truths
- When we do research we collect data and we can compute
a mean.
- There are only two possibilities: Either--
- H0 is true (sample comes from a particular
sampling distribution, with m0 and
s0) or
- H0 is false (sample comes from some other
sampling distribution)
- Let us assume that if H0 is false it comes
from a distribution we will call H1 (with
m1 and s1) in this
case
m1¹ m0
Type 1 and Type 2 errors
- Type 1 error is rejecting H0 when it is
true
- Type 2 error is accepting H0 when it is
false
- a is the probability
of making a type 1 error when H0 is true
- b is the probability
of making a type 2 error when H0 is false.
- The aim of statistical decisions is to jointly
minimize these errors
- 1-a is probability of accepting
a true H0
- 1-b is the power of a test.
It is the probability of rejecting H0 when it is false
- All of these are conditional probabilities. They are
conditional upon H0 being true for the a
conditions and upon H0 being false for the
b conditions.
Null Hypothesis H0 and alternate hypotheses
H1
- When hypothesis testing there are
always more than one sampling distribution to consider
- H0 is the sampling distribution that we
test statistically because we can specify the parameters very precisely
- H1 refers to an alternative hypothesis
that the sample we are testing might be a point in. We never know for certain
precisely what the parameters of H1 are.
- We assume that H0 is false if the probability
of getting the sample statistic is a or less.
a is usually .05, but sometimes .01
- The a level is set by the
experimenter and is not affected by sample size or variability.
More hypothesis testing and power
of a test
- If H0 is true, the probability of rejecting
it is a regardless of the other conditions. The
probability of accepting a true H0 is 1-a
. This is a correct decision
- If H0 is false however, the probability
of rejecting it is dependent on many factors such as sample size,
a level, variability and the size of the difference between the two
sampling distributions, i.e. m0 and
m1.
- The probability of accepting a false H0
is b. The probability of rejecting a false H
0, called the power of a test. It is 1-b
. Note that a , and 1-a
, are conditional probabilities. They depend on the condition of H0
being true. Likewise b and 1-
b depend on H0 being false.
- If a is set at .05 the probability
of making a type 1 error in an experiment is not .05. If H
0 is false, and it often is, the probability of a type 1 error is zero.
For any experiment p(a error)
[a.
- Only under those few conditions when H0
is true can one make a Type 1 error.
- If H0 is false rejecting H0
is a correct decision, not an error!
- If H0 is false, only by accepting H
0 can one make an error, that being a Type 2 error.
- If H0 is false one wants to reject it. The
greater the power, the more likely H0 is going to be rejected.
- If H0 is true, the probability of rejecting
it is a. If H0 is false, the probability
of rejecting it depends on the many factors which determine power; but rejecting
H0 is what must be done to avoid making an error.
- Determining the actual value of power (1-
b), the probability of rejecting a false H0, requires knowing
the actual difference between m0 and
m1, which is not known
for any experiment.
- b and 1-
b are computed for possible m1
-m0differences which seem to the experimenter
to be important.