Main content

### Course: Statistics and probability > Unit 13

Lesson 1: Comparing two proportions# Hypothesis test comparing population proportions

Once again, Sal continues the discussion of election results to run a hypothesis test comparing population proportions. Created by Sal Khan.

## Want to join the conversation?

- What is the difference between this video and the last one? In the last vid, we wanted to know how much more likely are men to vote for the candidate then women, and we figured the probability out. Wasn't the null hypothesis already rejected in the previous vid.?(14 votes)
- The result should indeed be the same.

The difference is that here, he is solving the problem by using the 'hypothesis' test. To solve a problem you can use several methods. Some methods can't be used for every problem (some are only good for one sided or one-tailed tests, etc).

It's just a different way of working, and depends on how you interpret the problem. Hope that helps a bit.(23 votes)

- Great video. But why didn't we calculate the standard deviation using the traditional method i.e. estimate each standard deviation with the help of their respective sample standard deviations divided by the square root of 1000 and then add them to get the sampling distributions variance?(11 votes)
- Sal utilized what is known as a "pooled" sample technique for determining what the sample proportion's mean would be if there is no meaningful difference between men and women votes in the example (i.e. the null hypothesis). The difference results because in the case of proportions we are dealing with a sample-statistic quantity (i.e. a sample represented as a binary proportions of the 2 wholes) rather than the physical values of each sample. Thus, in the case of binary statistics, when you want to assume that the binary statistics come from the same population, it is appropriate to "pool" together the two proportions when estimating that population's same mean (p).(7 votes)

- Why Sal used p1=p2 assumption while calculating standarad deviation, as in the data , p1=0.642 and p2=0.591, then y he assumed p1=p2 to calculate standarad deviation. Also if p1=p2 then p1-p2 must be equal to 0 and not 0.051. Also the standarad deviation clculated in this video(0.02187) is different from standarad deviation in last video (0.022). Please explain(7 votes)
- Sal was solving for the null hypothesis which was that there is no difference between the male and female voting preference, therefore meaning that the means (mu) should be the same. The null hypothesis should always describe that all events are independent and have no effect on each other.

As for the difference in the standard deviations, it should have to do with the fact that Sal was doing a hypothesis test, which assumes both means (mu) are the same which would change the standard deviation.

Please forgive me if this is not correct.(6 votes)

- how do you work this if the populations are two different amounts...say like 200 males and 250 females at a 5% significant level.(4 votes)
- Would calculating the proportions change now? Because the males would take up a larger proportion, and if you assume the null hypothesis, would that sample affect the average p more?(4 votes)

- why is standard deviation of sampling distribution for null hypothesis calculated differently in this video and video "hypothesis test for difference of means"?(3 votes)
- Because in the
*Hypothesis test for difference of means*, we were comparing two means against each other, whereas in this video we are comparing two proportions.

However, the two expressions for the standard error in these two videos are actually equivalent, they're just expressed in different ways, because we're dealing with different types of data (numerical vs. categorical). Because of this, we use different statistics (sample mean vs. sample proportion), and hence the formulas look a little bit different.

If we coded the successes as 1 and failures as 0, then the sample proportion P would be equal to the sample mean, and if we calculated the standard deviation, it would be equal to the expressions that Sal uses (after we extend to two groups, as this case presents).(8 votes)

- When calculating the SD with p1=p2, why didn't he change the n from 1000 to 2000?(1 vote)
- I wish I could draw this... Remember that to get the standard deviation of the difference, we had sqrt(p1*(1-p1)/n1 + p2*(1-p2)/n2)? I used n1 and n2 instead of 1000. You can use this with different population sizes, and I find it a little confusing that he uses 1000 for both.For this H0, we assumed p1 = p2, so we POOLED the data together and got a new p = 0.6165. That's the new p we are using rather than p1 and p2, but we change nothing else. Now I can simplify the standard deviation to:

sqrt(p*(1-p)*(1/n1 + 1/n2))

1/n1 + 1/n2 simplifies to 2/1000 in THIS problem, but it won't be so neat when you have different population sizes.(10 votes)

- Why did Sal use a two tailed distribution? If the difference from the mean is more than 1.96 less wouldn't that area also mean that there is a difference between how men and women vote, so shouldn't it just be the upper area of the standard deviation used to calculate the confidence level?(3 votes)
- A one-tailed test would be appropriate if the alternative hypothesis were directional. If the alternative hypothesis is that the two populations are different (but not in a particular direction), then it's more appropriate to use a two-tailed test.(5 votes)

- Is assuming the null hypothesis to be true, the same thing as saying the claim is being made in the alternative hypothesis?(3 votes)
- I'm not sure I follow what you mean, exactly. The null hypothesis means that you assume there is no relationship, no phenomenon, no connection. In other words, the "null hypothesis" just means assuming the data are due to chance.

Another way of putting it is that if there is no verifiable reason to assume there is some cause/effect relationship, you assume there is none.

All of modern science is based on this principle. It is the basis of parsimony.(3 votes)

- Why the std. dev (=0.0217) was not computing like Sqrt( 0.6165*(1-0.6165) / 2000) = 0.0108 ?(4 votes)
- no, the two formulas are different (although they seem similar at the final stage)

#one for proportion of voting of all population (sum of men and women)

#the other for standard deviation of difference (between men and women)

1. proportion_all

= (votes_men + votes_women) / (n_men + n_women)

#votes_men=642, votes_women=591, n_men=n_women=1000

= (642+591) / (1000+1000)

= 0.6165

2. std_difference

= sqrt[std_men + std_men]

#std_men=p_men(1-p_men)/n_men (same for women)

= sqrt[p_men(1-p_men)/n_men + p_women(1-p_women)/n_women]

#p_men=p_woman=p_all by null hypothesis

= sqrt[p_all(1-p_all)/n_men + p_all(1-p_all)/n_women)]

#p_all=0.6165, n_men=n_women=1000

= sqrt[0.6165(1-0.6165)/1000 + 0.6165(1-0.6165)/1000]

#first and second term in [] are the same

= sqrt[2*0.6165(1-0.6165) / 1000]

~ 0.0217

#be careful not to multiply 2 with denominator too, or it will shrink std

in short, we "treated" our population as if they grew 2x to get the whole proportion. but we didn't. we cared them individually and then summed them up. but it happens for their sizes to be n_men=n_women=1000. this might be the cause of confusion, i believe. but a good news is we can use the formulas above for any sizes of populations (say n_men=1000, n_women=10000)(1 vote)

- To be numerically correct should not the calculated sample variance be multiplied by 2000/1999 to make it an unbiased estimator of the population variance? i.e. by (n/n-1)(3 votes)

## Video transcript

In the last couple of videos we
were trying to figure out whether there was a meaningful
difference between the proportion of men likely to vote
for a candidate and the proportion of women. And in the last video, we
actually estimated that using a 95% confidence interval for
the difference in the proportion of men and the
difference in the proportion of women. What I want to do in this
video is just to ask the question more directly. Or just do a straight up
hypothesis test to see is there a difference? So we're going to make
our null hypothesis. No difference. No difference between how the
men and the women will vote. Or another way of viewing, it
that the proportion of men who will vote for the candidate is
going to be the same as the proportion of women
who are going to vote for the candidate. Or another way you could say
that, is that the difference P1 minus P2, the true proportion
of men voting for the candidate minus the true
population proportion of women voting for the candidate
is going to be 0. That's are our null
hypothesis. Our alternative hypothesis is
that there is a difference. Or that P1 does not equal P2. Or that P1 minus P2, the
proportion of men voting minus the proportion of women voting,
the true population proportions, do not equal 0. And we're going to do the
hypothesis test with a significance level of 5%. And all that means, and we've
done this multiple times, is we're going to assume
the null hypothesis. And then assuming the null
hypothesis is true, we're going to figure out the
probability of getting the actual difference of our
sample proportions. So we're going to figure out
the probability of actually getting our actual difference
between our male sample proportion and our female
sample proportion. Given the assumption that our
null hypothesis is correct. And if this probability is
less than 5%, if this probability is less than
our significance level. So if the odds of getting these
two samples and the difference between those two
samples, is less than 5% percent, then we're going to
reject the null hypothesis. So how are we going
to do this? So if we assume the null
hypothesis, what does the sampling distribution of this
statistic start to look like? Well, if we assume that the true
population proportions are actually the same between
men and women. If P1 and P2 are actually the
same, then this right here is going to be 0. So what we can do is, we can
figure out that we got when we took the proportion of men and
we subtracted from that the proportion of women-- So this
is our sample proportion of men who, at least in our
poll, said they would vote for the candidate. This is a proportion of women
who said they would vote for the candidate. The difference between
the two was 0.051. So we can do is figure out
what's the probability? Assuming that the true
proportions are equal, that the mean of the sampling
distribution of this statistic is actually 0, what's the
probability that we get a difference of 0.051? So what's the likelihood that we
get something that extreme? And what we're going to do
here is just figure out a Z-score for this. Essentially figure out how many
standard deviations away from the mean this is. That would be our Z-score. And then figure out, is the
likelihood of getting a standard deviation, or that
extreme of a result, or that many standard deviations away
from the mean, is that likelihood more or
less than 5%? If it is less than 5%, we're
going to reject the null hypothesis. So let's first of all figure
out our Z-score. So we're assuming the null
hypothesis, P1 is equal to P2. Our Z-score, the number of
standard deviations that our actual result is away from the
mean, the actual difference that we sampled in the last few
videos between the men and the women was 0.051. And from that we're going
to subtract the assumed that mean. Remember, we're assuming that
these two things are equal. So the mean of this sampling
distribution right here is 0. So we're just going
to subtract 0. And then we have to divide this
by the standard deviation of the sampling distribution of
the statistic right here. P1 minus P2. Now, what's the standard
deviation of the distribution going to be? In the last video, we figured
out that we could represent it by this formula over here. But with our null hypothesis,
we're assuming that P1 and P2 are the same value. Let me rewrite it. So in our last video, and I
don't want to confuse the issue, because in the last
video, I made this approximation over here. So let me write the clean
version down here. We know that the standard
deviation of our sampling distribution of this statistic
of the sample mean of P1 minus the sample proportion, or sample
mean of P2, is equal to the square root of P1 times 1
minus P1 over 1,000, plus P2 times 1 minus P2 over 1,000. We've seen this in
several videos. But in the null hypothesis,
we are assuming that P1 is equal to P2. That's what we do. We assume the null hypothesis
and see the probability of this occuring. So if P1 is equal to P2, we
can just represent them as just some true population
proportion. So we could write it like this,
the square root of-- we can literally just factor out
1/1,000 times P times 1 minus P, plus P times 1 minus P. Because they're going to
be the same value. That's what we're assuming
in the null hypothesis. And so this is just two
of these over here. So this is going to be equal to
2P times 1 minus P, all of that over 1,000. And we're going to take the
square root of that. Now this is the standard
deviation, once again, of the distribution of this statistic
right over. The sample proportion for
the men minus the sample proportion of the women. Now, we still don't know this. We still don't know the
true proportion. But we can estimate it
using our samples. And since we're assuming that
the men and women, that there's no difference between
them, we can actually view it as a sample size of 2,000
to figure out that true proportion. So we can actually substitute
this with a sample proportion. And we can pretend like our
survey of the men and women is just one huge survey. So you have your sample
proportion, we're surveying a total of 2,000 people. 1,000 men and 1,000 women. But we're assuming that
they're no different. That's what our null hypothesis
is all about, assuming there's no difference
between men and women. And we got 642 yeses
amongst the men and 591 amongst the women. So we got a total
of 642 plus 591. If you viewed it as just one
huge sample of 2,000 people, we got 642 plus 591 is equal
to 1,233 divided by 2,000 gives us 0.6165. And this is our best estimate of
this consistent population proportion that is true
of both men and women. Because we are assuming that
they are no different. So we can substitute this value
in for P to estimate the standard deviation of the
sampling distribution of this statistic right over here. Assuming that the proportion of
men and women are the same. Or the proportion that will
vote for the candidate. So let's do that. It's going to be the square root
of 2 times P, which is 0.6165, times 1 minus
P, 1 minus 0.6165, divided by 1,000. Let make sure I got it. 2 times 0.6165, that's
that P right there. Times 1 minus P divided
by 1,000. We're taking the square root
of the whole thing. And so we get a standard
deviation of 0.0217. Let me write this over here. So this thing right over
here is 0.0217. So if we want to figure out
our Z-score, if we want to figure out how many standard
deviations the actual sample that we got of this statistic
right over here. If we want to figure out how
many standard deviations that is away from our assumed mean,
that there's no difference, then we just divide 0.051
by this standard deviation right over here. So let's do that. So we have 0.051 divided by this
standard deviation, and that was our answer up here. So I'll just do divided
by our answer. And we are 2.35 standard
deviations away. So our Z-score is
equal to 2.35. So just to review what we're
doing, we're assuming the null hypothesis, there's
no difference. If we assume there's no
difference, then the sampling distribution of this statistic
right here is going to have a mean of 0. And the result that we actually
got for the statistic has a Z-score of 2.34. Or this is equivalent to being
2.34 standard deviations away from this mean of 0. So, in order to reject the null
hypothesis, that has to be less probable than our
significance level. And to see that, let's see what
the minimum Z-score we need to reject our hypothesis. So let's think about
that a second. I'll go back to my Z-table. We want to have a significance
level of 5%. Which means the entire area of
our rejection, in which we would reject the null
hypothesis is 5%. This is a two tail test. An
extreme event on either far above the mean or far below
the mean will allow us to reject the hypothesis. So we care about the
area over here. And over here we would put
2.5% and over here we would have 2.5%. And we would have 95%
in the middle. So we need to find
this critical Z-score, critical Z-value. And if our Z-value is greater
than the positive version of this critical Z-value, then the
odds of getting something so extreme is less than 5%,
assuming the null hypothesis is correct. So then we can reject
the null hypothesis. So let's see what this
critical Z-value is. So essentially we want a Z-value
where the entire percentage below it is
going to be 97.5%. Because then you're going
to have 2.5% over here. And we've actually already
figured that out. This whole cumulative has to be
97.5%, we did that in the last video. If you look for that, you
get 0.975 right there. It's a Z-score of 1.96. I even wrote it over there. So this critical Z-value
is 1.96. So what that tells you is
there is a 5% chance of sampling a Z-statistic greater
than 1.96, assuming the null hypothesis is correct. Now, we just sampled a
Z-statistic of 2.34 assuming the null hypothesis correct. So the probability of sampling
this, given the null hypothesis is correct, is going
to be less than 5%. It is more extreme than this
critical view Z-value. It's going to be out
here some place. And because of that, we can
reject the null hypothesis. I'm sorry for jumping around
so much in this video. I had already written a lot. So I just kind of leveraged what
I had already written. But since the odds of getting
that, assuming the null hypothesis, are less than 5%,
and that was our significance level, we can reject the null
hypothesis and say that there is a difference. We don't know 100% sure
that there is. But statistically, we are in
favor of the idea that there is a difference between the
proportion of men and the proportion of women
who are going to vote for the candidate.