Main content

## Statistics and probability

### Course: Statistics and probability > Unit 13

Lesson 2: Comparing two means- Statistical significance of experiment
- Statistical significance on bus speeds
- Hypothesis testing in experiments
- Difference of sample means distribution
- Confidence interval of difference of means
- Clarification of confidence interval of difference of means
- Hypothesis test for difference of means

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Clarification of confidence interval of difference of means

Clarification of Confidence Interval of Difference of Means. Created by Sal Khan.

## Want to join the conversation?

- I am still confused about the word "confident". When he says he is confident there is a 95% chance, it sounds like "it is very probable that there is a 95% probability". That confuses me, so my question is: Is there a 95% probability, or is it not? If it is just very probable (we are confident, but not 100% sure) that there is a 95% chance, how probable is that our confidence is justified?

Is it one probability involved here, or two?

Maybe I am putting too much in the word confident and its use. But it would be simpler if he just said "there is a 95% chance", if that actually is the case.(10 votes)- Great explanation, the difference between a probability and knowing something has already happened really helped to drive the point home!(1 vote)

- Sal uses the given variance for the two samples, 4.67 and 4.04. Why doesn't he calculate a pooled variance for the separate variances and used that to calculate his confidence interval estimate?(2 votes)
- Say that the difference of two distributions were either less than 0.7 or greater than 3.12. Would observing either of those two scenarios indicate that something went wrong in the replication of the experiment? That perhaps one is dealing with new distributions accidentally?(2 votes)
- Those cases are part of the other 5% outside the 95% confidence interval. That doesn't mean the calculation was wrong. In fact, those cases were taken into account in the calculation by saying that we are 95% confident, not 100% confident, that the actual difference is between 0.7 and 3.12.(2 votes)

- Can I say that --> We are 95% confident that the true mean difference of weight loss lies between 0.7 Ibs and 3.12 Ibs(2 votes)
- Yes, that's what it's essentially saying.

i.e. If we ran this test 1000 more times, x1 would have some minimum and some maximum approximate value, and so would x2. On average, the difference between these two that we expect is 0.7 to 3.12 in favor of x1.

It's good to reiterate that this is not just a probability estimate, but an estimate in the strictest sense in that we assume the population standard deviation from a single sample. If you really did do 1000 more tests, you would actually get a better approximation by summing all of the samples into a sub-population of 100,000 and deriving the standard deviation within that instead.(1 vote)

- Hi. In the presentations (e.i Bernoulli / Margin of errors and in the course the difference of std dev uses the diff of sample std dev (as done in this video). However, when using the std dev of the sample, the denominator n-1 (degree of freedom) is used for accuracy purpose. In this video and the next, the diff of std dev uses the std dev of the sample std dev, but the denominator is n, not n-1. Though there is a high level of confidence (99%) that it is my comprehension which fails somewhere, I would be grateful if I could understand why the denominator n-1 is not used. Thanks a lot for your great videos, and my best wishes for 2016. Kindest regards. Laurent.(2 votes)
- there seem many things mixed here

1. n-1 (degree of freedom) vs. n-1 (denominator)

1) n-1 as degree of freedom is different from n-1 as denominator (though their values are the same). they are used in two different contexts

2) n-1 as DoF is for looking up t-table to do a hypothesis test (not involved in the calculation itself)

3) n-1 as denominator is for calculating std of samples from a population

# but we're doing neither 2) nor 3) here

2. n (denominator) vs. n-1 (denominator)

1) why we use n than n-1 as a denominator here?

: because our final concern is the mean of a population (than of a sample)

2) equations

std_pop = whatever_data / n

# but we don't know the values of whatever_data

# thus we rely on std of samples (not of one sample) to estimate std_pop

std_pop = std_samples / sqrt(n) = sqrt( variance_samples/n )

# let's ignore summing variances for two difference populations for now

3) std_sample = whatever_data / (n-1) as we learnt

# std_sample isn't std_samples. the former is only for std of one set of sampling (and its datapoints), while the latter is for that of many samplings (and their means)

3. why n-1?

1) the reason behind using n-1 than n for std_sample is because in most cases sample size is smaller than population size

2) and the numerator (whatever_data above) tends to be smaller than that of population

3) thus we divide it with a bit smaller number (n-1) to prevent std_sample from being unnecessarily small

by the way, thanks for asking this. it may help clear others' confusion about these concepts too(1 vote)

- What is an Interval?(0 votes)
- Hi Marilyn,

If you are not familiar with the notion of intervals, sampling or estimation as such then i'd say that this video would be a pretty good way to start off http://www.khanacademy.org/video/central-limit-theorem?playlist=Statistics

If even this feels strange to you then you'd better begin from the beginning of the Statistics playlist.

Cheers(14 votes)

- I don't really understand why our 0.7 to 3.12 interval tells us that we are confident 95% that we might lose weight and why not about not losing weight. Why Sal picked the first?(1 vote)
- because of the worst case: x1 is still 0.7 more weight loss than x2

he's saying if you ran the test again, there's a 95% chance that x1's WORST case is going to be some amount higher, and x2's BEST case is going to be some amount lower. Even when x1 is at its worst, and x2 is at its best (within 2 approximated std deviations at least), then x1 is still better(1 vote)

- I just dont know why he said that there is a 95% chance that 1.91 is withing 1.96 sigma of the mu.... It is disturbing me since yesterday.. someone please explain..(1 vote)
- just watch previous video. 1.91 is the sample mean, μ is the population mean and he uses the z-table to get 1.96 (standard deviations) interval for 95% confidence.(1 vote)

- Sal please please please prepare a script and tell every sentence just once. its very annoying to hear it repeatedly till you finish writing.(1 vote)
- That's cause he is thinking out loud and to be honest that is why I like his videos! This way you can follow his whole thought process while he is solving something.

scripted == book -> read a book(1 vote)

## Video transcript

Near the end of the last video,
I wasn't as articulate as I would like to be. Mainly because I think 15
minutes into a video my brain starts to really warm
up too much. But what I want to do
is restate what I was trying to say. We got this confidence
interval. I'll rewrite it here. I'll just restate the
confidence interval. So there's the 95% confidence
interval for the mean of this distribution. So, the mean of that
distribution, we got as being 1.91 plus or minus 1.21. And near the end of the
video I tried to explain why that is neat. Because here we have this
confidence interval for this weird mean of the difference
between the sampling means. So it seems kind of confusing. But I just want to
restate what we saw in previous videos. This thing right over here, the
mean of the difference of the sampling means, we saw
two or three videos ago. It's the same thing as the mean
of the difference of the means of the sampling
distributions. And we know that the mean
of each of the sampling distributions is actually the
same as the mean of the population distributions. So this is the same thing as
the mean of Population One minus the mean of
Population Two. And this was the neat result
about the last video. This isn't just a 95% confidence
interval for this parameter right here. It's actually a 95% confidence
interval for this parameter right here. And this is the parameter that
we really care about. The true difference in weight
loss between going on the low-fat diet and not going
on the low-fat diet. And we have a 95% confidence
interval that that difference is between 0.7 and
3.12 pounds. Which tells us that we have a
95% confidence interval that you're definitely going
to lose some weight. We're not 100% sure. We're confident that there's
a 95% probability of that. Anyway, hopefully that clarifies
it a little bit. I didn't want to confuse you
too much with that bungled language that I had at the
end of the last video.