If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Difference of sample means distribution

Sal walks through the difference of sample means distribution. Created by Sal Khan.

Want to join the conversation?

  • marcimus pink style avatar for user jwsecret136
    At "We saw this in the last video".
    where is the last video?
    I can't find the lecture content in last videos.
    So I don't understand lecture content at .
    Plese help me. . .
    (24 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Simon James Benjamin Mitchell
    A general statistical question - so much of the emphasis is placed on the mean as being representative of a given population, but what is the use of the population mean, and indeed the sampling distribution of the sample mean, when our population is not normally distributed?
    (6 votes)
    Default Khan Academy avatar avatar for user
  • leafers ultimate style avatar for user Brad Callum Carruthers
    I am struggling to differentiate between when the variance of a sample is sigma squared divided by n and when it is sigma squared divided by (n+1). I know that sigma squared divided by (n+1) is a better estimator, and is actually unbiased, but why would Sal be using sigma divided by just "n" in this video? Am I missing a crucial point?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • piceratops seed style avatar for user Barry Dowrick
    Thanks for the fab videos,
    I notice that they have been reorder since being recorded. In this one Sal often refers to "in the last video", but he is not referring to the one before in this sequence of videos. It would be great if you had a pointer to the video Sal is referring to.
    kind regards and many thanks
    Barry
    (6 votes)
    Default Khan Academy avatar avatar for user
  • old spice man green style avatar for user Tom Slade
    Saying Z = x^ - y^ after many videos using the Z distribution was very confusing lol
    (4 votes)
    Default Khan Academy avatar avatar for user
  • marcimus pink style avatar for user Lauren Heerts
    Is there a video somewhere about paired differences? I would love to see those worked out!
    Thank you for your videos, I love them!
    (4 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user hammondevan
    i have a question about a question i am doing for homework. dont have to answer the question its self, if someone could kind of clarify. "samples are taken from a normal population, will the distribution of the sample means also be normal?" what does it mean "distribution of the sample means"? i realize this is probably obvious but...
    (3 votes)
    Default Khan Academy avatar avatar for user
    • starky seedling style avatar for user deka
      for someone having a hard time with digesting what on earth "a distribution of sample" means, here's a bit weird derivation

      1. get a population distribution
      1) say you have 13 cats
      2) they have 13 weights
      3) you plot them on a graph
      > this is a population distribution (of their weights)

      2. get a sample (not sampling distribution!)
      1) you pick 3 cats among 13 at random
      2) plot their weights
      3) you got 1 sample distribution of n=3 of your cats from the population distribution above

      3. get a sampling distribution
      1) you do 2 above ten times with the same n=3
      2) plot their means on a graph (only 10 means of 3 cats, not any real weights of each cats!)
      3) now you get your sampling distribution

      when we are talking about "distribution of the sample means", we mean the 3-3) than 2-3). it literally means how distributed the 10 means of your 10 samples with 3 sample size for each from a population of 13 cats are!



      hope this to sweep away the fog in your head (as it did for mine)
      (2 votes)
  • marcimus pink style avatar for user Raymond Greenwood
    If the means of X and Y are sufficiently far enough apart could the distribution diagram have two vertices?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • starky seedling style avatar for user deka
      if you mean the distribution of the difference (mean_X - mean_Y), no there's 1 peak

      in fact the positions of their distributions on the same line doesn't matter for the graph of their difference. only the means and standard deviations matter
      (1 vote)
  • leaf green style avatar for user Matt K
    Do you recommend ck12 as the best resource for practicing problems related to the last third of the Inferential Statistics videos? The last exercise is z-scores 3 and I really would like to practice the later concepts. Thanks!
    (3 votes)
    Default Khan Academy avatar avatar for user
  • marcimus orange style avatar for user EllaLippen
    is there any practice questions
    (2 votes)
    Default Khan Academy avatar avatar for user

Video transcript

I want to build on what we did in the last video a little bit. Let's say we have two random variables. So I have random variable x. And let me draw its probability distribution. And actually, it doesn't have to be normal. But I'll just draw it as a normal distribution. So this is the distribution of random variable x. This is the mean. The population mean of random variable x. And then it has some type of standard deviation. Actually, let me just focus on the variance. So it has some variance right here for random variable x. This is x, the distribution for x. Let's say we have another random variable. Random variable y. Let's do the same thing for it. Let's draw its distribution. And let me draw the parameters for that distribution. So it has some true mean, some population mean for the random variable y. And it has some variance right over here. And I've drawn it roughly normal. Once again, we don't have to assume that it's normal. Because we're going to assume, when we go to the next level, that when we take the samples, we're taking enough samples that the central limit theorem will actually apply. But with that said, let's think about the sampling distributions of each of these random variables. So let's think about the sampling distribution of the sample mean of x. Let's say the sample size over here is going to be equal to n. So what is that going to look like? Well it's going to be some distribution. And we're assuming that n is a fairly large number. So this is going to be a normal distribution. Or it can be approximated with a normal distribution. Let me shift it over a little bit. I'm going to draw it a little bit narrow. Let me draw the mean. So the population mean of the sampling distribution is going to be denoted with this x bar, that tells us the distribution of the means when the sample size is n. And we know that this is going to be the same thing as the population mean for that random variable. And we know from the central limit theorem that the variance of the sampling distribution or, often called the standard error of the mean, is going to be equal to the population variance divided by this n right over here. And if you wanted the standard deviation of this, you just take the square root of both sides. Let's do the same thing for random variable y. Let's take the sampling distribution of the sample mean. But here, we're talking about y, random variable y. And let's just say it has a different sample size. It doesn't have to be a different one. But it just shows you that it doesn't have to be the same. So it has a sample size of m. Let me draw its distribution right over here. Once again, it'll be a narrower distribution than the population distribution. And it will be approximately normal, assuming that we have a large enough sample size. And the mean of the sampling distribution of the sample mean is going to be the same thing as the population mean. We've seen that multiple times. And its variance for the sample means, or the standard error of the mean. Actually, this isn't the standard error. Standard error would be the square root of this. So if I called this standard error of the mean, that's wrong. The standard error of the mean is the square root of this. It's the standard deviation. This is the variance of the mean. Don't want to confuse you. So the variance of the mean here is going to be the exact same thing. It's going to be the variance of the population divided by our sample size. And everything we've done so far is complete review. It's a little different, because I'm actually doing it with two different random variables. And I'm doing it with two different random variables for a reason. Because now I'm going to define a new random variable. We could just call it z. But z is equal to the difference of our sample means. It's equal to the x sample mean minus the y sample mean. So what does that really mean? Well, to get a sample mean, or at least for this distribution, you're taking n samples from this population over here. Maybe n is 10. You're taking 10 samples and finding its mean. That sample mean is a random variable. Let's say you take 10 samples from here and you get 9.2 when you find their mean. That 9.2 can be viewed as a sample from this distribution right over here. Same thing if this right here is m. Or if m right here is 12. You're taking 12 samples, taking its mean. And that sample mean, maybe it's 15.2, could be viewed as a sample from this distribution. As a sample from the sampling distribution. So what z is, z is a random variable where you're taking n samples from this distribution up here, this population distribution, taking its mean. Then you're taking m samples from this population distribution up here, taking its mean. And then finding the difference between that mean and that mean. So it's another random veritable. But what is the distribution of the z? So let's draw it. Well there's a couple of things we immediately know about z. And we kind of came up with this in the last video. Instead of writing z, I'm just going to write the mean of x bar, which is a sample from the sampling distribution of x, or the sample mean of x, minus the sample mean of y. We saw this in the last video. In fact, I think I still have the work up here. Yeah, I still have the work right up here. The mean of the difference is going to be the difference of the means. The mean of the difference is the same thing is the difference of the means. So the mean of this new distribution right over here is going to be the same thing as the mean of our sample mean minus the mean of our sample mean of y. And this might seem a little abstract in this video. In the next video, we're actually going to do this with concrete numbers. And hopefully it'll make a little bit more sense. And just so you know where we're going with this, the whole point of this is so that we can eventually do some inferential statistics about differences of means. How likely is a difference of means of two samples, random chance or not random chance? Or what is a confidence interval of the difference of means? That's what this is all building up to. So anyway, we know the mean of this distribution right over here. And what's the variance of this distribution? We came up with that result in the last video. If we're taking essentially the difference of two random variables, the variance is going to be the sum of those two random variables. And the whole point of that video is to show that it's not the difference of the variances, it's the sum of the variances. The variance of this new distribution-- and I haven't drawn the distribution yet-- The variance of this new distribution, I'll just write x bar minus y bar, is going to be equal to the sum of the variances of each of these distributions. The variance of x bar plus the variance of y bar. Actually, let me just draw this here. Just so we can visualize another distribution. Although, all I'm going to draw is another normal distribution. Let me scroll down a little bit. So the mean over here, the mean of x bar minus y bar, is going to be equal to the difference of these means over here. I don't have to rewrite it. Let me draw the curve. And notice, I'm drawing a fatter curve than either one. And why am I doing that? Because the variance here is the sum of the variances here. So we're going to have a fatter curve. It's going to have a bigger variance, or a bigger standard deviation than either of these. So then we have some variance here, variance of x bar minus y bar. Now what are these, in terms of the original population distribution? We came up with those results right over here. We know what the standard deviation is. We know that this thing is the same thing as the variance of the population distribution divided by n. We've done this multiple, multiple times. What's this going to be equal to? This is right here is the same thing as the variance of our population distribution. And the x just means this is for random variable x. But there's no bar on top. This is the actual population distribution, not the sampling distribution of the sample mean. So that divided by n. And then if we want the variance of the sampling distribution for y, let me do that in a different color. I'll use blue, because that was what we were using for the y random variable. That's going to be equal to this thing over here. And we've done this multiple times. Same exact logic as this. The population distribution for y divided by m. And so once again, I'll just write this out front. This is the variance of the differences of the sample means. And now if you wanted the standard deviation of the differences of the sample means, you just have to take the square root of both sides of this. You take the square root of this, you get the standard deviation of the difference of the sample means is equal to the square root of the population distribution of x. Or the variance of the population distribution of x divided by n plus the variance of the population distribution of y divided by m. And this is just neat. Because it kind of looks a little bit like a distance formula. I'll throw that out there as we get more sophisticated with our statistics and try to visualize what all of this kind of stuff means in more advanced topics. But the whole point of this is, now we can make inferences about a difference of means. If we have two samples, and we take the means of both of those samples and we find some difference, we can make some conclusions about how likely that difference was just by chance. And we're going to do that in the next video.