If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

AP®︎/College Statistics

Course: AP®︎/College Statistics>Unit 9

Lesson 2: The central limit theorem

Sampling distribution of the sample mean (part 2)

More on the Central Limit Theorem and the Sampling Distribution of the Sample Mean. Created by Sal Khan.

Want to join the conversation?

• Whats the point of the central limit Theorem if it doesn't provide you with the actual population distribution. For Ex in this video the population distribution in reality was totally different than the normal distribution. So what the importance of this concept?
• Each separate sample we take from the population will be different - they will have different scores and different sample means. So how do we tell which sample gives us the best description of the population? Can we even predict how well a sample describes the population it is drawn from?

By using the distribution of sample means we have the ability to predict the characteristics of the sample. And one of the basic reasons behind taking a sample is to use the sample data to answer questions about the larger population.

The Central Limit Theorem helps us to describe the distribution of sample means by identifying the basic characteristics of the samples - shape, central tendency and variability. So the distribution of sample means helps us to find the probability associated with each specific sample.

And because there's always some discrepancy or error between a sample statistic and the corresponding population statistic, the CLT enables us to calculate exactly how much error to expect.
• Could you please give a practical example of the utility of the Sampling Distribution of the Sample Mean (SDSM) when used with a NON NORMAL distribution? It would seem that a non-normal distribution generated by some process would mean that the process was "out of control", multiple processes going on, or some such. If that's the case, of what utility is the SDSM when it does not describe the scattered output of said process? Thanks much.
• Why is the mean of the sampling distribution of sample means always equal to the population mean?
• In formulas:
``E[X] = µE[ xbar ] = E[ 1/n Σ xi ]= 1/n E[ Σ xi ]= 1/n Σ E[X]= 1/n ( n * µ )= µ``

Logically, it makes sense this should be the case. If some variable has mean µ, that means we expect a given value to be µ. There'll be some variation around that, but that's what we expect, on average. So, we're expecting the average to be µ. Then, if we get a lot of such sample means (that is: the sampling distribution), we're getting a whole lot of values which we expect to all be µ. The average of a lot of things that are all µ or very close to it, should also be µ.
• Is there a difference between 1000 times taking samples of 10 (as at ) and 10 times taking samples of 1000?
What about taking 1 time taking a sample of 10000?
...Or 10000 times taking a sample of 1?
• Yes. There is the difference. The bigger sample size you have - the narrow normal distribution you will get.

For example,

Sample size = 25, number of iteration = 5 || Sample size = 5, number of iteration = 25

mean ___________ 13.74 __________ || ____________ 14.32 ____________
median __________ 13.00 __________ || ____________ 15.00 ____________
SD ______________ 1.19 ___________ || ____________ 2.99 ____________
• @ actually, if you pulled a 9 and a 6, you would get 7 and a half.
• You're right! They should have one of those little correction pop-ups.
• Technically, Sal says that the bigger the sample data the normal"er" the distribution is, but if N=population then you would get a vertical line with a value of x=mean and y tends to infinity, right?
• Exactly correct! But we almost always assume that the size of the sample is much smaller than the population. If you could really "sample" the ENTIRE POPULATION, then by definition, that's not a "sample", so of course all this theory sort of becomes nonsense. The whole point of this is to try to answer the question, "what can I say about the population if I only know the values for a sample of the population?"
• at around , Sal says we are never going to get 7.5 when n=2. But if 6 and 9 are randomly selected, then 7.5 would be the average, so I'm not quite understanding his reasoning here?
• You're totally correct, Mike G! Your logic is solid; Sal made a mistake.
(1 vote)
• What happens when you take the sampling distribution of the sample mean of the sampling distribution of the sample mean of a set of observations? And what happens when you repeat that again and again on the same set? And how would the results relate to one another?
• Let's say that X is my data, xbar1 is the sample mean, and xbar2 is the sample mean of a collection of sample means (basically, a second-level sample mean)

For sake of argument, let's assume that X is Normally distributed with mean μ and SD σ/√n. Then for a sample of size n, xbar1 is Normally distributed with mean μ and SD σ/√n. If we thought of xbar1 as our random variable, and took samples of size m, we'd apply the same logic, and get that xbar2 is Normally distributed with mean μ and SD σ /√(nm). And so on.

However, I'm not sure if you're quite thinking along the right lines. The purpose of a sampling distribution is to understand how a statistic varies from sample to sample. Generally speaking, we will have a sample of data, and so only the "first level" sampling distribution would be relevant. While it's possible to think of a "second level" sampling distribution (as you have: the sampling distribution of the sample mean of the sampling distribution), by and large we just won't have any use for it.