- Sampling distribution of the difference in sample means
- Mean and standard deviation of difference of sample means
- Shape of sampling distributions for differences in sample means
- Sampling distribution of the difference in sample means: Probability example
- Differences of sample means — Probability examples
Practice using shape, center (mean), and variability (standard deviation) to calculate probabilities of various results when we're dealing with sampling distributions for the differences of sample means.
Intro and review
In this article, we'll practice applying what we've learned about sampling distributions for the differences in sample means to calculate probabilities of various sample results.
Skip ahead if you want to go straight to some examples.
Here's a review of how we can think about the shape, center, and variability in the sampling distribution of the difference between two means :
The shape of a sampling distribution of depends on our sample sizes and the shape of each population distribution from which we sample.
- If both populations are normal, then the sampling distribution of is exactly normal regardless of sample sizes.
- If one or both populations are not normal (or their shapes are unknown), then the sampling distribution of is approximately normal as long as our sample size is at least from the not-normal population(s).
The mean difference is the difference between the population means:
The standard deviation of the difference is:
(where and are the sizes of each sample).
This standard deviation formula is exactly correct as long as we have:
- Independent observations between the two samples.
- Independent observations within each sample*.
*If we're sampling without replacement, this formula will actually overestimate the standard deviation, but it's extremely close to correct as long as each sample is less than of its population.
Let's try applying these ideas to a few examples and see if we can use them to calculate some probabilities.
Every day, thousands of people at an airport pass through security on one of two levels: level A or level B. Suppose that, on average, it takes people minutes to pass through security on level A with a standard deviation of minutes. On level B, the mean and standard deviation are and minutes, respectively.
Each day, the airport looks at separate random samples of people from each level. They calculate the mean time for each sample, then look at the difference between the sample means .
What are the mean and standard deviation (in minutes) of the sampling distribution of ?
A large university has over students and over professors. Suppose that the ages of students are strongly skewed to the right with a mean and standard deviation of years and years, respectively. The ages of professors are also right-skewed, and their mean and standard deviation are years and years, respectively.
A student conducting a study plans on taking separate random samples of students and professors. They'll look at the difference between the mean age of each sample .
The student wonders how likely it is that the difference between the two sample means is greater than years.
Why is it inappropriate to use a normal distribution to calculate this probability?