Main content
AP®︎/College Statistics
Course: AP®︎/College Statistics > Unit 9
Lesson 7: Sampling distributions for differences in sample means- Sampling distribution of the difference in sample means
- Mean and standard deviation of difference of sample means
- Shape of sampling distributions for differences in sample means
- Sampling distribution of the difference in sample means: Probability example
- Differences of sample means — Probability examples
© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice
Differences of sample means — Probability examples
Practice using shape, center (mean), and variability (standard deviation) to calculate probabilities of various results when we're dealing with sampling distributions for the differences of sample means.
Intro and review
In this article, we'll practice applying what we've learned about sampling distributions for the differences in sample means to calculate probabilities of various sample results.
Skip ahead if you want to go straight to some examples.
Here's a review of how we can think about the shape, center, and variability in the sampling distribution of the difference between two means x, with, \bar, on top, start subscript, 1, end subscript, minus, x, with, \bar, on top, start subscript, 2, end subscript:
Shape
The shape of a sampling distribution of x, with, \bar, on top, start subscript, 1, end subscript, minus, x, with, \bar, on top, start subscript, 2, end subscript depends on our sample sizes and the shape of each population distribution from which we sample.
- If both populations are normal, then the sampling distribution of x, with, \bar, on top, start subscript, 1, end subscript, minus, x, with, \bar, on top, start subscript, 2, end subscript is exactly normal regardless of sample sizes.
- If one or both populations are not normal (or their shapes are unknown), then the sampling distribution of x, with, \bar, on top, start subscript, 1, end subscript, minus, x, with, \bar, on top, start subscript, 2, end subscript is approximately normal as long as our sample size is at least 30 from the not-normal population(s).
Center
The mean difference is the difference between the population means:
Variability
The standard deviation of the difference is:
(where n, start subscript, 1, end subscript and n, start subscript, 2, end subscript are the sizes of each sample).
This standard deviation formula is exactly correct as long as we have:
- Independent observations between the two samples.
- Independent observations within each sample*.
*If we're sampling without replacement, this formula will actually overestimate the standard deviation, but it's extremely close to correct as long as each sample is less than 10, percent of its population.
Let's try applying these ideas to a few examples and see if we can use them to calculate some probabilities.
Example 1
Every day, thousands of people at an airport pass through security on one of two levels: level A or level B. Suppose that, on average, it takes people 26 minutes to pass through security on level A with a standard deviation of 7, point, 5 minutes. On level B, the mean and standard deviation are 24 and 4 minutes, respectively.
Each day, the airport looks at separate random samples of 100 people from each level. They calculate the mean time for each sample, then look at the difference between the sample means left parenthesis, x, with, \bar, on top, start subscript, start text, A, end text, end subscript, minus, x, with, \bar, on top, start subscript, start text, B, end text, end subscript, right parenthesis.
Example 2
A large university has over 30, comma, 000 students and over 1, comma, 500 professors. Suppose that the ages of students are strongly skewed to the right with a mean and standard deviation of 21 years and 3 years, respectively. The ages of professors are also right-skewed, and their mean and standard deviation are 50 years and 5 years, respectively.
A student conducting a study plans on taking separate random samples of 100 students and 20 professors. They'll look at the difference between the mean age of each sample left parenthesis, x, with, \bar, on top, start subscript, start text, P, end text, end subscript, minus, x, with, \bar, on top, start subscript, start text, S, end text, end subscript, right parenthesis.
The student wonders how likely it is that the difference between the two sample means is greater than 35 years.
Want to join the conversation?
- I'm really enjoying this course, thank you, khan academy.(10 votes)