Main content

### Course: Statistics and probability > Unit 9

Lesson 4: Combining random variables- Mean of sum and difference of random variables
- Variance of sum and difference of random variables
- Intuition for why independence matters for variance of sum
- Deriving the variance of the difference of random variables
- Combining random variables
- Combining random variables
- Example: Analyzing distribution of sum of two normally distributed random variables
- Example: Analyzing the difference in distributions
- Combining normal random variables
- Combining normal random variables

© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Combining normal random variables

When we combine variables that each follow a normal distribution, the resulting distribution is also normally distributed. This lets us answer interesting questions about the resulting distribution.

## Example 1: Total amount of candy

Each bag of candy is filled at a factory by $4$ machines. The first machine fills the bag with blue candies, the second with green candies, the third with red candies, and the fourth with yellow candies. The amount of candy each machine dispenses is normally distributed with a mean of $50{\textstyle \phantom{\rule{0.167em}{0ex}}}\text{g}$ and a standard deviation of $5{\textstyle \phantom{\rule{0.167em}{0ex}}}\text{g}$ . Also, assume that the amount dispensed by any given machine is independent from the other machines.

Let $T$ be the total weight of candy in a randomly selected bag.

**Find the probability that a randomly selected bag contains less than**$178{\textstyle \phantom{\rule{0.167em}{0ex}}}\text{g}$ of candy.

Let's solve this problem by breaking it into smaller pieces.

## Example 2: Difference in bowling scores

Adam and Mike go bowling every week. Adam's scores are normally distributed with a mean of $175$ pins and a standard deviation of $30$ pins. Mike's scores are normally distributed with a mean of $150$ pins and a standard deviation of $40$ pins. Assume that their scores in any given game are independent.

Let $A$ be Adam's score in a random game, $M$ be Mike's score in a random game, and $D$ be the difference between Adam's and Mike's scores where $D=A-M$ .

**Find the probability that Mike scores higher than Adam in a randomly selected game.**

Let's solve this problem by breaking it into smaller pieces.

## Want to join the conversation?

- In Example 2: The hint says P(D < 0), why the probability of the difference between the two data has to be less than 0?(6 votes)
- We have D = A - M. If D < 0, then it can happen only when M > A, which means Mike scores higher than Adam.

P(D < 0) means probability of an event where Mike scores higher than Adam.

Hope that helps.(31 votes)

- In example 2 the number of pins is discrete, how could you represent that using a density curve ?(7 votes)
- Very good question! It turns out that, if Mike and Adam play a large number of games the distribution of their scores will be very well approximated by a normal distribution (even if their scores are discrete variables!). This is a consequence of something called the "Central Limit Theorem". Here is a video of Sal talking about it from the AP/ College Statistics series: https://www.khanacademy.org/math/ap-statistics/sampling-distribution-ap/what-is-sampling-distribution/v/central-limit-theorem(3 votes)

- In the Practice quiz they keep having an absolute value probability question. How does one go about solving that? For instance one example is Sam's mean of washing cars is 20 minutes with a standard deviation of 6.4 minutes. Taylor's mean of washing the interior of cars is 18 minutes with a mean of 4.8 minutes.

Then it says find the probability that a randomly selected time of Sam and Taylor falls within 10 minutes of each other and gives the equation find P(D less than |10|). So I do D=(S-T) and I get mean of D is 2 minutes and the standard deviation is 8 minutes. So far so good, but after that I always go wrong somehow. When I click on the explanation it says to do two z scores one of -10 and one of 10 and then calculate between them, but why would I do -10 and 10, it says within ten minutes of each other, wouldn't that mean you would do ten above and ten below the mean of D?(6 votes)- The mean and standard deviation explain the shape of the curve and can tell which percentages are above and below certain points. However, the question asks whether they finish within 10 minutes of each other. Since Taylor is 2 minutes quicker than Sam, the area under the curve is shifted. The center where they both have the same time is 0, reflecting where both Sam and Taylor have the same finish time (S-T). Calculate 10 minutes below and 10 minutes above 0, the place where they are equal, to find the percentages where they are finishing within 10 minutes of each other.(1 vote)

- Im confused on how you do the last part (problem D) for both examples(1 vote)
- In problem D of both examples, you're essentially finding the probability of a specific event occurring based on the normal distribution. This involves calculating the z-score corresponding to the given value (e.g., less than 180 grams of candy or Mike scoring higher than Adam in bowling) and then using a standard normal distribution table or calculator to find the corresponding probability.(1 vote)

- In example one, you give the z score for -2.2 as 0.0139, but on every other z table I look at, it's 0.41294. What's going on there?(1 vote)
- The discrepancy in the z-score and resulting probability might be due to different methods of calculation or rounding. It's essential to ensure consistency in the method of calculation and rounding when comparing results from different sources. If you're using a standard normal distribution table or calculator, the value of P(D < −2.2) should indeed be closer to 0.0139. If you're getting a different value, it's worth double-checking your calculation or the source of the z-table you're using.(1 vote)

- Hiya Sal and everyone at Khan, thank you for all your hard work. It would be really nice if we could get a worked example of a probability of an absolute value as that is something that comes up in the practice questions but wasn't covered in the videos leading up to it. Like P(X |5|) or something like that.(0 votes)
- P(X = |5|) = P(X = 5)

I think you meant something different, since you can always just replace |5| with 5.

If you meant to ask something like

P(|X| > 5)

this would be (I think)

P(|X| > 5) = P(X < -5) + P(X > 5)(2 votes)

- For example 2, I am confused why we are finding P(D<0). Since, Adam and Mike are playing bowling the difference of the two normal distributions must be discrete whole numbers? Then, D can't be any value between -1 and 0. Also, 0 isn't part of the solution space. Then we should be trying to find P(D<=-1)?

<= means less than or equal too.(1 vote)- As r.v. D = A - M. The only possible way to that Mike has more pins than Adam is when Mike's pins is greater than Adam's.

Therefore, for our question D should be a negative number. That's why in hint D < 0(0 votes)

- what happens when the probability is greater than a set mean? for example P(X>14)(0 votes)
- When the probability is greater than a set mean, for example P(X > 14), it means you're calculating the probability of the random variable being greater than 14. This can be interpreted as finding the probability of an event occurring where the outcome is greater than 14.(2 votes)

- again makes more sense that D=M-A, since the probability in demand is M scoring more than A(0 votes)
- In the context of Example 2, if D = M − A, it would represent the difference in scores where Mike scores more than Adam. However, the problem defines D = A − M, representing the difference in scores where Adam scores more than Mike. Both formulations are valid, but the problem explicitly defines the order of subtraction.(1 vote)