If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: Statistics and probability>Unit 9

Lesson 8: More on expected value

# Getting data from expected value

Expected value refers to the average outcome you would expect from repeating an experiment over and over. It is calculated by multiplying each possible outcome by its probability of occurring, and summing those products together. If we know the expected value, we can go backwards and solve for frequency. Created by Sal Khan.

## Want to join the conversation?

• I got my feeling of not understanding a bit and frustrated
when, at he divides 67.4 by 20. I can´t get the intuition behind the steps he makes. I could memorize it by the sake of memorizing but i really want to grasp the understanding in order to be able to solve problems on my own
• Suppose that the average number of children per person is 1.5. This is the expected value. In a group of 100 people you would expect to find 1.5 * 100 = 150 chilldren. This is just the expected sum of children among 100 people. If you want to go back to the expected value, you need to divide the expected sum (150) by the number of observations (# of people) you are considering: 150/100 = 1.5.

Sal is doing the same thing: dividing the expected sum (67.4) by the number of observations (20) to get the expected value (3.37).
• why do you multiply by the die value?
• So for the dice problem you can think of it like getting an average of the results.
So if you rolled say 2, 5, 5 then your average value would be (2 + 5 + 5)/3 = 4
So you need to include the face value of the dice roll.

As for your comment where for the di face "there was a color" that's not really going to work the same way. In the dice problem rolling a 5 and 1 is the same as rolling 3 and 3. But if you're dice were colored and you roll yellow then blue then how would you add or average those values? In order to make it work for colors you'd probably need to assign the colors number values or do something else to make the problem make sense.
• I don't really get why do we multiply by the die value? It doesn't make any sense to me... Can anyone explain that as simple as possible? I'm really confused! :(
• Expected value of a dice roll assumes that the face of the dice is the value.
So notice you can't really do this with a coin. Unless you assign a number value to Heads and Tails a coin has no number value.

So for example,
If you roll 2 dice and get 2 and 4 then the average value is 3.
If you roll 3 and 4 then the average is 3.5.

So you could experimentally roll a hundred dice or a thousand or whatever and calculate the average and that would be an estimate for the Expected Value.

That's exactly the same as the problem in this video. By summing up all of the probabilities * values he finds the average or the estimate for the Expected Value.

Hope that helps.
• Why couldn't he just add all the known frequency numbers (110, 95, et cetera), subtract it from the total (500), and divide it by 2? That is what I did when he asked us to pause the video at the beginning, and I found the same answer that he did in the end.
• Because it was a coincidence that both values were 75. The first part of your reasoning is correct, if you sum the know frequencies and subtract from 500 you get a number that must be divided between the 2 missing frequencies, but you didn't know how the 150 were distributed before the rest of the procedure.

For example, with exactly the same data, but changing the expected value of the sum of 20 rolls to be 52.4, you would find that the first number is 150 and the second number is 0.
• I dont understand the difference between The expected Value (E(x)) and Mean Value (μ). Thank you.
• The mean of a set of values is simply the average of those values. For instance, if you had the set of numbers {1, 5, 7}, then the mean would be (1+4+7)/3 = 4.

Expected value can be very similar to (and sometimes even equal to) the mean. However, it requires an extra step: you must take into account that different outcomes may have different probabilities. Here's an example:

Let's say you had a machine that gave you a random amount of money each day, and that on any day it had a 20% chance of giving you \$50, a 50% chance of giving you \$100, and a 30% chance of giving you \$300. How much money could you expect to make after many many days? (Say, 1,000 days for example).

Well, based on the numbers above, we could say that we would probably make \$50 on about 200 of our days (because 20% of 1000 is 200), \$100 on about 500 of our days, and \$300 on about 300 of our days. Therefore, our total profits would be \$50 x 200 + \$100 x 500 + \$300 x 300 = \$150,000. So, based on this, how much money did we make each day on average? Well, \$150,000 / 1000days = \$150/day. This is the expected value.

The formula above can be simplified if we use 1 day instead of 1000 days. Essentially, this is done the same way, except that you'd substitute in the number 1 wherever you see the number 1000, and you'd get 20% x 1 day x \$50 + 50% x 1 day x \$100 + 30% x 1 day x \$300 = \$150/day.

In some cases, expected value can actually equal the mean. This happens when every possible outcome has the same probability of occurring. For example, rolling a die.

If you roll a die, each outcome has a 1/6 chance of occurring (assuming it's a fair die). Therefore, the expected value would be 1/6 x 1 + 1/6 x 2 + 1/6 x 3 + 1/6 x 4 + 1/6 x 5 + 1/6 x 6. This can be simplified to 1/6 + 2/6 + 3/6 + 4/6 + 5/6 + 6/6, which can then be simplified to (1+2+3+4+5+6)/6. This is exactly the same as the mean.

Anyways, great question! And I hope my comment helped clear things up a bit.
• At you mention you are taking a weighted sum of the values, isn't there an equal chance of any number between 1-6 coming up when you role a dice? If so why should any number be weighted more than another?
• The numbers 1-6 have an equal theoretical probability. That is, we know that the six sides are equally likely, and so we expect 1/6 of the rolls to come up as each number. For example, in 600 rolls we'd expect 100 of each number.

In practice, the observed results will vary slightly from the theoretical probability. The weighted sum Sal is calculating is the observed expected value (that is, the sample mean), he's using the sample data. So in 600 rolls there might be only 95 1's, and 105 2's, and those determine the weights for the sample mean.

It's somewhat odd to use "expected value" for observed results instead of theoretical results, but that's what Sal is doing.
• Why does an expected value of the sum of 20 rolls being 67.4 mean that the expected value of one roll is 67.4/20?
• I am quite confuse. First, she rolled the die 500 times, but only took into account 20 outcomes from rolling the die to get 67.4 as the expected value. If she had the trouble to roll the die 500 times, why not calculate the expected value using all the values she got? How this affects this expected value? I mean, if she had taken into account all the rolls, the expected value could be different? If so, using this "reduced" value to find the missing absolute frequencies doesn't make this solution inaccurate? Also, it is not very clear how she calculated the expected value. I mean, did she use the same formula as Sal's when trying to find the missing values(die_face_number x absolute_frequency / total_rolls)?
• A bit of confusion here.

First, she calculated the expected value per roll based on 500 rolls. In other words, she rolled the die 500 times, summed up the results, and then divided by 500 to find expected value per roll. Then, she wanted to know what value she could expect to get for 20 rolls; so she simply multiplied by 20.

The expected value for 500 rolls and the expected value for 20 rolls are both based on the expected value per 1 roll. Thus, since they are based on the same thing, you can solve for one based on the other, which is essentially what Sal did to solve for the missing values.

Hopefully that clears some things up for you.
• Why isn't expected value of 1 roll 3.5? Was it from experiment? I thought you could calculate expected value before experiment, just from probability of each event happening.
(1 vote)
• In this case, Sal is calculating the expected value according to the data that Jamie collected.

If the die is in fact a 'fair' die then you would be exactly right that the expected value of a given roll is 3.5. In this case however, Sal is calculating the expected value according to the data collected because that will provide information to figure out what the two washed away values were.
• What is the conclusion: is the die fair or unfair? Is the answer given by Sal here correct in either case?
• Based on the solution provided, the die is not fair. The calculated frequencies for each outcome are not equal, indicating that some outcomes are more likely than others. Therefore, the die is biased or unfair.
(1 vote)

## Video transcript

Voiceover:Jamie's dad gave her a die for her birthday. She wanted to make sure it was fair, so she took her die to school and rolled it 500 times and kept track of how many times the die rolled each number. Afterwards, she calculated the expected value of the sum of 20 rolls to be 67.4, the expected value of the sum of 20 rolls to be 67.4. On her way home from school, it was raining, and 2 values were washed away from her data table. Find the 2 missing absolute frequencies from Jamie's data table. So you see here, she rolled her die 500 times, and she wrote down how many times she got a 2. She got a 2 110 times, a 3 95 times, a 4 70 times, a 5 75 times, and then she had written down how many times she got a 1 and a 6, but then it got washed away, so we need to figure out how many times she got a 1 and a 6, given the information on this table right over here and given the information that the expected value of the sum of 20 rolls is 67.4. I encourage you to pause this video and think about it on your own before I give a go at it. So first, let's think about what this expected value, the sum of 20 rolls being 67.4 tells us. That means that the expected value of 1 roll, the expected value of the sum of 20 rolls is just 20 times the expected value of 1 roll. The expected value of a roll, let me do it here, expected value of a roll is going to be equal to 67.4 divided by 20. We can get our calculator out. Let's see. So we have 67.4 divided by 20 is 3.37. So this is equal to 3.37. So how does that help us? We know how to calculate an expected value given this frequency table right over here. If we say that this number right over here, let's say that's capital A and let's say that this number here is capital B, if we were to try to calculate the expected value of a roll, what we really want to do is take the weighted frequency of each of these values, the weighted sum. So, for example, if we got a 1 A out of 500 times, it would be A out of 500 times 1, times 1 plus, I'll do this in different colors, plus 110 out of 500 times 2, plus 110 out of 500 times 2. Notice, this is the frequency which was they got 2 times 2. We're taking a weighted sum of these values. And then plus 95 out of 500 times 3, plus 95 out of 500 times 3, plus, I think you see where this is going, 70 over 500 times 4, plus 70 over 500 times 4, almost there, plus, let's see, I haven't used this brown color, plus 75 over 500 times, I'll do it here, plus 75 over 500 times 5. Finally, plus B over 500, plus B over 500 times 6, this is going to give us our expected value of a roll, which is going to be equal to 3.37. So all of this is equal to 3.37. One thing that we can do, since we have all these 500s and this denominator right over here, let's multiply both sides of this equation times 500. If we do that, the left-hand side becomes, well, 500 times A over 500 is just going to be A plus 110, plus 110 times 2. So it's going to be 220. Plus 95 times 3, that's going to be 15 less than 300, so it's going to be plus 285, plus 285, and then 70 times 4 is 280, plus 280. 75 times 5 is going to be 350 plus 25, 375, so plus 375, plus 6B. let me make sure I'm not skipping any steps here, plus 6B is going to be equal to this times 500, and that is going to be equal to 3.37 times 500 is equal to 1,685, 1,685. All I did to go from this step right over here, which I set up saying here, this is the expected value of one roll, which we already know to be 3.37, is I just multiplied both sides of this equation by 500. I just did this times 500, and I did this times 500, and this 500 obviously cancels with all of these, and then 500 times 3.37 is 16.85, and so I got this right over here. Now, I got 1, 2, 3, 4, 5, 6. Yup, I did enough. I have the right number of terms. I just want to make sure I'm not making a careless mistake. If we want to simplify this, we can subtract 220, 285, 280, and 375 from both sides. If we did that, we would get A, if we subtract that from the left-hand side, we're just going to get A plus 6B, A plus 6B. And on the right-hand side, we are going to get, let's get our calculator out, 1,685 minus 220, 220, minus 285 minus 280 minus 375 gets us to 525. So we get A plus 6B is equal to 525. You say, "OK, you did all that work, "but we still have one equation with 2 unknowns. "How do we figure out what A and B actually are?" We know something else, We know, and this is actually much easier to figure out, we know that the sum of this whole table right over here, A plus 110 plus 95 plus 70 plus 75 plus B is equal to 500. Or if we ... Let me write that down. So we know that A plus 110 plus 95 plus 70 plus 75 plus B needs to be equal to 500. Or we could subtract 110 plus 95 plus 70 plus 75 from both sides and get, if you subtract it from the left-hand side, you're just left with A plus B, A plus B, and on the right-hand side, if we start with 500, so 500 minus 110 minus 95 minus 70 minus 75 gets us to 150. So A plus B must be equal to 150, is equal to 150. Now we have a system of 2 equations and 2 unknowns, and so we know how to solve those. We could do it by substitution or we could subtract the second equation from the first, so let's do that. Let's subtract the left-hand side of this equation from that or essentially, we could multiply this one times a negative 1 and then add these 2 equations. The As are going to cancel out, and we are going to be left with 6B minus B is 5B is equal to 375, is equal to 375. Did I do that right? If I add 125 to this, I get to 500, then another 25, I get to 525. So 5B is equal to 375, or if we divide both sides by 5, we get B is equal to 75. B is equal to 75. This right over here is equal to 75. If B is equal to 75, what is A? We know that A plus B is equal to 500. We figured that out a little while ago before we multiplied both sides of this times a negative 1. We knew that A plus B, when B is now 75, so we could say A plus 75, is equal to 150, and that's just from this, we figured out that A plus B is equal to 150 before we multiplied both sides times a negative. Subtract 75 from both sides, you get A is also equal to 75. And we are done.