If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Probability distributions from empirical data

We can approximate a probability distribution by using empirical or observed data. Created by Sal Khan.

Want to join the conversation?

Video transcript

- [Instructor] We're told that Jada owns a restaurant where customers can make their orders using an app. She decides to offer a discount on appetizers to attract more customers. And she's curious about the probability that a customer orders a large number of appetizers. Jada tracked how many appetizers were in each of the past 500 orders. All right, so the number of appetizers, so 40 out of the 500 ordered zero appetizers, and for example, 120 out of the 500 ordered three appetizers and so on and so forth. Let X represent the number of appetizers in a random order. Based on these results, construct an approximate probability distribution of X. Pause this video and see if you can have a go at this before we do this together. All right, so they're telling us an approximate probability distribution, because we don't know the actual probability. We can't get into people's minds and figure out the probability that the neurons fire in exactly the right way to order appetizers. But what we can do is look at past results, empirical data right over here to approximate the distribution. So what we can do is look at the last 500, and for each of the outcomes think about what fraction of the last 500 had that outcome. And that will be our approximation. And so the outcomes here, we could have zero appetizers, one, two, three, four, five, or six. Now the approximate probability of zero appetizers is going to be 40 over 500, which is the same thing as four over 50, which is the same thing as two over 25. So I'll write two 25th right over there. The probability of one appetizer, well, that's going to be 90, the over 500, which is the same thing as nine over 50. And I think that's already in lowest terms. Then 160 over 500 is the same thing as 16 over 50, which is the same thing as eight over 25. And we just keep going. 120 out of 500 is the same thing as 12 out of 50, or six out of 25. Six out of 25, and then 50 out of 500. Well, that's one out of every 10. So I'll just write it like that. 30 out of 500 is the same thing as three out of 50. I'll just write it like that. And that last but not least, 10 out of 500 is the same thing as one in 50. And we're done. We have just constructed an approximate probability distribution for our random variable X.