Main content

### Course: Statistics and probability > Unit 9

Lesson 1: Discrete random variables- Random variables
- Discrete and continuous random variables
- Constructing a probability distribution for random variable
- Constructing probability distributions
- Probability models example: frozen yogurt
- Probability models
- Valid discrete probability distribution examples
- Probability with discrete random variable example
- Probability with discrete random variables
- Mean (expected value) of a discrete random variable
- Expected value
- Mean (expected value) of a discrete random variable
- Expected value (basic)
- Variance and standard deviation of a discrete random variable
- Standard deviation of a discrete random variable

© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Variance and standard deviation of a discrete random variable

We learn how to calculate the mean and standard deviation of a discrete random variable. The concept of a random variable is explained, along with methods to calculate its expected value (mean) and measure its spread (variance and standard deviation). A practical example makes the concept easier to understand.

## Want to join the conversation?

- why we didn't divide by n or (n - 1)?(18 votes)
- That would be the standard deviation of a
**dataset**. Dividing by (n-1) is done in order to correct bias. Sal explained the reason why this is done in this video: https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-data/modal/v/another-simulation-giving-evidence-that-n-1-gives-us-an-unbiased-estimate-of-variance

Moreover, the standard deviation represents the average dispersion of a certain set of values. Thus, if the relative frequency is used to calculate the expected value, the relative frequency is also used to calculate the standard deviation.(14 votes)

- i thought that the mean was the average of the sum. how come the mean of the discrete random variable is only the sum of the x(p(x)) instead of sum:x(p(x)) / number of values of x? I hope that makes sense.(6 votes)
- The expected value is sigma xp(x) by definition. What this implies if there are three numbers let say 1, 5, 10, and three number have equally likely chance of occurring:

then the expected value is (1+5+10)/3 = 16/3 = 5.33...

If the probabilty the values occurring are different then you would have to use xp(x). Let now say 1 occurs with 0.5 chance, 10 with chance of 0.2 and 5 with chance of 0.3 . Then the expected value is 0.5(1)+0.3(5)+0.2(10)= 3.4.

Note that mean and expected value are the same thing. It is just we extending concept of mean or in other words expected value for various probability mass function where each event does not have the same chance of occurring.(16 votes)

- What is the difference between variance and standard deviation? Why is standard deviation the square root of the variance? Thanks!(6 votes)
- The variance is an indicator of the dispersion but doesn't carry any immediate information about it (for instance, how could you interpret a variance of 1.19 from a random variable in comparison with a variance of 2.34 from another r.v.?).

Standard deviation allows you to "standardize" the dispersion for large number of samples (or initially based on normal distribution): if your std is 1.09 and your mean is 2.1, you can say that 68% of your values are expected to be between 2.1-1.09 and 2.1+1.09 (mean + 1 std) for instance.

Basically (and quite naively), std is a way to standardize the value given by the variance.(11 votes)

- At6:22, how is for standard deviation to be intuitive? Given the way it is calculated, I think standard deviation is similar to the mean of deviation. But how can it be reasonable intuitively?(10 votes)
- Just a quick question, why do we have to time P(X) again when calculating Standard Deviation? This was done already when calculating E(X) so the mean 2.1 should be weighted already no?(5 votes)
- I think a good way to understand 'weight' is with the concept of frequency. With the random variable X, 0 occurs 10% of the time. So, there were 100 data points/ experiments, we would estimate 10 of them to be zero. Of course, we do not have the number of data points, but we do have the frequency that 0 will probably occur.

Long story short, we cannot ignore how often a data point shows up, otherwise we are ignoring a big portion of the data.

Forgive me if that was confusing. If you need clarification, let me know.(2 votes)

- So basically, the formula of finding variance of a discrete random variable is

X= random variable

P(X)= probability of random variable

Σ=sum

σ^2= variance

µ=mean`σ^2 =Σ[`

**X-µ ]^2**⋅ P(X)*variance is equal to the*(µ), then we multiply it with the P(X) the X's probability_.**sum of squared difference between X(respectively) and the mean**

To find the standard deviation(σ), we simply just have to take square root of both side,( usually do it after found your variance):

√（ σ^2） =√（Σ[**X-µ ]^2**⋅ P(X)）*Feel free to correct me if I have made any mistake here, since im just another learner as well, cant be 100% right*.(4 votes) - What I love about Sal is that he explains the concept behind every equation or method so we wouldn't have to just memorize it.

But he didn't do so this time with the VAR equation. Can anyone explain it?(3 votes) - can we apply z-scores to discrete random variables(2 votes)
- In general when calculating the variance and standard deviation we divide by N resp. n-1. Why does this not apply in case of discrete random variables ?(2 votes)
- Can we apply the concept of z-score to discrete randome variables?

For example, if X=4, z score is (4-2.1)/1.19 = 1.597

So X=4 is 1.597 standard deviations away from the expected value.

Is my understanding correct?(1 vote)

## Video transcript

- [Instructor] In a previous video, we defined this random variable x. It's a discrete random variable. It can only take on a
finite number of values, and I defined it as the number of workouts I might do in a week. And we calculated the
expected value of our random variable x, which we could also
denote as the mean of x, and we use the Greek letter mu, which we use for population mean. And all we did is, it's the probability-weighted sum of the various outcomes. And we got for this random variable with this probability distribution, we got an expected value or a mean of 2.1. What we're gonna do
now is extend this idea to measuring spread. And so we're going to think about what is the variance of
this random variable, and then we could take
the square root of that to find what is the standard deviation. The way we are going to do this has parallels with the
way that we've calculated variance in the past. So the variance of our random variable x, what we're going to do
is take the difference between each outcome and the mean, square that difference, and then we're gonna multiply
it by the probability of that outcome. So for example for this first data point, you're going to have
zero minus 2.1 squared times the probability of getting zero, times 0.1. Then you're going to get plus one minus 2.1 squared times the probability that you get one, times 0.15. Then you're going to get plus two minus 2.1 squared times the probability that you get a two, times 0.4. Then you have plus three minus 2.1 squared times 0.25. And then last but not least you have plus four minus 2.1 squared times 0.1. So once again, the difference between each outcome and the mean, we square it and we multiply times the probability of that outcome. So this is going to be
negative 2.1 squared, which is just 2.1 squared, so I'll just write this as 2.1 squared, times .1. That's the first term. And then we're going to have plus one minus 2.1 is negative 1.1, and then we're going to square that, so that's just going to be
the same thing as 1.1 squared, which is 1.21 but I'll just write it out, 1.1 squared times .15. And then this is going to be two minus 2.1 is negative .1. When you square it is
going to be equal to. So plus .01. If you have negative .1 times negative .1, it's .01 times 0.4, times .4. And then plus we this is going to be 0.9 squared, so that is .81 times .25. And then we're almost there. This is going to be plus 1.9 squared, 1.9 squared times .1. And we get 1.19. So this is all going to be equal to 1.19. And if we wanna get the standard deviation for this random variable, we would denote that with
the Greek letter sigma. The standard deviation
for the random variable x is going to be equal to the
square root of the variance. Square root of 1.19, which is equal to, just get
the calculator back here, so we are just going
to take the square root of what we just, let's
type it again, 1.19. And that gives us, so it's approximately 1.09. Approximately 1.09. So let's see if this makes sense. Let me put this all on a
number line right over here. So you have the outcome zero, one, two, three, and four. So you have a 10% chance
of getting a zero. So I will draw that like this, let's just say this is a height of 10%. You have a 15% chance of getting one, so that would be 1 1/2 times higher. So it would look something like this. You have a 40% chance of getting a two. That's going to be like this. You have a 40% chance of getting a two. You have a 25% chance of getting a three. Like this. And then you have a 10%
chance of getting a four. So like that. So this is a visualization
of this discrete probability distribution where I didn't draw the vertical axis here, but this would be .1, this would be .15, this would be .25, and that is .4. And then we see that the mean is at 2.1. The mean is, the mean is at 2.1, which makes sense. Even though this random
variable only takes on integer values, you can have a mean that takes on a non-integer value. And then the standard deviation is 1.09. So 1.09 above the mean is going to get us close to 3.2, and 1.09 below the mean is
gonna get us close to one. And so this all at least
intuitively feels reasonable. This mean does seem to be indicative of the central tendency
of this distribution. And the standard deviation does seem to be a decent measure of the spread.