If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Variance of a binomial variable

We can derive the variance of a binomial variable to be p(1-p), and the standard deviation is the square root of the variance.

Want to join the conversation?

  • leaf green style avatar for user ShenHong
    I think 'X = sum Y = nY' is mislead people
    since Y is a random variable
    sum of Y can't be equal to n*Y(n is a constant)

    to me its feel like Var(X) = Var(sum Y) = Var(nY) which is incorrect idea
    'Var(sum Y) = sum Var(Y)' and 'Var(nY) = n^2*Var(Y)' so they can't be equal

    it would be better to make clear 'X = sum Y' and 'Var(X) = Var(sum Y)'
    thanks and sorry for my bad English
    (13 votes)
    Default Khan Academy avatar avatar for user
    • cacteye blue style avatar for user Jerry Nilsson
      Yes, Sal's terminology is a bit sloppy...

      It would be clearer that 𝑋 is the sum of independent instances of 𝑌 if he had said
      𝑋 = 𝑌(1) + 𝑌(2) + 𝑌(3) + ... + 𝑌(𝑛) = ∑𝑌(𝑖)

      Then,
      𝐸(𝑋) = 𝐸(∑(𝑌(𝑖))) = ∑𝐸(𝑌(𝑖)) = 𝑛 ∙ 𝐸(𝑌)

      Similarly,
      Var(𝑋) = 𝑛 ∙ Var(𝑌)
      (18 votes)
  • duskpin seedling style avatar for user Lauren Ruth S.
    What does the variance of a binomial variable describe? Why is the variance of a binomial variable important?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      The variance of a binomial variable describes the spread or variability of the distribution around the mean (expected value). It gives us an idea of how dispersed the outcomes are from the expected number of successes. In practical terms, it helps in understanding the reliability or predictability of the outcomes. For example, a higher variance indicates more unpredictability in the number of successes across trials.
      (1 vote)
  • blobby green style avatar for user haisrilatha
    Variance of aY is a^2 variance(Y). How would that justify with the derivation
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user alwaysOCD
    Couldn't this be simplified by saying the Variance of a binomial variable is the variance of a Bernoulli distribution multiplied by n trials? The reason being the variance addition property.

    @ time Sal says "indeed the variance for a binomial variable" I think he means the variance of a Bernoulli distribution?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      Yes, that simplification captures the essence of the derivation. Since a binomial variable X can be thought of as the sum of n independent Bernoulli trials (each with variance p(1 − p)), the variance of X is n times the variance of a single Bernoulli trial, thanks to the property that the variance of the sum of independent variables is the sum of their variances. And, yes, at , it appears there was a confusion in terminology; what's described is indeed the variance of a single Bernoulli trial.
      (1 vote)
  • blobby green style avatar for user seohyeonlee2020
    Shouldn't var(nx) be n^2*var(x)?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      For the variance of nX, where X is a random variable and n is a constant, it's important to distinguish between multiplying the variable itself by n (affecting the variance by n^2) and summing n independent instances of a variable. For nX, Var(nX) = n^2Var(X) because scaling a variable scales its variance by the square of that factor. However, when you sum n independent copies of X (as in a binomial distribution being the sum of n Bernoulli trials), the variance is n × Var(X), not n^2 × Var(X), because the operation is additive for independent variables, not multiplicative.
      (1 vote)
  • aqualine ultimate style avatar for user Hawz
    From the last video in this video I am so confused. I don't even know where to start with my questions.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user joe.wadakethalakal
    Would this derivation of the variance = p(1-p) work if Sal started by using p(0-p)^2 + (1-p)(1-p)^2? I can't seem to derive the same result if I try calculate it this way. I guess my question is why did Sal use p(1-p)^2 as the first term and not p(0-p)^2? Shouldn't we arrive at the same result?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      When Sal uses p(1−p)^2 as the term, it seems there's a misunderstanding in your question. The correct formulation for the variance of a Bernoulli distribution should involve p(1 − p), where p is the probability of success. For a Bernoulli variable Y where Y = 1 with probability p and Y = 0 with probability 1 − p, the variance calculation uses E[Y ^2] − (E[Y])^2, leading directly to p(1 − p). The terms p(0 − p)^2 and (1 − p)(1 − p)^2 are not used because they're not correctly formulating the variance for a Bernoulli trial.
      (1 vote)
  • blobby green style avatar for user xl3197
    X and Y actually are two sets of data
    Therefore Var(X)=Var(Y+Y+Y...) = nVar(Y)

    Proof:
    Var(X+Y) = Var(X)+Var(Y)+2Cov(X,Y)
    If X and Y are independent of each other, then Cov(X,Y) = 0
    (0 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      Your statement is correct regarding the variance of sums of independent variables. When X and Y are independent, Cov(X,Y) = 0, and the variance of a sum X + Y equals Var(X) + Var(Y). Extending this to n identical, independent distributions Y, the variance of their sum (nY) indeed equals n × Var(Y).
      (1 vote)
  • blobby green style avatar for user Howard Young
    Isn't it when X=nY, μx=nμy, and σx=nσy, according to the impact of scaling a random variable. Then why Sal inferred from E(x)=nE(y) the conclusion Var(x)=nVar(Y)?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Rabia AL
    I learned that to find the variance of a function or a random variable I can use characteristic function of it but I couldn't find the characteristic function of X^2 to solve the problem below.
    If X ∼ bin(n, p) and Y = X^2
    Find the variance of Y by using characteristic function of X.
    (0 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      To find the variance of Y = X^2 for a binomially distributed X using the characteristic function, you first need to understand that the characteristic function for a binomial distribution X ∼ Bin(n, p) is given by ϕX(t) = (pe^it + (1 − p))^n, where i is the imaginary unit. However, finding the variance of Y = X^2 directly from the characteristic function of X is not straightforward because the characteristic function of Y isn't directly obtained from that of X. You would typically calculate the variance of Y by finding E[Y] and E[Y^2] from the probability distribution of Y, which, in this case, is not trivially related to the characteristic function of X. This approach might not be the most efficient for this particular problem.
      (1 vote)

Video transcript

- [Instructor] What we're going to do in this video is continue our journey trying to understand what the expected value and what the variance of a binomial variable is going to be or what the expected value or the variance of a binominal distribution is going to be which is just the distribution of a binomial variable. And so, like in the last video I have this binomial variable X that's defined in a very general sense. It's the number of successes from N trials, so it's a finite number of trials where the probability of success is equal to P, so the probability is constant across the trials for each of these independent trials, so the probability of success in one trial is not dependent on what happened in the other rials. And we also talked in that previous video where we talked about the expected value of this binomial variable we said hey, it could be viewed that this binomial variable can be viewed as the sum of N of what you could really consider to be a Bernoulli variable here. So, this variable, this random variable Y, the probability that's equal to one, you could do that as a success is equal to P. The probability that it's a failure that Y is equal to zero is one minus P, so you could view Y, the outcome of Y or whether Y is one or zero is really whether we had a success or not in each of these trials, so if you add up N Ys, then you are going to get X and we use that information to figure out what the expected value of X is going to be because the expected value of Y is pretty straightforward to directly compute. Expected value of Y is just the probability weighted outcomes. So, it's P times one plus one minus P, one minus P, times zero, times zero. This whole term's gonna be zero and so, the expected value of Y is really just P and so, if you said the expected value of X, well, that's just going to be, let me just write it over here, this is all review, we could say that the expected value of X is just going to be equal to, we know from our expected value properties that it's going to be equal to the sum of the expected values of these N Ys, or you could say it is N times the expected value, times the expected value of Y, the expected value of Y is P, so this is going to be equal to N times P. Now, we're gonna do the same idea to figure out what the variance of X is going to be equal to because we could see, we know from our variance properties, you can't do this with standard deviation but you could do it with variance and then once you figure out the variance, you just take the square root for the standard deviation, the variance of X is similarly going to be the sum of the variances of these N Ys. So, it's gonna be similarly N times the variance, N times the variance of Y. So, this all boils down to what is the variance of Y going to be equal be? So, let me scroll over a little bit, get a little bit of more real estate and I will figure that out right over here. Alright, so we wanna figure out the variance of Y, so variance of Y is going to be equal to what? Well, here it's going to be the probability squared distances from the expected value. So, we have a probability of P where what is going to be our squared distance from the expected value? Well, we're going to get a one with a probability of P, so in that case our distance from the mean or from the expected value, we're at one, the expected value we already know is equal to P, so that's that for that possible outcome, the squared distance times its probability weight and then we have, actually let me scroll over, well, I'll just do it right over here, plus we have a probability of one minus P, one minus P for the other possible outcome, so in that outcome we are at zero and the difference between zero and our expected value? Well, that's just going to be zero minus P and once again we are going to square that quantity and so, this is the expression for the variance of Y and we can simplify it a little bit. So, this is all going to be equal to, so, P times one minus P squared and then is just going to be P squared times one minus P plus P squared times one minus P and let's see, we can factor out a P times one minus P, so what is that going to be left with? So, if we factor out a P times one minus P here, we're just going to be left with a one minus P and if we factor out a P times one minus P here, we're just going to have a plus P. These two cancel out. This is just this whole thing is just a one. So, you're left with P times one minus P which is indeed the variance for a binomial variable. We actually proved that in other videos. I guess it doesn't hurt to see it again but there you have. We know what the variance of Y is. It is P times one minus P and the variance of X is just N times the variance of Y, so there we go, we deserve a little bit of a drum roll, the variance of X is equal to N times P times one minus P. So, if we were to take the concrete example of the last video where if I were to take 10 free throws, so each trial is a shot, is a free throw, so if I were to take 10 free throws and my probability of success is 0.3, I have a 30% free throw percentage, the variance that I would expect to see, so in that case the variance if X is the number of free throws I make after these 10 shots, my variance will be 10 times 0.3, 0.3 times one minus 0.3, so 0.7 and so, that would be what? This right over, so this would be equal to 10 times .3 times .7 times 0.21, so my variance in this situation is going to be equal to 2.1. Is equal to 2.1 and if I wanted to figure out the standard deviation of this right over here, I would just take the square root of this, so if we want the standard deviation, just take the square root of this expression right over here.