If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Course: Statistics and probability>Unit 9

Lesson 6: Binomial mean and standard deviation formulas

Variance of a binomial variable

We can derive the variance of a binomial variable to be p(1-p), and the standard deviation is the square root of the variance.

Want to join the conversation?

• I think 'X = sum Y = nY' is mislead people
since Y is a random variable
sum of Y can't be equal to n*Y(n is a constant)

to me its feel like Var(X) = Var(sum Y) = Var(nY) which is incorrect idea
'Var(sum Y) = sum Var(Y)' and 'Var(nY) = n^2*Var(Y)' so they can't be equal

it would be better to make clear 'X = sum Y' and 'Var(X) = Var(sum Y)'
thanks and sorry for my bad English
• Yes, Sal's terminology is a bit sloppy...

It would be clearer that 𝑋 is the sum of independent instances of 𝑌 if he had said
𝑋 = 𝑌(1) + 𝑌(2) + 𝑌(3) + ... + 𝑌(𝑛) = ∑𝑌(𝑖)

Then,
𝐸(𝑋) = 𝐸(∑(𝑌(𝑖))) = ∑𝐸(𝑌(𝑖)) = 𝑛 ∙ 𝐸(𝑌)

Similarly,
Var(𝑋) = 𝑛 ∙ Var(𝑌)
• What does the variance of a binomial variable describe? Why is the variance of a binomial variable important?
• The variance of a binomial variable describes the spread or variability of the distribution around the mean (expected value). It gives us an idea of how dispersed the outcomes are from the expected number of successes. In practical terms, it helps in understanding the reliability or predictability of the outcomes. For example, a higher variance indicates more unpredictability in the number of successes across trials.
• Variance of aY is a^2 variance(Y). How would that justify with the derivation
(1 vote)
• This was exactly my question!... if X = nY, then Var(X) = Var(nY) = n^2Var(Y).... so Sal needs to explain his steps...
• Couldn't this be simplified by saying the Variance of a binomial variable is the variance of a Bernoulli distribution multiplied by n trials? The reason being the variance addition property.

@ time Sal says "indeed the variance for a binomial variable" I think he means the variance of a Bernoulli distribution?
• Yes, that simplification captures the essence of the derivation. Since a binomial variable X can be thought of as the sum of n independent Bernoulli trials (each with variance p(1 − p)), the variance of X is n times the variance of a single Bernoulli trial, thanks to the property that the variance of the sum of independent variables is the sum of their variances. And, yes, at , it appears there was a confusion in terminology; what's described is indeed the variance of a single Bernoulli trial.
(1 vote)
• From the last video in this video I am so confused. I don't even know where to start with my questions.
• Shouldn't var(nx) be n^2*var(x)?
(1 vote)
• For the variance of nX, where X is a random variable and n is a constant, it's important to distinguish between multiplying the variable itself by n (affecting the variance by n^2) and summing n independent instances of a variable. For nX, Var(nX) = n^2Var(X) because scaling a variable scales its variance by the square of that factor. However, when you sum n independent copies of X (as in a binomial distribution being the sum of n Bernoulli trials), the variance is n × Var(X), not n^2 × Var(X), because the operation is additive for independent variables, not multiplicative.
• Would this derivation of the variance = p(1-p) work if Sal started by using p(0-p)^2 + (1-p)(1-p)^2? I can't seem to derive the same result if I try calculate it this way. I guess my question is why did Sal use p(1-p)^2 as the first term and not p(0-p)^2? Shouldn't we arrive at the same result?
(1 vote)
• When Sal uses p(1−p)^2 as the term, it seems there's a misunderstanding in your question. The correct formulation for the variance of a Bernoulli distribution should involve p(1 − p), where p is the probability of success. For a Bernoulli variable Y where Y = 1 with probability p and Y = 0 with probability 1 − p, the variance calculation uses E[Y ^2] − (E[Y])^2, leading directly to p(1 − p). The terms p(0 − p)^2 and (1 − p)(1 − p)^2 are not used because they're not correctly formulating the variance for a Bernoulli trial.
(1 vote)
• X and Y actually are two sets of data
Therefore Var(X)=Var(Y+Y+Y...) = nVar(Y)

Proof:
Var(X+Y) = Var(X)+Var(Y)+2Cov(X,Y)
If X and Y are independent of each other, then Cov(X,Y) = 0
• Your statement is correct regarding the variance of sums of independent variables. When X and Y are independent, Cov(X,Y) = 0, and the variance of a sum X + Y equals Var(X) + Var(Y). Extending this to n identical, independent distributions Y, the variance of their sum (nY) indeed equals n × Var(Y).
(1 vote)
• Isn't it when X=nY, μx=nμy, and σx=nσy, according to the impact of scaling a random variable. Then why Sal inferred from E(x)=nE(y) the conclusion Var(x)=nVar(Y)?
(1 vote)
• No, σx=nσy is incorrect. In previous video Sal also said that if a random variable is scaled by n then Var(x)= n * Var(y), which is equivalent to σx^2 = nσy^2