If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: Statistics and probability>Unit 3

Lesson 5: Variance and standard deviation of a sample

# Sample standard deviation and bias

Sal shows an example of calculating standard deviation and bias. Created by Sal Khan.

## Want to join the conversation?

• Are there any other ways to obtain an unbiased standard deviation from our sample population, instead of just accepting the fact that the sample variance gives you a biased standard deviation?
• The short answer is "no"--there is no unbiased estimator of the population standard deviation (even though the sample variance is unbiased). However, for certain distributions there are correction factors that, when multiplied by the sample standard deviation, give you an unbiased estimator. Nevertheless, all of this is definitely beyond the scope of the video and, frankly, not that important in the grand scheme of things (i.e. unless you're a technical mathematician, don't worry about it). But it was a good question!
• Sal says here "hopefully we're convinced now why we divide by n-1," but the previous video left off with "next time I'll show you further why we divide by n-1." Is there a video in between that I should be watching, or some other information? I can't help feeling quite confused, and this is not the first time in this course I've felt Sal mentioned something that wasn't explained previously.
• At , what does 'nonlinear' mean by?
• Here is a function y = f(x). When you give a input value x, you will have a output value y through some operation. If this function is linear, it means when you change x by Δx, the change of y (Δy) has a fixed ratio to Δx.
Graphically, if you plot values from function y = f(x) and line them up, you will get a straight line.

Nonlinear functions are those, if you change x with Δx, Δy divided by Δx is not a fixed value. Consequently, the if you plot values from that function and line them up, you won't get a straight line. You may get a curve.
• i didn't get it. how do square root of unbiased sample variance leads to biased standard deviation..? kindly explain.
• I'm not an expert in statistics, but here's my crack at it.

An unbiased process that outputs some value means that the expected value of the process will match some actual value. Basically, as you perform the unbiased process on more and more samples the average value will approach the actual value.

But if you have a set of values who's average is some number and then perform a non-linear operation on them (like sqr root) then their new average value is NOT going to match the old average with the same non-linear operation performed on it.

For example, take the following numbers:
2, 2, 2, 2, 12
Their average is 4.
Here is their sqr roots:
1.41, 1.41, 1.41, 1.41, 3.46
the average value of those sqr roots is 1.82.
But the sqr root of the old average value is 2.
They don't match! We've introduced some bias by performing a non-linear operation.

I imagine it's impossible to remove this bias because the magnitude and direction of the bias probably heavily depends on the population data.
• @ , why do we divide by 7 instead of 8? I know he says it is the unbiased sample variance, but what exactly does that mean?
• That means that he is using a better approximation for the variance of the population, given a normal distribution.
• My boy started to glitch @
• Sal.exe has stopped working
(1 vote)
• How would I know to divide by n-1 or n? I know this question has been asked before, but I don't really see the reason. Could someone please give me a simple answer?
• The reason has been explained in the wikipedia https://en.wikipedia.org/wiki/Variance#Sample_variance.
n-1 correction is called Bessel's correction. Even though I couldn't understand the proof, I did understand that this is the case that you divide by n-1 instead of n when you have a sample and are estimating for the whole population.
• I don't think "unbiased sample variance" has been explained. In the previous video, Sal promised to explain why the result is more accurate when we subtract 1 from the denominator, but here he just assumes everybody knows why.
• Instead of squaring the difference from the mean and taking the square root of the sum, isn't it more reasonable to take the mean of the absolute value of the difference from the mean? This way we won't require squares and square roots.
• In statistics

- Bias: mean from samples is different/far away from population mean. In other word mean from samples not able to represent population mean

- Unbias: mean from samples is close to/same as population mean. In other word mean from samples can represent population mean

Is this correct? If so then wouldn't bias be measurement error?