Main content
Statistics and probability
Course: Statistics and probability > Unit 3
Lesson 4: Variance and standard deviation of a population- Measures of spread: range, variance & standard deviation
- Variance of a population
- Population standard deviation
- The idea of spread and standard deviation
- Calculating standard deviation step by step
- Standard deviation of a population
- Mean and standard deviation versus median and IQR
- Concept check: Standard deviation
- Statistics: Alternate variance formulas
© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice
Population standard deviation
The population standard deviation is a measure of how much variation there is among individual data points in a population. It's a way of quantifying how spread out the data is from its mean. A small standard deviation means that the data points are generally close to the mean, while a large standard deviation means that the data is more dispersed. Created by Sal Khan.
Want to join the conversation?
- Isn't the dividing part wrong?
I learned it should not be 5 in this case, but it should be 4 which is n-1.(4 votes)- You divide by "n-1" when dealing with the Sample Standard deviation. In this video Sal is calculating the Standard deviation of the population, which is why he is dividing by "N".(10 votes)
- How could the concept of variance be usefull in real life ?(7 votes)
- So in this example the standard deviation is 0.562 meters, does that mean that the 5.5 meters of the original data set is a bit of an outlier since it's not within the standard deviation of the mean?(6 votes)
- What does Population standard deviation mean??(2 votes)
- The standard deviation of the population. Most if not all the values that we quantify in the field of Statistics - things like the mean (average), or median, or standard deviation, etc - can be thought of in two ways:
1. What is the value of the quantity considering only our sample data? This is what we call a "statistic".
2. What would be the value of the quantity if we were able to get data on the wheel population, meaning every possible data point. This is what we call a "parameter".
So there are statistics and parameters. We use a statistic to estimate (make an educated guess at the value) of a parameter. The population standard deviation is simply referencing the population parameter, rather than the sample statistic. Sometimes (often) the value of the parameter is unknown or even unknowable, but we can still think of it in theory.(6 votes)
- Sal's question makes sense, why don't take the absolute value of it instead of take it to second power?(3 votes)
- Is 'var' the short form of variance?(2 votes)
- It depends on the context, I've seen it used for both. If it seems to be representing a single number or a function, then it's probably variance. If it seems to reference several characteristics (e.g. height, weight, eye color, etc), then it probably means variables.(4 votes)
- so both Variance and standard deviation are used to measure level of dispersion. what's the difference when you need to pick one to solve real-world problem?(3 votes)
- It's true that they both are used to measure the level of dispersion but the difference is that the SD is a "true" average distance from the mean. Therefore, SD is more "useful." Variance is just a step before you get SD.(3 votes)
- with n=13 and p=0.5 find p( at least 10)(3 votes)
- you said that p= 0.5 so the anser is 0.5(1 vote)
- Atwhen I used my calculator I got a different answer it said 18.6 instead of 4.6 I am quite puzzled because I have repeated the calculation correctly and still I have the same wrong answer 18.6. 1:35(1 vote)
- Are you sure you've input the decimals properly? The answer is 4.6. The numbers are 4.0, 4.2, 5.0, 4.3 and 5.5.(4 votes)
- Hello im wondering how can u do this when the middle number has a repating number for mine is 233/ 6 whitch = 38.8333333333333333333333333333333 and keeps going how do i solve this?(1 vote)
- you can convert it into a fraction which is 38 5/6(4 votes)
Video transcript
Let's say that you're
curious about studying the dimensions of
the cars that happen to sit in the parking lot. And so you measure
their lengths. Let's just make the
computation simple. Let's say that there are
five cars in the parking lot. The entire size of the
population that we care about is 5. And you go and measure
their lengths-- one car is 4 meters long,
another car is 4.2 meters long, another car is 5 meters
long, the fourth car is 4.3 meters long,
and then, let's say the fifth car is
5.5 meters long. So let's come up with some
parameters for this population. So the first one that you
might want to figure out is a measure of
central tendency. And probably the most popular
one is the arithmetic mean. So let's calculate that first. So we're going to do
that for the population. So we're going to use mu. So what is the
arithmetic mean here? Well, we just have to add
all of these data points up and divide by 5. And I'll just get the
calculator out just so it's a little bit quicker. This is going to be for 4 plus
4.2 plus 5 plus 4.3 plus 5.5. And then, I'm going to take
that sum and then divide by 5. And I get an arithmetic mean
for my population of 4.6. So that's fine. And if we want to put some
units there, it's 4.6 meters. Now, that's the central tendency
or measure of central tendency. We also might be curious about
how dispersed is the data, especially from that
central tendency. So what would we use? Well, we already have a
tool at our disposal-- the population variance. And the population
variance is one of many ways of
measuring dispersion. It has some very neat
properties the way we've defined it as the mean
of the squared distances from the mean. It tends to be a
useful way of doing it. So let's just a bit. Let's actually calculate
the population variance for this population
right over here. Well, all we need to
do is find the distance from each of these points
to our mean right over here. And then, square them. And then, take the mean of
those two squared distances. So let's do that. So it's going to be
4 minus 4.6 squared plus 4.2 minus 4.6 squared
plus 5 minus 4.6 squared plus 4.3 minus 4.6 squared. And then, finally--
I'm running out of space-- plus 5.5
minus 4.6 squared. And then, we're going to
divide all of that by 5 to get our population variance. And so what's that
going to give us? Let's get our calculator out. 4 minus 4.6 squared. That's negative 0.6 squared. Negative 0.6 squared
is going to be the exact same thing
as 0.6 squared. So let me write
that as 0.6 squared plus 4.2 minus 4.6
is negative 0.4. But when we square it, the
negative's going to disappear. So it's going to be plus 0.4. I'll just write 0.4 squared. And then, we have 5 minus 4.6. That's 0.4 so plus 0.4 squared. 4.3 minus 4.6. It's negative 0.3. The negative goes away
when you square it. It's going to be
plus 0.3 squared. And then, finally, 5.5 minus
4.6 is going to be 0.9. So plus 0.9 squared. Then, we will divide by the
number of data points we have. And we get 0.316. Or if we want to write it,
this is going to be 0.316. Now, let me ask you what is a
mildly interesting question-- what would be the units for
this population variance? Since we happen to care
about units in this video. Well, up here, this is 4
meters minus 4.6 meters. 4.2 meters minus 4.6 meters. So these are all meters. These are measurements
in meters. We saw it up here. So these are all
measurements in meters. When you subtract them,
you'll get meters. But then when you
square them, you get meters squared plus meters
squared plus meters squared plus meters squared
plus meters squared. And then, you're just dividing
that by a unitless count of the number of
data points you have. So the units here are
going to be square meters. And so you might say, hey. That's kind of a
weird unit if we're trying to visualize
or think about how dispersed we are from the mean. When I visualize it,
I visualize dispersion or how varied they are in terms
of meters, not meters squared. So what could we do? And a big hint-- this
comes out of just even the notation for variance. And it's this sigma
symbol squared. So why don't we just take the
square root of our variance? Which we will denote
with just a sigma. It makes a lot of sense. And in this case,
what's it going to be? It's going to be the
square root of 0.316. And then, what are
the units going to be? It's going to be just meters. And we end up with-- so let me
take the square root of 0.316. And I get 0.56-- I'll
just round to the nearest thousandth-- 0.562. So this is approximately
0.562 meters. So you might be
saying, Sal, what do we call this thing
that we just did? The square root of the variance. And here we're dealing
with the population. We haven't thought
about sampling yet. The square root of the
population variance, what do we call this
thing right over here? And this is a very
familiar term. Oftentimes, when
you take an exam, this is calculated for
the scores on the exam. This is our population-- let
me do this in a new color. I'm using that yellow
a little bit too much. This is the population
standard deviation. It is a measure of how much the
data is varying from the mean. In general, the
larger this value, that means that the data is
more varied from the population mean. The smaller, it's less varied. And these are all somewhat
arbitrary definitions of how we've defined variance. We could have taken things
to the fourth power. We could have done other things. We could have not
taken them to a power but taking the
absolute value here. The reason why we
do it this way is it has neat statistical properties
as we try to build on it. But that's the population
standard deviation, which gives us nice
units-- meters. In the next video, we'll think
about the sample standard deviation.