Main content

## Statistics and probability

### Course: Statistics and probability > Unit 3

Lesson 2: More on mean and median- Calculating the mean: data displays
- Calculating the median: data displays
- Comparing means of distributions
- Means and medians of different distributions
- Impact on median & mean: removing an outlier
- Impact on median & mean: increasing an outlier
- Effects of shifting, adding, & removing a data point
- Mean as the balancing point
- Missing value given the mean
- Missing value given the mean
- Median & range puzzlers
- Median & range puzzlers

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Means and medians of different distributions

Sal compares the mean and median based on a few different distributions. Created by Sal Khan.

## Want to join the conversation?

- I dont understand : in the following exercises "Interpreting and comparing data distributions", the questions are about standard deviation, but Sal haven t mentioned it yet.(16 votes)
- Why are there in the exercise right after this video questions about interquartile range and greatest deviations? I don't even know what these notions mean(11 votes)
- They misplaced the exercise, it happens from time to time... Next sections cover those topics.(5 votes)

- If the population is college graduates, Michael Jordan shouldn't have been included (he turned pro before graduating)(5 votes)
- The question is specifically asking about the median income of "geology majors" not "graduates of UNC with a degree in geology".(6 votes)

- Doesn't an outlier affect the median somewhat because it is a number used to locate the median correct.. The way this is worded it seems like the outliers don't affect the median, only the mean.... am I understanding this correctly(3 votes)
- Outliers do affect the median, for the reason you say. However, there needs to be
*many*(relative to the size of the dataset) extreme values before the median gets changed.

Say we have a dataset: 1, 2, 3, 4, 5

The median is 3. If we changed the 5 to 500, the median will still be 3 (but the mean will not be!). If we changed the 4 and 5 to 400 and 500, the median would still be 3.

So for the median, it's not so much the size of the outlier, but rather how many there are. For the mean, even a single outlier can have a big effect if it is extreme enough.(7 votes)

- (scratches head) where did he get the data?(4 votes)
- You recorded the time in seconds it took for 8 participants to solve a puzzle. These times appear below. However, when the data was entered into the statistical program, the score that was supposed to be 22.1 was entered as 21.2. You had calculated the following measures of central tendency: the mean, the median, and the mean trimmed 25%. Which of these measures of central tendency will change when you correct the recording error?(3 votes)
- It depends on the position of the data value. If it is in the "middle", then the median will change, but otherwise it would remain the same. Likewise, the mean trimmed 25% would only change if the incorrectly entered value was not a truncated / trimmed one. The mean would change.(2 votes)

- What's mean and what's median?(2 votes)
- The mean is the average of all the numbers [sum of all numbers / the amount of numbers in the set] while the median is the middle number in a set that is arranged from smallest to largest. For example, you have

2, 3, 5, 6, 7, 8, 9,

The mean would be 2 + 3 + 5 + 6 + 7 + 8 + 9 / 7 =**32.3**, while the median would be**6**because it is the middle number.

If the amount of numbers in the sequence is even, take the mean of the middle two numbers. For example, you have

2, 5, 6, 7, 8, 9

The median would be 7 + 6 / 2 =**6.5**

Hope this helps!😀(3 votes)

- I need a quick refresher, what is a median again?(2 votes)
- If you organise all of the data in a data set in ascending or descending order, you will have two cases:

Odd number of data points: the middle value will be your median

Even number of data points: You have to take the average of your two middle terms to find the median(2 votes)

- what is a standard deviation?(2 votes)
- Standard Deviation is the measure of how far a typical value in the set is from the average. The smaller the Standard Deviation, the closely grouped the data point are. The standard deviation of {1,2,3} would be less than the Standard Deviation of {0,4,7,10}(2 votes)

- I made some experiment and I think third answer is not must.

Data:0,0,0,52,52,56,56,62,63,64,65,66,66,67,67,67,67,68,74,75,77,78,79,79,82,82,83,87,88,91,92,92,95,95,99

Median: 68

Mean: 68.17142857

Am I missing something?(2 votes)- I dont think you are.Unless you count mode and range.(2 votes)

## Video transcript

Voiceover:"For a senior
project, Richard is researching "how much money a college graduate can "expect to earn based on his or her major. "He finds the following interesting facts: "Basketball superstar Michael Jordan was a "geology major at the
University of North Carolina. "There were only three
civil engineering majors "from the University of Montana. "They all took the exact same job at the same company,
earning the same salary. "Of the 35 finance majors
from Wesleyan University, "32 got high-paying consulting jobs, "and the other 3 were unemployed. "For geology majors from the University of "North Carolina the median
income will likely be ..." and we have some options here, "less than, equal to or
greater than the mean." And then we have to answer
the same questions ... "For civil engineering
majors from Montana, the median income ..." Well actually these are both about median. "The median income will be ..." and we compare it against the mean. And then, "For finance
majors from Wesleyan ..." We're going to compare the
median income to the mean. So to visualize this a little bit more, I've copied and pasted
this exact same problem onto my scratchpad, so here it
is, I can now write on this. So, let's think about each of these. "For geology majors from UNC, "the median income will likely be ..." How will that compare to the mean? Well what do they tell us about UNC? They tell us that Michael
Jordan was a geology major at the University of North Carolina. So what will the distribution of salaries probably look like? So if we're thinking
about the University of North Carolina, probably will
look something like this. And I'm going to do a
very rough distribution right over here and let's
say that this would be a salary of 0, and let's say
that is a salary of, I don't know Let me put a salary of 50K
here, I'll do this in thousands. Let's say this is 100
thousand right over here. And then you have Michael Jordan who is, actually I'll do a little gap
here because he's so far up, I don't know exactly what he was making but it was definitely
in the tens of millions of dollars a year so Michael Jordan is way, way, way, way, up here. So if you were to make a histogram or a plot of all of the salaries we could say, Okay well you know maybe
we have, if you put all of the folks from geology majors at University of North Carolina... well there's probably, especially
right when they graduated there's probably you know, 1,
2, 3, I could keep doing it. A bunch of people maybe making 50K maybe some people making a
little bit more, maybe some people up here, maybe some people there. Some people there, some people there. Right there, maybe someone's making 100K. Maybe it's a couple of people up there, maybe someone isn't making anything, maybe they weren't able to find a job. And then of course you have Michael Jordan up here making you
know, 10 million dollars or 20 million dollars
or something like that. So when you have a situation like this where you have this
outlier of Michael Jordan it's going to put, one
way I think about it it kind of tugs on the
mean, it wouldn't affect the median because remember the median is the middle value so it
doesn't matter how high this number is, you can
make this a trillion dollars it's not going to change
what the middle value is. The middle value is still
going to be the same middle value, you can
move this anywhere around in this range it's not
going to change the median. But the mean will change, if
this becomes really, really astronomically high it
will distort the actual mean here, actually could
distort it a good bit. So for geology majors
from UNC the median income is going to be lower than the mean because Michael Jordan is pulling the mean up. So let me fill that in. "So for geology majors
from UNC, the median "will be less than the mean." Now let's think about the other ones, "For civil engineering
majors from Montana, "the median income will
be" blank "the mean." Well they tell us there were
only three civil engineering majors from the University of Montana. They all took the exact
same job at the same company earning the same salary. So let's say all 3 of them
earn 50 thousand dollars. Let's say that's their salary so you have if you were to calculate
the mean it would be 50 + 50 + 50 over 3 which of course is 50. That would be the mean. If you wanted the median you
list the salaries in order and then you take the middle
one, well the middle one is 50, so in this case the
median is equal to the mean. So let's fill that in,
median is equal to the mean. And then finally, wait let
me go back to my scratchpad. Whoops, let me go back
to my scratchpad here. "For finance majors from
Wesleyan, the median income will be," blank "the mean." So let's think about
this distribution here. So here we have 35, out of
the 35, 32 got high-paying consulting jobs so let's
say that they were making six-figures so the distribution might look something like this, this
is 0 and this is let's say this is 50K and let's say
that this right over here is 100 thousand dollars a years. So 32 got high-paying consulting jobs, so you might have 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32. So the distribution for
the people who got the jobs might look something
like that but there were 3 people who are unemployed
so let's say they got no income, so you have 1, 2, 3. So this is now, you have 3
outliers like the Michael Jordan situation but instead
of them being very high they are very low so they're
going to pull the mean lower, they're not going
to, if these were 0 or these were 50, or these were over here, they're not going to affect the median. The middle number is still
going to be the same. But they are going to pull down the mean. So here I would say that
the median income will likely be higher, will likely
be greater than the mean. Because the mean is
going to get pulled down by these outliers, these three
people not make anything. So let's fill that out. "For finance majors from
Wesleyan, the median income "will likely be greater than the mean." Now let's check our answer,
and we got it right.