Statistics and probability
- Calculating the mean: data displays
- Calculating the median: data displays
- Comparing means of distributions
- Means and medians of different distributions
- Impact on median & mean: removing an outlier
- Impact on median & mean: increasing an outlier
- Effects of shifting, adding, & removing a data point
- Mean as the balancing point
- Missing value given the mean
- Missing value given the mean
- Median & range puzzlers
- Median & range puzzlers
Means and medians of different distributions
Sal compares the mean and median based on a few different distributions. Created by Sal Khan.
Want to join the conversation?
- I dont understand : in the following exercises "Interpreting and comparing data distributions", the questions are about standard deviation, but Sal haven t mentioned it yet.(16 votes)
- Why are there in the exercise right after this video questions about interquartile range and greatest deviations? I don't even know what these notions mean(10 votes)
- They misplaced the exercise, it happens from time to time... Next sections cover those topics.(5 votes)
- If the population is college graduates, Michael Jordan shouldn't have been included (he turned pro before graduating)(5 votes)
- The question is specifically asking about the median income of "geology majors" not "graduates of UNC with a degree in geology".(6 votes)
- Doesn't an outlier affect the median somewhat because it is a number used to locate the median correct.. The way this is worded it seems like the outliers don't affect the median, only the mean.... am I understanding this correctly(3 votes)
- Outliers do affect the median, for the reason you say. However, there needs to be many (relative to the size of the dataset) extreme values before the median gets changed.
Say we have a dataset: 1, 2, 3, 4, 5
The median is 3. If we changed the 5 to 500, the median will still be 3 (but the mean will not be!). If we changed the 4 and 5 to 400 and 500, the median would still be 3.
So for the median, it's not so much the size of the outlier, but rather how many there are. For the mean, even a single outlier can have a big effect if it is extreme enough.(7 votes)
- You recorded the time in seconds it took for 8 participants to solve a puzzle. These times appear below. However, when the data was entered into the statistical program, the score that was supposed to be 22.1 was entered as 21.2. You had calculated the following measures of central tendency: the mean, the median, and the mean trimmed 25%. Which of these measures of central tendency will change when you correct the recording error?(3 votes)
- It depends on the position of the data value. If it is in the "middle", then the median will change, but otherwise it would remain the same. Likewise, the mean trimmed 25% would only change if the incorrectly entered value was not a truncated / trimmed one. The mean would change.(2 votes)
- (scratches head) where did he get the data?(3 votes)
- I made some experiment and I think third answer is not must.
Am I missing something?(2 votes)
- I dont think you are.Unless you count mode and range.(2 votes)
- i watched almost all the videos and i still dont understand(2 votes)
- It is okay, just try to wrap your mind around it. It will come to you don't worry and don't stress it.(1 vote)
- is there a video on ogives?(2 votes)
- is there a easier way that this cxan be modified in shorter and a little bit more clesr(2 votes)
Voiceover:"For a senior project, Richard is researching "how much money a college graduate can "expect to earn based on his or her major. "He finds the following interesting facts: "Basketball superstar Michael Jordan was a "geology major at the University of North Carolina. "There were only three civil engineering majors "from the University of Montana. "They all took the exact same job at the same company, earning the same salary. "Of the 35 finance majors from Wesleyan University, "32 got high-paying consulting jobs, "and the other 3 were unemployed. "For geology majors from the University of "North Carolina the median income will likely be ..." and we have some options here, "less than, equal to or greater than the mean." And then we have to answer the same questions ... "For civil engineering majors from Montana, the median income ..." Well actually these are both about median. "The median income will be ..." and we compare it against the mean. And then, "For finance majors from Wesleyan ..." We're going to compare the median income to the mean. So to visualize this a little bit more, I've copied and pasted this exact same problem onto my scratchpad, so here it is, I can now write on this. So, let's think about each of these. "For geology majors from UNC, "the median income will likely be ..." How will that compare to the mean? Well what do they tell us about UNC? They tell us that Michael Jordan was a geology major at the University of North Carolina. So what will the distribution of salaries probably look like? So if we're thinking about the University of North Carolina, probably will look something like this. And I'm going to do a very rough distribution right over here and let's say that this would be a salary of 0, and let's say that is a salary of, I don't know Let me put a salary of 50K here, I'll do this in thousands. Let's say this is 100 thousand right over here. And then you have Michael Jordan who is, actually I'll do a little gap here because he's so far up, I don't know exactly what he was making but it was definitely in the tens of millions of dollars a year so Michael Jordan is way, way, way, way, up here. So if you were to make a histogram or a plot of all of the salaries we could say, Okay well you know maybe we have, if you put all of the folks from geology majors at University of North Carolina... well there's probably, especially right when they graduated there's probably you know, 1, 2, 3, I could keep doing it. A bunch of people maybe making 50K maybe some people making a little bit more, maybe some people up here, maybe some people there. Some people there, some people there. Right there, maybe someone's making 100K. Maybe it's a couple of people up there, maybe someone isn't making anything, maybe they weren't able to find a job. And then of course you have Michael Jordan up here making you know, 10 million dollars or 20 million dollars or something like that. So when you have a situation like this where you have this outlier of Michael Jordan it's going to put, one way I think about it it kind of tugs on the mean, it wouldn't affect the median because remember the median is the middle value so it doesn't matter how high this number is, you can make this a trillion dollars it's not going to change what the middle value is. The middle value is still going to be the same middle value, you can move this anywhere around in this range it's not going to change the median. But the mean will change, if this becomes really, really astronomically high it will distort the actual mean here, actually could distort it a good bit. So for geology majors from UNC the median income is going to be lower than the mean because Michael Jordan is pulling the mean up. So let me fill that in. "So for geology majors from UNC, the median "will be less than the mean." Now let's think about the other ones, "For civil engineering majors from Montana, "the median income will be" blank "the mean." Well they tell us there were only three civil engineering majors from the University of Montana. They all took the exact same job at the same company earning the same salary. So let's say all 3 of them earn 50 thousand dollars. Let's say that's their salary so you have if you were to calculate the mean it would be 50 + 50 + 50 over 3 which of course is 50. That would be the mean. If you wanted the median you list the salaries in order and then you take the middle one, well the middle one is 50, so in this case the median is equal to the mean. So let's fill that in, median is equal to the mean. And then finally, wait let me go back to my scratchpad. Whoops, let me go back to my scratchpad here. "For finance majors from Wesleyan, the median income will be," blank "the mean." So let's think about this distribution here. So here we have 35, out of the 35, 32 got high-paying consulting jobs so let's say that they were making six-figures so the distribution might look something like this, this is 0 and this is let's say this is 50K and let's say that this right over here is 100 thousand dollars a years. So 32 got high-paying consulting jobs, so you might have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32. So the distribution for the people who got the jobs might look something like that but there were 3 people who are unemployed so let's say they got no income, so you have 1, 2, 3. So this is now, you have 3 outliers like the Michael Jordan situation but instead of them being very high they are very low so they're going to pull the mean lower, they're not going to, if these were 0 or these were 50, or these were over here, they're not going to affect the median. The middle number is still going to be the same. But they are going to pull down the mean. So here I would say that the median income will likely be higher, will likely be greater than the mean. Because the mean is going to get pulled down by these outliers, these three people not make anything. So let's fill that out. "For finance majors from Wesleyan, the median income "will likely be greater than the mean." Now let's check our answer, and we got it right.