Outliers in scatter plots
Learn what an outlier is and how to find one!
What are outliers in scatter plots?
Scatter plots often have a pattern. We call a data point an outlier if it doesn't fit the pattern.
A scatterplot plots Backpack weight in kilograms on the y-axis, versus Student weight in kilograms on the x-axis. 5 points rise diagonally in a narrow pattern of points between (40, 4) and (76, 12 and 1 half). A point labeled Sharon is above the pattern. A point labeled Brad is below the pattern. All points are estimated.
Consider the scatter plot above, which shows data for students on a backpacking trip. (Each point represents a student.)
Notice how two of the points don't fit the pattern very well. These points have been labeled Brad and Sharon, which are the names of the students they represent.
Sharon could be considered an outlier because she is carrying a much heavier backpack than the pattern predicts.
Brad could be considered an outlier because he is carrying a much lighter backpack than the pattern predicts.
Key idea: There is no special rule that tells us whether or not a point is an outlier in a scatter plot. When doing more advanced statistics, it may become helpful to invent a precise definition of "outlier", but we don't need that yet.
To fully wrap our minds around why certain data points might be considered outliers, let's try a couple of practice problems.
Problem 1: Computer shopping
A scatterplot plots Quality rating on the y-axis, versus Price in dollars on the x-axis. 12 points rise diagonally in a relatively narrow pattern of points between (125, 60) and (175, 80). A point labeled A is at the beginning of the pattern. A point labeled B is above the pattern. A point labeled C is at the end of the pattern. A point labeled D is below the pattern to the right. All points are estimated.
Michelle was researching different computers to buy for college. She looked up the prices and quality ratings for a sample of computers. Her data is shown in the scatter plot to the right, where each point is a computer.
Michele wants to buy a computer whose quality rating is far higher than the pattern would predict based on its price.
Which of the labeled points represents a computer that Michele wants to buy?
Problem 2: Test scores
Some high school students in the U.S. take a test called the SAT before applying to colleges. The scatter plot to the right shows what percent of each state's college-bound graduates took the SAT in , along with that state's average score on the math section.
The three labeled points could be considered outliers.
Why might these points be considered outliers?
Want to join the conversation?
- Can there be more than one outlier?(39 votes)
- Of course! However, if they are next to each other you might have to question if they really are outliers.(67 votes)
- Is there any sort of mathematical tests to determine whether or not a data point is an outlier? For instance, in a paper I am writing I am graphing market capitalization against population growth. How would I mathematically (with a formula) determine that a data point (ie Market Capitalization, Annual Population Growth (perhaps it is 4336.2, 2110)) is an outlier?(23 votes)
- Yes, if you have the IQR, 1st and 3rd Q, or have the ability to calculate these, you can multiply the IQR*1.5 and either add or subtract the product from the 1st and 3rd Q, respectively. Anything below the lower difference or above the upper sum is an outlier.(14 votes)
- why? why dose this have to be a thing?(16 votes)
- It's just part of math! The only way we can find parts of data, etc.(9 votes)
- are you guys having a good day?(11 votes)
- no because of school(3 votes)
- I still dont get it. What if there are many outliers? How am I supposed to plot them?(7 votes)
- You plot all of the outliers the same way no matter how many there are.(10 votes)
- Do scatter plots need to have an outlier?(5 votes)
- A scatterplot would be something that does not confine directly to a line but is scattered around it. It can have exceptions or outliers, where the point is quite far from the general line. but no it does not need to have an outlier to be a scatterplot, It simply cannot confine directly with the line.(7 votes)
- Is there an easier way to do this.(5 votes)
- No, it is not complicated at all, you can tell if it is a Outlier because it is either higher or lower than all the rest of the points. see, easy(8 votes)
- How do you find the outlier in a set of data?(5 votes)
- You just look for points that are far away from most of the other points.(8 votes)
- Khan acadamy k omm(8 votes)
- FRR khan academyK(1 vote)
- can there be a scatter plot where there is no trend line but it still means something?(4 votes)
- Yes, there can be a scatter plot with no trend lines, this is because if the scatter plot is jumbled up severely and is identified as a no association there is a high possibility of not having a trendline.
Have a Great Day!(3 votes)