If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

### Course: AP®︎/College Statistics>Unit 5

Lesson 1: Representing the relationship between two quantitative variables

# Describing scatterplots (form, direction, strength, outliers)

When we look at scatterplot, we should be able to describe the association we see between the variables.
A quick description of the association in a scatterplot should always include a description of the form, direction, and strength of the association, along with the presence of any outliers.
Form: Is the association linear or nonlinear?
Direction: Is the association positive or negative?
Strength: Does the association appear to be strong, moderately strong, or weak?
Outliers: Do there appear to be any data points that are unusually far away from the general pattern?
It's also important to include the context of the two variables in the description of these features. Here's an example.

## Example

Let's describe this scatterplot, which shows the relationship between the age of drivers and the number of car accidents per $100$ drivers in the year $2009$.
Here's a possible description that mentions the form, direction, strength, and the presence of outliers—and mentions the context of the two variables:
"This scatterplot shows a strong, negative, linear association between age of drivers and number of accidents. There don't appear to be any outliers in the data."
Notice that the description mentions the form (linear), the direction (negative), the strength (strong), and the lack of outliers. It also mentions the context of the two variables in question (age of drivers and number of accidents).

## Practice

Problem 1
Choose the scatterplot that best fits this description:
"There is a strong, positive, linear association between the two variables."
Choose 1 answer:

Problem 2
Choose the scatterplot that best fits this description:
"There is a moderately strong, negative, linear association between the two variables with a few potential outliers."
Choose 1 answer:

Problem 3
Choose the scatterplot that best fits this description:
"There is a strong, negative, nonlinear association between the two variables."
Choose 1 answer:

## Want to join the conversation?

• In Problem #3, illustrations A and B, you show something we see in economics quite a bit. In economics, we're always interested in identifying "effects" that take place between variables. However, sometimes one effect drops off and then a new effect takes over. I call this phenomenon a "split" effect.

For example, in the Laffer curve, we at first see the government raise more tax revenue as tax rates increase because they collect more money from citizens. Simple enough. However, after a certain tax rate is reached, we start to see a new effect take place wherein the tax revenue drops off as the tax rate is increased further. This is because at very high rates of taxation, people either lose interest in working, or they start to seek ways of hiding their income from the government. Thus, we often see two or more different effects express themselves through a full range of data.

While I have always used the term "split" effect to describe such phenomenon, I have not been able to find this phenomenon acknowledged or identified (by any particular term) amongst economists or mathematicians. Mathematicians seem to simply call these scenarios "non-linear" or "curvilinear" relationships, without seeming to notice that there are invariably two distinct relationships being identified by the data.

Am I mistaken? Do mathematicians acknowledge split effects? If so, what term do mathematicians use to describe this type of phenomenon?
(41 votes)
• Mathematicians probably include your "split effect" in the category of nonlinear correlation
(4 votes)
• aren't there too many outliers in problem 2 !*
(6 votes)
• What do you mean? If you mean in general, there isn't a lot of outliers. There is only 2 and the 2 are in answer C..... was that a statement or a question?
(4 votes)
• How is it possible to tell whether the correlation is strong or moderately strong?
(5 votes)
• Strong correlation means that there aren't many outliers. In simple words, the dots on the graph are close to each other.
(3 votes)
• How many points have to be off course for a graph to be a "moderately negative or positive"
(4 votes)
• Is not about the amount, but the direction, if they have a downwards tendency then they are negative, and a topwards is a positive.
(3 votes)
• no questions i understand
(3 votes)
• i just realy needed work for the carona brake
(3 votes)
• why hast this world lose its mind?
(3 votes)
• connections between proportional relationships, lines, and linear equations]
(2 votes)
• A proportional relationship is of the form 𝑦 = 𝑘𝑥,
which is also the equation of a straight line that goes through the origin.

An example of a proportional relationship would be the outcome of rolling a six-sided die 𝑥 times.
Given that the die is fair the average outcome per roll is 𝑘 = 3.5
Thus the expected outcome after 𝑥 rolls is 𝑦 = 3.5𝑥

If we ran a bunch of simulations where 𝑥 varies from, say, 1 to 20
and we plotted the results we would most likely get something that resembles the line 𝑦 = 3.5𝑥
(3 votes)
• why is it strong negative
(2 votes)
• Negative does not necessarily mean that the points are spread out. Negative just means that the trend line/points are going downwards. Positive is upwards. Positive/negative is the direction of the line, not the strength.
Hope this helps ;)
(2 votes)
• Would there be an outlier on Problem 3?
(2 votes)
• In plot C, which I would say shows a weak, positive, linear association, the point in the top-left corner is a potential outlier.

Plots A and B don't appear to have any outliers.
(2 votes)