If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

### Course: 8th grade>Unit 7

Lesson 1: Introduction to scatter plots

# Constructing a scatter plot

Does the time an exam happens affects the scores? Aubrey gathered info. Then she made a scatter plot with the time of day on the bottom (on the horizontal axis) and the scores on the side (on the vertical axis). She didn't find a clear pattern. Created by Sal Khan.

## Want to join the conversation?

• why do you have to the score on the y axis? Couldn't you put on x? Could some explain it for me?
(18 votes)
• The x-axis always shows the independent variable, a number that is unaffected by what is on the y-axis. The y-axis has the dependent variable which is a result of the independent variable.

Here is a link to a khan academy video:
https://www.khanacademy.org/math/algebra/introduction-to-algebra/alg1-dependent-independent/v/dependent-and-independent-variables-exercise-example-1

And here is a link to practice:
https://www.khanacademy.org/math/algebra/introduction-to-algebra/alg1-dependent-independent/e/dependent-and-independent-variables

I also have some of my own examples and explanations below. I know that it is long, but I hope it helps! : )

Here is an example:
You are driving a car. You want to see how the number of miles that you drive effects the gas in the tank. The number of miles that you drive would be the independent variable you have not driven x miles because you lost gas. You lost gas because you drove x miles.

I that my explanation made sense to you. If it didn't, here are some clues to help you find the variables:
The independent variable is usually whole numbers, such as 1,2,3,4,5,6,7 . . .
The dependent variable can jump around, like 9.2, 7, 5.3, 6.5 . . .

You can also think of it as a number machine game. One where you input 1 and get an output of 2, you input 2 and get 4, you input 3, an get 9, and so on. The independent variable is the input. The dependent variable is the output.

The independent variable can be whatever you like and the dependent variable is a result that depends on the independent variable.
(56 votes)
• what do you do when you have 3 different things to put on a graf like this
|------------hight
|
|------------hours
|
|------------days
(12 votes)
• you create a 3D graph that has an x-axis, an y-axis, and an z-axis
(12 votes)
• If you have a bunch of random dots everywhere and then some clusters in some random places,what would you call that?
(3 votes)
• I don't know if there's a name for it, but the clusters suggest that some 𝑥-values are more frequent than others, and for those 𝑥-values some 𝑦-values are more frequent than others.
These clusters have a greater impact on the regression than the surrounding dots, but since you say the clusters are also randomly strewn we should still have a weak linear regression.

Comparing people's heights to the number of shoes they own could potentially produce a pattern like this, with one cluster forming around the intersection of the average female height and the average number of shoes per female, and another cluster around the intersection of the average male height and the average number of shoes per male.
(15 votes)
• these comments-
(6 votes)
• I know!
(4 votes)
• Are scatterplots just like graphs?
(2 votes)
• Yes, scatterplots show raw data, and the directions and flow in them allow us to see trends and make predictions.
(7 votes)
• butter dog, dog with da butter, butter with da dog
(5 votes)
• exactly this is so true
(1 vote)
• I know how to construct a scatter plot but, I have no clue how to "make appropriate scatter plots" I keep getting it wrong. Im not sure how to do that.
(3 votes)
• In a good scatterplot, the points make good use of the space on the coordinate grid (for example, the points are not all “bunched up” in a small portion of the grid). Also, the independent variable should be on the horizontal axis, and the dependent variable should be on the vertical axis.

Have a blessed, wonderful day!
(5 votes)
• How do you find the equation for the line of best fit?
(1 vote)
• Try to eyeball a line that goes through the "middle of all the points", drawing it on the graph. Once you've done that, find the slope using the rise and run of the points on that line. Locate the y-intercept as well. Finally, arrange the data into y = mx + b form.

Hope this helps!😊
(8 votes)
• I think her experiment was bad because some classes are harder than others

I would have done worse in chemistry than math no matter what periods it was

if she checked all the first periods and all the second periods and all the third period etc.. it would have been accurate thought
(3 votes)
• how do you do a 2 way plot
(3 votes)

## Video transcript

Aubrey wanted to see if there's a connection between the time a given exam takes place and the average score of this exam. She collected data about exams from the previous year. Plot the data in a scatter plot. And let's see, they give us a couple of rows here. This is the class. Then they give us the period of the day that the class happened. And then they give us the average score on an exam. And we have to be a little careful with the study-- maybe there's some correlation depending on what subject is taught during what period. But let's just use her data, at least, just based on her data, see if-- well, definitely do what they're asking us, plot a scatter plot, and then see if there is any connection. So let's see. On the horizontal axis, we have Period. And on this investigation, this exploration she's doing, she's trying to see, well, does the period of the day somehow drive average score? So that's why Period is on the horizontal axis. And the thing that's driving is on the horizontal, the thing that's being driven is on the vertical. So let's plot each of these points. Period 1, average score 93-- right over there. Period 6, 87. Oh, that's not the right place, and then we can move it if we want-- 87, right over there. Period 2, 70. Period 4, 62-- right over there. Period 4 and 86, that's right over there. Period 1, 73. Period 3, average score of 73 as well. Period 1, 80, average score of 80. And then Period 3, average score of 96. So there we go. And it doesn't really seem like there's any obvious trend over here. So let's make sure that we got this right. And we did.