Main content
Statistics and probability
Unit 5: Lesson 1
Introduction to scatterplots Constructing a scatter plot
 Constructing scatter plots
 Making appropriate scatter plots
 Example of direction in scatterplots
 Scatter plot: smokers
 Bivariate relationship linearity, strength and direction
 Positive and negative linear associations from scatter plots
 Describing trends in scatter plots
 Positive and negative associations in scatterplots
 Outliers in scatter plots
 Clusters in scatter plots
 Describing scatterplots (form, direction, strength, outliers)
 Scatterplots and correlation review
© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice
Constructing a scatter plot
AP.STATS:
UNC‑1 (EU)
, UNC‑1.S (LO)
, UNC‑1.S.1 (EK)
, UNC‑1.S.2 (EK)
CCSS.Math: Sal shows how to construct a scatter plot. Created by Sal Khan.
Want to join the conversation?
 why do you have to the score on the y axis? Couldn't you put on x? Could some explain it for me?(12 votes)
 The xaxis always shows the independent variable, a number that is unaffected by what is on the yaxis. The yaxis has the dependent variable which is a result of the independent variable.
Here is a link to a khan academy video:
https://www.khanacademy.org/math/algebra/introductiontoalgebra/alg1dependentindependent/v/dependentandindependentvariablesexerciseexample1
And here is a link to practice:
https://www.khanacademy.org/math/algebra/introductiontoalgebra/alg1dependentindependent/e/dependentandindependentvariables
I also have some of my own examples and explanations below. I know that it is long, but I hope it helps! : )
Here is an example:
You are driving a car. You want to see how the number of miles that you drive effects the gas in the tank. The number of miles that you drive would be the independent variable you have not driven x miles because you lost gas. You lost gas because you drove x miles.
I that my explanation made sense to you. If it didn't, here are some clues to help you find the variables:
The independent variable is usually whole numbers, such as 1,2,3,4,5,6,7 . . .
The dependent variable can jump around, like 9.2, 7, 5.3, 6.5 . . .
You can also think of it as a number machine game. One where you input 1 and get an output of 2, you input 2 and get 4, you input 3, an get 9, and so on. The independent variable is the input. The dependent variable is the output.
The independent variable can be whatever you like and the dependent variable is a result that depends on the independent variable.(30 votes)
 what do you do when you have 3 different things to put on a graf like this
hight

hours

days(10 votes) you create a 3D graph that has an xaxis, an yaxis, and an zaxis(9 votes)
 If you have a bunch of random dots everywhere and then some clusters in some random places,what would you call that?(2 votes)
 I don't know if there's a name for it, but the clusters suggest that some 𝑥values are more frequent than others, and for those 𝑥values some 𝑦values are more frequent than others.
These clusters have a greater impact on the regression than the surrounding dots, but since you say the clusters are also randomly strewn we should still have a weak linear regression.
Comparing people's heights to the number of shoes they own could potentially produce a pattern like this, with one cluster forming around the intersection of the average female height and the average number of shoes per female, and another cluster around the intersection of the average male height and the average number of shoes per male.(9 votes)
 I know how to construct a scatter plot but, I have no clue how to "make appropriate scatter plots" I keep getting it wrong. Im not sure how to do that.(3 votes)
 In a good scatterplot, the points make good use of the space on the coordinate grid (for example, the points are not all “bunched up” in a small portion of the grid). Also, the independent variable should be on the horizontal axis, and the dependent variable should be on the vertical axis.
Have a blessed, wonderful day!(4 votes)
 How do you find the equation for the line of best fit?(1 vote)
 Try to eyeball a line that goes through the "middle of all the points", drawing it on the graph. Once you've done that, find the slope using the rise and run of the points on that line. Locate the yintercept as well. Finally, arrange the data into y = mx + b form.
Hope this helps!😊(6 votes)
 I think her experiment was bad because some classes are harder than others
I would have done worse in chemistry than math no matter what periods it was
if she checked all the first periods and all the second periods and all the third period etc.. it would have been accurate thought(3 votes)  Are scatterplots just like graphs?(1 vote)
 Yes, scatterplots show raw data, and the directions and flow in them allow us to see trends and make predictions.(4 votes)
 how do you do a 2 way plot(3 votes)
 Which (in your opinion) is the best graph to generally use?(3 votes)
 That depends on what your scenario is, and what you want to show and find out. For example, if you wish to show individual results on a class's math test, use the scatterplot. If you want to predict profit for your company, use a line graph. The pie chart would work nicely for showing how much of your sales were in a product or group.(1 vote)
 I know this is a bit irrelevant, but does anyone know how to get a black hole badge?(3 votes)
 were not suppost to ask that(1 vote)
Video transcript
Aubrey wanted to see if there's
a connection between the time a given exam takes place and
the average score of this exam. She collected data about
exams from the previous year. Plot the data in a scatter plot. And let's see, they give
us a couple of rows here. This is the class. Then they give us
the period of the day that the class happened. And then they give us the
average score on an exam. And we have to be a little
careful with the study maybe there's some
correlation depending on what subject is taught
during what period. But let's just use her data, at
least, just based on her data, see if well, definitely
do what they're asking us, plot a scatter plot, and then
see if there is any connection. So let's see. On the horizontal
axis, we have Period. And on this investigation,
this exploration she's doing, she's trying to see, well,
does the period of the day somehow drive average score? So that's why Period is
on the horizontal axis. And the thing that's driving
is on the horizontal, the thing that's being driven
is on the vertical. So let's plot each
of these points. Period 1, average score
93 right over there. Period 6, 87. Oh, that's not the
right place, and then we can move it if we want
87, right over there. Period 2, 70. Period 4, 62 right over there. Period 4 and 86, that's
right over there. Period 1, 73. Period 3, average
score of 73 as well. Period 1, 80,
average score of 80. And then Period 3,
average score of 96. So there we go. And it doesn't really
seem like there's any obvious trend over here. So let's make sure
that we got this right. And we did.