### Course: Statistics and probability>Unit 5

Lesson 3: Introduction to trend lines

# Fitting a line to data

Sal creates a scatter plot and then fits a line to data on the median California family income. Created by Sal Khan.

• How do we do this manually and write an equation usingpoint-slope form?
• Find the coordinates of two points on the line (not the data points they don't necessarily lie on the line).
The slope can then be calculated as the change in y over change in x and then you can use the point-slope form
• Are there more videos on how to use excel?
• What does "extrapolation" mean? What is linear regression?
• "Extrapolation" is using known data with a pattern to predict unknown data. A linear regression draws a line assuming that the scatter plot points go up linearly. He doesn't talk about it in this video, but there are other types of regression lines, like an "exponential regression," which works with something that grows exponentially, like: population, or a bank account, or GDP.
• Hey so I am kinda confused at how the slope is 1882, such a huge slope would give a really steep line wouldnt it?
Thanks!
• Great question! The "steepness" of a line is determined by 2 things. First, it is the slope, you're correct. But it is also determined by the rate each of the axes is increasing by. And because Sal chose to have the x- and y- axes to increase at different rates, it appears to be less steep.
Hope that helps!
• How do you find it out, paper-pencil style?
• well, you can ignore the fact that this is not really a linear equation and use the y=mx+b formula and you still get an estimate which is all they ask. since you cant predict the future in the real world.
• So the meaning of the Y intercept is basically the year at which the study started or at least where are table started right ?

But What is the meaning of the slope ? I would guess it to be the median income per year but it doesn't ring much to me. need a little help : ) thaks !
• The slope represents the "approximate rate" at which the median income is increasing. Per year, the median income increases x amount of dollars. I say approximate rate, because the rate is not constant, but the line of best fit represents the trend in the data.
• Fun fact: median income for a California family in 2010 was actually \$57,708!
• I am confused about the word model.
Is the line which is fitted in the data called the model?
• Yes, the line that is fitted to the data (and the equation of this line) is an example of a model.

Have a blessed, wonderful day!
• how could i have found the equation of the line without using excel?