If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Introduction to residuals and least squares regression

Introduction to residuals and least squares regression.

## Want to join the conversation?

• I am so confused. Which one is the actual y value and which one is the predicted y value ?? Why is 100 the actual value ? And also, at , how did he get that point ?? He literally just said the predicted value was right there, but he did not even explain how he got it...
• 100 is the actual weight because he measured someone who was 60" tall and that person weighed 100 pounds. He plotted that on the graph at (60,100). He created a line, by “eyeballing” the data points for what looked like a best fit for the data. He used that diagonal line to predict a person’s height from their given weight. Using the line, a person who is 60” is predicted to weigh 150 pounds. You can find that by drawing a line straight up from the x-axis at 60 and see where it meets the diagonal line. Draw a horizontal line from that point to the y-axis and you can read the y value, which is the weight predicted by using the line.
• Is residual same as variance in machine learning?
• Residual is synonymous to the value of a loss function
• Since sum of squared residuals is more sensitive to outliers (as squaring assigns greater proportion of the sum to the outlier), why is sum of absolute residuals used less in regression?
• The sum of squared residuals is used more often than the sum of absolute residuals because squaring the residuals gives more weight to outliers, making the method more sensitive to extreme data points. This sensitivity to outliers can be advantageous in certain cases as it helps to identify and account for significant deviations from the regression line, providing a more robust model.
• this confused me even more.
• So we have to find the predicted value and the we use the actual to get our residual
(1 vote)
• Yes, to calculate the residual for a data point, you first find the predicted value using the regression line equation (y = mx + b), substituting the corresponding value of x. Then, subtract the actual observed value of y from the predicted value to obtain the residual. If the actual value is above the line, the residual is positive; if it's below the line, the residual is negative.
• so what is the easiest way of doing this and understanding because the way my math teacher explained, its hard.
(1 vote)
• The easiest way to understand linear regression is to grasp the concept of fitting a line to a scatterplot in a way that summarizes the relationship between two variables. Understanding the slope-intercept form of a linear equation (y = mx + b), where m is the slope and b is the y-intercept, is crucial. Then, comprehend how the line minimizes the differences (residuals) between the actual data points and the predicted values from the line.
(1 vote)
• Is this pretty much finding slope y=mx+b
(1 vote)
• Yes, linear regression involves finding the slope (m) and y-intercept (b) of the line in the equation y = mx + b, where y represents the dependent variable, x represents the independent variable, m is the slope, and b is the y-intercept.
(1 vote)
• That was kind of confusing all I had to understand was how could you solve it when the actual number is above the line.
(1 vote)
• When the actual number is above the line, it results in a positive residual, indicating that the actual value is higher than the predicted value by the regression line.
(1 vote)
• What is the purpose of these residuals?
(1 vote)
• The purpose of residuals in linear regression is to measure the discrepancy between the observed values of the dependent variable and the values predicted by the regression model. Residuals help assess how well the model fits the data points and identify any patterns or trends that the model might not capture effectively
(1 vote)
• I just wanted to ask this question - Can't we use least squares approximation from linear algebra to find the line of best fit in this case?