Main content
AP®︎/College Statistics
Course: AP®︎/College Statistics > Unit 5
Lesson 5: Analyzing departures from linearity- R-squared intuition
- R-squared or coefficient of determination
- Standard deviation of residuals or root mean square deviation (RMSD)
- Interpreting computer regression data
- Interpreting computer output for regression
- Impact of removing outliers on regression lines
- Influential points in regression
- Effects of influential points
- Identify influential points
- Transforming nonlinear data
- Worked example of linear regression using transformed data
- Predict with transformed data
© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice
Interpreting computer output for regression
Desiree is interested to see if students who consume more caffeine tend to study more as well. She randomly selects 20 students at her school and records their caffeine intake (mg) and the number of hours spent studying. A scatterplot of the data showed a linear relationship.
This is computer output from a least-squares regression analysis on the data:
Predictor | Coef | SE Coef | T | P | |
---|---|---|---|---|---|
Constant | 2, point, 544 | 0, point, 134 | 18, point, 955 | 0, point, 000 | |
Caffeine (mg) | 0, point, 164 | 0, point, 057 | 2, point, 862 | 0, point, 005 |
Want to join the conversation?
- In the earlier video, "R-squared or coefficient of determination", you mentioned the SEline, as in, the sum of errors between the line and the points. Would the S (standard deviation in residuals" be SEline/n?(11 votes)
- Why doesn't "bx" come first in ŷ=a+bx, whereas "mx" comes first in y=mx+b.(3 votes)
- I don't think the order matters as long as you have the correct value for the constant and slope.(5 votes)
- I was under the impression if the Pvalue is below .05 that implies there is a relationship between the independent variable and the dependent variable. If there is also a positive relationship at what point can we confidently determine that the model is a good fit and the increase is caused by the independent variable. Is there a percent threshold for R-sqr/adj r-sqr?(4 votes)
- Can anybody please explain why the constant coefficient 2.544 is the Y-intercept, and the caffeine coefficient 0.164 is the slope in the question 1? I can't seem to get my head around this. Please help!(1 vote)
- The y-intercept is always displayed in the top row, and the slope is always displayed in the bottom row. (Unfortunately, I don't know the reasoning behind them - sorry! Generally, I've found that the slope, y-intercept, s, and r^2 are the most useful pieces of information in these data charts.)(5 votes)
- I don't understand which is the x and y values on the charts(0 votes)
- The constant is y which stands for number of hrs studied and the x is the number of milligrams of caffeine taken(3 votes)
- What does regression mean?(1 vote)
- Correlation quantifies the strength of the linear relationship between a pair of variables, whereas regression expresses the relationship in the form of an equation.(1 vote)
- why is the format different? We don't use y=mx+b but y=b+mx. Whats the difference?(1 vote)
- there is no difference, sometimes it's just written differently
if you think of the commutative addition property, you can reorder the terms and the result won't change(1 vote)
- more caffeine = more awake more willingness to study(1 vote)
- Hello everyone!(0 votes)