Main content

## Statistics and probability

### Course: Statistics and probability > Unit 15

Lesson 2: Nonlinear regression# Comparing models to fit data example

Sal determines if a quadratic or exponential model fits the data better, then uses the model to make a prediction.

## Want to join the conversation?

- Is there another more "mathematical way to prove which function is a better fit, because Sal's method of counting and eyeballing seems a little unreliable to me.(6 votes)
- the way of measuring the summed (or averaged) distance of predictions from real datapoints applies not just to a linear model

it works for any shapes of model including the two types in video

and one of the simplest way is to sum (predict-real)^2 over all datapoints, compare this value of each model, pick the smallest one. cause it "fits" best to the real values(1 vote)

- If the years scale (x) was a tad longer, would the exponential function fit better then? (since the quadratic function starts rising up but the prices don't)(5 votes)
- We don't know what the curves look like after 5 years. However, based on the data given, the exponential curve fits the data points far better than the exponential curve. A possible real life answer could be that as the older the movie gets, the rarer and more difficult it gets to find it. It would be reasonable to expect to pay a premium to watch it.(0 votes)

- I keep trying to find what R^2 and R are, but I can't find anything. What are they?(1 vote)
- R is the linear correlation coefficient, it shows the relationship between the two variables. The closer R is to 1, the stronger positive relationship it is. The closer R is to -1, the stronger negative relationship it is.

Now, R^2 is the coefficient of determination, which is the proportion of variation in Y that is explained by the X regression model.The bigger R^2, the more accurate the model is in accounting for the relationship between X and Y.

Hope this helps!(4 votes)

- I have two sets of data (both experimental) which are non-linear, and I'm looking for some ways to tell how fit they are with each other. Is there a similar coefficient (like R-squared) which can be used for this?

Thank you!(1 vote) - after 2017 ppl just didn't comment this video(1 vote)
- How would you do this using an F-Test?(1 vote)
- I still don't get it. Can someone explain in a more understanding way because I am so confused right now?(1 vote)
- Can someone tell me how to re-express this data to make it linear?

Weight: 2 5 8 10 20 40 60 80 100 120

Food: 1/3 2/3 2 9/8 2 13/4 13/3 5.5 6.5 22/3(1 vote) - @3:57What is Perseus one? Is this a new feature coming soon?(1 vote)
- No. Sal is able to see the meta data on the page and is able to change it to alter the data to change the questions. I imagine Perseus one is just an extension of this ability.(0 votes)

- In some of the practice examples, we are given a set of 4 graphs and asked to identify the best-fitting function by selecting the graph that looks most linear. How do I do that?(0 votes)

## Video transcript

- Christine works in a
movie store in her hometown. Using the store's total selection, she documented the price
of each movie title and how many years it has been since it was featured in movie theatres. She plotted the points below. So let's see what's going on below here. Looks like there's two
curves that she tries to fit. I'm assuming we're going to
read about it in a second. But these blue points are the data points. So, for example, this
data point right over here shows a movie that the
title costs six dollars, and it has been released
for almost two years, a little under two years. This data point right
over here, this is a movie that has been released
for almost four years, looks like maybe three
and three quarters years. And they're selling that,
looks like for a dollar or even a little bit less than a dollar. So those are her data points. So once, again, she documented the price of each movie title as
a function of how many years it's been since it was
featured in movie theatres. She is looking for a function
that models her data. Since the trend of the data
is decreasing and convex, and you see it here, it's
definitely decreasing, and convex, it's opening
upwards, if you imagine a curve, it looks like it's opening
upwards a little bit like that, so decreasing and convex,
she found a decreasing convex exponential model and a
decreasing convex quadratic model. Which of the following
functions better fits the data? Function A, this is an exponential. This is the one in green right over here. And Function B, this one right
over here is a quadratic. And you can see this one in purple. And so, which one of those
better fits the data? If we look at what's going on here, the green function, the exponential one, most of the data points
for any given duration, for how long the title's been out, it looks like it's
consistently underestimating. That it's always, the model's guess, or what the model would say the price is, is always, essentially except
for only one data point right over here, for all
of these other data points it's underestimating
what the price would be. The purple model or the purple
function right over here, it has more of a balance
between overestimating, right over here, it's
overestimating by a little bit, and underestimating. And its underestimates are closer, and its overestimates are
closer than this green model. So I would say that Function B
is definitely a better model. Use the function of best
fit, so we're going to say Function B, to predict
the price of a movie that was featured in
theatres 5.5 years ago. Round your answer to the nearest cent. So 5.5 years ago, that's
going to be right over here. We're going to go to Function
B, which is this purple one. So it's going to be under a dollar. But we want to get something
to the nearest cent, so let's actually use the actual
definition of the function. So this is price as a function of how long the movie has been released. Where x is how long it's been released, and y is its price. If x is 5.5, let's figure
out what y is going to be. So y is going to be equal
to 0.5 times x squared. So x is 5.5 squared. So then we have minus five times x again. So minus five times 5.5. And then we have plus 13. And what does that get us? That gets us 62 1/2 cents. If we were to round our
answer to the nearest cent, that's going to be 63 cents. And we got it right.