y = 30000+1.95x

Which is reasonably fair. My only concern is that, practically the (0,0) point should be included in the model.

Is there any math help I can get please ?

- Thread starter ojaswita
- Start date

y = 30000+1.95x

Which is reasonably fair. My only concern is that, practically the (0,0) point should be included in the model.

Is there any math help I can get please ?

- Joined
- Apr 12, 2005

- Messages
- 9,979

1) It has a range of applicability. It's a good representation of the data only in that range.

2) It's only a line. It can't bear with much variation. Is the line truly representative of your data?

If you insist that the line pass through the Origin, you must design your regression to do that. You've only one parameter - the slope.

For a least-squares solution, solve the Normal Equation: \(\displaystyle m\cdot\sum x^{2} = \sum xy\), where \(\displaystyle m\) is the slope in \(\displaystyle y = mx\)

Note: Your software may have an option for this, something like "set y-intercept to zero (0)".

First, he is fully correct that a regression of even excellent fit may not be a good approximation outside the range of the data used to create the regression equation.

Second, a regression is not a statement of truth, but an approximation. It likely approximates a truth if the relative error terms are all small and uncorrelated and if those errors can reasonably be attributed to errors in the data or to other contributing but ignored factors of very low importance.

Third, you may have reason to know that the true relationship must be such that f(0) = 0, but regression gives a linear equation where f(0) is nowhere close to zero. Then you know that either f(x) is not linear over the entire range of possible values of x or that your data are not typical. What to do? If the relative errors are small and uncorrelated, you can decide to use your regression equation as a good approximation in the range of your data and slightly outside it. If the relative errors are large or correlated, you should consider using a non-linear or a multi-variable model rather than a single variable, linear model. A linear equation in one variable may not have any relationship to reality.

- Joined
- Aug 27, 2012

- Messages
- 387

Good point. I once did a regression on a linear function as y = mx + b for a model that forced y = mx. I thought it was a reasonable thing to do to see how the model measured up to reality (no pun intended), but my lab professor gave me a lecture for it.

First, he is fully correct that a regression of even excellent fit may not be a good approximation outside the range of the data used to create the regression equation.

Second, a regression is not a statement of truth, but an approximation. It likely approximates a truth if the relative error terms are all small and uncorrelated and if those errors can reasonably be attributed to errors in the data or to other contributing but ignored factors of very low importance.

Third, you may have reason to know that the true relationship must be such that f(0) = 0, but regression gives a linear equation where f(0) is nowhere close to zero. Then you know that either f(x) is not linear over the entire range of possible values of x or that your data are not typical. What to do? If the relative errors are small and uncorrelated, you can decide to use your regression equation as a good approximation in the range of your data and slightly outside it. If the relative errors are large or correlated, you should consider using a non-linear or a multi-variable model rather than a single variable, linear model. A linear equation in one variable may not have any relationship to reality.

-Dan

- Joined
- Aug 27, 2012

- Messages
- 387

That's a pretty good sized intercept if it's supposed to be 0! What are the scale of your x's?

y = 30000+1.95x

Which is reasonably fair. My only concern is that, practically the (0,0) point should be included in the model.

Is there any math help I can get please ?

-Dan