Fitting non-linear Models: Transformations and Polynomials

Consider the dataset on Electricity Usage. The scatterplot as well as the residual vs. fits plot of the linear model indicate that a linear model is not good for this data:

So what can we do?

Transformations

One possibility is the same one we already used before to fix the problem with the normal assumption: transform the data. Even more the same transformations might work here as well:

Square Root Transform

Here we do a square root transform of the predictor x:
Model Transformation Fit
y = b0+b1 √x SQRT(x) y vs. SQRT(x)

Exponential Model

This means a log transform of the response y:
Model Transformation Fit Coef.
LOGT(y) LOGT(y) vs. x

Power Model

This means log transforms of both the predictor x and the response y:
Model Transformation Fit Coef.
LOGT(x) LOGT(y) vs. LOGT(x)
  LOGT(y)    

Polynomial Models

In addition to the transformation models we also have another option: polynomial models. In mathematics and expression of the form
y=a0+a1x+a2x2+a3x3+..+anxn
is called a polynomial of degree n. Here are some special cases:

Quadratic Model

Model New Predictor Fit
y = b0+ b1x+ b2x2 x**2 y vs. x and x**2

Cubic Model

Model New Predictors Fit
y = b0+ b1x+ b2x2+ b3x3 x**2 y vs. x, x**2 and x**3
  x**3    

Mathematical Features of these Models

What "shapes" can we fit with these models?

• Square root, exponential and power models are monotone, that is they either go up or down but never turn around.

• Polynomial models usually do turn around, quadratic models once, cubic models twice and so on. Sometimes this is not apparent because we only see the graph before the turnaround happens.

How to find these models

The only way to find the square root model is as described above: make a new variable SQRT (x) and fit y vs. SQRT(x). The exponential, power, quadratic and cubic models can be fit (and checked) using the Fitted Line Plot routine. Any higher order polynomial models again have to be fit as described above.

Example: Electricity Usage data:

• Square root model:
Calc > Calculator, Store in: SQRT(Temp) Expression: SQRT('Temperature')
Stat > Regression > Regression, Response= Usage, Predictors=, SQRT(Temp)
Result: Usage = 187 - 19.8·√Temperature

• Exponential model:
Graph > Regression > Fitted Line Plot, Response=Usage, Predictor= Temperature, Options > check logten of Y
Model from MINITAB fit: logten(Usage) = 2.317 - 0.01384 Temperature
so b0=2.317, hence a=102.317=207.5, and the exponential model is:
Usage = 207.5·10-0.0138Temperature

• Power model:
Graph > Regression > Fitted Line Plot, Response=Usage, Predictor= Temperature, Options > check logten of Y and logten of X
Model from MINITAB fit: logten(Usage) = 4.308 - 1.599 logten(Temperature)
so b0=4.308, hence a=104.308=20320, and the power model is:
Usage = 20320·Temperature-1.6

• Quadratic model:
Graph > Regression > Fitted Line Plot, Response=Usage, Predictor= Temperature, check quadratic
Usage = 196.7 - 4.640·Temperature + 0.03073·Temperature2

• Cubic model:
Graph > Regression > Fitted Line Plot, Response=Usage, Predictor= Temperature, check cubic
Usage = 213.0 - 5.689·Temperature + 0.05198·Temperature2 - 0.000136·Temperature3

•Power 4 (Biquadratic)
Calc > Calculator, Store in: Temp**2, Expression: 'Temperature'**2
Calc > Calculator, Store in: Temp**3, Expression: 'Temperature'**3
Calc > Calculator, Store in: Temp**4, Expression: 'Temperature'**4
Stat > Regression > Regression, Response= Usage, Predictors=, Temperature Temp**2 Temp**3 Temp**4
Usage = 202 - 4.7 Temperature + 0.020 Temp2 + 0.00030 Temp3 - 0.000002 Temp4