ANOVA - Interaction

In the multiple regression problems with discrete predictors we needed to worry about interaction, or whether or not a parallel line model was ok. If not we needed to include a multiplicative term (recall Yrs Sev*SexCode). Here we have the same issue:

Say we have a multi-way design problem, that is there is a continuous response variable and several discrete factors. Again we need to worry about interaction. In the regression case we could include the multiplicative term and run the best subset regression command to decide whether it was needed. Here we can check the interaction plot

An interaction plot looks as follows:

Here the line segments are all parallel. This implies that for any value of the factor A going from one value of B to the next adds the same amount to the response. So if we go from B=1 to B=2 both lines move up by 1.0, and if we go from B=2 to B=3 both lines move up by 0.3.

Because of this we call such a model additive

Now consider the following interactions plot:

Here as we go from B=2 to B=3 the line goes up by 0.3 if A=0 but it goes down by 0.6 if A=1.

Here is another way of understanding the difference: Say you are told that you have an additive model and the following information:

Factor 1 "low" "high"
Factor 2 = "in" 2.3 2.7
Factor 2 = "out" 3.1 ?

Can we make a guess for the response if Factor 1 = "high" and Factor 2 = "out"? We see that if Factor 2 = "in" and going from "low" to "high" the response goes up by 0.4 (=2.7-2.3). In an additive model that means the response should go up the same amount for Factor 2 = "out", that is it should go to 3.5 (=3.1+0.4).

But if there were interaction there would be no way to make any guess at all!

Deciding from the graph whether or not there is interaction is not always easy:

This is even worse because in ANOVA problems we often have very small data sets, so there is a great amount of variation in these graphs from sample to sample. As an illustration consider the following simulation:

open MINITAB project interaction.mpj
run MACRO %K:\3102\int 'A' 'B' 0.5

Now this generates data from a 100% additive model, but still sometimes the interaction plot shows interaction. How much (if at all) this happens depends on the standard deviation of the residuals, here 0.5.

So its really good if we can actually test for interaction, but that requires repeated measurements.