ANOVA - Modeling

Oneway ANOVA or Completely Randomized Design:

Here we have just one Factor.

We have the following notation:
Yij = μ + αi + eij, i=1,..,nj, j=1,..,k

where:
k is the number of groups
nj is the number of observations in the jth group
Yij is the ith observation in the jth group
μ is the overall mean.
αi is the deviation from the overall mean of the jth group
eij is the residual of the ith observation in the jth group

Example (Mothers and Cocaine Use):
k =3
n1=39, n2=19, n3=36
Y11=44.3, Y12=45.1, Y21=45.3, and so on

How can we interpret this model?
Say you are asked to use the data set to find an estimate for the mean length of a newborn baby without knowing anything about the mother. Then your best guess is the sample mean of all the babies lengths: 49.5 cm is the estimate of μ.

Now somebody tells you that actually the babies mother belongs to the first trimester group. How is this going to change your estimate? Well, the mean length of babies in this group is 49.3 cm, so you would change your guess by 49.3-49.5= -0.2 cm. This is an estimate of α2. So α2 is the adjustment to the overall mean if you know the observation belongs to group 2.
Similarly we find an estimate of α1 to be 51.1-49.5= 1.6 cm and α3 to be 48.0-49.5= -1.5 cm.
So the α's incorporate the change to the overall mean if we know which group an observation belongs to. But if there is no difference between the group means, then we would not need any adjustments, and so the null hypothesis of no difference between the group means becomes:

H0: α12=...=αk=0 (means are the same for all groups)
Ha: αi≠0 for some i

Previously we have written the hypotheses as follows:

H0: μ1 = .. = μk
Ha: μi ≠ μj for some i≠j

These two are absolutely equivalent, but the first one is easier to generalize to more than one factor so we will use it.

Twoway ANOVA

Here we have two factors. Twoway ANOVA problems come in two flavours:

a) Completely randomized block design: One factor is the focus of the study, the other is included because it might effect the outcome
b) Factorial design: both factors are of equal interest.
but for our purposes the difference does not matter.

Twoway Additive Model:

Yijk = μ + αi + βj + eijk
which leads to the two hypotheses tests:
H0: α12=...=αI=0 (no difference of the means for first factor)
H0: β12=...=βJ=0 (no difference of the means for second factor)

Twoway Model with Interaction:


Yijk = μ + αi + βj + gij+ eijk

which leads to the three hypotheses tests:
H0: α12=...=αI=0 (no difference of the means for first factor)
H0: β12=...=βJ=0 (no difference of the means for second factor)
H0: g11=g12=...=gIJ=0 (no interaction)
were the third test is for the presence of interaction. Note: this third test requires repeated measurments!

Note: Stat>ANOVA>Twoway by default fits a model with interaction, but Stat>ANOVA>General Linear Model fits an additive model.