Analysis of Capacity of Wells

Step 1: Graphs
Graph > Boxplot > with Groups, Graph variable=Capacity, Categorical Variable=Rocks

There is a problem with the normal assumption. We can try to fix this with a square root transform:
Calc > Calculator, Store in Sqrt(Capacity), Expression SQRT('Capacity')
Graph > Boxplot > with Groups, Graph variable=Sqrt(Capacity), Categorical variable=Rocks

Better but still not so good. How about a log transform?
Calc > Calculator, Store in log(Capacity), Expression LOGT('Capacity')
Graph > Boxplot > with Groups, Graph variable=log(Capacity), Categorical variable=Rocks

and this looks much better.

Step 2: Summary Statistics
Because we used a transformation we will use the median and IQR/1.35
Stat > Basic Statistics > Display Descriptive Statistics, Variable=Capacity, By variables=Rocks, Statistics > check IQR
Note: this uses Capacity, not log(Capacity)!
Rocks
Groups n Median IQR/1.35
Dolomite 50 1.72 6.92
Limestone 50 0.45 1.45
Metamorphic 50 0.296 0.785
Siliclastic 50 0.461 0.96

Note that the estimates of the variation differ by quite a lot (0.96 vs 6.19). This is due to the fact that we have many outliers in the dataset. An alternative table could be done based on the transformed data and mean/std:
Stat > Basic Statistics > Display Descriptive Statistics, Variable=log(Capacity), By variables=Rocks
log(Capacity) by Rocks
Groups n Mean Std
Dolomite 50 0.177 1.111
Limestone 50 -0.299 1.025
Metamorphic 50 -0.388 0.765
Siliclastic 50 -0.3295 0.6112

Step 3: Hypothesis Test
Stat > ANOVA > Twoway, Response=log(Capacity), Factor=Rocks, Graphs > Residual vs. Fits Plot and Normal Plot


both plots look ok

Test for Rocks:
1) a=0.05
2) H0: a1 = .. = a4=0 (no difference in the mean Capacity for different Rocks)
3) Ha: ai≠0 for some i (some differences in the mean Capacitys for different Rocks)
4) p-value=0.007 < a
5) We reject H0, there are some differences in the mean Capacity for different Rocks

Step 5: Multiple Comparison
The order of the rocks by the means of log(Capacity) is Metamorphic - Siliclastic - Limestone - Dolomite
We find
Stat > ANOVA > Oneway, Response=log(Capacity), Factor=Rocks, Comparisons > check Tukey

Metamorphic Siliclastic Limestone Dolomite
________________________________

Interpretation: Dolomite has a stat. signif. larger capacity then the other rocks. Other differences are not stat. signif., at least not at these sample sizes. Warning If we had not done a transformation the results would have been quite different. For example, rocks would not have been stat. signiificant (p-value=0.06)