Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more.


ANOVA - How to perform it

When performing an ANOVA the following basic assumptions have to be met

If these assumptions are not met, the analysis of variance can still be performed, but using different test procedures which are beyond the scope of this text.

Assuming that the data follows a normal distribution, we first have to test for equal variances(1). Depending on the samples several tests are available:
 
Hartley's test Requirement: the samples have to be of equal size. The test statistic is defined as the ratio of the largest and the smallest variance of all samples.
Cochran's test Cochran's test should be used when the variance of one sample is considerably larger than all others. The test statistic is calculated as the ratio of the largest variance to the sum of all variances.
Bartlett's test Requirement: this test is sensitive to non-normal distributions. Bartlett's test is a combination of a test for normality with a test for equal variances.
Brown-Forsythe Test and Levene Test The Levene test and a derived form of it, the Brown-Forsythe test, have two benefits: the data needs not to be normally distributed, and they may be applied to data sets with unequal group sizes. The Levene test is suitable for multi-factorial investigations. The difference between the Levene and the Brown-Forsythe test is the usage of means and medians, respectively.

In the next step the analysis of variances is performed (remember the goal is to compare means, not variances). The null hypothesis for the ANOVA is that all sample means are the same.  In order to achieve the ANOVA we have to calculate the mean of squares(2) within each sample MSw and the mean of squares among the samples MSb. The mean of squares (MS) is defined as the sum of squares divided by the degrees of freedom. The test statistic F, which is defined by the ratio of MSb to MSw, is distributed according to an F distribution. A value higher than the critical value Fk-1;n-k indicates that the null hypothesis has to be rejected.



(1) When checking the variances it is a good idea to set the level of significance clearly above the level of ANOVA. This keeps the type 2 error (= the non-rejection of the null hypothesis despite the variances are different) as low as possible.
(2) The mean of squares, MS, is defined as the sum of the squared differences to the mean, divided by the number of degrees of freedom.