F-Test

F-ratio tests

In psychological research, conventionally the sample means are compared in hypothesis tests. However, it is possible to have two identical means for two different samples/groups, meanwhile their variances could differ drastically. The statistical test to use to compare variance is called the $F$-ratio test (or the variance ratio test) and compares two variances in order to test whether they come from the same populations.

The F-statistic

The variance ratio formula is as follows:

\begin{equation} F = \dfrac{\text{Larger Variance Estimate}}{\text{Smaller Variance Estimate}} \end{equation}

There is a table of the $F$-distribution. Similarly to the t and Normal distributions, it is organised according to the degrees of freedom of the two variance estimates. The $F$-ratio test is a one-tailed test as it determines whether the numerator is bigger than denominator.

The Method
  • First we form our hypotheses:

\begin{align} H_0 &: \text{The two sample variances are equal.}\\ H_1 &: \text{There is a significant difference between the two sample variances.}\\ \end{align}

  • Then, we calculate the variance estimates/sample variances for each group.
  • We then calculate the $F$-statistic using the formula in the purple box above.
  • Next we compare our $F$-statistic with a critical value from a significance table for the $F$-distribution. We compare it with the intersection of the column for the degrees of freedom for the larger variance estimate and the row for the degrees of freedom for the smaller variance estimate. The degrees of freedom for each estimate is $n - 1$, where $n$ is the size of the group.
  • If our $F$-statistic is greater than the critical value then we conclude that there insufficient evidence to suggest that the two populations come from the same population of scores. Therefore, we accept the hypothesis that the sample variances are significantly different.
  • If our $F$-statistic is less than the critical value, then we instead accept the null hypothesis.

Worked Example

Worked Example - F-test

There is a clinical trial taking place to see how effective current treatments are at reducing anxiety levels in sufferers of Obsessive Compulsive Disorder (OCD) . Two groups of $8$ patients with OCD are given two different treatments. One group is assigned to the treatment Cognitive Behavioural Therapy (CBT), whereas the other group receives a course of anti-depressants, selective serotonin re-uptake inhibitors (SSRIs). Before treatment the patients' Hamilton Anxiety Rating Scale (HAM-A) scores were recorded in table below. After eight weeks of treatment, their HAM-A scores are recorded again, and then the difference between pre-treatment and post-treatment scores is calculated, to see how effective the treatments are. One of the clinical statisticians from the clinical trial team is concerned that the two different groups might have different variances in HAM-A scores to begin with and thinks this could affect the conclusions of the trial.

Perform a test to see whether or not the clinical statistician should be worried.

CBT Group

SSRI Group

16

23

22

19

29

25

28

26

17

25

19

24

20

18

19

23

Solution

Our hypotheses are:

\begin{align} H_0:& \text{The sample variances are not significantly different.}\\ H_1:& \text{The sample variances are significantly different.}\\ \end{align}

First we need to calculate the variances of the two groups.

Recall that the formula for the sample variance is: \[\mathrm{Var}(X) = \dfrac{1}{n-1}\sum\limits_{i = 1}^{n}{(x_i - \bar{x}^2)}\]

Rather than by doing this by hand we can use a calculator to find the variances for the two groups.

\begin{align} {s_C}^2 &= 23.357\\ {s_S}^2 &= 8.411\\ \end{align}

Note that the variance for the CBT group is larger than the variance for the SSRI group. Also, the mean for the CBT group is $21.25$ and for the SSRI group $22.875$.

Finally, we can calculate the $F$-statistic.

\begin{align} F &= \dfrac{~{s_C}^2}{~{s_S}^2},\\ &= \dfrac{23.357}{8.411},\\ &= 2.777 \text{ (3 d.p. )} \end{align}

We need to compare this with a value from an $F$-table. There are $8$ samples in each group, so we compare our value with the values in the intersection of $8 - 1 = 7$ and $8 - 1 = 7$.

Significance Level

$F$-Value

$90\%~(0.1)$

$2.78$

$95\%~(0.05)$

$3.79$

$97.5\%~(0.025)$

$4.99$

$99\%~(0.01)$

$6.99$

$99.5\%~(0.005)$

$8.89$

$99.9\%~(0.001)$

$15.00$

Our $F$-value of $2.777$ is less than any of the critical values in the table above; this means that the sample variances are not significantly different from each other. (Although we do note that it is approaching significance at the $10$% level.) We accept the null hypothesis.

In psychological research, we would report our findings in the following way:

'The mean score for the SSRI group $(\bar{X}=21.25)$ is slightly higher than for the CBT group $\bar{X}=22.875)$. However, we have found that the variances of the two groups are not statistically significant $(F=2.777, \text{df} = 7, 7, p > 0.05,$ ns$)$.

Note: df stands for degrees of freedom and ns means not significant.

Test Yourself

Try our Numbas tests on parametric hypothesis tests and two-sample tests.

See Also

Click here for some more worked examples (from the Business page).

See also non-parametric hypothesis testing and tests on frequencies for information on other types of hypothesis testing.