The Fisher’s Exact Test
A non-parametric test for investigating associations between two categorical, binary variables.
Complete the form below to unlock access to ALL audio articles.
There are many occasions when we want to investigate whether there is an association between two categorical, binary variables, such as looking at whether gender (male/female) and voting (yes/no) are related. In this article we will explore the theory, assumptions and interpretation of one such test that is particularly useful for small sample sizes, Fisher’s exact test, and take you through a worked example.
Assumptions of Fishers exact test
When to use Fisher's exact test and Fisher test alternatives
Interpreting Fisher’s test results
What is Fisher's exact test?
Fisher’s exact test is a statistical hypothesis test used to assess the association between two binary variables in a contingency table and is particularly useful when working with small sized samples. It is a non-parametric test, meaning it assumes no distribution in the data and it is analogous to the Chi-squared test for independence.
The hypotheses of Fisher’s exact test are as follows:
- The null hypothesis (H0) is that there is no association between the two variables.
- The alternative hypothesis (H1) is that there is an association between the two variables in any direction.
What is a contingency table?
A 2 x 2 contingency table, sometimes referred to as a cross-tabulation or a two-way table, is a useful tool in statistics that displays in its cell values the frequencies (counts) of each combination of two categorical variables with row and column totals included. They are powerful tools that help us to understand the relationship between two variables in a sample of data. You can find an example of a contingency table and how to interpret it in the final section of this article.
Fisher’s exact test makes use of contingency tables to calculate the probability of observing the data as it is, considering all other possible arrangements of the observed data while maintaining the row and column totals fixed.
Assumptions of Fishers exact test
Assumptions for Fisher’s exact test are as follows:
- Both variables should be categorical and binary, meaning they can take one of two values, so that a 2 x 2 contingency table can be populated.
- Data should be randomly selected from independent samples; groups should have no relationship to each other, and observations cannot fall into more than one category simultaneously.
- One or more cell value counts in the contingency table is small (less than 5). Where all values are more than 5, a chi-squared test should be performed instead. While Fisher’s exact test is theoretically valid when samples are large, it is computationally intensive and so usually only used for small samples.
When to use Fisher's exact test and Fisher test alternatives
When comparing proportions or values of categorical variables, these comparisons are usually made assuming the distribution of values can be approximated to a Normal distribution and calculated using a Chi-squared test.
However, this is only valid when the sample size is large. When the sample size is small, we can assess the association by evaluating all possible combinations of the data and compute exact probabilities (exact because it is no longer an approximation) and an exact p-value, and when we are interested in two binary variables that is where Fisher’s exact test comes in.
If data samples are paired, (for example, two measurements come from the same group of participants) and the pairs are small in number, then an exact McNemar’s test can be used as an alternative to Fisher’s exact test.
Interpreting Fisher’s test results
The Fisher’s exact test is performed by calculating the probability of the data that is observed if the null hypothesis (no association) is true, by using all possible 2 x 2 tables that hypothetically could have been observed, for the same row and column totals as those that are observed in the data (these are sometimes referred to as the marginal totals).
In other words, we are assessing how extreme our table of frequencies is in relation to all possible versions of it that could have occurred under the marginal totals and from this, making an inference about the association between the two variables.
We can generate a 2 x 2 contingency table as follows (Table 1):
Table 1: Example of a 2 x 2 contingency table.
|
|
|
|
| Outcome category 1 | Outcome category 2 | Total |
Exposure category 1 | a | b | a + b |
Exposure category 2 | c | d | c + d |
Total | a + c | b + d | a + b + c + d = n |
The formula for the exact probability (P) of the observed table is then:
where the symbol ! is a factorial, meaning we must multiply all integers from 1 up to the value preceding the symbol. For example, 4! = 1 x 2 x 3 x 4.
The exact probability of the observed table is calculated by using the different ways the cell frequencies of the observed table could be rearranged under the marginal totals, where one of these tables corresponds to our observed cell frequencies. The one-sided p-value is the probability of obtaining our observed table or results more extreme in the “tail” of the distribution (the most extreme value in the direction of the observed difference). It follows that the one-sided p-value gives only one direction of effect.
Usually more of interest is the two-sided p-value as it gives both directions of the effect and better represents the hypotheses of our test, in that it is the probability of observing data as extreme or more extreme than the observed results, assuming the null hypothesis is true.
There are two common ways the two-sided p-value can be calculated:
- The probability in the observed table + the probability in the “tail” of the distribution (the most extreme value in the direction of the observed difference), multiplied by two
- The probability in the observed table + the probabilities of all tables that have probabilities less than or equal to the observed table
Both approaches will give different, but valid, results. In practice, we rely on statistical software to perform these calculations.
A Fisher exact test example
Let’s take an example where we are interested in the association between coffee consumption (binary exposure, whether the people in a study drink coffee, yes/no) and cancer (binary outcome variable, whether people in the study developed cancer, yes/no).
Step one is to present the null and alternative hypotheses
The null hypothesis (H0) is that there is no association between coffee consumption and cancer in the study. The alternative hypothesis (H1) is that there is an association between coffee consumption and cancer.
Step two is to populate the contingency table and calculate the exact probability
Our contingency table might look like this (Table 2):
Table 2: Contingency table showing coffee consumption and cancer incidence in study subjects.
|
|
|
|
| Cancer | No cancer | Total |
Coffee drinkers | 3 | 11 | 14 |
Non-coffee drinkers | 1 | 15 | 16 |
Total | 4 | 26 | 30 |
Next, we calculate the exact probability in the observed contingency table:
Step three is to obtain p-values using the exact probabilities of other possible tables
The other possible contingency tables under the marginal totals and their exact probabilities are as follows (Table 3):
Table 3: All possible tables that may exist under the marginal totals.
|
|
|
|
| Cancer | No cancer | Total |
Coffee drinkers
| 0 | 14 | 14 |
Non-coffee drinkers
| 4 | 12 | 16 |
Total | 4 | 26 | 30 |
P = 0.066
|
|
|
|
| Cancer | No cancer | Total |
Coffee drinkers
| 1 | 13 | 14 |
Non-coffee drinkers
| 3 | 13 | 16 |
Total | 4 | 26 | 30 |
P = 0.071
|
|
|
|
| Cancer | No cancer | Total |
Coffee drinkers
| 2 | 12 | 14 |
Non-coffee drinkers
| 2 | 14 | 16 |
Total | 4 | 26 | 30 |
P = 0.398
|
|
|
|
| Cancer | No cancer | Total |
Coffee drinkers
| 4 | 10 | 14 |
Non-coffee drinkers
| 0 | 16 | 16 |
Total | 4 | 26 | 30 |
P = 0.036
Using the second approach: two-sided p-value = 0.212 + (0.066 + 0.071 + 0.036) = 0.385
Step four is to interpret the p-values
While we obtain different two-sided p-values for each approach we used, they are both valid and both lead us to the same conclusion. Assuming a significance level of 0.05, we can see that both p-values are far greater than this. We can conclude that there is insufficient evidence to reject the null hypothesis, and that there is no evidence of an association between coffee drinking and cancer in this study.
Further reading:
- Freeman JV, Julious SA. The analysis of categorical data. Scope. 2007;16(1):18-21.
- Bland M. An Introduction to Medical Statistics (4th ed.). Oxford. Oxford University Press; 2015. ISBN:9780199589920
- Frost J. Fishers exact test: Using and interpreting. Statistics By Jim. https://statisticsbyjim.com/hypothesis-testing/fishers-exact-test/, Accessed April 2, 2024.