Article Text
Statistics from Altmetric.com
Background
Fisher's Exact tests have been used to test association in a paper in this issue of the Journal, namely that by Akintomide et al.1 These notes are intended to provide some supplementary explanation of this method (see Box 1 for a glossary of terms used in this article).
Glossary of statistical terms used in this article
Association | Relationship between two variables. For categorical variables, the data can be envisaged as a cross-tabulation of counts of respondents with each combination of values for the two variables. ‘An association’ means the occurrence for an individual of a particular value of one variable, is associated with (more likely to be in conjunction with) a particular value of the other variable. ‘No association’ means the distribution of values across rows will be approximately the same in each column, or vice versa. See also Independence, Ordinal association. |
Asymptotic test (large sample approximation) | A test where the p value is obtained by approximation, but it is known that as sample size becomes very large, the calculated p value approaches to the true value very closely. |
Binary variable | Has only two possible values (e.g. oral contraceptive user or not, female or not). |
Categorical variable | Such a variable has a limited set of distinct values (categories), and these values can be nominal (i.e. simply descriptive, such as blood group) or ordered (such as degree of impact, duration of professional experience). |
Cell frequency (count) | See Cross-tabulation. |
Chi-square (χ2) test | Test applied to cross-tabulated data for two categorical variables, to assess association between them. It is designed for nominal data (no inherent ordering) so for any table of counts the same χ2 would be obtained whatever the ordering of rows and/or columns. It compares observed cell counts against what would be expected under the null hypothesis of no association, and the greater the discrepancy, the stronger the suggestion of association. |
Confidence interval (95%) | This defines a range of values within which we are 95% confident the true population effect (in this article, rho) lies. |
Conservative | Tending to give a p value that is larger than it might truly be (i.e. less likely to lead to rejection of the null hypothesis). |
Cross-classified | See Cross-tabulation. |
Cross-tabulation | A way of setting out data for individuals cross-classified by two categorical variables. The size of the table is specified R×C, where R=number of distinct categories in row classification, and C=number of column categories. Each cell of the table gives the frequency count of the number of individuals with that combination of row and column values [e.g. a 3×4 table has 3 rows and 4 columns, comprising 12 distinct cells (combinations)]. |
Effect size | Used loosely here to mean the strength of the association. In 2×2 tables the ‘effect’ could be summarised in a number of ways including, say, the difference between two study groups in percentage with some characteristic of interest (e.g. subgroup percentages of 34% vs 46% having some condition X, give an effect size ‘a difference of 12 %-points in prevalence of X’). |
Expected (cell counts) | Calculated from the total n, and the marginal totals for each of the two classification variables. |
Fixed marginal totals | Preset by the design. This is sometimes the case for one variable, say group sizes for a trial, but not often for both variables in a cross-tabulation. |
Frequency count | See Cross-tabulation. |
Hypothesis test | See Significance test. |
Independence (between two variables) | There is independence between two random variables R and C, if the probability that a participant has any specified value of R is unchanged by knowledge of that person's value for variable C. Or, independence is the same as ‘no association’ between values for R and values for C. See also Association. |
Marginal totals | The totals for each row and column in a cross-tabulation. |
Monte Carlo method | While exact results are preferred because they are reliable, the calculations required are sometimes too unwieldy. The Monte Carlo method is a general iterative method of obtaining an unbiased estimate of the exact value it is wished to calculate, by repeatedly sampling subsets of the entire ‘problem’, obtaining a calculated value for each subset, and then ‘averaging’ these (subset) values across all repeated iterations. For the Fisher Exact test, the difficulty is usually too many possible tables for which probabilities need to be calculated. In this case, Monte Carlo calculations of Fisher p values for a large enough number of subsets of tables, provides an unbiased estimate of the exact p value sought. |
Nominal variable | Has a set of distinct ‘naming’ values, such as type of contraception (sterilisation, barrier, hormonal), recruitment method (mailshot, general practitioner, Internet) |
Null hypothesis (NH) | A statement, prior to testing, of no effect (e.g. ‘no association’ between row and column classifications). See also Significance probability. |
Ordinal association | This is association between two ordinal variables such that, when a person has an ‘ordinally higher’ response (relative to group) on one of the two variables, he/she tends generally to give a response on the other variable that is also ordinally higher (direct or positive association) or tends generally to give a response that is ordinally ‘lower’ (inverse or negative association). |
Ordinal variable | Specifically this is a special subset of categorical variables where the values are conceptually ordered (e.g. degree of pain: none, mild, moderate, severe, etc.). In terms of statistical analysis, count variables (e.g. parity: 0, 1, 2, etc.) and continuous variables (e.g. weight: 62, 74, 91, etc.) are also conceptually ordinal, but might have too many possible values to be amenable to categorical methods of analysis. See Ordinal association. |
p value | See Significance probability. |
Power | This term is used here, loosely, as the probability of rejecting the stated NH on the basis of the study data, when that is in fact the correct decision. |
Significance probability (p value) | The probability, if the NH is true, of obtaining the observed data (combinations of responses on the two variables) or something more ‘extreme’ (i.e. further from the NH). The smaller p is, then the less likely this data would be under the NH, and so the greater our doubts that NH is indeed true. |
Significance test (or hypothesis test) | The process of testing aims to enable a binary decision to be made about the NH: reject NH or not. This decision is based on the significance probability (p value) obtained via the test. If the p value is low enough we will decide the data are so inconsistent with NH, that NH of ‘no association’ should be rejected as untenable – hence, in this application, concluding there must be some association. However, the p value reflects the size (power) of the study as well the strength of the association, so a more extreme p value does not necessarily mean a ‘stronger’ association. |
Spearman rho (non-parametric correlation coefficient) | Spearman rho is an index of the strength of association between values for two ordinal variables measured on the same individuals. Rho takes values between −1 and 1, with zero indicating no correlation, a positive value indicating a direct or positive correlation, and a negative value an inverse or negative correlation. Values 1 and –1 indicate perfect (direct or inverse) correlation. See also Ordinal association. |
Valid test | Used here loosely to mean a test that is suited to the research question and data variable(s) to be analysed, in the sense that the data to be analysed satisfy any data assumptions required for the test to perform adequately. |
What is Fisher's Exact test?
Undoubtedly the most widely known test of association between two binary variables is the 2×2 Chi-square (χ2) test.2–5 However, many readers will also have learned about Fisher's Exact test at some point – most likely in a basic statistics course – that Fisher's Exact test is the advised, or in fact the obligatory, alternative to the 2×2 χ2 test in the situation that ‘the sample size is small’.2–5 It might seem surprising then that Fisher Exact tests have been used for all analyses of association in the article by Akintomide et al., even though the n available for analysis is >100 in all analyses reported, and despite the fact that the cross-tabulations are not 2×2, but 3×3 or, in one case, 3×4.1
The fact is, Fisher's Exact test of association between two categorical (classification) variables is much more widely applicable than basic statistics courses have led learners to believe. There is an historical reason why it has been so ‘overlooked’, and that is because of the torturous arithmetic calculations that are required to achieve the Fisher Exact test for a cross-tabulation with large overall n, even more so to complete tests analogous to 2×2 Fisher …
Footnotes
Competing interests None.
Provenance and peer review Commissioned; internally peer reviewed.
Linked Articles
- Highlights from this issue