The Wilcoxon test, also known as the Wilcoxon rank-sum test or the Mann-Whitney U test, is a non-parametric statistical test that is used to compare two unpaired or independent groups of data. Non-parametric tests do not make assumptions about the shape or parameters of the population distribution from which the samples are drawn, which makes the Wilcoxon test a valuable tool when dealing with non-normally distributed data or ordinal data.
Here is a general overview of how the Wilcoxon test works:
1. The two sets of independent data are combined and ranked together from smallest to largest, irrespective of the group from which each data point came.
2. The rank-sum for each group is then calculated. The rank-sum is the sum of the ranks assigned to each data point in the group.
3. The test statistic, typically denoted as W, is calculated. This value is often the smaller of the two rank sums.
4. The null hypothesis of the Wilcoxon test is that the distributions of the two populations are identical, so that there is a symmetric probability of one randomly chosen value being larger than another randomly chosen value.
5. If the W statistic is significantly different from what would be expected under the null hypothesis (as determined by reference to a table of values or a computational tool for the Wilcoxon test), the null hypothesis is rejected.
The Wilcoxon signed-rank test is a different, but related, test. This test is used for paired or matched samples to test the hypothesis that the differences between pairs of observations come from a symmetric distribution centered around zero.
It’s important to remember that the Wilcoxon test, like all statistical tests, provides a measure of the evidence against the null hypothesis, but it does not “prove” that one distribution is different than another, nor does it provide information about the nature or magnitude of any such difference. Additionally, non-parametric tests like the Wilcoxon tests are generally less powerful (i.e., less likely to reject the null hypothesis when it is false) than their parametric counterparts, assuming the assumptions of the parametric tests are met.