The Wilcoxon rank-sum test, also known as the Mann-Whitney U test, is a nonparametric test used to determine if there is a significant difference between the distributions of two independent samples. This test is designed specifically for two independent populations and has the following key requirements:
The null hypothesis is that the median of the distributions of the two populations are identical, and the alternative hypothesis is that the distributions of the two populations are different. In addition, there is no assumption that the samples are normally distributed. In this test, when two samples are drawn from two identical populations and ranked as one single pool of data points, their medians are expected to be different if smaller or larger ranks predominantly fall into one of the samples.
To perform the Wilcoxon rank-sum test, also known as the Mann-Whitney U test, data from both samples are combined into a single ranked list, where each value is assigned a rank from smallest to largest. If any values are tied, they are given the average of the ranks for those positions. The ranks are then separated back into their respective groups, and the sum of ranks is calculated for each group.
The test statistic (denoted U in the Mann-Whitney U version) is derived from these rank sums, and its significance is evaluated to determine if there is a difference between the two samples. A significant result suggests that one sample has systematically higher or lower ranks, indicating a difference in the underlying distributions of the two populations. For small sample sizes (typically n<20), critical values for U are used from a table, while for larger samples, a z-score approximation is applied, assuming a normal distribution. This test is particularly useful when the assumptions for a two-sample t-test are not met, such as with ordinal or non-normally distributed data, offering a robust alternative to assess differences between two independent groups.
When the sample size is sufficiently large, this test is generally more efficient than its parametric counterpart and, therefore, more preferred for the data analysis. The significance tested using the Wilcoxon rank-sum test is usually reliable despite having outliers in the data. However, the test is also prone to higher type-I error when the data are either biased, heteroscedastic (having different variance), or when data/sample distributions are extremely far from the normal distribution.
Wilcoxon rank-sum, or the Mann-Whitney U, is a nonparametric test used to determine the difference between two populations by comparing their medians.
It strictly applies to two independent simple random samples with equal or unequal sample sizes.
Consider the prey capture response time in two different spider species.
Here, the null hypothesis states that the median response time of the two species is the same. The alternative hypothesis states otherwise.
The values in these two samples are ranked, considering them as a single pool of data points. However, the rank sum is calculated independently.
When high or low ranks are primarily found in one sample than the other, the two samples may have different medians.
The test statistic z is calculated using the following equations to test the hypothesis.
The Wilcoxon rank-sum test is two-tailed. So, the test statistic should be compared with the positive and negative critical values, typically at 5%.
Since, in the present example, the test statistic is beyond the range of these critical values, the medians of the two samples are significantly different.