The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works by analyzing "runs" in the data—continuous sequences of similar elements. A "run" is defined as a series of consecutive identical symbols (e.g., a run of positive values or a run of negative values). The Wald-Wolfowitz test compares the observed number of runs to the number of runs expected under randomness. Consider the following example for the sequence or run:
Dataset-1:
0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1
In this dataset, the [0, 0]; [1, 1, 1]; [0, 0, 0]; [1]; [0]; [1]; [0, 0, 0]; [1, 1]; [0, 0]; [1, 1] are the recognizable sequences or runs, for a total of 10 runs. As 0 and 1 are different in nature (i.e., they provide different information, e.g., absence and presence), 0 and 1 together cannot form a run. This means that [0, 1]; [0, 1] cannot be considered as a run.
The basic principle of the WWR test is "Reject the randomness of the data when the number of runs is extremely low or extremely high". The test provides a quantitative measure of randomness at a certain level of significance, for instance, 0.05. The WWR test alone, however, does not offer any clear indication of how random a given dataset is. The magnitude of randomness is still qualitative and needs to be interpreted based on the nature of the data (i.e., binary, categorical, or numerical).
The Wald-Wolfowitz runs test examines the randomness in ordered or sequential data. It uses computed runs from the data, where the randomness is rejected when the value of runs is too low or too high.
A run is the data sequence following another similar sequence in the same data that is mutually exclusive from the other.
The runs can be computed for binary, categorical, or numerical data.
For example, the sequence of winning or losing a tennis match is binary data. Notice that the values of runs for dataset-1 and dataset-2 are extreme, making them less random than dataset-3.
A DNA sequence is a typical example of categorical data. Here, the value of runs for sequence-1 and sequence-2 is extreme, making them less random than sequence-3.
Computing runs for numerical data, such as the order of leaf size cut by a leafcutter bee, requires its mean or median. Assign a + sign for every value higher than the mean or median and a - sign for every value lower to get a sequence of binary signs to calculate the runs.