|
|
Line 1: |
Line 1: |
| The '''Kendall tau rank distance''' is a [[Metric (mathematics)|metric]] that counts the number of pairwise disagreements between two ranking lists. The larger the distance, the more dissimilar the two lists are. Kendall tau distance is also called '''bubble-sort distance''' since it is equivalent to the number of swaps that the [[bubble sort]] algorithm would make to place one list in the same order as the other list. The Kendall tau distance was created by [[Maurice Kendall]].
| | Friends call him Royal. Playing croquet is something I will never give up. Interviewing is how I make a living and it's something I truly appreciate. Her family life in Idaho.<br><br>Here is my web page [http://81.30.156.125/index.php?mod=users&action=view&id=14403 81.30.156.125] |
| | |
| ==Definition==
| |
| The Kendall tau ranking distance between two lists <math>L1</math> and <math>L2</math> is
| |
| | |
| : <math>K(\tau_1,\tau_2) = |\{(i,j): i < j, ( \tau_1(i) < \tau_1(j) \wedge \tau_2(i) > \tau_2(j) ) \vee ( \tau_1(i) > \tau_1(j) \wedge \tau_2(i) < \tau_2(j) )\}|.</math>
| |
| | |
| where
| |
| * <math>\tau_1</math> and <math>\tau_2</math> are the rankings of the elements in <math>L1</math> and <math>L2</math>
| |
| | |
| <math>K(\tau_1,\tau_2)</math> will be equal to 0 if the two lists are identical and <math>n(n-1)/2</math> (where <math>n</math> is the list size) if one list is the reverse of the other. Often Kendall tau distance is normalized by dividing by <math>n(n-1)/2</math> so a value of 1 indicates maximum disagreement. The normalized Kendall tau distance therefore lies in the interval [0,1].
| |
| | |
| Kendall tau distance may also be defined as
| |
| | |
| : <math>K(\tau_1,\tau_2) = \begin{matrix} \sum_{\{i,j\}\in P} \bar{K}_{i,j}(\tau_1,\tau_2) \end{matrix}</math>
| |
| | |
| where
| |
| * ''P'' is the set of unordered pairs of distinct elements in <math>\tau_1</math> and <math>\tau_2</math>
| |
| * <math>\bar{K}_{i,j}(\tau_1,\tau_2)</math> = 0 if ''i'' and ''j'' are in the same order in <math>\tau_1</math> and <math>\tau_2</math>
| |
| * <math>\bar{K}_{i,j}(\tau_1,\tau_2)</math> = 1 if ''i'' and ''j'' are in the opposite order in <math>\tau_1</math> and <math>\tau_2.</math>
| |
| | |
| Kendall tau distance can also be defined as the total number of [[discordant pairs]].
| |
| | |
| Kendall tau distance in Rankings: A permutation (or ranking) is an array of N integers where each of the integers between 0 and N-1 appears exactly once.
| |
| The Kendall tau distance between two rankings is the number of pairs that are in different order in the two rankings. For example the Kendall tau distance between 0 3 1 6 2 5 4 and 1 0 3 6 4 2 5 is four because the pairs 0-1, 3-1, 2-4, 5-4 are in different order in the two rankings, but all other pairs are in the same order. <ref>http://algs4.cs.princeton.edu/25applications/</ref>
| |
| | |
| If Kendall tau function is performed as <math>K(L1,L2)</math> instead of <math>K(\tau_1,\tau_2)</math> (where <math>\tau_1</math> and <math>\tau_2</math> are the rankings of <math>L1</math> and <math>L2</math> elements respectively), then triangular inequality is not guaranteed. The triangular inequality fail in cases where there are repetitions in the lists. So then we are not any more dealing with a metric.
| |
| | |
| ==Example==
| |
| Suppose we rank a group of five people by height and by weight:
| |
| | |
| {| border="1" cellpadding="2"
| |
| |-
| |
| ! Person !! A !! B !! C !! D !! E
| |
| |-
| |
| ! Rank by Height
| |
| | 1 || 2 || 3 || 4 || 5
| |
| |-
| |
| ! Rank by Weight
| |
| | 3 || 4 || 1 || 2 || 5
| |
| |}
| |
| | |
| Here person A is tallest and third-heaviest, and so on. | |
| | |
| In order to calculate the Kendall tau distance, pair each person with every other person and count the number of times the values in list 1 are in the opposite order of the values in list 2.
| |
| | |
| {| border="1" cellpadding="2"
| |
| |-
| |
| ! Pair !! Height !! Weight !! Count
| |
| |-
| |
| ! (A,B)
| |
| | 1 < 2 || 3 < 4 ||
| |
| |-
| |
| ! (A,C)
| |
| | 1 < 3 || 3 > 1 || '''X'''
| |
| |-
| |
| ! (A,D)
| |
| | 1 < 4 || 3 > 2 || '''X'''
| |
| |-
| |
| ! (A,E)
| |
| | 1 < 5 || 3 < 5 ||
| |
| |-
| |
| ! (B,C)
| |
| | 2 < 3 || 4 > 1 || '''X'''
| |
| |-
| |
| ! (B,D)
| |
| | 2 < 4 || 4 > 2 || '''X'''
| |
| |-
| |
| ! (B,E)
| |
| | 2 < 5 || 4 < 5 ||
| |
| |-
| |
| ! (C,D)
| |
| | 3 < 4 || 1 < 2 ||
| |
| |-
| |
| ! (C,E)
| |
| | 3 < 5 || 1 < 5 ||
| |
| |-
| |
| ! (D,E)
| |
| | 4 < 5 || 2 < 5 ||
| |
| |}
| |
| | |
| Since there are 4 pairs whose values are in opposite order, the Kendall tau distance is 4. The normalized Kendall tau distance is
| |
| | |
| : <math>\frac{6-4}{5(5 - 1)/2} = 0.2.</math>
| |
| | |
| A value of 0.2 indicates a somewhat low agreement in the rankings.
| |
| | |
| ==See also==
| |
| * [[Kendall tau rank correlation coefficient]]
| |
| * [[Spearman's rank correlation coefficient]]
| |
| * [[Kemeny-Young method|Kemeny-Young (`maximum likelihood') voting rule]]
| |
| | |
| ==References==
| |
| {{reflist}}
| |
| * {{cite journal | author = Fagin, R., Kumar, R., and Sivakumar, D. | year = 2003 | title = Comparing top k lists | journal = [[SIAM Journal on Discrete Mathematics]] | pages = 134–160 | url = http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.86.3234&rep=rep1&type=pdf | issue = 1 | doi = 10.1137/S0895480102412856 | volume = 17}}
| |
| * Kendall, M. (1948) ''Rank Correlation Methods'', Charles Griffin & Company Limited
| |
| * Kendall, M. (1938) "A New Measure of Rank Correlation", [[Biometrika]], 30, 81-89.
| |
| | |
| ==External links==
| |
| * [http://www.rsscse-edu.org.uk/tsj/bts/noether/text.html Why Kendall tau?]
| |
| * [http://www.wessa.net/rwasp_kendall.wasp Online software: computes Kendall's tau rank correlation]
| |
| | |
| [[Category:Covariance and correlation]]
| |
| [[Category:Statistical distance measures]]
| |
| [[Category:Comparison of assessments]]
| |
Friends call him Royal. Playing croquet is something I will never give up. Interviewing is how I make a living and it's something I truly appreciate. Her family life in Idaho.
Here is my web page 81.30.156.125