A Linear Relationship between Correlation and Cohen's Kappa for Binary Data and Simulating Multivariate Nominal and Ordinal Data with Specified Kappa Matrix
Soumya Sahu, Hakan Demirtas

TL;DR
This paper establishes a linear relationship between correlation and Cohen's kappa for binary data, develops bounds for kappa, and introduces a method to generate multivariate data with specified kappa matrices.
Contribution
It reveals a linear correlation-kappa relationship, provides algorithms for exact bounds, and proposes a novel data simulation method for multivariate nominal and ordinal data.
Findings
Linear relationship between correlation and kappa for binary data
Exact bounds for kappa based on marginals
Method to generate multivariate data with specified kappa matrix
Abstract
Cohen's kappa is a useful measure for agreement between the judges, inter-rater reliability, and also goodness of fit in classification problems. For binary nominal and ordinal data, kappa and correlation are equally applicable. We have found a linear relationship between correlation and kappa for binary data. Exact bounds of kappa are much more important as kappa can be only .5 even if there is very strong agreement. The exact upper bound was developed by Cohen (1960) but the exact lower bound is also important if the range of kappa is small for some marginals. We have developed an algorithm to find the exact lower bound given marginal proportions. Our final contribution is a method to generate multivariate nominal and ordinal data with a specified kappa matrix based on the rearrangement of independently generated marginal data to a multidimensional contingency table, where cell counts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Applications · Statistical and Computational Modeling · Advanced Statistical Methods and Models
