On the Analysis of Correlation Between Nominal Data and Numerical Data
Zenon Gniazdowski

TL;DR
This paper explores measuring linear correlation between nominal and numerical data, analyzing correlation coefficients for real and complex coding, and proposing data correction methods when complex correlation measures are unsuitable.
Contribution
It introduces a method to assess linear correlation involving nominal data and discusses limitations of complex correlation coefficients, proposing data correction techniques.
Findings
Real coding yields unambiguous linear correlation measures.
Complex coding's correlation coefficients vary with phase permutation.
Data correction can facilitate correlation analysis with nominal data.
Abstract
The article investigates the possibility of measuring the strength of a linear correlation relationship between nominal data and numerical data. Correlation coefficients for variables coded with real numbers as well as for variables coded with complex numbers were studied. For variables coded with real numbers, unambiguous measures of real linear correlation were obtained. In the case of complex coding, it has been observed that the obtained complex correlation coefficients change with the permutation of the phases in the complex numbers used to code classes of elements with equal cardinalities. It was found that a necessary condition for linear correlation is the possibility of linear ordering of a set with data. Since linear order is not possible in the set of complex numbers, complex correlation coefficients cannot be used as a measure of linear correlation. In the event of such a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Processing Techniques · Cybersecurity and Information Systems
