Association measures for interval variables
M. Ros\'ario Oliveira, Margarida Azeitona, Ant\'onio Pacheco, Rui, Valadas

TL;DR
This paper investigates how different definitions of symbolic covariance matrices in Symbolic Data Analysis may misrepresent correlations in interval data, emphasizing the importance of understanding micro-data structures for accurate analysis.
Contribution
It introduces a model linking micro-data and macro-data for interval variables, revealing limitations of existing covariance definitions and guiding better selection based on micro-data knowledge.
Findings
Existing covariance definitions can misrepresent correlations in macro-data.
Micro-data assumptions are crucial for selecting appropriate covariance measures.
There can be significant divergence between different covariance definitions in real datasets.
Abstract
Symbolic Data Analysis (SDA) is a relatively new field of statistics that extends conventional data analysis by taking into account intrinsic data variability and structure. Unlike conventional data analysis, in SDA the features characterizing the data can be multi-valued, such as intervals or histograms. SDA has been mainly approached from a sampling perspective. In this work, we propose a model that links the micro-data and macro-data of interval-valued symbolic variables, which takes a populational perspective. Using this model, we derive the micro-data assumptions underlying the various definitions of symbolic covariance matrices proposed in the literature, and show that these assumptions can be too restrictive, raising applicability concerns. We analyze the various definitions using worked examples and four datasets. Our results show that the existence/absence of correlations in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Sensory Analysis and Statistical Methods · Complex Systems and Time Series Analysis
