On the Theory and Algorithm for rigorous discretization in applications of Information Theory
Venkateshan Kannan, Jesper Tegn\`er

TL;DR
This paper investigates fundamental theoretical issues with discretization in estimating information-theoretic quantities, illustrating their impact with biological network reconstruction and proposing an algorithm to correct biases.
Contribution
It identifies core problems in discretization methods, explains their origins, and introduces a shared information metric algorithm to improve estimation accuracy.
Findings
Discretization issues significantly bias information estimates.
The proposed algorithm reduces bias and improves biological network reconstruction.
Highlighting the importance of proper discretization in information-theoretic analyses.
Abstract
We identify fundamental issues with discretization when estimating information-theoretic quantities in the analysis of data. These difficulties are theoretical in nature and arise with discrete datasets carrying significant implications for the corresponding claims and results. Here we describe the origins of the methodological problems, and provide a clear illustration of their impact with the example of biological network reconstruction. We propose an algorithm (shared information metric) that corrects for the biases and the resulting improved performance of the algorithm demonstrates the need to take due consideration of this issue in different contexts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRough Sets and Fuzzy Logic · Neural Networks and Applications · Computability, Logic, AI Algorithms
