Learning Continuous Decomposable Models Using Mutual Information and Statistical Copulas
Luiz Desuó Neto, Henrique de Oliveira Caetano, Matheus de Souza Sant’Anna Fogliatto, Carlos Dias Maciel

TL;DR
This paper introduces a new method for learning dependence structures in continuous data using mutual information and copulas, improving accuracy and interpretability.
Contribution
A novel mutual information identity and nonparametric estimation pipeline for decomposable graphical models.
Findings
The proposed method improves edge recovery accuracy on synthetic chordal benchmarks.
It produces interpretable dependence summaries on a real gene expression dataset.
A practical nonparametric copula entropy estimation pipeline is developed.
Abstract
Learning dependence graphs from multivariate continuous data is challenging when marginal distributions are heterogeneous, since likelihood-based nonparametric scores can be sensitive to smoothing choices and can confound marginal irregularities, including non-identifiability, with dependence. This work studies structure learning in the class of decomposable (chordal) Markov random fields, where junction tree factorizations enable tractable inference and local score updates. Our first contribution is a theoretical result showing that, under decomposability, mutual information can be expressed as a difference of clique/separator copula entropies, yielding a dependence-only decomposition aligned with the clique/separator structure. Building on this identity, we define an information-theoretic objective for decomposable graphs with a complexity penalty that preserves clique/separator…
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Single-cell and spatial transcriptomics · Statistical Methods and Inference
