Flexible sampling of discrete data correlations without the marginal distributions
Alfredo Kalaitzis, Ricardo Silva

TL;DR
This paper introduces an efficient Hamiltonian MCMC algorithm for modeling dependencies in discrete data using copulas, bypassing marginal distribution learning and improving sampling efficiency.
Contribution
It proposes a novel constrained Hamiltonian Monte Carlo method that reduces computational complexity in copula-based dependence modeling for discrete data.
Findings
Significantly faster sampling compared to traditional Gibbs methods.
Effective modeling of joint dependence without marginal distribution estimation.
Applicable to high-dimensional discrete data analysis.
Abstract
Learning the joint dependence of discrete variables is a fundamental problem in machine learning, with many applications including prediction, clustering and dimensionality reduction. More recently, the framework of copula modeling has gained popularity due to its modular parametrization of joint distributions. Among other properties, copulas provide a recipe for combining flexible models for univariate marginal distributions with parametric families suitable for potentially high dimensional dependence structures. More radically, the extended rank likelihood approach of Hoff (2007) bypasses learning marginal models completely when such information is ancillary to the learning task at hand as in, e.g., standard dimensionality reduction problems or copula parameter estimation. The main idea is to represent data by their observable rank statistics, ignoring any other information from the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference · Markov Chains and Monte Carlo Methods
