Bayes optimal learning in high-dimensional linear regression with network side information
Sagnik Nandy, Subhabrata Sen

TL;DR
This paper studies the theoretical limits and practical algorithms for high-dimensional linear regression when network side information is available, demonstrating near-optimal performance and quantifying the information gain from the network.
Contribution
It introduces the Reg-Graph model, an AMP-based algorithm, and characterizes the mutual information, advancing understanding of network side information in high-dimensional regression.
Findings
The AMP algorithm is provably Bayes optimal under general conditions.
The mutual information quantifies the statistical benefit of network side information.
Numerical experiments show excellent finite-sample performance.
Abstract
Supervised learning problems with side information in the form of a network arise frequently in applications in genomics, proteomics and neuroscience. For example, in genetic applications, the network side information can accurately capture background biological information on the intricate relations among the relevant genes. In this paper, we initiate a study of Bayes optimal learning in high-dimensional linear regression with network side information. To this end, we first introduce a simple generative model (called the Reg-Graph model) which posits a joint distribution for the supervised data and the observed network through a common set of latent parameters. Next, we introduce an iterative algorithm based on Approximate Message Passing (AMP) which is provably Bayes optimal under very general conditions. In addition, we characterize the limiting mutual information between the latent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Bayesian Methods and Mixture Models
MethodsLinear Regression
