Supervised Feature Selection in Graphs with Path Coding Penalties and Network Flows
Julien Mairal, Bin Yu

TL;DR
This paper introduces a new computationally feasible method for selecting connected subgraphs in graph-structured features using path coding penalties, improving interpretability and prediction in supervised learning tasks.
Contribution
It proposes a novel path coding penalty for structured sparsity in DAGs, solved efficiently via network flow optimization, addressing algorithmic challenges of previous methods.
Findings
Scalable approach demonstrated on synthetic, image, and genomic data.
Leads to more connected subgraphs than existing regularization functions.
Improves interpretability and prediction performance in graph-based supervised learning.
Abstract
We consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network. In this context, it is of much interest to automatically select a subgraph with few connected components; by exploiting prior knowledge, one can indeed improve the prediction performance or obtain results that are easier to interpret. Regularization or penalty functions for selecting features in graphs have recently been proposed, but they raise new algorithmic challenges. For example, they typically require solving a combinatorially hard selection problem among all connected subgraphs. In this paper, we propose computationally feasible strategies to select a sparse and well-connected subset of features sitting on a directed acyclic graph (DAG). We introduce structured sparsity penalties over paths on a DAG called "path coding" penalties. Unlike existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Sparse and Compressive Sensing Techniques
