Fast tree inference with weighted fusion penalties
Julien Chiquet, Pierre Gutierrez, Guillem Rigaill

TL;DR
This paper introduces a fast, tree-structured inference method for multidimensional fusion penalties that improves interpretability and computational efficiency in large condition datasets, with proven theoretical properties and practical performance.
Contribution
It develops a homotopy algorithm for weighted fusion penalties, especially for $ extit{l}_1$ and $ extit{l}_} norms, and demonstrates its efficiency and statistical guarantees in data structure recovery.
Findings
The path of solutions forms a tree structure for uniform weights.
The homotopy algorithm exactly recovers the tree structure for certain norms.
The method outperforms competitors in speed and accuracy in simulations.
Abstract
Given a data set with many features observed in a large number of conditions, it is desirable to fuse and aggregate conditions which are similar to ease the interpretation and extract the main characteristics of the data. This paper presents a multidimensional fusion penalty framework to address this question when the number of conditions is large. If the fusion penalty is encoded by an -norm, we prove for uniform weights that the path of solutions is a tree which is suitable for interpretability. For the and -norms, the path is piecewise linear and we derive a homotopy algorithm to recover exactly the whole tree structure. For weighted -fusion penalties, we demonstrate that distance-decreasing weights lead to balanced tree structures. For a subclass of these weights that we call "exponentially adaptive", we derive an …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Gene expression and cancer classification · Bioinformatics and Genomic Networks
