Log-ratio Lasso: Scalable, Sparse Estimation for Log-ratio Models
Stephen Bates, Robert Tibshirani

TL;DR
This paper introduces a scalable method for sparse estimation of log-ratio models, enabling efficient feature selection in high-dimensional biological data, with improved predictive accuracy and interpretability.
Contribution
It presents a novel low-dimensional embedding of the log-ratio space and a two-step penalized fitting procedure for highly sparse, interpretable models.
Findings
Achieves highly sparse models with biologically relevant features
Improves predictive accuracy over less interpretable methods
Demonstrates effectiveness on cancer proteomics data
Abstract
Positive-valued signal data is common in many biological and medical applications, where the data are often generated from imaging techniques such as mass spectrometry. In such a setting, the relative intensities of the raw features are often the scientifically meaningful quantities, so it is of interest to identify relevant features that take the form of log-ratios of the raw inputs. When including the log-ratios of all pairs of predictors, the dimensionality of this predictor space becomes large, so computationally efficient statistical procedures are required. We introduce an embedding of the log-ratio parameter space into a space of much lower dimension and develop efficient penalized fitting procedure using this more tractable representation. This procedure serves as the foundation for a two-step fitting procedure that combines a convex filtering step with a second non-convex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
