FINDER: Feature Inference on Noisy Datasets using Eigenspace Residuals
Trajan Murphy, Akshunna S. Dogra, Hanfeng Gu, Caleb Meredith, Mark Kon, Julio Enrique Castrillion-Candas

TL;DR
FINDER introduces a novel eigenspace residuals framework for feature inference in noisy datasets, leveraging stochastic analysis and eigen-decomposition to improve classification in low signal-to-noise scenarios.
Contribution
The paper presents a new theoretical framework and algorithms for classifying noisy datasets using stochastic features and eigenspace analysis, with demonstrated success in scientific applications.
Findings
Achieved state-of-the-art results in Alzheimer's disease classification.
Improved remote sensing detection of deforestation.
Validated the effectiveness of eigenspace residuals in noisy data environments.
Abstract
''Noisy'' datasets (regimes with low signal to noise ratios, small sample sizes, faulty data collection, etc) remain a key research frontier for classification methods with both theoretical and practical implications. We introduce FINDER, a rigorous framework for analyzing generic classification problems, with tailored algorithms for noisy datasets. FINDER incorporates fundamental stochastic analysis ideas into the feature learning and inference stages to optimally account for the randomness inherent to all empirical datasets. We construct ''stochastic features'' by first viewing empirical datasets as realizations from an underlying random field (without assumptions on its exact distribution) and then mapping them to appropriate Hilbert spaces. The Kosambi-Karhunen-Lo\'eve expansion (KLE) breaks these stochastic features into computable irreducible components, which allow classification…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. The focus on learning in noisy and data-deficient regimes is important and relevant. 2. The paper is written with care regarding functional analytic details, correctly invoking concepts like Bochner integrals, covariance operators, and Hilbert-Schmidt isomorphisms. 3. The attempt to demonstrate FINDER on both biomedical and remote-sensing tasks shows some versatility.
1. The current presentation lack true methodological novelty, most of the theoretical developments are restatements of known results: - The “generalized” KLE theorem (Thm. 2.1) is essentially the standard Hilbert-space KLE without the separability assumption, a minor technical relaxation known in the stochastic analysis literature (e.g., Schwab & Todor 2006). - The link to classification through eigen-decomposition of the covariance operator is well-known (PCA, kernel PCA, functional data analys
* The paper is well-organized and clearly written. The paper has clean definitions of the embedding and sufficient experimental data. * The experiments were carried out on real datasets.
Major comments: * It is unclear to me the significance of the contribution by defining such a pipeline. It seems usual PCA but generalized to Hilbert space (which is also not novel in kernel clustering). To make this compelling, the authors need to articulate what FINDER achieves beyond (a) standard FPCA/FDA pipelines that learn class-specific subspaces and (b) PCA/kPCA followed by a linear classifier — e.g., is the ACA residual step provably better for imbalanced or low-SNR settings, or does it
The paper addresses the relevant problem of classification under noise, combining theoretical rigor with applications to two scientifically meaningful case studies. The framework is mathematically sound and versatile, grounded in functional data analysis. The empirical results are convincing and strengthen the authors’ claims. The authors also acknowledge several current limitations of their method and provide a transparent discussion of them.
1. The presentation is mathematically heavy and not sufficiently intuitive, which reduces accessibility for a broader ML audience that could benefit from this approach. In particular, the functional analysis formalism could be complemented by a schematic pipeline in a simpler setting (e.g. $\mathcal{H} = \mathbb{R}^D$), clarifying how the theoretical constructs translate to the experiments. 2. The novelty of the proposed framework should be more explicitly highlighted. For instance, the connect
The paper presents a mathematical framework that is elegant and general. It allows for a unified treatment of residual subspace methods that applies to both finite- and infinite-dimensional settings, and allows for quite general distribution-agnostic bounds. The algorithms presented in the paper appear simple to implement and achieve high performance on real datasets (though the lack of references made it difficult to verify state of the art performance).
**Related work**. The paper includes some background information in the introduction that is relevant to the main exposition, but has almost no discussion of recent related work. The experimental results are difficult to evaluate given the lack of references to related work and/or public leaderboards. **Clarity**. Writing is confusing at times. Basic ideas are often obscured by abstract formalism and idiosyncratic terminology. As far as I understand, the paper essentially constructs a general
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Bayesian Modeling and Causal Inference · Gaussian Processes and Bayesian Inference
