TopoPrune: Robust Data Pruning via Unified Latent Space Topology
Arjun Roy, Prajna G. Malettira, Manish Nagaraj, Kaushik Roy

TL;DR
Topological data pruning with TopoPrune enhances robustness and transferability by capturing intrinsic data structure through a dual-scale topological approach, improving stability and performance in model pruning tasks.
Contribution
Introduces TopoPrune, a novel topology-based framework that stabilizes data pruning by leveraging intrinsic data structure via persistent homology and manifold approximation.
Findings
Achieves high accuracy at 90% dataset pruning rates.
Demonstrates robustness to latent feature noise perturbations.
Shows superior transferability across different neural network architectures.
Abstract
Geometric data pruning methods, while practical for leveraging pretrained models, are fundamentally unstable. Their reliance on extrinsic geometry renders them highly sensitive to latent space perturbations, causing performance to degrade during cross-architecture transfer or in the presence of feature noise. We introduce TopoPrune, a framework which resolves this challenge by leveraging topology to capture the stable, intrinsic structure of data. TopoPrune operates at two scales, (1) utilizing a topology-aware manifold approximation to establish a global low-dimensional embedding of the dataset. Subsequently, (2) it employs differentiable persistent homology to perform a local topological optimization on the manifold embeddings, ranking samples by their structural complexity. We demonstrate that our unified dual-scale topological approach ensures high accuracy and precision,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
