LiveVal: Time-aware Data Valuation via Adaptive Reference Points

Jie Xu; Zihan Wu; Cong Wang; and Xiaohua Jia

arXiv:2502.10489·cs.LG·February 18, 2025

LiveVal: Time-aware Data Valuation via Adaptive Reference Points

Jie Xu, Zihan Wu, Cong Wang, and Xiaohua Jia

PDF

Open Access

TL;DR

LiveVal is a novel, efficient, and adaptive data valuation method that integrates with training processes to detect harmful samples early, improving training efficiency and robustness across various models and data types.

Contribution

It introduces a real-time, reference-based valuation approach with adaptive reference points, addressing limitations of previous methods that rely on retraining or static assumptions.

Findings

01

Achieves 180x speedup over traditional data valuation methods.

02

Maintains robust detection of harmful samples across modalities.

03

Provides theoretical guarantees for stability and alignment with training progress.

Abstract

Time-aware data valuation enhances training efficiency and model robustness, as early detection of harmful samples could prevent months of wasted computation. However, existing methods rely on model retraining or convergence assumptions or fail to capture long-term training dynamics. We propose LiveVal, an efficient time-aware data valuation method with three key designs: 1) seamless integration with SGD training for efficient data contribution monitoring; 2) reference-based valuation with normalization for reliable benchmark establishment; and 3) adaptive reference point selection for real-time updating with optimized memory usage. We establish theoretical guarantees for LiveVal's stability and prove that its valuations are bounded and directionally aligned with optimization progress. Extensive experiments demonstrate that LiveVal provides efficient data valuation across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Database Systems and Queries · Time Series Analysis and Forecasting · Data Management and Algorithms

MethodsStochastic Gradient Descent