An Efficient Metric for Data Quality Measurement in Imitation Learning

Noushad Sojib; Momotaz Begum

arXiv:2605.01544·cs.RO·May 5, 2026

An Efficient Metric for Data Quality Measurement in Imitation Learning

Noushad Sojib, Momotaz Begum

PDF

TL;DR

This paper introduces a fast, automated demonstration ranking metric based on power spectral density to improve data quality in imitation learning, enhancing policy performance without environment interaction.

Contribution

The proposed PSD-based metric enables scalable, in-field data curation for imitation learning without requiring policy rollouts or expert labels.

Findings

01

PSD-curated data improves task success rates.

02

Smoother trajectories achieved with PSD ranking.

03

Effective on benchmark and real-world datasets.

Abstract

Imitation learning (IL) has seen remarkable progress, yet field deployment of IL-powered robots remains hindered by the challenge of out-of-distribution (OOD) scenarios. Fine-tuning pre-trained policies with end-user demonstrations collected in deployment environments is a promising strategy to address this challenge. However, end-user demonstrations are frequently of poor quality, characterized by excessive corrective motions, oscillations, and abrupt adjustments that degrade both learned and fine-tuned policy performance. Existing automated approaches for curating demonstration data require policy rollouts in the environment, making them computationally expensive and impractical for real-world deployment. In this paper, we propose a fast, efficient, and fully automated demonstration ranking metric based on the power spectral density (PSD) of demonstration trajectories. The PSD metric…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.