Diversity You Can Actually Measure: A Fast, Model-Free Diversity Metric for Robotics Datasets
Sreevardhan Sirigiri, Nathan Samuel de Lara, Christopher Agia, Florian Shkurti, Fabio Ramos

TL;DR
This paper introduces a fast, model-free diversity metric based on signature transform entropy for robotics datasets, and demonstrates its effectiveness in improving imitation learning success rates through diversity-aware data curation.
Contribution
The authors develop a novel, computationally efficient diversity metric for robot datasets and propose FAKTUAL, a data curation method that enhances imitation learning by maximizing dataset entropy.
Findings
Diversity-aware data curation improves success rates across multiple benchmarks.
FAKTUAL requires no access to policies or rollouts, adding minimal overhead.
The entropy metric effectively measures dataset diversity and correlates with generalization performance.
Abstract
Robotics datasets for imitation learning typically consist of long-horizon trajectories of different lengths over states, actions, and high-dimensional observations (e.g., RGB video), making it non-trivial to quantify diversity in a way that respects the underlying trajectory structure and geometry. We extend Shannon and von Neumann entropy to this setting by defining signature transform-based entropy on the Gram matrix of a signature kernel over demonstrations, yielding entropy and diversity metrics that operate directly on the demonstration dataset. Building on these metrics, we study how dataset diversity affects generalization performance in robot imitation learning and propose a simple, model-free way to curate diverse demonstrations. We introduce FAKTUAL (FAst trajectory Kernel enTropy cUration for imitation Learning), a data curation algorithm that selects a subset of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Social Robot Interaction and HRI
