STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization

Hao Li; Qi Lv; Rui Shao; Xiang Deng; Yinchuan Li; Jianye Hao; Liqiang Nie

arXiv:2506.03863·cs.RO·April 8, 2026

STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization

Hao Li, Qi Lv, Rui Shao, Xiang Deng, Yinchuan Li, Jianye Hao, Liqiang Nie

PDF

1 Video

TL;DR

STAR introduces a novel framework for learning diverse robot skill abstractions by addressing codebook collapse and modeling skill dependencies, leading to improved performance in manipulation tasks.

Contribution

It proposes rotation-augmented residual skill quantization and a causal skill transformer to enhance skill diversity and causal understanding in robotic manipulation.

Findings

01

Achieves around 12% improvement over baselines on LIBERO benchmark.

02

Effectively prevents codebook collapse with rotation-augmented residual skill quantization.

03

Models skill dependencies explicitly through a causal skill transformer.

Abstract

Transforming complex actions into discrete skill abstractions has demonstrated strong potential for robotic manipulation. Existing approaches mainly leverage latent variable models, e.g., VQ-VAE, to learn skill abstractions through learned vectors (codebooks), while they suffer from codebook collapse and modeling the causal relationship between learned skills. To address these limitations, we present \textbf{S}kill \textbf{T}raining with \textbf{A}ugmented \textbf{R}otation (\textbf{STAR}), a framework that advances both skill learning and composition to complete complex behaviors. Specifically, to prevent codebook collapse, we devise rotation-augmented residual skill quantization (RaRSQ). It encodes relative angles between encoder outputs into the gradient flow by rotation-based gradient mechanism. Points within the same skill code are forced to be either pushed apart or pulled closer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization· slideslive