The Temporal Trap: Entanglement in Pre-Trained Visual Representations for Visuomotor Policy Learning

Nikolaos Tsagkas; Andreas Sochopoulos; Duolikun Danier; Chris Xiaoxuan Lu; Oisin Mac Aodha

arXiv:2502.03270·cs.RO·November 17, 2025

The Temporal Trap: Entanglement in Pre-Trained Visual Representations for Visuomotor Policy Learning

Nikolaos Tsagkas, Andreas Sochopoulos, Duolikun Danier, Chris Xiaoxuan Lu, Oisin Mac Aodha

PDF

Open Access

TL;DR

This paper investigates the challenge of temporal entanglement in pre-trained visual representations used for visuomotor policy learning, proposing a disentanglement baseline to improve temporal understanding and policy success.

Contribution

It identifies temporal entanglement as a key issue, quantifies its impact, and introduces a simple disentanglement method to enhance temporal representation in visuomotor tasks.

Findings

01

Temporal entanglement correlates with policy success.

02

Traditional temporal enrichment methods are insufficient.

03

Disentanglement improves temporal cue representation.

Abstract

The integration of pre-trained visual representations (PVRs) has significantly advanced visuomotor policy learning. However, effectively leveraging these models remains a challenge. We identify temporal entanglement as a critical, inherent issue when using these time-invariant models in sequential decision-making tasks. This entanglement arises because PVRs, optimised for static image understanding, struggle to represent the temporal dependencies crucial for visuomotor control. In this work, we quantify the impact of temporal entanglement, demonstrating a strong correlation between a policy's success rate and the ability of its latent space to capture task-progression cues. Based on these insights, we propose a simple, yet effective disentanglement baseline designed to mitigate temporal entanglement. Our empirical results show that traditional methods aimed at enriching features with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning