ERMV: Editing 4D Robotic Multi-view images to enhance embodied agents

Chang Nie; Guangming Wang; Zhe Lie; Hesheng Wang

arXiv:2507.17462·cs.CV·July 24, 2025

ERMV: Editing 4D Robotic Multi-view images to enhance embodied agents

Chang Nie, Guangming Wang, Zhe Lie, Hesheng Wang

PDF

Open Access

TL;DR

ERMV is a novel data augmentation framework for editing 4D multi-view robotic images, improving embodied agent training by maintaining consistency, expanding editing scope, and ensuring semantic integrity.

Contribution

The paper introduces ERMV, a new method for editing 4D robotic multi-view data that addresses key challenges in consistency, efficiency, and semantic preservation.

Findings

01

ERMV significantly improves model robustness in simulated environments.

02

ERMV enhances generalization of embodied intelligence policies in real-world tests.

03

The proposed framework reduces computational costs while maintaining high editing quality.

Abstract

Robot imitation learning relies on 4D multi-view sequential images. However, the high cost of data collection and the scarcity of high-quality data severely constrain the generalization and application of embodied intelligence policies like Vision-Language-Action (VLA) models. Data augmentation is a powerful strategy to overcome data scarcity, but methods for editing 4D multi-view sequential images for manipulation tasks are currently lacking. Thus, we propose ERMV (Editing Robotic Multi-View 4D data), a novel data augmentation framework that efficiently edits an entire multi-view sequence based on single-frame editing and robot state conditions. This task presents three core challenges: (1) maintaining geometric and appearance consistency across dynamic views and long time horizons; (2) expanding the working window with low computational costs; and (3) ensuring the semantic integrity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModular Robots and Swarm Intelligence · Reinforcement Learning in Robotics · Robotic Path Planning Algorithms