Domain-Invariant Per-Frame Feature Extraction for Cross-Domain Imitation   Learning with Visual Observations

Minung Kim; Kawon Lee; Jungmo Kim; Sungho Choi; Seungyul Han

arXiv:2502.02867·cs.CV·February 17, 2025

Domain-Invariant Per-Frame Feature Extraction for Cross-Domain Imitation Learning with Visual Observations

Minung Kim, Kawon Lee, Jungmo Kim, Sungho Choi, Seungyul Han

PDF

Open Access

TL;DR

This paper introduces DIFF-IL, a novel imitation learning method that extracts domain-invariant features from individual frames and uses temporal labeling to improve cross-domain visual imitation tasks.

Contribution

The paper proposes a new IL approach that isolates domain-invariant features per frame and employs frame-wise time labeling for better behavior segmentation and reward assignment.

Findings

01

DIFF-IL outperforms existing methods in diverse visual environments.

02

Effective in handling high-dimensional, noisy, and incomplete visual observations.

03

Improves imitation learning performance across different visual domains.

Abstract

Imitation learning (IL) enables agents to mimic expert behavior without reward signals but faces challenges in cross-domain scenarios with high-dimensional, noisy, and incomplete visual observations. To address this, we propose Domain-Invariant Per-Frame Feature Extraction for Imitation Learning (DIFF-IL), a novel IL method that extracts domain-invariant features from individual frames and adapts them into sequences to isolate and replicate expert behaviors. We also introduce a frame-wise time labeling technique to segment expert behaviors by timesteps and assign rewards aligned with temporal contexts, enhancing task performance. Experiments across diverse visual environments demonstrate the effectiveness of DIFF-IL in addressing complex visual tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Vision and Imaging