Gaussian See, Gaussian Do: Semantic 3D Motion Transfer from Multiview Video

Yarin Bekor; Gal Michael Harari; Or Perel; Or Litany

arXiv:2511.14848·cs.CV·November 20, 2025

Gaussian See, Gaussian Do: Semantic 3D Motion Transfer from Multiview Video

Yarin Bekor, Gal Michael Harari, Or Perel, Or Litany

PDF

Open Access

TL;DR

This paper introduces a novel method for semantic 3D motion transfer from multiview videos, enabling cross-category, rig-free motion transfer with high fidelity and consistency, supported by a new benchmark and robust reconstruction pipeline.

Contribution

It presents Gaussian See, Gaussian Do, a new approach that combines implicit motion transfer, anchor-based embeddings, and 4D reconstruction for improved semantic 3D motion transfer.

Findings

01

Achieves superior motion fidelity compared to baselines.

02

Ensures cross-view consistency and accelerates convergence.

03

Establishes the first benchmark for semantic 3D motion transfer.

Abstract

We present Gaussian See, Gaussian Do, a novel approach for semantic 3D motion transfer from multiview video. Our method enables rig-free, cross-category motion transfer between objects with semantically meaningful correspondence. Building on implicit motion transfer techniques, we extract motion embeddings from source videos via condition inversion, apply them to rendered frames of static target shapes, and use the resulting videos to supervise dynamic 3D Gaussian Splatting reconstruction. Our approach introduces an anchor-based view-aware motion embedding mechanism, ensuring cross-view consistency and accelerating convergence, along with a robust 4D reconstruction pipeline that consolidates noisy supervision videos. We establish the first benchmark for semantic 3D motion transfer and demonstrate superior motion fidelity and structural consistency compared to adapted baselines. Code and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis