360Anything: Geometry-Free Lifting of Images and Videos to 360{\deg}

Ziyi Wu; Daniel Watson; Andrea Tagliasacchi; David J. Fleet; Marcus A. Brubaker; Saurabh Saxena

arXiv:2601.16192·cs.CV·January 23, 2026

360Anything: Geometry-Free Lifting of Images and Videos to 360{\deg}

Ziyi Wu, Daniel Watson, Andrea Tagliasacchi, David J. Fleet, Marcus A. Brubaker, Saurabh Saxena

PDF

Open Access

TL;DR

360Anything is a data-driven, geometry-free framework that uses pre-trained diffusion transformers to lift images and videos to 360-degree panoramas without needing camera metadata, achieving state-of-the-art results.

Contribution

It introduces a novel geometry-free approach using diffusion transformers for perspective-to-360 panorama conversion, removing the dependence on camera calibration data.

Findings

01

State-of-the-art performance on image and video perspective-to-360 generation

02

Outperforms methods requiring ground-truth camera information

03

Demonstrates strong zero-shot camera FoV and orientation estimation

Abstract

Lifting perspective images and videos to 360{\deg} panoramas enables immersive 3D world generation. Existing approaches often rely on explicit geometric alignment between the perspective and the equirectangular projection (ERP) space. Yet, this requires known camera metadata, obscuring the application to in-the-wild data where such calibration is typically absent or noisy. We propose 360Anything, a geometry-free framework built upon pre-trained diffusion transformers. By treating the perspective input and the panorama target simply as token sequences, 360Anything learns the perspective-to-equirectangular mapping in a purely data-driven way, eliminating the need for camera information. Our approach achieves state-of-the-art performance on both image and video perspective-to-360{\deg} generation, outperforming prior works that use ground-truth camera information. We also trace the root…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · 3D Shape Modeling and Analysis