TL;DR
PlaneCycle is a training-free, adapter-free method that enables 2D foundation models to perform 3D tasks by cyclically distributing spatial aggregation, without retraining or architectural changes.
Contribution
It introduces a novel operator that lifts 2D models to 3D without training, applicable to any 2D network, and achieves competitive 3D performance.
Findings
Without training, models show intrinsic 3D fusion capability.
Under linear probing, outperforms slice-wise 2D baselines.
Matches standard 3D architectures after full fine-tuning.
Abstract
Large-scale 2D foundation models exhibit strong transferable representations, yet extending them to 3D volumetric data typically requires retraining, adapters, or architectural redesign. We introduce PlaneCycle, a training-free, adapter-free operator for architecture-agnostic 2D-to-3D lifting of foundation models. PlaneCycle reuses the original pretrained 2D backbone by cyclically distributing spatial aggregation across orthogonal HW, DW, and DH planes throughout network depth, enabling progressive 3D fusion while preserving pretrained inductive biases. The method introduces no additional parameters and is applicable to arbitrary 2D networks. Using pretrained DINOv3 models, we evaluate PlaneCycle on six 3D classification and three 3D segmentation benchmarks. Without any training, the lifted models exhibit intrinsic 3D fusion capability and, under linear probing, outperform slice-wise 2D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
