Dual-Domain Representation Alignment: Bridging 2D and 3D Vision via Geometry-Aware Architecture Search

Haoyu Zhang; Zhihao Yu; Rui Wang; Yaochu Jin; Qiqi Liu; Ran Cheng

arXiv:2603.19563·cs.CV·March 23, 2026

Dual-Domain Representation Alignment: Bridging 2D and 3D Vision via Geometry-Aware Architecture Search

Haoyu Zhang, Zhihao Yu, Rui Wang, Yaochu Jin, Qiqi Liu, Ran Cheng

PDF

Open Access 1 Models

TL;DR

This paper introduces EvoNAS, a multi-objective neural architecture search framework that efficiently designs 2D and 3D vision models balancing accuracy and speed, suitable for resource-limited devices.

Contribution

It proposes a hybrid supernet with a novel knowledge distillation strategy and a distributed evaluation framework, enabling reliable and efficient architecture search.

Findings

01

EvoNets outperform traditional models in inference latency and throughput.

02

The proposed methods reduce evaluation costs by over 70%.

03

EvoNets maintain strong performance across multiple vision tasks.

Abstract

Modern computer vision requires balancing predictive accuracy with real-time efficiency, yet the high inference cost of large vision models (LVMs) limits deployment on resource-constrained edge devices. Although Evolutionary Neural Architecture Search (ENAS) is well suited for multi-objective optimization, its practical use is hindered by two issues: expensive candidate evaluation and ranking inconsistency among subnetworks. To address them, we propose EvoNAS, an efficient distributed framework for multi-objective evolutionary architecture search. We build a hybrid supernet that integrates Vision State Space and Vision Transformer (VSS-ViT) modules, and optimize it with a Cross-Architecture Dual-Domain Knowledge Distillation (CA-DDKD) strategy. By coupling the computational efficiency of VSS blocks with the semantic expressiveness of ViT modules, CA-DDKD improves the representational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
kujimili/EvoNAS
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Vision and Imaging · Domain Adaptation and Few-Shot Learning