DC-ViT: Modulating Spatial and Channel Interactions for Multi-Channel Images
Umar Marikkar, Syed Sameed Husain, Muhammad Awais, Sara Atito

TL;DR
DC-ViT introduces a novel transformer architecture that decouples spatial and channel interactions, improving multi-channel image analysis by preserving channel-specific features and adaptively integrating cross-channel information.
Contribution
The paper proposes Decoupled Vision Transformer (DC-ViT) with Decoupled Self-Attention and Decoupled Aggregation, enabling better handling of heterogeneous multi-channel images.
Findings
Consistent performance improvements over existing MC-ViT methods.
Effective preservation of channel-specific semantics.
Enhanced task-specific channel importance learning.
Abstract
Training and evaluation in multi-channel imaging (MCI) remains challenging due to heterogeneous channel configurations arising from varying staining protocols, sensor types, and acquisition settings. This heterogeneity limits the applicability of fixed-channel encoders commonly used in general computer vision. Recent Multi-Channel Vision Transformers (MC-ViTs) address this by enabling flexible channel inputs, typically by jointly encoding patch tokens from all channels within a unified attention space. However, unrestricted token interactions across channels can lead to feature dilution, reducing the ability to preserve channel-specific semantics that are critical in MCI data. To address this, we propose Decoupled Vision Transformer (DC-ViT), which explicitly regulates information sharing using Decoupled Self-Attention (DSA), which decomposes token updates into two complementary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Ferroelectric and Negative Capacitance Devices · EEG and Brain-Computer Interfaces
