A Computationally Efficient Multidimensional Vision Transformer
Alaa El Ichi, Khalide Jbilou

TL;DR
This paper introduces a tensor-based Vision Transformer framework that reduces computational costs by exploiting multilinear structures and cosine transforms, achieving significant parameter reduction while maintaining accuracy.
Contribution
The paper presents a novel tensor cosine product framework integrated into Vision Transformers, enabling efficient attention mechanisms and structured feature representations.
Findings
Achieves 1/C parameter reduction in Vision Transformers.
Maintains competitive accuracy on classification and segmentation benchmarks.
Provides theoretical analysis of tensor cosine product properties.
Abstract
Vision Transformers have achieved state-of-the-art performance in a wide range of computer vision tasks, but their practical deployment is limited by high computational and memory costs. In this paper, we introduce a novel tensor-based framework for Vision Transformers built upon the Tensor Cosine Product (Cproduct). By exploiting multilinear structures inherent in image data and the orthogonality of cosine transforms, the proposed approach enables efficient attention mechanisms and structured feature representations. We develop the theoretical foundations of the tensor cosine product, analyze its algebraic properties, and integrate it into a new Cproduct-based Vision Transformer architecture (TCP-ViT). Numerical experiments on standard classification and segmentation benchmarks demonstrate that the proposed method achieves a uniform 1/C parameter reduction (where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Tensor decomposition and applications · Advanced Neural Network Applications
