Vision Transformer Based User Equipment Positioning

Parshwa Shah; Dhaval K. Patel; Brijesh Soni; Miguel L\'opez-Ben\'itez; Siddhartan Govindasamy

arXiv:2511.08549·cs.CV·November 12, 2025

Vision Transformer Based User Equipment Positioning

Parshwa Shah, Dhaval K. Patel, Brijesh Soni, Miguel L\'opez-Ben\'itez, Siddhartan Govindasamy

PDF

Open Access

TL;DR

This paper introduces a Vision Transformer-based model for user equipment positioning using CSI data, significantly improving accuracy over existing methods in indoor and outdoor scenarios.

Contribution

The paper presents a novel ViT architecture tailored for CSI data, addressing limitations of previous models and achieving superior positioning accuracy.

Findings

01

Achieves RMSE of 0.55m indoors and 13.59m outdoors in DeepMIMO

02

Outperforms state-of-the-art schemes by approximately 38%

03

Substantially better error distribution compared to other approaches

Abstract

Recently, Deep Learning (DL) techniques have been used for User Equipment (UE) positioning. However, the key shortcomings of such models is that: i) they weigh the same attention to the entire input; ii) they are not well suited for the non-sequential data e.g., when only instantaneous Channel State Information (CSI) is available. In this context, we propose an attention-based Vision Transformer (ViT) architecture that focuses on the Angle Delay Profile (ADP) from CSI matrix. Our approach, validated on the `DeepMIMO' and `ViWi' ray-tracing datasets, achieves an Root Mean Squared Error (RMSE) of 0.55m indoors, 13.59m outdoors in DeepMIMO, and 3.45m in ViWi's outdoor blockage scenario. The proposed scheme outperforms state-of-the-art schemes by $\sim$ 38\%. It also performs substantially better than other approaches that we have considered in terms of the distribution of error distance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndoor and Outdoor Localization Technologies · Advanced Neural Network Applications · Robotics and Sensor-Based Localization