Vision Transformer Based User Equipment Positioning
By: Parshwa Shah , Dhaval K. Patel , Brijesh Soni and more
Potential Business Impact:
Finds your phone's location more accurately.
Recently, Deep Learning (DL) techniques have been used for User Equipment (UE) positioning. However, the key shortcomings of such models is that: i) they weigh the same attention to the entire input; ii) they are not well suited for the non-sequential data e.g., when only instantaneous Channel State Information (CSI) is available. In this context, we propose an attention-based Vision Transformer (ViT) architecture that focuses on the Angle Delay Profile (ADP) from CSI matrix. Our approach, validated on the `DeepMIMO' and `ViWi' ray-tracing datasets, achieves an Root Mean Squared Error (RMSE) of 0.55m indoors, 13.59m outdoors in DeepMIMO, and 3.45m in ViWi's outdoor blockage scenario. The proposed scheme outperforms state-of-the-art schemes by $\sim$ 38\%. It also performs substantially better than other approaches that we have considered in terms of the distribution of error distance.
Similar Papers
Neural Positioning Without External Reference
Signal Processing
Locates phones using Wi-Fi signals, no extra gear needed.
3D Dynamic Radio Map Prediction Using Vision Transformers for Low-Altitude Wireless Networks
Machine Learning (CS)
Helps drones stay connected in the air.
Edge-Enhanced Vision Transformer Framework for Accurate AI-Generated Image Detection
CV and Pattern Recognition
Finds fake pictures made by computers.