Estimating Vehicle Speed on Roadways Using RNNs and Transformers: A Video-based Approach
Sai Krishna Reddy Mareddy, Dhanush Upplapati, Dhanush Kumar Antharam

TL;DR
This paper investigates the use of advanced neural network architectures, including LSTM, GRU, and Transformers, for vehicle speed estimation from video data, aiming to improve accuracy, scalability, and real-time applicability over traditional methods.
Contribution
It introduces a novel application of LSTM, GRU, and Transformer models for video-based vehicle speed estimation, demonstrating superior performance and robustness compared to basic RNNs.
Findings
Transformers outperform RNNs and GRUs in speed estimation accuracy.
Increasing sequence length improves model performance.
Transformers show robustness across diverse traffic conditions.
Abstract
This project explores the application of advanced machine learning models, specifically Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Transformers, to the task of vehicle speed estimation using video data. Traditional methods of speed estimation, such as radar and manual systems, are often constrained by high costs, limited coverage, and potential disruptions. In contrast, leveraging existing surveillance infrastructure and cutting-edge neural network architectures presents a non-intrusive, scalable solution. Our approach utilizes LSTM and GRU to effectively manage long-term dependencies within the temporal sequence of video frames, while Transformers are employed to harness their self-attention mechanisms, enabling the processing of entire sequences in parallel and focusing on the most informative segments of the data. This study demonstrates that both LSTM and GRU…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTanh Activation · Gated Recurrent Unit · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Sigmoid Activation · Long Short-Term Memory
