ViFiT: Reconstructing Vision Trajectories from IMU and Wi-Fi Fine Time Measurements
Bryan Bo Cao, Abrar Alali, Hansi Liu, Nicholas Meegan, Marco Gruteser,, Kristin Dana, Ashwin Ashok, Shubham Jain

TL;DR
ViFiT is a transformer-based model that reconstructs vision object trajectories from IMU and Wi-Fi data, significantly reducing bandwidth needs in IoT applications while maintaining tracking accuracy.
Contribution
This paper introduces ViFiT, a novel transformer-based approach for reconstructing vision trajectories from phone sensor data, addressing bandwidth constraints in IoT systems.
Findings
Achieves a Minimum Required Frames Ratio of 0.65, outperforming previous methods.
Reaches a frame reduction rate of 97.76%, greatly reducing bandwidth usage.
Demonstrates effectiveness across diverse indoor and outdoor environments.
Abstract
Tracking subjects in videos is one of the most widely used functions in camera-based IoT applications such as security surveillance, smart city traffic safety enhancement, vehicle to pedestrian communication and so on. In the computer vision domain, tracking is usually achieved by first detecting subjects with bounding boxes, then associating detected bounding boxes across video frames. For many IoT systems, images captured by cameras are usually sent over the network to be processed at a different site that has more powerful computing resources than edge devices. However, sending entire frames through the network causes significant bandwidth consumption that may exceed the system bandwidth constraints. To tackle this problem, we propose ViFiT, a transformer-based model that reconstructs vision bounding box trajectories from phone data (IMU and Fine Time Measurements). It leverages a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
