MEX: Memory-efficient Approach to Referring Multi-Object Tracking
Huu-Thien Tran, Phuoc-Sang Pham, Thai-Son Tran, Khoa Luu

TL;DR
This paper introduces MEX, a memory-efficient module that enhances referring multi-object tracking by improving inference efficiency and memory usage, enabling real-time performance on low-memory devices.
Contribution
We propose MEX, a novel memory-efficient module that can be integrated into existing trackers like iKUN, significantly improving their efficiency and performance during inference.
Findings
Effective during inference on a single 4 GB GPU
Significant improvements in memory usage and processing speed
Enhanced HOTA tracking scores on Refer-KITTI dataset
Abstract
Referring Multi-Object Tracking (RMOT) is a relatively new concept that has rapidly gained traction as a promising research direction at the intersection of computer vision and natural language processing. Unlike traditional multi-object tracking, RMOT identifies and tracks objects and incorporates textual descriptions for object class names, making the approach more intuitive. Various techniques have been proposed to address this challenging problem; however, most require the training of the entire network due to their end-to-end nature. Among these methods, iKUN has emerged as a particularly promising solution. Therefore, we further explore its pipeline and enhance its performance. In this paper, we introduce a practical module dubbed Memory-Efficient Cross-modality -- MEX. This memory-efficient technique can be directly applied to off-the-shelf trackers like iKUN, resulting in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · Energy Efficient Wireless Sensor Networks · Target Tracking and Data Fusion in Sensor Networks
