Enhanced Object Detection: A Study on Vast Vocabulary Object Detection Track for V3Det Challenge 2024
Peixi Wu, Bosong Chai, Xuan Nie, Longquan Yan, Zeyu Wang, Qifan Zhou,, Boning Wang, Yansong Peng, Hebei Li

TL;DR
This paper introduces improved methods for vast vocabulary object detection, enhancing detection accuracy and ranking in the V3Det Challenge 2024 through network, loss, and training strategy modifications.
Contribution
We developed novel adjustments to network structure, loss functions, and training strategies tailored for complex, large-vocabulary object detection tasks.
Findings
Achieved superior performance over baseline models.
Secured top rankings in V3Det Challenge 2024.
Demonstrated effectiveness of proposed improvements.
Abstract
In this technical report, we present our findings from the research conducted on the Vast Vocabulary Visual Detection (V3Det) dataset for Supervised Vast Vocabulary Visual Detection task. How to deal with complex categories and detection boxes has become a difficulty in this track. The original supervised detector is not suitable for this task. We have designed a series of improvements, including adjustments to the network structure, changes to the loss function, and design of training strategies. Our model has shown improvement over the baseline and achieved excellent rankings on the Leaderboard for both the Vast Vocabulary Object Detection (Supervised) track and the Open Vocabulary Object Detection (OVD) track of the V3Det Challenge 2024.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications
