Optimizing Camera Configurations for Multi-View Pedestrian Detection
Yunzhong Hou, Xingjian Leng, Tom Gedeon, Liang Zheng

TL;DR
This paper introduces a transformer-based reinforcement learning approach to automatically optimize multi-view camera setups for pedestrian detection, outperforming traditional methods and human-designed configurations.
Contribution
A novel transformer-based reinforcement learning method for autonomous camera configuration optimization in multi-view pedestrian detection systems.
Findings
Generated configurations outperform heuristic and human-designed setups.
The method maximizes coverage and reduces occlusion effectively.
Configurations improve detection accuracy across multiple scenarios.
Abstract
Jointly considering multiple camera views (multi-view) is very effective for pedestrian detection under occlusion. For such multi-view systems, it is critical to have well-designed camera configurations, including camera locations, directions, and fields-of-view (FoVs). Usually, these configurations are crafted based on human experience or heuristics. In this work, we present a novel solution that features a transformer-based camera configuration generator. Using reinforcement learning, this generator autonomously explores vast combinations within the action space and searches for configurations that give the highest detection accuracy according to the training dataset. The generator learns advanced techniques like maximizing coverage, minimizing occlusion, and promoting collaboration. Across multiple simulation scenarios, the configurations generated by our transformer-based model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · Image Enhancement Techniques
