MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios
Jiacheng Ruan, Wenzhen Yuan, Zehao Lin, Ning Liao, Zhiyu Li, Feiyu, Xiong, Ting Liu, Yuzhuo Fu

TL;DR
This paper introduces MM-CamObj, a novel multimodal dataset and LVLM designed specifically for camouflaged object detection and understanding, addressing the lack of such data in existing models.
Contribution
The paper creates the first comprehensive camouflaged scene dataset and develops CamObj-Llava, a specialized LVLM, along with a curriculum learning strategy and evaluation benchmark.
Findings
CamObj-Llava outperforms existing LVLMs on camouflaged scene tasks.
The dataset enables significant improvements in recognition and localization of camouflaged objects.
Model achieves 25.84% better performance on key tasks compared to GPT-4o.
Abstract
Large visual-language models (LVLMs) have achieved great success in multiple applications. However, they still encounter challenges in complex scenes, especially those involving camouflaged objects. This is primarily due to the lack of samples related to camouflaged scenes in the training dataset. To mitigate this issue, we construct the MM-CamObj dataset for the first time, comprising two subsets: CamObj-Align and CamObj-Instruct. Specifically, CamObj-Align contains 11,363 image-text pairs, and it is designed for VL alignment and injecting rich knowledge of camouflaged scenes into LVLMs. CamObj-Instruct is collected for fine-tuning the LVLMs with improved instruction-following capabilities, and it includes 11,363 images and 68,849 conversations with diverse instructions. Based on the MM-CamObj dataset, we propose the CamObj-Llava, an LVLM specifically designed for addressing tasks in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsVisual Attention and Saliency Detection · Video Surveillance and Tracking Methods
