Lightweight Modular Parameter-Efficient Tuning for Open-Vocabulary Object Detection
Bilal Faye, Hanane Azzag, Mustapha Lebbah

TL;DR
UniProj-Det is a lightweight, modular framework for open-vocabulary object detection that significantly reduces training costs by freezing backbones and only training a small subset of parameters, while maintaining high performance.
Contribution
It introduces a universal projection module with a learnable modality token, enabling efficient vision-language adaptation with minimal parameter updates.
Findings
Trains only 2-5% of parameters while achieving competitive performance.
Reduces FLOPs, memory, and latency compared to full fine-tuning.
Demonstrates effectiveness across phrase grounding, referring expression, and segmentation tasks.
Abstract
Open-vocabulary object detection (OVD) extends recognition beyond fixed taxonomies by aligning visual and textual features, as in MDETR, GLIP, or RegionCLIP. While effective, these models require updating all parameters of large vision--language backbones, leading to prohibitive training cost. Recent efficient OVD approaches, inspired by parameter-efficient fine-tuning methods such as LoRA or adapters, reduce trainable parameters but often face challenges in selecting which layers to adapt and in balancing efficiency with accuracy. We propose UniProj-Det, a lightweight modular framework for parameter-efficient OVD. UniProj-Det freezes pretrained backbones and introduces a Universal Projection module with a learnable modality token, enabling unified vision--language adaptation at minimal cost. Applied to MDETR, our framework trains only about ~2-5% of parameters while achieving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques
MethodsMDETR
