TL;DR
BEM is a lightweight, training-free module that improves real-time fixed-background detection by reducing false positives through background embedding memory and inverse similarity scoring.
Contribution
We introduce BEM, a novel training-free background embedding memory module that enhances pretrained detectors in fixed-background scenarios by suppressing false positives.
Findings
BEM reduces false positives across YOLO and RT-DETR detectors.
Background-frame cosine similarity correlates with object count and detection precision.
BEM maintains real-time performance while improving detection accuracy.
Abstract
Pretrained detectors perform well on benchmarks but often suffer performance degradation in real-world deployments due to distribution gaps between training data and target environments. COCO-like benchmarks emphasize category diversity rather than instance density, causing detectors trained under per-class sparsity to struggle in dense, single- or few-class scenes such as surveillance and traffic monitoring. In fixed-camera environments, the quasi-static background provides a stable, label-free prior that can be exploited at inference to suppress spurious detections. To address the issue, we propose Background Embedding Memory (BEM), a lightweight, training-free, weight-frozen module that can be attached to pretrained detectors during inference. BEM estimates clean background embeddings, maintains a prototype memory, and re-scores detection logits with an inverse-similarity,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
