Look Beyond Saliency: Low-Attention Guided Dual Encoding for Video Semantic Search

Faisal Aljehrai; Mohammed A. Alkhrashi; Alreem Almuhrij; Sarah Abuhimed; Noorh Aldossary; Abdullah Aldwyish; Raied Aljadaany; Huda Alamri; Muhammad Kamran J Khan

arXiv:2605.06229·cs.CV·May 8, 2026

Look Beyond Saliency: Low-Attention Guided Dual Encoding for Video Semantic Search

Faisal Aljehrai, Mohammed A. Alkhrashi, Alreem Almuhrij, Sarah Abuhimed, Noorh Aldossary, Abdullah Aldwyish, Raied Aljadaany, Huda Alamri, Muhammad Kamran J Khan

PDF

TL;DR

This paper introduces a novel inverse attention embedding mechanism that improves video semantic search in crowded scenes by emphasizing background regions often ignored by traditional models.

Contribution

The work presents a new inverse attention embedding technique that enhances semantic retrieval in crowded videos without requiring extra training.

Findings

01

Improved recall in video semantic search in crowded environments.

02

Ablation studies confirm the effectiveness of inverse attention embeddings.

03

Significant performance gains over existing methods.

Abstract

Video semantic search in densely crowded scenes remains a challenging task due to visual encoders tendency to prioritize salient foreground regions while neglecting contextually important, background areas. We propose an Inverse Attention Embedding mechanism that explicitly captures and highlights these overlooked regions. By combining inverse attention embeddings with traditional visual embeddings, our method significantly enhances semantic retrieval performance without additional training. Initial experiments and ablation studies demonstrate promising improvements over existing approaches in recall for video semantic search in crowded environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.