Efficient Vision Language Model Fine-tuning for Text-based Person   Anomaly Search

Jiayi He; Shengeng Tang; Ao Liu; Lechao Cheng; Jingjing Wu; Yanyan Wei

arXiv:2502.03230·cs.CV·February 6, 2025

Efficient Vision Language Model Fine-tuning for Text-based Person Anomaly Search

Jiayi He, Shengeng Tang, Ao Liu, Lechao Cheng, Jingjing Wu, Yanyan Wei

PDF

Open Access

TL;DR

This paper introduces a fine-tuning approach for vision-language models tailored to text-based person anomaly search, emphasizing a novel similarity coverage analysis strategy to improve recognition of subtle differences in descriptions.

Contribution

The paper proposes the SCA strategy for better handling subtle text differences, enhancing model accuracy in text-based person anomaly search tasks.

Findings

01

Achieved high accuracy in TPAS challenge

02

Enhanced recognition of subtle description differences

03

Improved model reliability in large datasets

Abstract

This paper presents the HFUT-LMC team's solution to the WWW 2025 challenge on Text-based Person Anomaly Search (TPAS). The primary objective of this challenge is to accurately identify pedestrians exhibiting either normal or abnormal behavior within a large library of pedestrian images. Unlike traditional video analysis tasks, TPAS significantly emphasizes understanding and interpreting the subtle relationships between text descriptions and visual data. The complexity of this task lies in the model's need to not only match individuals to text descriptions in massive image datasets but also accurately differentiate between search results when faced with similar descriptions. To overcome these challenges, we introduce the Similarity Coverage Analysis (SCA) strategy to address the recognition difficulty caused by similar text descriptions. This strategy effectively enhances the model's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Mobility and Location-Based Analysis · Video Surveillance and Tracking Methods · Data-Driven Disease Surveillance

MethodsLib