# Multi-Focused Video Group Activities Hashing

**Authors:** Zhongmiao Qi, Yan Jiang, Bolin Zhang, Chong Wang, Lijun Guo, Pengjiang Qian, Jiangbo Qian

arXiv: 2509.00490 · 2025-12-30

## TL;DR

This paper introduces a novel spatiotemporal video hashing framework, STVH, and its enhanced version M-STVH, for efficient retrieval of group activities at different granularities by modeling object dynamics and interactions.

## Contribution

The paper presents the first unified spatiotemporal hashing method, STVH, and an advanced multi-focused version, M-STVH, for flexible, fine-grained group activity video retrieval.

## Key findings

- Both STVH and M-STVH achieve excellent results on public datasets.
- M-STVH effectively handles activity and object visual feature retrieval.
- The methods outperform existing approaches in group activity video retrieval.

## Abstract

With the explosive growth of video data in various complex scenarios, quickly retrieving group activities has become an urgent problem. However, many tasks can only retrieve videos focusing on an entire video, not the activity granularity. To solve this problem, we propose a new STVH (spatiotemporal interleaved video hashing) technique for the first time. Through a unified framework, the STVH simultaneously models individual object dynamics and group interactions, capturing the spatiotemporal evolution on both group visual features and positional features. Moreover, in real-life video retrieval scenarios, it may sometimes require activity features, while at other times, it may require visual features of objects. We then further propose a novel M-STVH (multi-focused spatiotemporal video hashing) as an enhanced version to handle this difficult task. The advanced method incorporates hierarchical feature integration through multi-focused representation learning, allowing the model to jointly focus on activity semantics features and object visual features. We conducted comparative experiments on publicly available datasets, and both STVH and M-STVH can achieve excellent results.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00490/full.md

## Figures

26 figures with captions in the complete paper: https://tomesphere.com/paper/2509.00490/full.md

---
Source: https://tomesphere.com/paper/2509.00490