Feature Hallucination for Self-supervised Action Recognition

Lei Wang; Piotr Koniusz

arXiv:2506.20342·cs.CV·June 26, 2025

Feature Hallucination for Self-supervised Action Recognition

Lei Wang, Piotr Koniusz

PDF

Open Access

TL;DR

This paper introduces a multimodal self-supervised framework for action recognition that uses feature hallucination, novel descriptors, and uncertainty modeling to improve accuracy without extra computational cost.

Contribution

It proposes a new deep translational framework with domain-specific descriptors and uncertainty-aware hallucination for enhanced action recognition.

Findings

01

Achieves state-of-the-art results on Kinetics-400, Kinetics-600, and Something-Something V2 datasets.

02

Effectively integrates multimodal features and auxiliary cues for improved recognition.

03

Demonstrates robustness to feature noise through uncertainty modeling.

Abstract

Understanding human actions in videos requires more than raw pixel analysis; it relies on high-level semantic reasoning and effective integration of multimodal features. We propose a deep translational action recognition framework that enhances recognition accuracy by jointly predicting action concepts and auxiliary features from RGB video frames. At test time, hallucination streams infer missing cues, enriching feature representations without increasing computational overhead. To focus on action-relevant regions beyond raw pixels, we introduce two novel domain-specific descriptors. Object Detection Features (ODF) aggregate outputs from multiple object detectors to capture contextual cues, while Saliency Detection Features (SDF) highlight spatial and intensity patterns crucial for action recognition. Our framework seamlessly integrates these descriptors with auxiliary modalities such as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications

MethodsDropout · Dense Connections · Absolute Position Encodings · Layer Normalization · Byte Pair Encoding · Label Smoothing · Softmax · Transformer · Focus