Curvelet-enhanced transformer architecture for blurred action fine-grained detection
Yuxiang Ren, Zhetao Guo, Wei Zhang, Yushi Shen, Ying Xing

TL;DR
This paper introduces a new network for recognizing human actions in videos, especially under challenging conditions like motion blur.
Contribution
The novel Multi Curvelet Transformer Network (MCTN) uses curvelet transforms to enhance motion clarity and improve action detection.
Findings
MCTN achieves a mean average precision (mAP) of 0.822 on benchmark datasets.
The curvelet-based attention mechanisms improve spatial-temporal feature extraction.
The network shows potential for real-time intelligent video analysis and human-computer interaction.
Abstract
This study proposes a novel Multi Curvelet Transformer Network (MCTN) for fine-grained human behavior recognition in dynamic video scenarios. A key challenge in this field lies in accurately identifying human actions under adverse conditions such as motion blur, occlusion, and varying illumination. To address this, we introduce a motion blur restoration module leveraging the curvelet transform to enhance motion image clarity, thereby improving downstream behavior detection. Furthermore, we enhance the Transformer architecture by embedding curvelet-based multi-scale attention mechanisms, which significantly improve the model’s ability to extract spatial-temporal features at different resolutions. The proposed network also adopts a multi-curvelet transform structure to deepen semantic representation. Experimental results on benchmark datasets, including an action recognition dataset and…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Technologies in Various Fields · Emotion and Mood Recognition
