Hierarchically Learned View-Invariant Representations for Cross-View Action Recognition
Yang Liu, Zhaoyang Lu, Jing Li, Tao Yang

TL;DR
This paper introduces a hierarchical method combining joint sparse representation and distribution adaptation to learn view-invariant features for cross-view action recognition, effectively handling large view differences.
Contribution
It proposes a novel hierarchical approach with joint sparse representation and distribution adaptation for robust cross-view action recognition.
Findings
Outperforms state-of-the-art methods on four multiview datasets.
Effectively reduces distribution differences across views.
Learns view-invariant features robust to large view variations.
Abstract
Recognizing human actions from varied views is challenging due to huge appearance variations in different views. The key to this problem is to learn discriminant view-invariant representations generalizing well across views. In this paper, we address this problem by learning view-invariant representations hierarchically using a novel method, referred to as Joint Sparse Representation and Distribution Adaptation (JSRDA). To obtain robust and informative feature representations, we first incorporate a sample-affinity matrix into the marginalized stacked denoising Autoencoder (mSDA) to obtain shared features, which are then combined with the private features. In order to make the feature representations of videos across views transferable, we then learn a transferable dictionary pair simultaneously from pairs of videos taken at different views to encourage each action video across views to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDenoising Autoencoder · Solana Customer Service Number +1-833-534-1729
