Unsupervised Discriminative Embedding for Sub-Action Learning in Complex Activities
Sirnam Swetha, Hilde Kuehne, Yogesh S Rawat, Mubarak Shah

TL;DR
This paper introduces an unsupervised method for learning sub-actions in complex activities by mapping visual and temporal data into a latent space where sub-actions are discriminatively learned without explicit clustering.
Contribution
It proposes a novel end-to-end discriminative latent concept learning approach that jointly embeds visual and temporal features for unsupervised sub-action discovery.
Findings
Effective on three benchmark datasets
Outperforms existing unsupervised methods
Learns robust sub-action representations
Abstract
Action recognition and detection in the context of long untrimmed video sequences has seen an increased attention from the research community. However, annotation of complex activities is usually time consuming and challenging in practice. Therefore, recent works started to tackle the problem of unsupervised learning of sub-actions in complex activities. This paper proposes a novel approach for unsupervised sub-action learning in complex activities. The proposed method maps both visual and temporal representations to a latent space where the sub-actions are learnt discriminatively in an end-to-end fashion. To this end, we propose to learn sub-actions as latent concepts and a novel discriminative latent concept learning (DLCL) module aids in learning sub-actions. The proposed DLCL module lends on the idea of latent concepts to learn compact representations in the latent embedding space…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
