Extreme Low Resolution Activity Recognition with Multi-Siamese Embedding Learning
Michael S. Ryoo, Kiyoon Kim, Hyun Jong Yang

TL;DR
This paper introduces a multi-Siamese neural network for recognizing human activities in extremely low resolution videos, improving accuracy while preserving privacy.
Contribution
It proposes a novel two-stream multi-Siamese CNN that learns a shared embedding space invariant to low resolution transformations, advancing low resolution activity recognition.
Findings
Outperforms previous state-of-the-art methods on standard datasets
Effectively captures transformation-invariant features in low resolution videos
Demonstrates robustness in privacy-preserving activity recognition
Abstract
This paper presents an approach for recognizing human activities from extreme low resolution (e.g., 16x12) videos. Extreme low resolution recognition is not only necessary for analyzing actions at a distance but also is crucial for enabling privacy-preserving recognition of human activities. We design a new two-stream multi-Siamese convolutional neural network. The idea is to explicitly capture the inherent property of low resolution (LR) videos that two images originated from the exact same scene often have totally different pixel values depending on their LR transformations. Our approach learns the shared embedding space that maps LR videos with the same content to the same location regardless of their transformations. We experimentally confirm that our approach of jointly learning such transform robust LR video representation and the classifier outperforms the previous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · COVID-19 diagnosis using AI · Domain Adaptation and Few-Shot Learning
