Benchmarking Conventional Vision Models on Neuromorphic Fall Detection   and Action Recognition Dataset

Karthik Sivarama Krishnan; Koushik Sivarama Krishnan

arXiv:2201.12285·cs.CV·April 12, 2022

Benchmarking Conventional Vision Models on Neuromorphic Fall Detection and Action Recognition Dataset

Karthik Sivarama Krishnan, Koushik Sivarama Krishnan

PDF

TL;DR

This paper benchmarks fine-tuned conventional vision models on neuromorphic datasets for fall detection and action recognition, demonstrating that the MViT architecture achieves the highest accuracy and F1 score among tested models.

Contribution

It introduces a benchmarking framework for conventional vision models on neuromorphic data and identifies the MViT architecture as the most effective for this application.

Findings

01

DVS-MViT achieves 95.8% accuracy and F1 score.

02

DVS-C2D achieves 91.6% accuracy and F1 score.

03

DVS-CSN and DVS-X3D perform less effectively.

Abstract

Neuromorphic vision-based sensors are gaining popularity in recent years with their ability to capture Spatio-temporal events with low power sensing. These sensors record events or spikes over traditional cameras which helps in preserving the privacy of the subject being recorded. These events are captured as per-pixel brightness changes and the output data stream is encoded with time, location, and pixel intensity change information. This paper proposes and benchmarks the performance of fine-tuned conventional vision models on neuromorphic human action recognition and fall detection datasets. The Spatio-temporal event streams from the Dynamic Vision Sensing cameras are encoded into a standard sequence image frames. These video frames are used for benchmarking conventional deep learning-based architectures. In this proposed approach, we fine-tuned the state-of-the-art vision models for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMultiscale Vision Transformer