S3-Net: A Fast and Lightweight Video Scene Understanding Network by   Single-shot Segmentation

Yuan Cheng; Yuchao Yang; Hai-Bao Chen; Ngai Wong; Hao Yu

arXiv:2011.02265·cs.CV·November 5, 2020·1 cites

S3-Net: A Fast and Lightweight Video Scene Understanding Network by Single-shot Segmentation

Yuan Cheng, Yuchao Yang, Hai-Bao Chen, Ngai Wong, Hao Yu

PDF

Open Access

TL;DR

S3-Net is a fast, lightweight video scene understanding network that performs single-shot segmentation and uses structured features for real-time applications, optimized for edge computing.

Contribution

It introduces S3-Net, a novel single-shot segmentation approach combined with tensorization and quantization for efficient, real-time video scene understanding on edge devices.

Findings

01

Achieves 8.1% higher accuracy than 3D-CNN on UCF11.

02

Reduces storage by 6.9 times.

03

Operates at 22.8 FPS on CityScapes with GTX1080Ti.

Abstract

Real-time understanding in video is crucial in various AI applications such as autonomous driving. This work presents a fast single-shot segmentation strategy for video scene understanding. The proposed net, called S3-Net, quickly locates and segments target sub-scenes, meanwhile extracts structured time-series semantic features as inputs to an LSTM-based spatio-temporal model. Utilizing tensorization and quantization techniques, S3-Net is intended to be lightweight for edge computing. Experiments using CityScapes, UCF11, HMDB51 and MOMENTS datasets demonstrate that the proposed S3-Net achieves an accuracy improvement of 8.1% versus the 3D-CNN based approach on UCF11, a storage reduction of 6.9x and an inference speed of 22.8 FPS on CityScapes with a GTX1080Ti GPU.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Video Surveillance and Tracking Methods