RVM+: An AI-Driven Vision Sensor Framework for High-Precision, Real-Time Video Portrait Segmentation with Enhanced Temporal Consistency and Optimized Model Design

Na Tang; Yuehui Liao; Yu Chen; Guang Yang; Xiaobo Lai; Jing Chen

PMC · DOI:10.3390/s25051278·February 20, 2025

RVM+: An AI-Driven Vision Sensor Framework for High-Precision, Real-Time Video Portrait Segmentation with Enhanced Temporal Consistency and Optimized Model Design

Na Tang, Yuehui Liao, Yu Chen, Guang Yang, Xiaobo Lai, Jing Chen

PDF

Open Access

TL;DR

RVM+ is an AI framework that improves real-time video portrait segmentation with better accuracy and efficiency for applications like AR and robotics.

Contribution

RVM+ introduces ConvGRU and knowledge distillation to enhance temporal consistency and reduce computational costs in video segmentation.

Findings

01

RVM+ outperforms state-of-the-art methods in segmentation accuracy and temporal consistency.

02

Knowledge distillation reduces computational demands with minimal accuracy loss.

03

Key metrics like MIoU, SAD, and dtSSD confirm the model's robustness and efficiency.

Abstract

Video portrait segmentation is essential for intelligent sensing systems, including human-computer interaction, autonomous navigation, and augmented reality. However, dynamic video environments introduce significant challenges, such as temporal variations, occlusions, and computational constraints. This study introduces RVM+, an enhanced video segmentation framework based on the Robust Video Matting (RVM) architecture. By incorporating Convolutional Gated Recurrent Units (ConvGRU), RVM+ improves temporal consistency and captures intricate temporal dynamics across video frames. Additionally, a novel knowledge distillation strategy reduces computational demands while maintaining high segmentation accuracy, making the framework ideal for real-time applications in resource-constrained environments. Comprehensive evaluations on challenging datasets show that RVM+ outperforms state-of-the-art…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Figures13

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Image Enhancement Techniques