Reducing latency and bandwidth for video streaming using keypoint   extraction and digital puppetry

Roshan Prabhakar; Shubham Chandak; Carina Chiu; Renee Liang; Huong; Nguyen; Kedar Tatwawadi; Tsachy Weissman

arXiv:2011.03800·eess.IV·January 11, 2021·DCC

Reducing latency and bandwidth for video streaming using keypoint extraction and digital puppetry

Roshan Prabhakar, Shubham Chandak, Carina Chiu, Renee Liang, Huong, Nguyen, Kedar Tatwawadi, Tsachy Weissman

PDF

1 Repo

TL;DR

This paper introduces a keypoint-based encoding method for video streaming that significantly reduces bandwidth and latency by transmitting body and face keypoints for real-time digital puppetry, especially useful in poor network conditions.

Contribution

The paper presents a novel keypoint-centric encoder and decoder for video streaming that achieves lower bandwidth usage and latency compared to traditional codecs, enabling real-time digital puppetry.

Findings

01

Bandwidth requirement is below 35 kbps, an order of magnitude lower than typical systems.

02

Computational latency for mesh extraction and animation is under 120ms on a standard laptop.

03

Prototype demonstrates effective real-time video communication with semantic preservation.

Abstract

COVID-19 has made video communication one of the most important modes of information exchange. While extensive research has been conducted on the optimization of the video streaming pipeline, in particular the development of novel video codecs, further improvement in the video quality and latency is required, especially under poor network conditions. This paper proposes an alternative to the conventional codec through the implementation of a keypoint-centric encoder relying on the transmission of keypoint information from within a video feed. The decoder uses the streamed keypoints to generate a reconstruction preserving the semantic features in the input feed. Focusing on video calling applications, we detect and transmit the body pose and face mesh information through the network, which are displayed at the receiver in the form of animated puppets. Using efficient pose and face mesh…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shubhamchandak94/digital-puppetry
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.