A new way of video compression via forward-referencing using deep learning
S.M.A.K. Rajin, M. Murshed, M. Paul, S.W. Teng, J. Ma

TL;DR
This paper introduces a novel video compression method that leverages deep learning to model human pose trajectories, using generated future frames as forward references to improve compression efficiency, especially in high-motion videos.
Contribution
It proposes a new video coding approach that models human pose to generate forward-referenced frames, overcoming traditional limitations of backward referencing in high-motion scenarios.
Findings
Achieves up to 2.83 dB PSNR gain
Realizes 25.93% bitrate savings
Effective in high-motion video sequences
Abstract
To exploit high temporal correlations in video frames of the same scene, the current frame is predicted from the already-encoded reference frames using block-based motion estimation and compensation techniques. While this approach can efficiently exploit the translation motion of the moving objects, it is susceptible to other types of affine motion and object occlusion/deocclusion. Recently, deep learning has been used to model the high-level structure of human pose in specific actions from short videos and then generate virtual frames in future time by predicting the pose using a generative adversarial network (GAN). Therefore, modelling the high-level structure of human pose is able to exploit semantic correlation by predicting human actions and determining its trajectory. Video surveillance applications will benefit as stored big surveillance data can be compressed by estimating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Image Processing Techniques · Advanced Vision and Imaging
