MVFlow: Deep Optical Flow Estimation of Compressed Videos with Motion Vector Prior
Shili Zhou, Xuhao Jiang, Weimin Tan, Ruian He, Bo Yan

TL;DR
MVFlow leverages motion vectors from compressed videos to enhance optical flow estimation, achieving faster processing and improved accuracy by utilizing pre-computed compression information.
Contribution
Introduces MVFlow, a novel optical flow model that incorporates motion vectors from compressed videos, transforming them into a suitable domain for improved estimation.
Findings
Reduces AEPE by 1.09 compared to existing models.
Saves 52% processing time while maintaining accuracy.
Constructed four new datasets for compressed video optical flow.
Abstract
In recent years, many deep learning-based methods have been proposed to tackle the problem of optical flow estimation and achieved promising results. However, they hardly consider that most videos are compressed and thus ignore the pre-computed information in compressed video streams. Motion vectors, one of the compression information, record the motion of the video frames. They can be directly extracted from the compression code stream without computational cost and serve as a solid prior for optical flow estimation. Therefore, we propose an optical flow model, MVFlow, which uses motion vectors to improve the speed and accuracy of optical flow estimation for compressed videos. In detail, MVFlow includes a key Motion-Vector Converting Module, which ensures that the motion vectors can be transformed into the same domain of optical flow and then be utilized fully by the flow estimation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image Enhancement Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
