InternVQA: Advancing Compressed Video Quality Assessment with Distilling Large Foundation Model
Fengbin Guan, Zihao Yu, Yiting Lu, Xin Li, Zhibo Chen

TL;DR
This paper introduces InternVQA, a lightweight video quality assessment model distilled from a large foundation model, demonstrating strong performance in assessing compressed video quality.
Contribution
It presents a novel distillation approach to transfer knowledge from InternVideo2 to a smaller model for video quality assessment under compression.
Findings
The distilled model outperforms other methods in compression video quality assessment.
Different backbone choices impact the distillation effectiveness.
The approach effectively leverages large foundation models for specialized tasks.
Abstract
Video quality assessment tasks rely heavily on the rich features required for video understanding, such as semantic information, texture, and temporal motion. The existing video foundational model, InternVideo2, has demonstrated strong potential in video understanding tasks due to its large parameter size and large-scale multimodal data pertaining. Building on this, we explored the transferability of InternVideo2 to video quality assessment under compression scenarios. To design a lightweight model suitable for this task, we proposed a distillation method to equip the smaller model with rich compression quality priors. Additionally, we examined the performance of different backbones during the distillation process. The results showed that, compared to other methods, our lightweight model distilled from InternVideo2 achieved excellent performance in compression video quality assessment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Video Analysis and Summarization · Video Coding and Compression Technologies
