Beyond VMAF: Towards Application-Specific Metrics for Teleoperation Video
Ines Trautmannsheimer, Richard Grauberger, Frank Diermeyer

TL;DR
This paper adapts the VMAF video quality metric for teleoperation by retraining it with domain-specific data, resulting in improved correlation with human ratings and highlighting the importance of application-specific metrics.
Contribution
The authors retrain VMAF using teleoperation video data, creating a domain-specific model that better predicts human perceived quality in safety-critical scenarios.
Findings
Retrained VMAF shows 15% lower RMSE and 27% lower MAD compared to original.
Domain-specific training improves the metric's alignment with human ratings.
Outliers reveal cases where high objective scores mask critical degradations.
Abstract
Automated driving has made remarkable progress, yet situations still arise where human intervention is necessary. Teleoperation provides a scalable solution to address such cases, enabling remote operators to support vehicles without being physically present. In this context, video transmission forms the operator's primary source of situational awareness, making video quality a decisive factor for both safety and task performance. In an online study, participants rated compressed video sequences from the Zenseact Dataset and provided subjective quality ratings. These ratings were then used to retrain the Video Multi-Method Assessment Fusion (VMAF) model, yielding an adapted variant tailored to teleoperation. The retrained model demonstrated improved alignment with human ratings compared to the original 4K VMAF. In particular, RMSE decreased from 10.36 to 8.83, and MAD from 8.71 to 6.38,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
