RF-GML: Reference-Free Generative Machine Listener
Arijit Biswas, Guanxin Jiang

TL;DR
RF-GML is a new reference-free audio quality metric that accurately predicts subjective scores for various audio formats and codecs, leveraging transfer learning from a full-reference model.
Contribution
It introduces RF-GML, a novel transfer learning-based reference-free audio quality assessment model with high accuracy and versatility across diverse audio content and codecs.
Findings
Outperforms existing RF models in predicting subjective quality.
Effectively distinguishes different levels of coding artifacts.
Works well across mono, stereo, and binaural audio at 48 kHz.
Abstract
This paper introduces a novel reference-free (RF) audio quality metric called the RF-Generative Machine Listener (RF-GML), designed to evaluate coded mono, stereo, and binaural audio at a 48 kHz sample rate. RF-GML leverages transfer learning from a state-of-the-art full-reference (FR) Generative Machine Listener (GML) with minimal architectural modifications. The term "generative" refers to the model's ability to generate an arbitrary number of simulated listening scores. Unlike existing RF models, RF-GML accurately predicts subjective quality scores across diverse content types and codecs. Extensive evaluations demonstrate its superiority in rating unencoded audio and distinguishing different levels of coding artifacts. RF-GML's performance and versatility make it a valuable tool for coded audio quality assessment and monitoring in various applications, all without the need for a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
