Towards HRTF Personalization using Denoising Diffusion Models
Juan Camilo Albarrac\'in S\'anchez, Luca Comanducci, Mirco Pezzoli, Fabio Antonacci

TL;DR
This paper introduces a novel approach using denoising diffusion probabilistic models conditioned on anthropometric data to generate personalized HRTFs, demonstrating promising results comparable to current state-of-the-art methods.
Contribution
It is the first to apply DDPMs for HRTF personalization based on anthropometric measurements, advancing personalized audio rendering techniques.
Findings
DDPMs can generate personalized HRIRs effectively.
Performance is comparable to existing state-of-the-art models.
The approach demonstrates feasibility for practical HRTF customization.
Abstract
Head-Related Transfer Functions (HRTFs) have fundamental applications for realistic rendering in immersive audio scenarios. However, they are strongly subject-dependent as they vary considerably depending on the shape of the ears, head and torso. Thus, personalization procedures are required for accurate binaural rendering. Recently, Denoising Diffusion Probabilistic Models (DDPMs), a class of generative learning techniques, have been applied to solve a variety of signal processing-related problems. In this paper, we propose a first approach for using DDPM conditioned on anthropometric measurements to generate personalized Head-Related Impulse Response (HRIR), the time-domain representation of HRTF. The results show the feasibility of DDPMs for HRTF personalization obtaining performance in line with state-of-the-art models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI and HR Technologies
