RFOP: Rethinking Fusion and Orthogonal Projection for Face-Voice Association

Abdul Hannan; Furqan Malik; Hina Jabbar; Syed Suleman Sadiq; Mubashir Noman

arXiv:2512.02860·cs.CV·December 3, 2025

RFOP: Rethinking Fusion and Orthogonal Projection for Face-Voice Association

Abdul Hannan, Furqan Malik, Hina Jabbar, Syed Suleman Sadiq, Mubashir Noman

PDF

Open Access

TL;DR

This paper presents RFOP, a novel approach for face-voice association in multilingual environments, emphasizing fusion and orthogonal projection to improve semantic relevance, achieving competitive results in the FAME 2026 challenge.

Contribution

The paper introduces RFOP, a new method that enhances face-voice association by focusing on semantic information through fusion and orthogonal projection techniques.

Findings

01

Achieved 33.1% EER on English-German face-voice data

02

Ranked 3rd in the FAME 2026 challenge

03

Effective focus on relevant semantic information improves performance

Abstract

Face-voice association in multilingual environment challenge 2026 aims to investigate the face-voice association task in multilingual scenario. The challenge introduces English-German face-voice pairs to be utilized in the evaluation phase. To this end, we revisit the fusion and orthogonal projection for face-voice association by effectively focusing on the relevant semantic information within the two modalities. Our method performs favorably on the English-German data split and ranked 3rd in the FAME 2026 challenge by achieving the EER of 33.1.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Emotion and Mood Recognition · Face Recognition and Perception