Bidirectional Cross-Modal Prompting for Event-Frame Asymmetric Stereo
Ninghui Xu, Fabio Tosi, Lihui Wang, Jiawei Han, Luca Bartolomei, Zhiting Yao, Matteo Poggi, Stefano Mattoccia

TL;DR
This paper presents Bi-CMPStereo, a bidirectional cross-modal prompting framework that enhances event-frame stereo matching by leveraging semantic and structural features from both modalities.
Contribution
It introduces a novel method that learns aligned stereo representations and integrates domain-specific cues, improving accuracy and generalization in event-frame stereo matching.
Findings
Outperforms state-of-the-art methods in accuracy.
Demonstrates significant generalization capabilities.
Effectively exploits semantic and structural features from both modalities.
Abstract
Conventional frame-based cameras capture rich contextual information but suffer from limited temporal resolution and motion blur in dynamic scenes. Event cameras offer an alternative visual representation with higher dynamic range free from such limitations. The complementary characteristics of the two modalities make event-frame asymmetric stereo promising for reliable 3D perception under fast motion and challenging illumination. However, the modality gap often leads to marginalization of domain-specific cues essential for cross-modal stereo matching. In this paper, we introduce Bi-CMPStereo, a novel bidirectional cross-modal prompting framework that fully exploits semantic and structural features from both domains for robust matching. Our approach learns finely aligned stereo representations within a target canonical space and integrates complementary representations by projecting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
