{S\textsuperscript{2}M\textsuperscript{2}}: Scalable Stereo Matching Model for Reliable Depth Estimation
Junhong Min, Youngpil Jeon, Jimin Kim, Minyong Choi

TL;DR
The paper introduces {S extsuperscript{2}M extsuperscript{2}}, a scalable global stereo matching model that achieves state-of-the-art accuracy and efficiency across diverse datasets without dataset-specific tuning.
Contribution
It presents a novel global matching architecture using a multi-resolution transformer and a new loss function, overcoming computational barriers of previous global methods.
Findings
State-of-the-art accuracy on Middlebury v3 and ETH3D benchmarks
Significantly outperforms prior methods in most metrics
Reconstructs high-quality details with competitive efficiency
Abstract
The pursuit of a generalizable stereo matching model, capable of performing well across varying resolutions and disparity ranges without dataset-specific fine-tuning, has revealed a fundamental trade-off. Iterative local search methods achieve high scores on constrained benchmarks, but their core mechanism inherently limits the global consistency required for true generalization. However, global matching architectures, while theoretically more robust, have historically been rendered infeasible by prohibitive computational and memory costs. We resolve this dilemma with {S\textsuperscript{2}M\textsuperscript{2}}: a global matching architecture that achieves state-of-the-art accuracy and high efficiency without relying on cost volume filtering or deep refinement stacks. Our design integrates a multi-resolution transformer for robust long-range correspondence, trained with a novel loss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Medical Image Segmentation Techniques
