Learning Semantic Correspondence with Sparse Annotations
Shuaiyi Huang, Luyu Yang, Bo He, Songyang Zhang, Xuming He, Abhinav, Shrivastava

TL;DR
This paper introduces a teacher-student framework with denoising strategies to improve dense semantic correspondence using sparse keypoint annotations, achieving state-of-the-art results.
Contribution
It proposes a novel paradigm with pseudo-label generation and denoising strategies, including spatial priors and dynamic label selection, for learning from sparse annotations.
Findings
Achieves state-of-the-art performance on semantic correspondence benchmarks.
Effective pseudo-label denoising improves correspondence accuracy.
Two learning strategies demonstrate robustness and versatility.
Abstract
Finding dense semantic correspondence is a fundamental problem in computer vision, which remains challenging in complex scenes due to background clutter, extreme intra-class variation, and a severe lack of ground truth. In this paper, we aim to address the challenge of label sparsity in semantic correspondence by enriching supervision signals from sparse keypoint annotations. To this end, we first propose a teacher-student learning paradigm for generating dense pseudo-labels and then develop two novel strategies for denoising pseudo-labels. In particular, we use spatial priors around the sparse annotations to suppress the noisy pseudo-labels. In addition, we introduce a loss-driven dynamic label selection strategy for label denoising. We instantiate our paradigm with two variants of learning strategies: a single offline teacher setting, and mutual online teachers setting. Our approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Music and Audio Processing · Video Analysis and Summarization
