DCTM: Discrete-Continuous Transformation Matching for Semantic Flow
Seungryong Kim, Dongbo Min, Stephen Lin, and Kwanghoon Sohn

TL;DR
This paper introduces DCTM, a novel framework for dense semantic correspondence that effectively handles complex deformations like affine transformations using a discrete-continuous optimization approach, resulting in superior performance.
Contribution
The paper proposes a discrete-continuous transformation matching framework that efficiently infers dense affine transformation fields for semantic correspondence.
Findings
Outperforms state-of-the-art methods on benchmark datasets.
Efficient computation through edge-aware filtering and CNN-based descriptors.
Effectively handles complex geometric variations like affine transformations.
Abstract
Techniques for dense semantic correspondence have provided limited ability to deal with the geometric variations that commonly exist between semantically similar images. While variations due to scale and rotation have been examined, there lack practical solutions for more complex deformations such as affine transformations because of the tremendous size of the associated solution space. To address this problem, we present a discrete-continuous transformation matching (DCTM) framework where dense affine transformation fields are inferred through a discrete label optimization in which the labels are iteratively updated via continuous regularization. In this way, our approach draws solutions from the continuous space of affine transformations in a manner that can be computed efficiently through constant-time edge-aware filtering and a proposed affine-varying CNN-based descriptor.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
DCTM: Discrete-Continuous Transformation Matching for Semantic Flow· youtube
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications
