TopicFM: Robust and Interpretable Topic-Assisted Feature Matching
Khang Truong Giang, Soohwan Song, Sungho Jo

TL;DR
TopicFM introduces a novel image-matching approach using topic modeling to encode high-level semantic contexts, enhancing robustness and interpretability in challenging scenes with large variations or minimal textures.
Contribution
It is the first to apply topic modeling for explicit high-level context encoding in image matching, improving robustness and providing interpretability.
Findings
Outperforms state-of-the-art methods on outdoor and indoor datasets.
Shows significant robustness in challenging scenes with large variations.
Provides explainable matching results through inferred topics.
Abstract
This study addresses an image-matching problem in challenging cases, such as large scene variations or textureless scenes. To gain robustness to such situations, most previous studies have attempted to encode the global contexts of a scene via graph neural networks or transformers. However, these contexts do not explicitly represent high-level contextual information, such as structural shapes or semantic instances; therefore, the encoded features are still not sufficiently discriminative in challenging scenes. We propose a novel image-matching method that applies a topic-modeling strategy to encode high-level contexts in images. The proposed method trains latent semantic instances called topics. It explicitly models an image as a multinomial distribution of topics, and then performs probabilistic feature matching. This approach improves the robustness of matching by focusing on the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
MethodsAttention Is All You Need · Linear Layer · Softmax · Multi-Head Attention · Residual Connection · Dense Connections · Byte Pair Encoding · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Label Smoothing
