# Learning Local Feature Descriptor with Motion Attribute for Vision-based   Localization

**Authors:** Yafei Song, Di Zhu, Jia Li, Yonghong Tian, Mingyang Li

arXiv: 1908.01180 · 2019-08-08

## TL;DR

This paper introduces MD-Net, a convolutional network that jointly estimates motion attributes and local feature descriptors to enhance vision-based localization accuracy, especially by filtering out unsteady features.

## Contribution

The paper presents a novel FCN architecture that simultaneously learns local feature descriptors and motion attributes, improving localization performance without significant extra computation.

## Key findings

- Outperforms existing methods in feature description and motion attribute estimation.
- Enhances localization accuracy by integrating motion attributes.
- Efficiently learns features using a shared backbone network.

## Abstract

In recent years, camera-based localization has been widely used for robotic applications, and most proposed algorithms rely on local features extracted from recorded images. For better performance, the features used for open-loop localization are required to be short-term globally static, and the ones used for re-localization or loop closure detection need to be long-term static. Therefore, the motion attribute of a local feature point could be exploited to improve localization performance, e.g., the feature points extracted from moving persons or vehicles can be excluded from these systems due to their unsteadiness. In this paper, we design a fully convolutional network (FCN), named MD-Net, to perform motion attribute estimation and feature description simultaneously. MD-Net has a shared backbone network to extract features from the input image and two network branches to complete each sub-task. With MD-Net, we can obtain the motion attribute while avoiding increasing much more computation. Experimental results demonstrate that the proposed method can learn distinct local feature descriptor along with motion attribute only using an FCN, by outperforming competing methods by a wide margin. We also show that the proposed algorithm can be integrated into a vision-based localization algorithm to improve estimation accuracy significantly.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.01180/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1908.01180/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/1908.01180/full.md

---
Source: https://tomesphere.com/paper/1908.01180