Hybrid Feature Embedding For Automatic Building Outline Extraction
Weihang Ran, Wei Yuan, Xiaodan Shi, Zipei Fan, Ryosuke Shibasaki

TL;DR
This paper introduces a hybrid CNN-Transformer model with an active contour component and a triple-branch decoder for precise building outline extraction from aerial images, significantly improving accuracy over baseline models.
Contribution
It presents a novel combined CNN-Transformer architecture with a triple-branch decoder and active contour integration for enhanced building outline detection.
Findings
Achieved 91.1% mIoU on Vaihingen dataset
Achieved 83.8% mIoU on Bing huts dataset
Outperformed baseline models in accuracy
Abstract
Building outline extracted from high-resolution aerial images can be used in various application fields such as change detection and disaster assessment. However, traditional CNN model cannot recognize contours very precisely from original images. In this paper, we proposed a CNN and Transformer based model together with active contour model to deal with this problem. We also designed a triple-branch decoder structure to handle different features generated by encoder. Experiment results show that our model outperforms other baseline model on two datasets, achieving 91.1% mIoU on Vaihingen and 83.8% on Bing huts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutomated Road and Building Extraction · Remote-Sensing Image Classification · Remote Sensing and LiDAR Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Label Smoothing · Softmax · Dense Connections · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Residual Connection
