Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images
Qinfeng Zhu, Yuanzhi Cai, Lei Fan

TL;DR
This paper evaluates the effectiveness of the xLSTM model in semantic segmentation of remotely sensed images, finding it generally underperforms compared to Transformer-based models.
Contribution
First to assess Vision-LSTM's performance in remote sensing image segmentation using a novel encoder-decoder architecture called Seg-LSTM.
Findings
Vision-LSTM underperforms compared to Vision-Transformers.
Seg-LSTM provides a baseline for future improvements.
Study introduces a new evaluation framework for Vision-LSTM in segmentation.
Abstract
Recent advancements in autoregressive networks with linear complexity have driven significant research progress, demonstrating exceptional performance in large language models. A representative model is the Extended Long Short-Term Memory (xLSTM), which incorporates gating mechanisms and memory structures, performing comparably to Transformer architectures in long-sequence language tasks. Autoregressive networks such as xLSTM can utilize image serialization to extend their application to visual tasks such as classification and segmentation. Although existing studies have demonstrated Vision-LSTM's impressive results in image classification, its performance in image semantic segmentation remains unverified. Our study represents the first attempt to evaluate the effectiveness of Vision-LSTM in the semantic segmentation of remotely sensed images. This evaluation is based on a specifically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsLinear Layer · Multi-Head Attention · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam
