Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images

Qinfeng Zhu; Yuanzhi Cai; Lei Fan

arXiv:2406.14086·cs.CV·January 13, 2026·3 cites

Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images

Qinfeng Zhu, Yuanzhi Cai, Lei Fan

PDF

Open Access 1 Repo

TL;DR

This paper evaluates the effectiveness of the xLSTM model in semantic segmentation of remotely sensed images, finding it generally underperforms compared to Transformer-based models.

Contribution

First to assess Vision-LSTM's performance in remote sensing image segmentation using a novel encoder-decoder architecture called Seg-LSTM.

Findings

01

Vision-LSTM underperforms compared to Vision-Transformers.

02

Seg-LSTM provides a baseline for future improvements.

03

Study introduces a new evaluation framework for Vision-LSTM in segmentation.

Abstract

Recent advancements in autoregressive networks with linear complexity have driven significant research progress, demonstrating exceptional performance in large language models. A representative model is the Extended Long Short-Term Memory (xLSTM), which incorporates gating mechanisms and memory structures, performing comparably to Transformer architectures in long-sequence language tasks. Autoregressive networks such as xLSTM can utilize image serialization to extend their application to visual tasks such as classification and segmentation. Although existing studies have demonstrated Vision-LSTM's impressive results in image classification, its performance in image semantic segmentation remains unverified. Our study represents the first attempt to evaluate the effectiveness of Vision-LSTM in the semantic segmentation of remotely sensed images. This evaluation is based on a specifically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhuqinfeng1999/seg-lstm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques

MethodsLinear Layer · Multi-Head Attention · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam