# Long Short-Term Memory Spatial Transformer Network

**Authors:** Shiyang Feng, Tianyue Chen, Hao Sun

arXiv: 1901.02273 · 2019-09-02

## TL;DR

This paper introduces an LSTM-enhanced spatial transformer network that improves digit sequence classification by reducing distortion and enhancing attention, achieving higher accuracy than previous CNN-based models with STN.

## Contribution

The paper presents a novel combined LSTM and spatial transformer network model that enhances sequence digit classification by incorporating top-down attention and reducing spatial distortion.

## Key findings

- Achieved a 1.6% error rate on digit sequence classification.
- Outperformed standard CNN with STN, which had a 2.2% error rate.
- Demonstrated improved handling of spatial transformations in sequence data.

## Abstract

Spatial transformer network has been used in a layered form in conjunction with a convolutional network to enable the model to transform data spatially. In this paper, we propose a combined spatial transformer network (STN) and a Long Short-Term Memory network (LSTM) to classify digits in sequences formed by MINST elements. This LSTM-STN model has a top-down attention mechanism profit from LSTM layer, so that the STN layer can perform short-term independent elements for the statement in the process of spatial transformation, thus avoiding the distortion that may be caused when the entire sequence is spatially transformed. It also avoids the influence of this distortion on the subsequent classification process using convolutional neural networks and achieves a single digit error of 1.6\% compared with 2.2\% of Convolutional Neural Network with STN layer.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.02273/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1901.02273/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1901.02273/full.md

---
Source: https://tomesphere.com/paper/1901.02273