# DA-LSTM: A Long Short-Term Memory with Depth Adaptive to Non-uniform   Information Flow in Sequential Data

**Authors:** Yifeng Zhang, Ka-Ho Chow, S.-H. Gary Chan

arXiv: 1903.02082 · 2019-03-07

## TL;DR

DA-LSTM introduces a dynamic depth adjustment mechanism for LSTM networks, effectively modeling non-uniform information flow in sequential data, leading to faster convergence and reduced computational costs.

## Contribution

It proposes a novel depth-adaptive LSTM architecture that adjusts its structure based on information distribution without prior knowledge.

## Key findings

- Reduces convergence time by over 41%.
- Consumes less computational resources.
- Outperforms stacked and deep transition LSTMs in experiments.

## Abstract

Much sequential data exhibits highly non-uniform information distribution. This cannot be correctly modeled by traditional Long Short-Term Memory (LSTM). To address that, recent works have extended LSTM by adding more activations between adjacent inputs. However, the approaches often use a fixed depth, which is at the step of the most information content. This one-size-fits-all worst-case approach is not satisfactory, because when little information is distributed to some steps, shallow structures can achieve faster convergence and consume less computation resource. In this paper, we develop a Depth-Adaptive Long Short-Term Memory (DA-LSTM) architecture, which can dynamically adjust the structure depending on information distribution without prior knowledge. Experimental results on real-world datasets show that DA-LSTM costs much less computation resource and substantially reduce convergence time by $41.78\%$ and $46.01 \%$, compared with Stacked LSTM and Deep Transition LSTM, respectively.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.02082/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1903.02082/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1903.02082/full.md

---
Source: https://tomesphere.com/paper/1903.02082