Physics-inspired Energy Transition Neural Network for Sequence Learning
Zhou Wu, Junyi An, Baile Xu, Furao Shen, Jian Zhao

TL;DR
This paper introduces PETNN, a physics-inspired recurrent neural network that effectively captures long-term dependencies, outperforms transformers on sequence tasks, and offers lower complexity, challenging the dominance of Transformer models.
Contribution
The paper proposes PETNN, a novel physics-inspired recurrent architecture that enhances long-term sequence learning and demonstrates superior performance and efficiency over transformer-based models.
Findings
PETNN outperforms transformer-based methods on various sequence tasks.
PETNN exhibits significantly lower computational complexity.
The study highlights the potential of physics-inspired recurrent models in sequence learning.
Abstract
Recently, the superior performance of Transformers has made them a more robust and scalable solution for sequence modeling than traditional recurrent neural networks (RNNs). However, the effectiveness of Transformer in capturing long-term dependencies is primarily attributed to their comprehensive pair-modeling process rather than inherent inductive biases toward sequence semantics. In this study, we explore the capabilities of pure RNNs and reassess their long-term learning mechanisms. Inspired by the physics energy transition models that track energy changes over time, we propose a effective recurrent structure called the``Physics-inspired Energy Transition Neural Network" (PETNN). We demonstrate that PETNN's memory mechanism effectively stores information over long-term dependencies. Experimental results indicate that PETNN outperforms transformer-based methods across various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Dense Connections · Adam · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Softmax
