Transformer with Memory Replay
Rui Liu, Barzan Mozafari

TL;DR
This paper introduces Transformer with Memory Replay (TMR), enhancing sample efficiency and runtime performance in NLP tasks by integrating memory replay mechanisms into transformer models.
Contribution
It presents a novel integration of memory replay with transformers, improving sample efficiency and runtime performance in NLP pretraining.
Findings
At least 1% accuracy improvement on GLUE and SQuAD benchmarks.
Achieves better runtime efficiency through optimized memory replay design.
Demonstrates effective sample reuse in transformer pretraining.
Abstract
Transformers achieve state-of-the-art performance for natural language processing tasks by pre-training on large-scale text corpora. They are extremely compute-intensive and have very high sample complexity. Memory replay is a mechanism that remembers and reuses past examples by saving to and replaying from a memory buffer. It has been successfully used in reinforcement learning and GANs due to better sample efficiency. In this paper, we propose \emph{Transformer with Memory Replay} (TMR), which integrates memory replay with transformer, making transformer more sample-efficient. Experiments on GLUE and SQuAD benchmark datasets show that Transformer with Memory Replay achieves at least point increase compared to the baseline transformer model when pretrained with the same number of examples. Further, by adopting a careful design that reduces the wall-clock time overhead of memory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
MethodsAttention Is All You Need · Linear Layer · Softmax · Layer Normalization · Byte Pair Encoding · Dense Connections · Dropout · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam
