Effective Batching for Recurrent Neural Network Grammars
Hiroshi Noji, Yohei Oseki

TL;DR
This paper introduces an effective batching method for recurrent neural network grammars (RNNGs), significantly improving training and inference speed while maintaining strong syntactic generalization performance.
Contribution
It proposes a parallel batching technique for RNNGs, enabling efficient GPU utilization and substantial speedups in training and inference compared to previous implementations.
Findings
Achieves 6x speedup over existing C++ implementation
Attains 20-150x faster beam search inference
Maintains competitive syntactic generalization performance
Abstract
As a language model that integrates traditional symbolic operations and flexible neural representations, recurrent neural network grammars (RNNGs) have attracted great attention from both scientific and engineering perspectives. However, RNNGs are known to be harder to scale due to the difficulty of batched training. In this paper, we propose effective batching for RNNGs, where every operation is computed in parallel with tensors across multiple sentences. Our PyTorch implementation effectively employs a GPU and achieves x6 speedup compared to the existing C++ DyNet implementation with model-independent auto-batching. Moreover, our batched RNNG also accelerates inference and achieves x20-150 speedup for beam search depending on beam sizes. Finally, we evaluate syntactic generalization performance of the scaled RNNG against the LSTM baseline, based on the large training data of 100M…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory
