Neuron Interaction Based Representation Composition for Neural Machine Translation
Jian Li, Xing Wang, Baosong Yang, Shuming Shi, Michael R. Lyu,, Zhaopeng Tu

TL;DR
This paper introduces a novel neuron interaction-based method for neural machine translation that models pairwise neuron interactions to improve translation quality, outperforming the Transformer baseline on standard benchmarks.
Contribution
It proposes a bilinear pooling approach to model neuron interactions in NMT, enhancing linguistic representation and translation performance.
Findings
Improved translation accuracy over Transformer baseline.
Captures more syntactic and semantic information.
Effective modeling of neuron interactions enhances representations.
Abstract
Recent NLP studies reveal that substantial linguistic information can be attributed to single neurons, i.e., individual dimensions of the representation vectors. We hypothesize that modeling strong interactions among neurons helps to better capture complex information by composing the linguistic properties embedded in individual neurons. Starting from this intuition, we propose a novel approach to compose representations learned by different components in neural machine translation (e.g., multi-layer networks or multi-head attention), based on modeling strong interactions among neurons in the representation vectors. Specifically, we leverage bilinear pooling to model pairwise multiplicative interactions among individual neurons, and a low-rank approximation to make the model computationally feasible. We further propose extended bilinear pooling to incorporate first-order…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
