Data Noising as Smoothing in Neural Network Language Models
Ziang Xie, Sida I. Wang, Jiwei Li, Daniel L\'evy, Aiming Nie, Dan, Jurafsky, Andrew Y. Ng

TL;DR
This paper establishes a connection between data noising in neural network language models and smoothing in n-gram models, leading to improved language modeling and machine translation performance.
Contribution
It introduces noising schemes inspired by smoothing techniques, bridging discrete sequence regularization with established n-gram smoothing methods.
Findings
Performance gains in language modeling
Enhanced machine translation results
Empirical validation of noising-smoothing relationship
Abstract
Data noising is an effective technique for regularizing neural network models. While noising is widely adopted in application domains such as vision and speech, commonly used noising primitives have not been developed for discrete sequence-level settings such as language modeling. In this paper, we derive a connection between input noising in neural network language models and smoothing in -gram models. Using this connection, we draw upon ideas from smoothing to develop effective noising schemes. We demonstrate performance gains when applying the proposed schemes to language modeling and machine translation. Finally, we provide empirical analysis validating the relationship between noising and smoothing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
