Symmetric Regularization based BERT for Pair-wise Semantic Reasoning
Weidi Xu, Xingyi Cheng, Kunlong Chen, Wei Wang, Bin Bi, Ming Yan, Chen, Wu, Luo Si, Wei Chu, Taifeng Wang

TL;DR
This paper enhances BERT's sentence-pair reasoning by introducing a symmetric regularization with a 3-class task, including PSP, to better distinguish entailment from shallow correlation, improving performance on NLI and MRC tasks.
Contribution
It proposes a novel symmetric regularization approach with a 3-class categorization task for BERT, incorporating document-level context and label smoothing to improve semantic reasoning.
Findings
Significant performance improvements on NLI and MRC benchmarks.
Enhanced ability to distinguish entailment from shallow correlation.
Effective use of document-level information in pre-training.
Abstract
The ability of semantic reasoning over the sentence pair is essential for many natural language understanding tasks, e.g., natural language inference and machine reading comprehension. A recent significant improvement in these tasks comes from BERT. As reported, the next sentence prediction (NSP) in BERT, which learns the contextual relationship between two sentences, is of great significance for downstream problems with sentence-pair input. Despite the effectiveness of NSP, we suggest that NSP still lacks the essential signal to distinguish between entailment and shallow correlation. To remedy this, we propose to augment the NSP task to a 3-class categorization task, which includes a category for previous sentence prediction (PSP). The involvement of PSP encourages the model to focus on the informative semantics to determine the sentence order, thereby improves the ability of semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Weight Decay · Residual Connection · Adam · Layer Normalization · Softmax · Attention Is All You Need · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention
