Enhancing Natural Language Representation with Large-Scale Out-of-Domain Commonsense
Wanyun Cui, Xingran Chen

TL;DR
This paper introduces OK-Transformer, a method that effectively incorporates large-scale out-of-domain commonsense knowledge into language models to improve text understanding without catastrophic forgetting.
Contribution
The paper proposes OK-Transformer, a novel approach that seamlessly integrates out-of-domain commonsense into existing Transformer models without additional pre-training.
Findings
Improves performance in commonsense reasoning tasks.
Enhances general text classification accuracy.
Effective in low-resource commonsense scenarios.
Abstract
We study how to enhance text representation via textual commonsense. We point out that commonsense has the nature of domain discrepancy. Namely, commonsense has different data formats and is domain-independent from the downstream task. This nature brings challenges to introducing commonsense in general text understanding tasks. A typical method of introducing textual knowledge is continuing pre-training over the commonsense corpus. However, it will cause catastrophic forgetting to the downstream task due to the domain discrepancy. In addition, previous methods of directly using textual descriptions as extra input information cannot apply to large-scale commonsense. In this paper, we propose to use large-scale out-of-domain commonsense to enhance text representation. In order to effectively incorporate the commonsense, we proposed OK-Transformer (\underline{O}ut-of-domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · Linear Warmup With Linear Decay · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · BERT · Absolute Position Encodings
