Learning Robust Negation Text Representations
Thinh Hung Truong, Karin Verspoor, Trevor Cohn, Timothy Baldwin

TL;DR
This paper introduces a method to enhance negation understanding in text encoders by distilling data from large language models and fine-tuning with contrastive learning, improving negation robustness without sacrificing overall performance.
Contribution
It presents a novel distillation and fine-tuning approach specifically targeting negation robustness in both BERT-based models and large language models.
Findings
Significant improvement in negation understanding capabilities.
Maintains competitive performance on general text benchmarks.
Method adaptable to large language models for negation tasks.
Abstract
Despite rapid adoption of autoregressive large language models, smaller text encoders still play an important role in text understanding tasks that require rich contextualized representations. Negation is an important semantic function that is still not properly captured by such methods, affecting many downstream applications relying on text embeddings. We propose a strategy to improve negation robustness of text encoders, by distilling data from large language models using diverse patterns of negation and hedging. We adopt a standard contrastive learning strategy to finetune a strong BERT-based model, and observe large improvement in negation understanding capabilities while maintaining competitive performance on general benchmarks. In addition, we also show that our method can be adapted to LLMs, leading to improved performance on negation benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
