Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text   Correspondence

Myeongjun Jang; Frank Mtumbuka; Thomas Lukasiewicz

arXiv:2205.03815·cs.CL·August 12, 2022

Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text Correspondence

Myeongjun Jang, Frank Mtumbuka, Thomas Lukasiewicz

PDF

1 Repo

TL;DR

This paper investigates the logical negation property in large language models, finds they often violate it, and proposes a novel meaning-matching training task to improve their lexical semantic understanding and downstream task performance.

Contribution

The paper introduces a new meaning-matching training task that enhances language models' understanding of lexical semantics and logical negation, outperforming previous methods.

Findings

01

PLMs frequently violate the logical negation property.

02

The meaning-matching task improves lexical semantic learning.

03

Fine-tuning with the task maintains or improves downstream performance.

Abstract

The logical negation property (LNP), which implies generating different predictions for semantically opposite inputs, is an important property that a trustworthy language model must satisfy. However, much recent evidence shows that large-size pre-trained language models (PLMs) do not satisfy this property. In this paper, we perform experiments using probing tasks to assess PLM's LNP understanding. Unlike previous studies that only examined negation expressions, we expand the boundary of the investigation to lexical semantics. Through experiments, we observe that PLMs violate the LNP frequently. To alleviate the issue, we propose a novel intermediate training task, names meaning-matching, designed to directly learn a meaning-text correspondence, instead of relying on the distributional hypothesis. Through multiple experiments, we find that the task enables PLMs to learn lexical semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mj-jang/beyond-distributional
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.