Attention-based Contrastive Learning for Winograd Schemas

Tassilo Klein; Moin Nabi

arXiv:2109.05108·cs.CL·September 14, 2021

Attention-based Contrastive Learning for Winograd Schemas

Tassilo Klein, Moin Nabi

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel self-supervised contrastive learning framework applied directly to Transformer self-attention to improve commonsense reasoning on Winograd Schemas, outperforming existing unsupervised methods.

Contribution

It presents the first contrastive learning approach at the attention level for Winograd Schema reasoning, enhancing unsupervised learning capabilities.

Findings

01

Outperforms all comparable unsupervised approaches

02

Occasionally surpasses supervised methods

03

Demonstrates superior commonsense reasoning on multiple datasets

Abstract

Self-supervised learning has recently attracted considerable attention in the NLP community for its ability to learn discriminative features using a contrastive objective. This paper investigates whether contrastive learning can be extended to Transfomer attention to tackling the Winograd Schema Challenge. To this end, we propose a novel self-supervised framework, leveraging a contrastive loss directly at the level of self-attention. Experimental analysis of our attention-based models on multiple datasets demonstrates superior commonsense reasoning capabilities. The proposed approach outperforms all comparable unsupervised approaches while occasionally surpassing supervised ones.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sap-samples/emnlp2021-attention-contrastive-learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques

MethodsContrastive Learning