Free Lunch for Efficient Textual Commonsense Integration in Language   Models

Wanyun Cui; Xingran Chen

arXiv:2305.15516·cs.CL·May 26, 2023·1 cites

Free Lunch for Efficient Textual Commonsense Integration in Language Models

Wanyun Cui, Xingran Chen

PDF

Open Access

TL;DR

This paper introduces an efficient batching method for integrating textual commonsense knowledge into language models, reducing computational costs without sacrificing performance, especially beneficial for large datasets and memory-constrained devices.

Contribution

It proposes a spectral clustering-based batching approach that reuses commonsense descriptions across samples, optimizing efficiency without altering the language model.

Findings

01

Significant reduction in computational cost observed.

02

Efficiency gains are larger on bigger datasets.

03

Performance is preserved despite batching optimization.

Abstract

Recent years have witnessed the emergence of textual commonsense knowledge bases, aimed at providing more nuanced and context-rich knowledge. The integration of external commonsense into language models has been shown to be a key enabler in advancing the state-of-the-art for a wide range of NLP tasks. However, incorporating textual commonsense descriptions is computationally expensive, as compared to encoding conventional symbolic knowledge. In this paper, we propose a method to improve its efficiency without modifying the model. We group training samples with similar commonsense descriptions into a single batch, thus reusing the encoded description across multiple samples. One key observation is that the upper bound of batch partitioning can be reduced to the classic {\it graph k-cut problem}. Consequently, we propose a spectral clustering-based algorithm to solve this problem.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks