Span Fine-tuning for Pre-trained Language Models

Rongzhou Bao; Zhuosheng Zhang; Hai Zhao

arXiv:2108.12848·cs.CL·September 16, 2021

Span Fine-tuning for Pre-trained Language Models

Rongzhou Bao, Zhuosheng Zhang, Hai Zhao

PDF

TL;DR

This paper introduces a flexible span fine-tuning method for pre-trained language models that adaptively incorporates span-level information during downstream task training, improving performance and efficiency.

Contribution

It proposes a novel span fine-tuning approach that dynamically determines span settings during task-specific training, unlike fixed span methods.

Findings

01

Significantly improves PrLM performance on GLUE benchmark

02

Offers flexible span setting during fine-tuning

03

Enhances efficiency over previous span-level methods

Abstract

Pre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words. Previous works have shown that incorporating span-level information over consecutive words in pre-training could further improve the performance of PrLMs. However, given that span-level clues are introduced and fixed in pre-training, previous methods are time-consuming and lack of flexibility. To alleviate the inconvenience, this paper presents a novel span fine-tuning method for PrLMs, which facilitates the span setting to be adaptively determined by specific downstream tasks during the fine-tuning phase. In detail, any sentences processed by the PrLM will be segmented into multiple spans according to a pre-sampled dictionary. Then the segmentation information will be sent through a hierarchical CNN module together with the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.