Optimising Language Models for Downstream Tasks: A Post-Training Perspective

Zhengyan Shi

arXiv:2506.20917·cs.CL·June 27, 2025

Optimising Language Models for Downstream Tasks: A Post-Training Perspective

Zhengyan Shi

PDF

Open Access

TL;DR

This paper introduces novel post-training techniques and evaluation methods to improve language models' efficiency, robustness, and adaptability for diverse downstream NLP tasks, addressing limitations of traditional fine-tuning.

Contribution

It presents a series of methods including a novel continued pre-training approach, parameter-efficient fine-tuning, and new benchmarks for better LM adaptation and evaluation.

Findings

01

Outperforms state-of-the-art semi-supervised approaches

02

Reduces memory and compute costs significantly

03

Enhances performance on instruction-following and reasoning tasks

Abstract

Language models (LMs) have demonstrated remarkable capabilities in NLP, yet adapting them efficiently and robustly to specific tasks remains challenging. As their scale and complexity grow, fine-tuning LMs on labelled data often underutilizes available unlabelled data, leads to overfitting on small task-specific sets, and imposes significant computational costs. These limitations hamper their application to the open-ended landscape of real-world language tasks. This thesis proposes a series of methods to better adapt LMs to downstream applications. First, we explore strategies for extracting task-relevant knowledge from unlabelled data, introducing a novel continued pre-training technique that outperforms state-of-the-art semi-supervised approaches. Next, we present a parameter-efficient fine-tuning method that substantially reduces memory and compute costs while maintaining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Intelligent Tutoring Systems and Adaptive Learning