Improving Data and Parameter Efficiency of Neural Language Models Using Representation Analysis

Josip Juki\'c

arXiv:2507.12004·cs.CL·July 17, 2025

Improving Data and Parameter Efficiency of Neural Language Models Using Representation Analysis

Josip Juki\'c

PDF

Open Access

TL;DR

This paper introduces novel representation analysis techniques and optimization strategies to improve data and parameter efficiency in neural language models, demonstrating significant performance gains across NLP tasks.

Contribution

It presents new methods based on representation smoothness, active learning integration, and in-context weak supervision to enhance efficiency and robustness of language models.

Findings

01

Outperforms traditional methods in efficiency and stability

02

Reduces labeling efforts with active learning and early stopping

03

Enhances low-resource model performance with in-context learning

Abstract

This thesis addresses challenges related to data and parameter efficiency in neural language models, with a focus on representation analysis and the introduction of new optimization techniques. The first part examines the properties and dynamics of language representations within neural models, emphasizing their significance in enhancing robustness and generalization. It proposes innovative approaches based on representation smoothness, including regularization strategies that utilize Jacobian and Hessian matrices to stabilize training and mitigate sensitivity to input perturbations. The second part focuses on methods to significantly enhance data and parameter efficiency by integrating active learning strategies with parameter-efficient fine-tuning, guided by insights from representation smoothness analysis. It presents smoothness-informed early-stopping techniques designed to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications