Charting Empirical Laws for LLM Fine-Tuning in Scientific Multi-Discipline Learning

Lintao Wang; Zhuqiang Lu; Yilin Zhu; Kun Hu; Zhenfei Yin; Shixiang Tang; Zhiyong Wang; Wanli Ouyang; Xinzhu Ma

arXiv:2602.11215·cs.LG·February 13, 2026

Charting Empirical Laws for LLM Fine-Tuning in Scientific Multi-Discipline Learning

Lintao Wang, Zhuqiang Lu, Yilin Zhu, Kun Hu, Zhenfei Yin, Shixiang Tang, Zhiyong Wang, Wanli Ouyang, Xinzhu Ma

PDF

Open Access

TL;DR

This paper systematically studies multi-disciplinary fine-tuning of large language models, revealing empirical laws that guide effective training across scientific domains for improved generalization.

Contribution

It introduces the first comprehensive analysis of multi-disciplinary LLM fine-tuning and formulates four empirical laws to optimize cross-domain learning.

Findings

01

Multi-disciplinary learning shows higher variability than single-discipline.

02

Four empirical laws guide effective multi-disciplinary fine-tuning.

03

Asymmetric LoRA-MoE achieves robust gains with minimal parameters.

Abstract

While large language models (LLMs) have achieved strong performance through fine-tuning within individual scientific domains, their learning dynamics in multi-disciplinary contexts remains poorly understood, despite the promise of improved generalization and broader applicability through cross-domain knowledge synergy. In this work, we present the first systematic study of multi-disciplinary LLM fine-tuning, constructing a five-discipline corpus and analyzing learning patterns of full fine-tuning, LoRA, LoRA-MoE, and LoRA compositions. Particularly, our study shows that multi-disciplinary learning is substantially more variable than single-discipline training and distills four consistent empirical laws: (1) Balance-then-Diversity: low-resource disciplines degrade performance unless mitigated via diversity-aware upsampling; (2) Merge-then-Align: restoring instruction-following ability is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Topic Modeling · Computational and Text Analysis Methods