StatLLaMA: Multi-Stage training for domain-optimized statistical large language models
Jing-Yi Zeng, Guan-Hua Huang

TL;DR
This paper presents StatLLaMA, a domain-specific large language model for statistics, developed through multi-stage training that balances statistical reasoning and general knowledge, outperforming models starting from generic foundations.
Contribution
It introduces a multi-stage training pipeline for domain-optimized statistical LLMs, demonstrating the importance of starting from an instruction-tuned model for effective specialization.
Findings
Starting from a base foundation model fails to develop statistical reasoning.
Instruction-tuned models enable effective domain specialization.
Direct preference optimization offers stable RLHF alignment.
Abstract
This study investigates how to efficiently build a domain-specialized large language model (LLM) for statistics using the lightweight LLaMA-3.2-3B family as the foundation model (FM). We systematically compare three multi-stage training pipelines--starting from a base FM with no instruction-following capability, a base FM augmented with post-hoc instruction tuning, and an instruction-tuned FM with strong general reasoning abilities--across continual pretraining, supervised fine-tuning (SFT), reinforcement learning from human feedback (RLHF) preference alignment, and downstream task fine-tuning (DTFT). Results show that pipelines beginning with a base FM fail to develop meaningful statistical reasoning, even after extensive instruction tuning, SFT, or RLHF alignment. In contrast, starting from LLaMA-3.2-3B-Instruct enables effective domain specialization. A comprehensive evaluation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
