Is Micro Domain-Adaptive Pre-Training Effective for Real-World Operations? Multi-Step Evaluation Reveals Potential and Bottlenecks

Masaya Tsunokake; Yuta Koreeda; Terufumi Morishita; Koichi Nagatsuka; Hikaru Tomonari; Yasuhiro Sogawa

arXiv:2602.04466·cs.CL·February 5, 2026

Is Micro Domain-Adaptive Pre-Training Effective for Real-World Operations? Multi-Step Evaluation Reveals Potential and Bottlenecks

Masaya Tsunokake, Yuta Koreeda, Terufumi Morishita, Koichi Nagatsuka, Hikaru Tomonari, Yasuhiro Sogawa

PDF

Open Access 1 Video

TL;DR

This study evaluates micro domain-adaptive pre-training (mDAPT) for large language models in real-world enterprise tasks, revealing its strengths in knowledge elicitation but highlighting bottlenecks in reasoning and answer composition.

Contribution

It provides a multi-step evaluation framework for mDAPT's effectiveness on generative tasks in micro domains, identifying specific strengths and limitations.

Findings

01

mDAPT improves fact elicitation from LLMs

02

It does not significantly enhance reasoning or answer composition

03

Enhancing reasoning is key to better performance

Abstract

When applying LLMs to real-world enterprise operations, LLMs need to handle proprietary knowledge in small domains of specific operations ( $micro domains$ ). A previous study shows micro domain-adaptive pre-training ( $mDAPT$ ) with fewer documents is effective, similarly to DAPT in larger domains. However, it evaluates mDAPT only on multiple-choice questions; thus, its effectiveness for generative tasks in real-world operations remains unknown. We aim to reveal the potential and bottlenecks of mDAPT for generative tasks. To this end, we disentangle the answering process into three subtasks and evaluate the performance of each subtask: (1) $eliciting$ facts relevant to questions from an LLM's own knowledge, (2) $reasoning$ over the facts to obtain conclusions, and (3) $composing$ long-form answers based on the conclusions. We verified mDAPT on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Is Micro Domain-Adaptive Pre-Training Effective for Real-World Operations? Multi-Step Evaluation Reveals Potential and Bottlenecks· underline

Taxonomy

TopicsTopic Modeling · AI-based Problem Solving and Planning · Software Engineering Research