Full-Stack Domain Enhancement for Combustion LLMs: Construction and Optimization
Quanjia Xiao, Weimin Ouyang, Zonglin Yang, Tianhao Wu, Qingguo Zhou, Runze Mao, Zhi X. Chen

TL;DR
This paper introduces a comprehensive domain-specific workflow for training combustion science LLMs, improving their physical reasoning and reducing hallucinations through specialized training and evaluation benchmarks.
Contribution
It presents the first full-stack domain-enhanced LLM workflow for combustion science, integrating data construction, incremental training, fine-tuning, and reinforcement learning.
Findings
Model outperforms general-purpose LLMs on combustion reasoning tasks.
The workflow effectively reduces hallucinations and improves physical law adherence.
FlameBench provides a standardized benchmark for combustion science reasoning.
Abstract
Large language models (LLMs) in the direction of task adaptation and capability enhancement for professional fields demonstrate significant application potential. Nevertheless, for complex physical systems such as combustion science, general-purpose LLMs often generate severe hallucinations due to insufficient domain knowledge and the inability to adhere to physical conservation laws. To address this issue, we propose the first full-stack domain-enhanced LLM workflow tailored for the field of combustion science, which integrates automated domain corpus construction, incremental pre-training, instruction fine-tuning, and verifiable reward-based reinforcement learning. This workflow ensures that the model truly internalizes physical laws rather than merely learning textual statistical patterns. We also release FlameBench, a standardized evaluation benchmark specifically designed for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Topic Modeling · Natural Language Processing Techniques
