PACuna: Automated Fine-Tuning of Language Models for Particle Accelerators
Antonin Sulc, Raimund Kammering, Annika Eichler, Tim Wilksen

TL;DR
PACuna is a fine-tuned language model designed to assist with complex questions about particle accelerators, using automated data collection and domain-specific training to improve understanding and support scientific facilities.
Contribution
This work introduces PACuna, a novel approach for fine-tuning language models on accelerator-specific data with automated data collection and question generation.
Findings
PACuna effectively answers complex accelerator questions validated by experts.
Automated data collection and question generation reduce expert involvement.
PACuna outperforms commercial assistants on specialized accelerator queries.
Abstract
Navigating the landscape of particle accelerators has become increasingly challenging with recent surges in contributions. These intricate devices challenge comprehension, even within individual facilities. To address this, we introduce PACuna, a fine-tuned language model refined through publicly available accelerator resources like conferences, pre-prints, and books. We automated data collection and question generation to minimize expert involvement and make the data publicly available. PACuna demonstrates proficiency in addressing intricate accelerator questions, validated by experts. Our approach shows adapting language models to scientific domains by fine-tuning technical texts and auto-generated corpora capturing the latest developments can further produce pre-trained models to answer some intricate questions that commercially available assistants cannot and can serve as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Oil and Gas Production Techniques
