Large Language Models -- the Future of Fundamental Physics?
Caroline Heneka, Florian Nieser, Ayodele Ore, Tilman Plehn, Daniel Schiller

TL;DR
This paper explores using Large Language Models, specifically Qwen2.5, for analyzing cosmological data, demonstrating their potential to outperform traditional methods in generating and interpreting large-scale structure maps.
Contribution
It introduces Lightcone LLM (L3M), a novel approach combining LLMs with connector networks for cosmological data analysis, showing improved performance over standard methods.
Findings
L3M outperforms standard initialization in cosmological tasks.
L3M compares favorably with dedicated networks of similar size.
Qwen2.5 enables effective out-of-domain transfer learning for cosmology.
Abstract
For many fundamental physics applications, transformers, as the state of the art in learning complex correlations, benefit from pretraining on quasi-out-of-domain data. The obvious question is whether we can exploit Large Language Models, requiring proper out-of-domain transfer learning. We show how the Qwen2.5 LLM can be used to analyze and generate SKA data, specifically 3D maps of the cosmological large-scale structure for a large part of the observable Universe. We combine the LLM with connector networks and show, for cosmological parameter regression and lightcone generation, that this Lightcone LLM (L3M) with Qwen2.5 weights outperforms standard initialization and compares favorably with dedicated networks of matching size.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
