BloombergGPT: A Large Language Model for Finance
Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, Mark Dredze,, Sebastian Gehrmann, Prabhanjan Kambadur, David Rosenberg, Gideon Mann

TL;DR
BloombergGPT is a 50-billion-parameter language model trained on extensive financial and general data, outperforming existing models on financial tasks while maintaining general language understanding.
Contribution
This work introduces BloombergGPT, the first large language model specifically trained on a vast financial dataset, demonstrating superior performance in financial NLP tasks.
Findings
Outperforms existing models on financial NLP benchmarks
Maintains strong performance on general language tasks
Utilizes a large, diverse dataset for training
Abstract
The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific dataset yet, augmented with 345 billion tokens from general purpose datasets. We validate BloombergGPT on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage. Our mixed dataset training leads to a model that outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Stock Market Forecasting Methods · Natural Language Processing Techniques
