KodeXv0.1: A Family of State-of-the-Art Financial Large Language Models
Neel Rajani, Lilli Kiessling, Aleksandr Ogaltsov, Claus Lang

TL;DR
KodeXv0.1 is a new family of financial large language models that outperform GPT-4 and other state-of-the-art models in financial question answering by domain-specific training on financial documents.
Contribution
Introduces KodeXv0.1 models trained on a large financial dataset, surpassing GPT-4 in financial question answering tasks.
Findings
KodeX-8Bv0.1 outperforms similar models by up to 9.24%.
KodeX-70Bv0.1 exceeds GPT-4 on all benchmarks.
Models are trained with RAG-aware 4bit LoRA instruction tuning.
Abstract
Although powerful, current cutting-edge LLMs may not fulfil the needs of highly specialised sectors. We introduce KodeXv0.1, a family of large language models that outclass GPT-4 in financial question answering. We utilise the base variants of Llama 3.1 8B and 70B and adapt them to the financial domain through a custom training regime. To this end, we collect and process a large number of publicly available financial documents such as earnings calls and business reports. These are used to generate a high-quality, synthetic dataset consisting of Context-Question-Answer triplets which closely mirror real-world financial tasks. Using the train split of this dataset, we perform RAG-aware 4bit LoRA instruction tuning runs of Llama 3.1 base variants to produce KodeX-8Bv0.1 and KodeX-70Bv0.1. We then complete extensive model evaluations using FinanceBench, FinQABench and the withheld test…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinTech, Crowdfunding, Digital Finance · Stock Market Forecasting Methods · Banking stability, regulation, efficiency
MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · LLaMA · Softmax · Layer Normalization · Dropout
