KodeXv0.1: A Family of State-of-the-Art Financial Large Language Models

Neel Rajani; Lilli Kiessling; Aleksandr Ogaltsov; Claus Lang

arXiv:2409.13749·cs.CL·September 24, 2024

KodeXv0.1: A Family of State-of-the-Art Financial Large Language Models

Neel Rajani, Lilli Kiessling, Aleksandr Ogaltsov, Claus Lang

PDF

Open Access

TL;DR

KodeXv0.1 is a new family of financial large language models that outperform GPT-4 and other state-of-the-art models in financial question answering by domain-specific training on financial documents.

Contribution

Introduces KodeXv0.1 models trained on a large financial dataset, surpassing GPT-4 in financial question answering tasks.

Findings

01

KodeX-8Bv0.1 outperforms similar models by up to 9.24%.

02

KodeX-70Bv0.1 exceeds GPT-4 on all benchmarks.

03

Models are trained with RAG-aware 4bit LoRA instruction tuning.

Abstract

Although powerful, current cutting-edge LLMs may not fulfil the needs of highly specialised sectors. We introduce KodeXv0.1, a family of large language models that outclass GPT-4 in financial question answering. We utilise the base variants of Llama 3.1 8B and 70B and adapt them to the financial domain through a custom training regime. To this end, we collect and process a large number of publicly available financial documents such as earnings calls and business reports. These are used to generate a high-quality, synthetic dataset consisting of Context-Question-Answer triplets which closely mirror real-world financial tasks. Using the train split of this dataset, we perform RAG-aware 4bit LoRA instruction tuning runs of Llama 3.1 base variants to produce KodeX-8Bv0.1 and KodeX-70Bv0.1. We then complete extensive model evaluations using FinanceBench, FinQABench and the withheld test…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFinTech, Crowdfunding, Digital Finance · Stock Market Forecasting Methods · Banking stability, regulation, efficiency

MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · LLaMA · Softmax · Layer Normalization · Dropout