Evaluating Large Language Models on Financial Report Summarization: An   Empirical Study

Xinqi Yang; Scott Zang; Yong Ren; Dingjie Peng; Zheng Wen

arXiv:2411.06852·cs.CL·November 12, 2024

Evaluating Large Language Models on Financial Report Summarization: An Empirical Study

Xinqi Yang, Scott Zang, Yong Ren, Dingjie Peng, Zheng Wen

PDF

Open Access

TL;DR

This paper empirically evaluates three state-of-the-art large language models for financial report summarization, introducing a comprehensive benchmarking framework and releasing a relevant dataset for future research.

Contribution

It provides a novel evaluation framework combining quantitative and qualitative metrics for financial text summarization with publicly available datasets.

Findings

01

GLM-4 outperforms others in ROUGE-1 scores

02

LLaMA3.1 shows higher contextual relevance

03

Mistral-NeMo demonstrates robustness in accuracy

Abstract

In recent years, Large Language Models (LLMs) have demonstrated remarkable versatility across various applications, including natural language understanding, domain-specific knowledge tasks, etc. However, applying LLMs to complex, high-stakes domains like finance requires rigorous evaluation to ensure reliability, accuracy, and compliance with industry standards. To address this need, we conduct a comprehensive and comparative study on three state-of-the-art LLMs, GLM-4, Mistral-NeMo, and LLaMA3.1, focusing on their effectiveness in generating automated financial reports. Our primary motivation is to explore how these models can be harnessed within finance, a field demanding precision, contextual relevance, and robustness against erroneous or misleading information. By examining each model's capabilities, we aim to provide an insightful assessment of their strengths and limitations. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStock Market Forecasting Methods · Advanced Text Analysis Techniques · Topic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Dropout · Linear Warmup With Linear Decay · WordPiece · Dense Connections · Layer Normalization · Adam · Attention Dropout