Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports
Tianyu Cao, Natraj Raman, Danial Dervovic, Chenhao Tan

TL;DR
This paper systematically analyzes the abilities of various large language models in long-form multimodal summarization of financial reports, revealing strengths and biases, especially highlighting Claude 2's superior performance.
Contribution
It introduces a computational framework for characterizing multimodal long-form summarization and provides a comparative analysis of multiple LLMs on financial report summarization.
Findings
GPT-3.5 and Cohere perform poorly on the task
Claude 2 and GPT-4 show different behaviors in extractiveness and bias
Claude 2 effectively recognizes important information after input shuffling
Abstract
As large language models (LLMs) expand the power of natural language processing to handle long inputs, rigorous and systematic analyses are necessary to understand their abilities and behavior. A salient application is summarization, due to its ubiquity and controversy (e.g., researchers have declared the death of summarization). In this paper, we use financial report summarization as a case study because financial reports are not only long but also use numbers and tables extensively. We propose a computational framework for characterizing multimodal long-form summarization and investigate the behavior of Claude 2.0/2.1, GPT-4/3.5, and Cohere. We find that GPT-3.5 and Cohere fail to perform this summarization task meaningfully. For Claude 2 and GPT-4, we analyze the extractiveness of the summary and identify a position bias in LLMs. This position bias disappears after shuffling the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Data Mining Algorithms and Applications · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Label Smoothing · Cosine Annealing · Weight Decay · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Linear Layer
