APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning

Jiashuo Sun; Hang Zhang; Chen Lin; Xiangdong Su; Yeyun Gong; Jian Guo

arXiv:2212.07249·cs.CL·March 13, 2024·1 cites

APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning

Jiashuo Sun, Hang Zhang, Chen Lin, Xiangdong Su, Yeyun Gong, Jian Guo

PDF

Open Access 3 Repos 2 Models

TL;DR

APOLLO introduces an optimized training approach for long-form numerical reasoning in financial analysis, enhancing fact retrieval and program generation to improve accuracy and diversity, achieving state-of-the-art results.

Contribution

The paper proposes a novel retriever-generator framework with number-aware sampling and consistency-based reinforcement learning for better numerical reasoning.

Findings

01

Achieved new state-of-the-art on FinQA and ConvFinQA datasets.

02

Improved discriminative ability of the retriever for numerical facts.

03

Enhanced program diversity and accuracy through consistency-based training.

Abstract

Long-form numerical reasoning in financial analysis aims to generate a reasoning program to calculate the correct answer for a given question. Previous work followed a retriever-generator framework, where the retriever selects key facts from a long-form document, and the generator generates a reasoning program based on retrieved facts. However, they treated all facts equally without considering the different contributions of facts with and without numbers. Meanwhile, the program consistency were ignored under supervised training, resulting in lower training accuracy and diversity. To solve these problems, we proposed APOLLO to improve the long-form numerical reasoning framework. For the retriever, we adopt a number-aware negative sampling strategy to enable the retriever to be more discriminative on key numerical facts. For the generator, we design consistency-based reinforcement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStock Market Forecasting Methods · Topic Modeling · Advanced Text Analysis Techniques

MethodsAdaptive Parameter-wise Diagonal Quasi-Newton Method