Architecture Matters More Than Scale: A Comparative Study of Retrieval and Memory Augmentation for Financial QA Under SME Compute Constraints

Jianan Liu; Jing Yang; Xianyou Li; Weiran Yan; Yichao Wu; Penghao Liang; Mengwei Yuan

arXiv:2604.17979·cs.IR·May 12, 2026

Architecture Matters More Than Scale: A Comparative Study of Retrieval and Memory Augmentation for Financial QA Under SME Compute Constraints

Jianan Liu, Jing Yang, Xianyou Li, Weiran Yan, Yichao Wu, Penghao Liang, Mengwei Yuan

PDF

TL;DR

This study compares different reasoning architectures for financial question answering under SME constraints, highlighting the importance of architecture choice over scale for practical deployment.

Contribution

It introduces an SME-constrained evaluation setting and systematically compares four reasoning architectures, revealing their strengths in different financial tasks.

Findings

01

Structured memory improves precision in deterministic tasks.

02

Retrieval-based approaches outperform memory methods in conversational settings.

03

A hybrid framework balances accuracy, auditability, and efficiency.

Abstract

The rapid adoption of artificial intelligence (AI) and large language models (LLMs) is transforming financial analytics by enabling natural language interfaces for reporting, decision support, and automated reasoning. However, limited empirical understanding exists regarding how different LLM-based reasoning architectures perform across realistic financial workflows, particularly under the cost, accuracy, and compliance constraints faced by small and medium-sized enterprises (SMEs). SMEs typically operate within severe infrastructure constraints, lacking cloud GPU budgets, dedicated AI teams, and API-scale inference capacity, making architectural efficiency a first-class concern. To ensure practical relevance, we introduce an explicit SME-constrained evaluation setting in which all experiments are conducted using a locally hosted 8B-parameter instruction-tuned model without cloud-scale…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.