Fine-Tuning Vision-Language Models for Markdown Conversion of Financial Tables in Malaysian Audited Financial Reports
Jin Khye Tan (Faculty of Computer Science, Information Technology, Universiti Malaya), En Jun Choong, Ethan Jeremiah Chitty, Yan Pheng Choo, John Hsin Yang Wong, Chern Eu Cheah

TL;DR
This paper presents a fine-tuned vision-language model tailored for converting complex Malaysian financial tables into Markdown format, achieving high accuracy and structural fidelity, surpassing larger models and reducing inference time.
Contribution
The study introduces a specialized fine-tuning approach for VLMs using a curated dataset and novel metrics, significantly improving financial table conversion accuracy.
Findings
Achieved 92.20% accuracy on criteria-based assessment.
Attained 96.53% Markdown TEDS score, surpassing larger models.
Outperformed proprietary models like GPT-4o and Gemini 2.5 Flash.
Abstract
Accurately extracting and representing the structure of tabular data from financial documents remains a critical challenge in document understanding, particularly for regulatory and analytical use cases. This study addresses the complexity of converting financial tables from Malaysian audited financial reports into Markdown format, a task complicated by rotated layouts, multi-level headers, and implicit structural cues. We propose a fine-tuned vision-language model (VLM), based on Qwen2.5-VL-7B, optimized for high-fidelity Markdown generation from document images. Our approach includes a curated dataset of 2,152 image-text pairs with augmentations and a supervised fine-tuning strategy using LoRA. To assess performance, we evaluated our model on 100 out-of-sample tables using a dual framework: a criteria-based LLM-as-a-judge for fine-grained accuracy and our novel Markdown…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Reporting and XBRL · Stock Market Forecasting Methods
