ExpliCIT-QA: Explainable Code-Based Image Table Question Answering
Maximiliano Hormaz\'abal Lagos,\'Alvaro Bueno S\'aez, Pedro Alonso Doval, Jorge Alcalde Vesteiro, H\'ector Cerezo-Costas

TL;DR
ExpliCIT-QA is a modular, explainable system for multimodal table question answering that generates step-by-step reasoning, code, and explanations to improve transparency and auditability in complex table image analysis.
Contribution
It introduces a novel multimodal pipeline with explainability features, including reasoning steps and code generation, advancing transparency in table-based visual question answering.
Findings
Improved interpretability over existing baselines.
Enhanced transparency with intermediate outputs available for inspection.
Demonstrated effectiveness on TableVQA-Bench benchmark.
Abstract
We present ExpliCIT-QA, a system that extends our previous MRT approach for tabular question answering into a multimodal pipeline capable of handling complex table images and providing explainable answers. ExpliCIT-QA follows a modular design, consisting of: (1) Multimodal Table Understanding, which uses a Chain-of-Thought approach to extract and transform content from table images; (2) Language-based Reasoning, where a step-by-step explanation in natural language is generated to solve the problem; (3) Automatic Code Generation, where Python/Pandas scripts are created based on the reasoning steps, with feedback for handling errors; (4) Code Execution to compute the final answer; and (5) Natural Language Explanation that describes how the answer was computed. The system is built for transparency and auditability: all intermediate outputs, parsed tables, reasoning steps, generated code,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Topic Modeling
