ITUNLP at SemEval-2025 Task 8: Question-Answering over Tabular Data: A Zero-Shot Approach using LLM-Driven Code Generation

Atakan Site; Emre Hakan Erdemir; G\"ul\c{s}en Eryi\u{g}it

arXiv:2508.00762·cs.CL·August 4, 2025

ITUNLP at SemEval-2025 Task 8: Question-Answering over Tabular Data: A Zero-Shot Approach using LLM-Driven Code Generation

Atakan Site, Emre Hakan Erdemir, G\"ul\c{s}en Eryi\u{g}it

PDF

Open Access 1 Video

TL;DR

This paper introduces a zero-shot question-answering system for tabular data using LLM-driven Python code generation, achieving competitive results in SemEval-2025 Task 8.

Contribution

It presents a novel zero-shot approach leveraging LLMs for code generation to perform tabular question answering, emphasizing optimized prompting strategies.

Findings

01

Different LLMs vary in effectiveness for code generation.

02

Python code generation outperforms other methods in accuracy.

03

System ranked eighth and sixth in two subtasks.

Abstract

This paper presents our system for SemEval-2025 Task 8: DataBench, Question-Answering over Tabular Data. The primary objective of this task is to perform question answering on given tabular datasets from diverse domains under two subtasks: DataBench QA (Subtask I) and DataBench Lite QA (Subtask II). To tackle both subtasks, we developed a zero-shot solution with a particular emphasis on leveraging Large Language Model (LLM)-based code generation. Specifically, we propose a Python code generation framework utilizing state-of-the-art open-source LLMs to generate executable Pandas code via optimized prompting strategies. Our experiments reveal that different LLMs exhibit varying levels of effectiveness in Python code generation. Additionally, results show that Python code generation achieves superior performance in tabular question answering compared to alternative approaches. Although our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ITUNLP at SemEval-2025 Task 8: Question-Answering over Tabular Data: A Zero-Shot Approach using LLM-Driven Code Generation· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications