NALA_MAINZ at BLP-2025 Task 2: A Multi-agent Approach for Bangla Instruction to Python Code Generation

Hossain Shaikh Saadi; Faria Alam; Mario Sanz-Guerrero; Minh Duc Bui; Manuel Mager; Katharina von der Wense

arXiv:2511.16787·cs.CL·November 24, 2025

NALA_MAINZ at BLP-2025 Task 2: A Multi-agent Approach for Bangla Instruction to Python Code Generation

Hossain Shaikh Saadi, Faria Alam, Mario Sanz-Guerrero, Minh Duc Bui, Manuel Mager, Katharina von der Wense

PDF

Open Access

TL;DR

This paper introduces a multi-agent system for translating Bangla instructions into Python code, achieving high accuracy by iterative debugging and testing, and winning the BLP-2025 shared task.

Contribution

It presents a novel multi-agent pipeline that iteratively refines code solutions based on test failures, improving code generation from Bangla instructions.

Findings

01

Achieved a Pass@1 score of 95.4 in the shared task.

02

First place in the BLP-2025 code generation challenge.

03

Code is publicly available for further research.

Abstract

This paper presents JGU Mainz's winning system for the BLP-2025 Shared Task on Code Generation from Bangla Instructions. We propose a multi-agent-based pipeline. First, a code-generation agent produces an initial solution from the input instruction. The candidate program is then executed against the provided unit tests (pytest-style, assert-based). Only the failing cases are forwarded to a debugger agent, which reruns the tests, extracts error traces, and, conditioning on the error messages, the current program, and the relevant test cases, generates a revised solution. Using this approach, our submission achieved first place in the shared task with a $P a ss @1$ score of 95.4. We also make our code public.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Topic Modeling · Software Testing and Debugging Techniques