TL;DR
This paper introduces an end-to-end speech translation system combining Whisper ASR and Krutrim LLM for English-Indic translation, achieving competitive BLEU scores and exploring Chain-of-Thought prompting to improve translation quality.
Contribution
The paper presents a novel integration of pre-trained ASR and Indic-specific LLMs for low-resource speech translation, and investigates Chain-of-Thought prompting effects.
Findings
Achieved BLEU scores of 28.88 (English-Indic) and 27.86 (Indic-English)
Chain-of-Thought improved Tamil-to-English translation BLEU by 13.84
Challenges in maintaining consistent CoT output format
Abstract
This paper presents HITSZ's submission for the IWSLT 2025 Indic track, focusing on speech-to-text translation (ST) for English-to-Indic and Indic-to-English language pairs. To enhance translation quality in this low-resource scenario, we propose an end-to-end system integrating the pre-trained Whisper automated speech recognition (ASR) model with Krutrim, an Indic-specialized large language model (LLM). Experimental results demonstrate that our end-to-end system achieved average BLEU scores of for English-to-Indic directions and for Indic-to-English directions. Furthermore, we investigated the Chain-of-Thought (CoT) method. While this method showed potential for significant translation quality improvements on successfully parsed outputs (e.g. a BLEU increase for Tamil-to-English), we observed challenges in ensuring the model consistently adheres to the required…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
