Synthesizing the Virtual Advocate: A Multi-Persona Speech Generation Framework for Diverse Linguistic Jurisdictions in Indic Languages

Aniket Deroy

arXiv:2602.11172·cs.CL·February 13, 2026

Synthesizing the Virtual Advocate: A Multi-Persona Speech Generation Framework for Diverse Linguistic Jurisdictions in Indic Languages

Aniket Deroy

PDF

Open Access

TL;DR

This paper evaluates multilingual TTS models for synthetic courtroom speech in Indic languages, proposing a prompting framework to generate advocate personas, and discusses current limitations in emotional expressiveness and phonological diversity.

Contribution

It introduces a prompting framework leveraging Gemini 2.5 models for multi-language legal speech synthesis and analyzes their performance and challenges in diverse linguistic contexts.

Findings

01

Models perform well in procedural speech delivery.

02

Struggle with emotional modulation and vocal dynamics.

03

Performance varies across languages, with Bengali and Gujarati showing lower quality.

Abstract

Legal advocacy requires a unique combination of authoritative tone, rhythmic pausing for emphasis, and emotional intelligence. This study investigates the performance of the Gemini 2.5 Flash TTS and Gemini 2.5 Pro TTS models in generating synthetic courtroom speeches across five Indic languages: Tamil, Telugu, Bengali, Hindi, and Gujarati. We propose a prompting framework that utilizes Gemini 2.5s native support for 5 languages and its context-aware pacing to produce distinct advocate personas. The evolution of Large Language Models (LLMs) has shifted the focus of TexttoSpeech (TTS) technology from basic intelligibility to context-aware, expressive synthesis. In the legal domain, synthetic speech must convey authority and a specific professional persona a task that becomes significantly more complex in the linguistically diverse landscape of India. The models exhibit a "monotone…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · AI in Service Interactions · Multimodal Machine Learning Applications