NADIR: Differential Attention Flow for Non-Autoregressive Transliteration in Indic Languages

Lakshya Tomar; Vinayak Abrol; Puneet Agarwal

arXiv:2601.12389·cs.CL·January 21, 2026

NADIR: Differential Attention Flow for Non-Autoregressive Transliteration in Indic Languages

Lakshya Tomar, Vinayak Abrol, Puneet Agarwal

PDF

Open Access 1 Video

TL;DR

NADIR is a novel non-autoregressive model for Indic language transliteration that balances speed and accuracy, achieving significant speed-up while maintaining competitive error rates and reducing common transliteration errors.

Contribution

Introduces NADIR, a new NAR architecture with Differential Transformer and Mixture-of-Experts for improved transliteration in Indic languages, balancing speed and accuracy.

Findings

01

Over 13x speed-up compared to AR baseline

02

Competitive Character Error Rate of 15.78%

03

Reduces common transliteration errors significantly

Abstract

In this work, we argue that not all sequence-to-sequence tasks require the strong inductive biases of autoregressive (AR) models. Tasks like multilingual transliteration, code refactoring, grammatical correction or text normalization often rely on local dependencies where the full modeling capacity of AR models can be overkill, creating a trade-off between their high accuracy and high inference latency. While non-autoregressive (NAR) models offer speed, they typically suffer from hallucinations and poor length control. To explore this trade-off, we focus on the multilingual transliteration task in Indic languages and introduce NADIR, a novel NAR architecture designed to strike a balance between speed and accuracy. NADIR integrates a Differential Transformer and a Mixture-of-Experts mechanism, enabling it to robustly model complex character mappings without sequential dependencies. NADIR…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

NADIR: Differential Attention Flow for Non-Autoregressive Transliteration in Indic Languages· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques