When Alignment Hurts: Decoupling Representational Spaces in Multilingual Models

Ahmed Elshabrawy; Hour Kaing; Haiyue Song; Alham Fikri Aji; Hideki Tanaka; Masao Utiyama; Raj Dabre

arXiv:2508.12803·cs.CL·August 19, 2025

When Alignment Hurts: Decoupling Representational Spaces in Multilingual Models

Ahmed Elshabrawy, Hour Kaing, Haiyue Song, Alham Fikri Aji, Hideki Tanaka, Masao Utiyama, Raj Dabre

PDF

Open Access 3 Reviews

TL;DR

This paper demonstrates that excessive alignment with high-resource languages in multilingual models can hinder low-resource language generation, and proposes a causal intervention method to improve performance by decoupling these representations.

Contribution

It introduces an online variational probing framework for decoupling high-resource language representations in LLMs, validated through Arabic dialect modeling.

Findings

01

Intervention improves dialectal translation quality by up to +4.9 chrF++

02

Decoupling high-resource language subspaces enhances low-resource language generation

03

Tradeoff observed in standard-language performance after intervention

Abstract

Alignment with high-resource standard languages is often assumed to aid the modeling of related low-resource varieties. We challenge this assumption by demonstrating that excessive representational entanglement with a dominant variety, such as Modern Standard Arabic (MSA) in relation to Arabic dialects, can actively hinder generative modeling. We present the first comprehensive causal study of this phenomenon by analyzing and directly intervening in the internal representation geometry of large language models (LLMs). Our key contribution is an online variational probing framework that continuously estimates the subspace of the standard variety during fine-tuning, enabling projection-based decoupling from this space. While our study uses Arabic as a case due to its unusually rich parallel resources across 25 dialects, the broader motivation is methodological: dialectal MT serves as a…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 3

Strengths

The paper is relatively well written and nice to read (although I believe key information is left unspecified; see weaknesses below). The paper studies an interesting problem: how alignment between languages’ representations affects model performance. The paper proposes an interesting method to change the "amount" of alignment between languages, which it leverages to study the relationship between alignment and performance.

Weaknesses

I believe this paper is very interesting: it proposes an interesting method, it runs interesting experiments, and has interesting conclusions. Issues in the writing, however, stop me from recommending its acceptance. Specifically: 1. First, one of the key contributions of this paper is unclear from the paper’s description: the online subspace decoupling method. The way the paper “Identifies Higher-resource Variety Subspace” is unclear. The paper states: > We train a variational linear probe (

Reviewer 02Rating 8Confidence 3

Strengths

- The specific problem with using multilingual models and co-training with higher resource related languages for dialects is an understudied yet significant problem. - I am more familiar with multilingual representation research and think this connection to dialect MT is interesting. - The results are reasonable and intuitive

Weaknesses

- I think a control of a few unrelated languages at least for analysis could strengthen your claims - Seeing how these findings change with changing scale would have been great Presentation etc: - You should make it more clear early in the paper that you also introduce a technique - nit: From what I understand, you used an existing technique for analysis but introduced a new technique but the abstract and intro reads a bit differently - Figure 4 is super unclear

Reviewer 03Rating 2Confidence 3

Strengths

1. The proposed online decoupling method is well-motivated and its formulation is well explained. It is also general in its formulation and potentially applicable outside of the inter-variety MT task as well. 2. The random subspace experiment represented in Table 3, where the question of generic vs targeted hidden space regularization task, is a great analysis to isolate the effect of the specific method proposed. 3. The paper includes substantial testing across a number of related languauge g

Weaknesses

1. Motivation and contribution mismatch: The introduction to the paper focuses on the problem of "generative quality" and makes a claim that cross-lingual alignment sometimes can impede this quality. However, it is revealed much later that the test-bed of this paper is solely the inter-variety translation task. It is much more clearly the case that cross-lingual alignment would indeed harm something like inter-variety translation, but this is not clear from the motivation. Basically, if this wer

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques