BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs

Nicolas Boizard; Th\'eo Deschamps-Berger; Hippolyte Gisserot-Boukhlef; C\'eline Hudelot; Pierre Colombo

arXiv:2604.02045·cs.CL·April 3, 2026

BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs

Nicolas Boizard, Th\'eo Deschamps-Berger, Hippolyte Gisserot-Boukhlef, C\'eline Hudelot, Pierre Colombo

PDF

11 Models 7 Datasets

TL;DR

This paper introduces BidirLM, a method to convert causal language models into bidirectional encoders, achieving superior performance across multiple modalities by novel training strategies and model merging techniques.

Contribution

It presents a systematic approach for adapting causal LLMs into bidirectional encoders, including a new training objective, mitigation of catastrophic forgetting, and integration with specialized models.

Findings

01

BidirLM outperforms existing models on text, vision, and audio benchmarks.

02

The critical role of prior masking in successful adaptation is identified.

03

A scalable adaptation process without original pre-training data is developed.

Abstract

Transforming causal generative language models into bidirectional encoders offers a powerful alternative to BERT-style architectures. However, current approaches remain limited: they lack consensus on optimal training objectives, suffer from catastrophic forgetting at scale, and fail to flexibly integrate the vast ecosystem of specialized generative models. In this work, through systematic ablations on the Gemma3 and Qwen3 families, we identify the key factors driving successful adaptation, highlighting the critical role of an often-omitted prior masking phase. To scale this process without original pre-training data, we introduce a dual strategy combining linear weight merging with a lightweight multi-domain data mixture that mitigates catastrophic forgetting. Finally, we augment our encoders by merging them with specialized causal models, seamlessly transferring modality- and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.