The Bicameral Model: Bidirectional Hidden-State Coupling Between Parallel Language Models
Cedric Flamant, Udaya Ghai, Kanna Shimizu

TL;DR
The paper introduces the Bicameral Model, which enables two pretrained language models to coordinate via a trainable neural interface on their hidden states, improving task performance without explicit output communication.
Contribution
It proposes a novel bidirectional coupling mechanism between language models through a learned neural interface, allowing dynamic, task-specific communication.
Findings
Coupling two 0.5B models with a calculator increases arithmetic accuracy from 36% to 96%.
Coupling two 0.6B models with a Z3 solver improves logic puzzle performance by 1.7 times.
Using a Python sandbox, the auxiliary model generates code from hidden states alone, without seeing the problem text.
Abstract
Existing multi-model and tool-augmented systems communicate by generating text, serializing every exchange through the output vocabulary. Can two pretrained language models instead coordinate through a continuous, concurrent channel? The Bicameral Model couples two frozen language models through a trainable neural interface on their intermediate hidden states. At every generation step, both models run in lockstep: a primary model drives the task while an auxiliary model operates tools, solves constraints, or executes code, with both conditioning on each other's activations through a translation network and a learned suppression gate (1\% of combined parameters). The gate learns a selective communication protocol from task loss alone, without a prescribed format. We demonstrate the mechanism across three tool backends. On arithmetic, coupling two 0.5B models with a calculator…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
