Many Minds from One Model: Bayesian-Inspired Transformers for Population Diversity

Diji Yang; Yi Zhang

arXiv:2512.25063·cs.LG·January 19, 2026

Many Minds from One Model: Bayesian-Inspired Transformers for Population Diversity

Diji Yang, Yi Zhang

PDF

Open Access

TL;DR

This paper introduces Population Bayesian Transformers (B-Trans), a method to generate diverse, coherent transformer model instances from a single pre-trained model by injecting stochasticity into normalization layers, inspired by population diversity.

Contribution

The paper proposes a novel Bayesian-inspired approach to produce diverse transformer model instances from one pre-trained model, enhancing response diversity and task performance.

Findings

01

B-Trans generates diverse yet coherent model instances.

02

B-Trans improves response diversity in zero-shot tasks.

03

B-Trans outperforms deterministic baselines in task performance.

Abstract

Despite their scale and success, modern transformers are usually trained as single-minded systems: optimization produces a deterministic set of parameters, representing a single functional hypothesis about the data. Motivated by the analogy to human populations, in which population-level intelligence emerges from diverse individual behaviors, we propose Population Bayesian Transformers (B-Trans), which enable sampling diverse yet coherent transformer large language model instances (hereafter referred to as a 'mind') from a single pre-trained LLM. B-Trans introduces a Bayesian-inspired posterior proxy by injecting stochasticity directly into normalization layers, avoiding the prohibitive cost of training full Bayesian neural networks. Sampling from this proxy yields a population of minds with diverse behaviors while maintaining general competence. During the generation of each response,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis