Group Cognition Learning: Making Everything Better Through Governed Two-Stage Agents Collaboration

Chunlei Meng; Pengbin Feng; Rong Fu; Hoi Leong Lee; Xiaojing Du; Zhaolu Kang; Zeyu Zhang; Weilin Zhou; Chun Ouyang; Zhongxue Gan

arXiv:2605.00370·cs.LG·May 12, 2026

Group Cognition Learning: Making Everything Better Through Governed Two-Stage Agents Collaboration

Chunlei Meng, Pengbin Feng, Rong Fu, Hoi Leong Lee, Xiaojing Du, Zhaolu Kang, Zeyu Zhang, Weilin Zhou, Chun Ouyang, Zhongxue Gan

PDF

TL;DR

This paper introduces Group Cognition Learning, a two-stage collaborative framework for multimodal learning that mitigates modality dominance and spurious coupling, achieving state-of-the-art results on multiple benchmarks.

Contribution

It proposes a novel governed two-stage collaboration paradigm with explicit modules to improve multimodal learning effectiveness.

Findings

01

GCL achieves state-of-the-art results on CMU-MOSI, CMU-MOSEI, and MIntRec datasets.

02

GCL effectively reduces modality dominance and spurious coupling.

03

Extensive experiments validate the design's effectiveness.

Abstract

Centralized multimodal learning commonly compresses language, acoustic, and visual signals into a single fused representation for prediction. While effective, this paradigm suffers from two limitations: modality dominance, where optimization gravitates towards the path of least resistance, ignoring weaker but informative modalities, and spurious modality coupling, where models overfit to incidental cross-modal correlations. To address these, we propose Group Cognition Learning (GCL), a governed collaboration paradigm that applies a two-stage protocol after modality-specific encoding. In Stage 1 (Selective Interaction), a Routing Agent proposes directed interaction routes, and an Auditing Agent assigns sample-wise gates to emphasize exchanges that yield positive marginal predictive gain while suppressing redundant coupling. In Stage 2 (Consensus Formation), a Public-Factor Agent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.