Group Cognition Learning: Making Everything Better Through Governed Two-Stage Agents Collaboration
Chunlei Meng, Pengbin Feng, Rong Fu, Hoi Leong Lee, Xiaojing Du, Zhaolu Kang, Zeyu Zhang, Weilin Zhou, Chun Ouyang, Zhongxue Gan

TL;DR
This paper introduces Group Cognition Learning, a two-stage collaborative framework for multimodal learning that mitigates modality dominance and spurious coupling, achieving state-of-the-art results on multiple benchmarks.
Contribution
It proposes a novel governed two-stage collaboration paradigm with explicit modules to improve multimodal learning effectiveness.
Findings
GCL achieves state-of-the-art results on CMU-MOSI, CMU-MOSEI, and MIntRec datasets.
GCL effectively reduces modality dominance and spurious coupling.
Extensive experiments validate the design's effectiveness.
Abstract
Centralized multimodal learning commonly compresses language, acoustic, and visual signals into a single fused representation for prediction. While effective, this paradigm suffers from two limitations: modality dominance, where optimization gravitates towards the path of least resistance, ignoring weaker but informative modalities, and spurious modality coupling, where models overfit to incidental cross-modal correlations. To address these, we propose Group Cognition Learning (GCL), a governed collaboration paradigm that applies a two-stage protocol after modality-specific encoding. In Stage 1 (Selective Interaction), a Routing Agent proposes directed interaction routes, and an Auditing Agent assigns sample-wise gates to emphasize exchanges that yield positive marginal predictive gain while suppressing redundant coupling. In Stage 2 (Consensus Formation), a Public-Factor Agent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
