Connector-S: A Survey of Connectors in Multi-modal Large Language Models
Xun Zhu, Zheng Zhang, Xi Chen, Yiming Shi, Miao Li, Ji Wu

TL;DR
This survey comprehensively reviews the design, taxonomy, and future challenges of connectors in multi-modal large language models, aiming to guide future research and development in this critical component.
Contribution
It provides a structured taxonomy of connectors, analyzes their technical progress, and discusses future research challenges in multi-modal large language models.
Findings
Taxonomy categorizes connectors into atomic and holistic types.
Highlights recent advancements and technical contributions.
Identifies key challenges and future research directions.
Abstract
With the rapid advancements in multi-modal large language models (MLLMs), connectors play a pivotal role in bridging diverse modalities and enhancing model performance. However, the design and evolution of connectors have not been comprehensively analyzed, leaving gaps in understanding how these components function and hindering the development of more powerful connectors. In this survey, we systematically review the current progress of connectors in MLLMs and present a structured taxonomy that categorizes connectors into atomic operations (mapping, compression, mixture of experts) and holistic designs (multi-layer, multi-encoder, multi-modal scenarios), highlighting their technical contributions and advancements. Furthermore, we discuss several promising research frontiers and challenges, including high-resolution input, dynamic compression, guide information selection, combination…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
