KCM: KAN-Based Collaboration Models Enhance Pretrained Large Models
Guangyu Dai, Siliang Tang, Yueting Zhuang

TL;DR
This paper introduces KCM, a novel collaborative model using KAN architecture to improve large-small model collaboration, reducing resource use and catastrophic forgetting across language, vision, and cross-modal tasks.
Contribution
The paper proposes KCM, an innovative KAN-based approach that enhances interpretability, mitigates forgetting, and improves efficiency in large-small model collaborations.
Findings
KCM reduces large model inference calls significantly.
KCM maintains near-identical accuracy with lower resource consumption.
KCM outperforms MLP-based models in all evaluated metrics.
Abstract
In recent years, Pretrained Large Models(PLMs) researchers proposed large-small model collaboration frameworks, leveraged easily trainable small models to assist large models, aim to(1) significantly reduce computational resource consumption while maintaining comparable accuracy, and (2) enhance large model performance in specialized domain tasks. However, this collaborative paradigm suffers from issues such as significant accuracy degradation, exacerbated catastrophic forgetting, and amplified hallucination problems induced by small model knowledge. To address these challenges, we propose a KAN-based Collaborative Model (KCM) as an improved approach to large-small model collaboration. The KAN utilized in KCM represents an alternative neural network architecture distinct from conventional MLPs. Compared to MLPs, KAN offers superior visualizability and interpretability while mitigating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis
