Knowledge Amalgamation for Object Detection with Transformers

Haofei Zhang; Feng Mao; Mengqi Xue; Gongfan Fang; Zunlei Feng; Jie; Song; Mingli Song

arXiv:2203.03187·cs.CV·October 28, 2024

Knowledge Amalgamation for Object Detection with Transformers

Haofei Zhang, Feng Mao, Mengqi Xue, Gongfan Fang, Zunlei Feng, Jie, Song, Mingli Song

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel knowledge amalgamation approach tailored for transformer-based object detection models, effectively transferring knowledge from multiple teachers to a compact student, and demonstrating superior performance on standard datasets.

Contribution

It proposes sequence-level and task-level amalgamation methods specifically designed for transformer architectures in object detection, improving knowledge transfer efficiency and effectiveness.

Findings

01

Sequence-level amalgamation significantly boosts student performance.

02

Transformer-based students learn heterogeneous tasks rapidly.

03

Achieves comparable or superior results to teacher models on PASCAL VOC and COCO.

Abstract

Knowledge amalgamation (KA) is a novel deep model reusing task aiming to transfer knowledge from several well-trained teachers to a multi-talented and compact student. Currently, most of these approaches are tailored for convolutional neural networks (CNNs). However, there is a tendency that transformers, with a completely different architecture, are starting to challenge the domination of CNNs in many computer vision tasks. Nevertheless, directly applying the previous KA methods to transformers leads to severe performance degradation. In this work, we explore a more effective KA scheme for transformer-based object detection models. Specifically, considering the architecture characteristics of transformers, we propose to dissolve the KA into two aspects: sequence-level amalgamation (SA) and task-level amalgamation (TA). In particular, a hint is generated within the sequence-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zju-vipa/KamalEngine
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning