Kaggle Kinship Recognition Challenge: Introduction of Convolution-Free Model to boost conventional
Mingchuan Tian, Guangway Teng, Yipeng Bao

TL;DR
This paper introduces a convolution-free Vision Transformer model as part of an ensemble classifier to improve kinship recognition accuracy, demonstrating significant performance gains over traditional CNN-only ensembles.
Contribution
It proposes combining Vision Transformers with CNNs for ensemble classification, a novel approach in kinship recognition tasks.
Findings
Combined models outperform CNN-only ensembles in ROC scores.
Vision Transformers add low correlation diversity to CNN ensembles.
Optimized Vision Transformer variants boost overall ensemble performance.
Abstract
This work aims to explore a convolution-free base classifier that can be used to widen the variations of the conventional ensemble classifier. Specifically, we propose Vision Transformers as base classifiers to combine with CNNs for a unique ensemble solution in Kaggle kinship recognition. In this paper, we verify our proposed idea by implementing and optimizing variants of the Vision Transformer model on top of the existing CNN models. The combined models achieve better scores than conventional ensemble classifiers based solely on CNN variants. We demonstrate that highly optimized CNN ensembles publicly available on the Kaggle Discussion board can easily achieve a significant boost in ROC score by simply ensemble with variants of the Vision Transformer model due to low correlation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Label Smoothing · Softmax · Absolute Position Encodings · Dropout · Adam · Byte Pair Encoding · Residual Connection
