Solution for Large-scale Long-tailed Recognition with Noisy Labels
Yuqiao Xian, Jia-Xin Zhuang, Fufu Yu

TL;DR
This paper presents a comprehensive solution for large-scale, long-tailed commodity image recognition with noisy labels, utilizing advanced architectures and data cleaning techniques to improve accuracy in an ecommerce challenge.
Contribution
It introduces a novel combination of model architectures and data processing strategies specifically designed for noisy, imbalanced, and fine-grained recognition tasks.
Findings
Achieved 6.4365% mean class error rate in the competition
Identified key techniques: data cleaning, classifier normalization, high-res finetuning, test augmentation
Demonstrated effectiveness of CNN and Transformer models in large-scale recognition
Abstract
This is a technical report for CVPR 2021 AliProducts Challenge. AliProducts Challenge is a competition proposed for studying the large-scale and fine-grained commodity image recognition problem encountered by worldleading ecommerce companies. The large-scale product recognition simultaneously meets the challenge of noisy annotations, imbalanced (long-tailed) data distribution and fine-grained classification. In our solution, we adopt stateof-the-art model architectures of both CNNs and Transformer, including ResNeSt, EfficientNetV2, and DeiT. We found that iterative data cleaning, classifier weight normalization, high-resolution finetuning, and test time augmentation are key components to improve the performance of training with the noisy and imbalanced dataset. Finally, we obtain 6.4365% mean class error rate in the leaderboard with our ensemble model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Anomaly Detection Techniques and Applications · Image and Object Detection Techniques
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Pointwise Convolution · Depthwise Convolution · Depthwise Separable Convolution · Inverted Residual Block · EfficientNetV2 · Average Pooling · Global Average Pooling
