Exploring the Robustness of In-Context Learning with Noisy Labels
Chen Cheng, Xinzhi Yu, Haodong Wen, Jingsong Sun, Guanzhang Yue, Yihao, Zhang, Zeming Wei

TL;DR
This paper investigates how robust Transformer-based models are to noisy labels during in-context learning, finding that they are resilient to various noise types and that training with noise can further enhance this robustness.
Contribution
It provides a comprehensive analysis of Transformer robustness to noisy labels in ICL and demonstrates that training with noise can improve inference resilience.
Findings
Transformers show notable resilience to diverse label noise during ICL.
Introducing noise into training data can enhance robustness during inference.
The study offers insights into the resilience of Transformers in noisy natural language processing environments.
Abstract
Recently, the mysterious In-Context Learning (ICL) ability exhibited by Transformer architectures, especially in large language models (LLMs), has sparked significant research interest. However, the resilience of Transformers' in-context learning capabilities in the presence of noisy samples, prevalent in both training corpora and prompt demonstrations, remains underexplored. In this paper, inspired by prior research that studies ICL ability using simple function classes, we take a closer look at this problem by investigating the robustness of Transformers against noisy labels. Specifically, we first conduct a thorough evaluation and analysis of the robustness of Transformers against noisy labels during in-context learning and show that they exhibit notable resilience against diverse types of noise in demonstration labels. Furthermore, we delve deeper into this problem by exploring…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWater Systems and Optimization · Wireless Sensor Networks and IoT · Machine Learning and Data Classification
MethodsAttention Is All You Need · Dropout · Residual Connection · Softmax · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Linear Layer · Dense Connections · Label Smoothing
