READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises
Chenglei Si, Zhengyan Zhang, Yingfa Chen, Xiaozhi Wang, Zhiyuan Liu,, Maosong Sun

TL;DR
READIN is a comprehensive Chinese benchmark with realistic input noises from speech and typing errors, designed to evaluate and improve the robustness of NLP models in real-world scenarios.
Contribution
It introduces the first large-scale Chinese benchmark with diverse, realistic input noises from user-generated data, covering multiple tasks and input methods.
Findings
Models show significant performance drops on noisy data.
Robust training methods have limited effectiveness against realistic noises.
The benchmark highlights the need for improved robustness techniques.
Abstract
For many real-world applications, the user-generated inputs usually contain various noises due to speech recognition errors caused by linguistic variations1 or typographical errors (typos). Thus, it is crucial to test model performance on data with realistic input noises to ensure robustness and fairness. However, little study has been done to construct such benchmarks for Chinese, where various language-specific input noises happen in the real world. In order to fill this important gap, we construct READIN: a Chinese multi-task benchmark with REalistic And Diverse Input Noises. READIN contains four diverse tasks and requests annotators to re-enter the original test data with two commonly used Chinese input methods: Pinyin input and speech input. We designed our annotation pipeline to maximize diversity, for example by instructing the annotators to use diverse input method editors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsTest
