The Catalan Language CLUB
Carlos Rodriguez-Penagos, Carme Armentano-Oller, Marta Villegas, Maite, Melero, Aitor Gonzalez, Ona de Gibert Bonet, and Casimiro Carrino Pio

TL;DR
The paper introduces the Catalan Language CLUB, a comprehensive benchmark for evaluating Catalan language understanding across multiple tasks, supporting AI development for Catalan through standardized assessments.
Contribution
It presents the first dedicated Catalan language understanding benchmark, enabling consistent evaluation of models on diverse NLU tasks.
Findings
Provides a suite of datasets for Catalan NLU tasks
Enables standardized evaluation of Catalan language models
Supports AI development for Catalan language
Abstract
The Catalan Language Understanding Benchmark (CLUB) encompasses various datasets representative of different NLU tasks that enable accurate evaluations of language models, following the General Language Understanding Evaluation (GLUE) example. It is part of AINA and PlanTL, two public funding initiatives to empower the Catalan language in the Artificial Intelligence era.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling
