The National Corpus of Contemporary Welsh: Project Report | Y Corpws Cenedlaethol Cymraeg Cyfoes: Adroddiad y Prosiect
Dawn Knight, Steve Morris, Tess Fitzpatrick, Paul Rayson, Irena, Spasi\'c, Enlli M\^on Thomas

TL;DR
This paper reports on the development of the CorCenCC online corpus resource for contemporary Welsh, discussing its theoretical basis, operational decisions, and potential applications for various user groups.
Contribution
It introduces a comprehensive Welsh language corpus built on a solid theoretical foundation, addressing operational challenges and demonstrating its utility for linguistic research and language preservation.
Findings
Development of a large, accessible Welsh corpus
Operational strategies for corpus-building discussed
Potential applications in linguistic research and language policy
Abstract
This report provides an overview of the CorCenCC project and the online corpus resource that was developed as a result of work on the project. The report lays out the theoretical underpinnings of the research, demonstrating how the project has built on and extended this theory. We also raise and discuss some of the key operational questions that arose during the course of the project, outlining the ways in which they were answered, the impact of these decisions on the resource that has been produced and the longer-term contribution they will make to practices in corpus-building. Finally, we discuss some of the applications and the utility of the work, outlining the impact that CorCenCC is set to have on a range of different individuals and user groups.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Communication and Language · Linguistic Variation and Morphology · Linguistics, Language Diversity, and Identity
