The Open Language Archives Community and Asian Language Resources
Steven Bird, Gary Simons, Chu-Ren Huang

TL;DR
This paper introduces OLAC, a global system for language archives, emphasizing its application in Asia for better documentation and discovery of language resources and tools.
Contribution
It presents OLAC and its metadata standards, focusing on adapting language and multilingual resource classification for the Asian context.
Findings
OLAC facilitates discovery of language resources globally.
Metadata standards improve resource documentation.
Asian language classification presents unique challenges.
Abstract
The Open Language Archives Community (OLAC) is a new project to build a worldwide system of federated language archives based on the Open Archives Initiative and the Dublin Core Metadata Initiative. This paper aims to disseminate the OLAC vision to the language resources community in Asia, and to show language technologists and linguists how they can document their tools and data in such a way that others can easily discover them. We describe OLAC and the OLAC Metadata Set, then discuss two key issues in the Asian context: language classification and multilingual resource classification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
