The Open Language Archives Community and Asian Language Resources

Steven Bird; Gary Simons; Chu-Ren Huang

arXiv:cs/0110014·cs.CL·May 23, 2007·3 cites

The Open Language Archives Community and Asian Language Resources

Steven Bird, Gary Simons, Chu-Ren Huang

PDF

Open Access

TL;DR

This paper introduces OLAC, a global system for language archives, emphasizing its application in Asia for better documentation and discovery of language resources and tools.

Contribution

It presents OLAC and its metadata standards, focusing on adapting language and multilingual resource classification for the Asian context.

Findings

01

OLAC facilitates discovery of language resources globally.

02

Metadata standards improve resource documentation.

03

Asian language classification presents unique challenges.

Abstract

The Open Language Archives Community (OLAC) is a new project to build a worldwide system of federated language archives based on the Open Archives Initiative and the Dublin Core Metadata Initiative. This paper aims to disseminate the OLAC vision to the language resources community in Asia, and to show language technologists and linguists how they can document their tools and data in such a way that others can easily discover them. We describe OLAC and the OLAC Metadata Set, then discuss two key issues in the Asian context: language classification and multilingual resource classification.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques