Ontology-Based Annotation of Multimedia Language Data for the Semantic Web
Artem Chebotko, Shiyong Lu, Farshad Fotouhi, Anthony Aristar

TL;DR
This paper presents OntoELAN, a tool that uses ontologies to annotate multimedia language data, enhancing searchability and preservation of endangered languages by making implicit content explicit for computational use.
Contribution
It introduces an ontology-based annotation system for multimedia language data, facilitating structured, searchable metadata for endangered language preservation.
Findings
Enables structured annotation of multimedia language data
Improves searchability and cataloging of language resources
Supports preservation efforts for endangered languages
Abstract
There is an increasing interest and effort in preserving and documenting endangered languages. Language data are valuable only when they are well-cataloged, indexed and searchable. Many language data, particularly those of lesser-spoken languages, are collected as audio and video recordings. While multimedia data provide more channels and dimensions to describe a language's function, and gives a better presentation of the cultural system associated with the language of that community, they are not text-based or structured (in binary format), and their semantics is implicit in their content. The content is thus easy for a human being to understand, but difficult for computers to interpret. Hence, there is a great need for a powerful and user-friendly system to annotate multimedia data with text-based, well-structured and searchable metadata. This chapter describes an ontology-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Music and Audio Processing
