Language Independent Acquisition of Abbreviations
Michael R. Glass, Md Faisal Mahbub Chowdhury, Alfio M. Gliozzo

TL;DR
This paper presents a multilingual approach for automatic abbreviation extraction and expansion, leveraging Wikipedia data, and introduces a new resource for evaluation with improved cross-lingual performance.
Contribution
It introduces a multilingual resource for abbreviations and a machine learning method for scoring expansion candidates, addressing previous limitations of single expansion assumptions.
Findings
Improved abbreviation expansion accuracy across seven languages.
Effective use of Wikipedia redirect and disambiguation pages for resource creation.
Enhanced performance in non-Latin script languages.
Abstract
This paper addresses automatic extraction of abbreviations (encompassing acronyms and initialisms) and corresponding long-form expansions from plain unstructured text. We create and are going to release a multilingual resource for abbreviations and their corresponding expansions, built automatically by exploiting Wikipedia redirect and disambiguation pages, that can be used as a benchmark for evaluation. We address a shortcoming of previous work where only the redirect pages were used, and so every abbreviation had only a single expansion, even though multiple different expansions are possible for many of the abbreviations. We also develop a principled machine learning based approach to scoring expansion candidates using different techniques such as indicators of near synonymy, topical relatedness, and surface similarity. We show improved performance over seven languages, including two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
