Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language
David Zajic, Michael Maxwell, David Doermann, Paul Rodrigues and, Michael Bloodgood

TL;DR
This paper introduces a novel paradigm that combines manual and automatic error correction in digital lexicographic resources using a simple, interpreted programming language called Dictionary Manipulation Language (DML), enabling efficient editing.
Contribution
It presents DML, a new language for expressing and automating corrections in structured lexicographic data, improving error correction processes.
Findings
DML allows precise node identification and manipulation.
Automated DML commands effectively correct recurring errors.
Manual DML editing facilitates one-off error repairs.
Abstract
We describe a paradigm for combining manual and automatic error correction of noisy structured lexicographic data. Modifications to the structure and underlying text of the lexicographic data are expressed in a simple, interpreted programming language. Dictionary Manipulation Language (DML) commands identify nodes by unique identifiers, and manipulations are performed using simple commands such as create, move, set text, etc. Corrected lexicons are produced by applying sequences of DML commands to the source version of the lexicon. DML commands can be written manually to repair one-off errors or generated automatically to correct recurring problems. We discuss advantages of the paradigm for the task of editing digital bilingual dictionaries.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Lexicography and Language Studies · Mathematics, Computing, and Information Processing
