Serialising the ISO SynAF Syntactic Object Model
Laurent Romary (IDSL, INRIA Saclay - Ile de France), Amir Zeldes,, Florian Zipser (IDSL, INRIA Saclay - Ile de France)

TL;DR
This paper presents an XML serialization format for the ISO SynAF syntactic object model, supporting various syntactic phenomena and integrating with other standards, demonstrated through a German Treebank case study.
Contribution
It introduces a comprehensive XML format for SynAF that supports diverse syntactic structures and interfaces with existing standards, enhancing interoperability.
Findings
Successfully serialized complex syntactic phenomena
Demonstrated integration with MAF and ISOCat standards
Applied to German Treebank with positive results
Abstract
This paper introduces, an XML format developed to serialise the object model defined by the ISO Syntactic Annotation Framework SynAF. Based on widespread best practices we adapt a popular XML format for syntactic annotation, TigerXML, with additional features to support a variety of syntactic phenomena including constituent and dependency structures, binding, and different node types such as compounds or empty elements. We also define interfaces to other formats and standards including the Morpho-syntactic Annotation Framework MAF and the ISOCat Data Category Registry. Finally a case study of the German Treebank TueBa-D/Z is presented, showcasing the handling of constituent structures, topological fields and coreference annotation in tandem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Advanced Database Systems and Queries
