TL;DR
This paper introduces the first Vietnamese semantic role labeling system and corpus, utilizing a novel constituent extraction algorithm and advanced machine learning techniques to achieve a 74.77% F1 score.
Contribution
It presents the creation of Vietnamese PropBank, a new SRL corpus, and a novel argument extraction algorithm with an integrated machine learning framework.
Findings
Achieved an F1 score of 74.77% on Vietnamese SRL.
Developed the first Vietnamese SRL corpus and software system.
Proposed a new constituent extraction algorithm for argument identification.
Abstract
In this paper, we study semantic role labelling (SRL), a subtask of semantic parsing of natural language sentences and its application for the Vietnamese language. We present our effort in building Vietnamese PropBank, the first Vietnamese SRL corpus and a software system for labelling semantic roles of Vietnamese texts. In particular, we present a novel constituent extraction algorithm in the argument candidate identification step which is more suitable and more accurate than the common node-mapping method. In the machine learning part, our system integrates distributed word features produced by two recent unsupervised learning models in two learned statistical classifiers and makes use of integer linear programming inference procedure to improve the accuracy. The system is evaluated in a series of experiments and achieves a good result, an score of 74.77%. Our system, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
