Verb Knowledge Injection for Multilingual Event Processing

Olga Majewska; Ivan Vuli\'c; Goran Glava\v{s}; Edoardo M. Ponti; Anna; Korhonen

arXiv:2012.15421·cs.CL·October 19, 2021

Verb Knowledge Injection for Multilingual Event Processing

Olga Majewska, Ivan Vuli\'c, Goran Glava\v{s}, Edoardo M. Ponti, Anna, Korhonen

PDF

TL;DR

This paper enhances multilingual event extraction by injecting explicit verb knowledge into Transformer models, improving their understanding and transferability across languages through dedicated adapter modules.

Contribution

It introduces verb adapters that incorporate curated lexical verb knowledge into multilingual Transformers, boosting event extraction performance and cross-lingual transfer capabilities.

Findings

01

Verb knowledge injection improves English event extraction accuracy.

02

Multilingual verb adapters enhance zero-shot transfer in other languages.

03

Noisy translation of lexical constraints still benefits from verb knowledge injection.

Abstract

In parallel to their overwhelming success across NLP tasks, language ability of deep Transformer networks, pretrained via language modeling (LM) objectives has undergone extensive scrutiny. While probing revealed that these models encode a range of syntactic and semantic properties of a language, they are still prone to fall back on superficial cues and simple heuristics to solve downstream tasks, rather than leverage deeper linguistic knowledge. In this paper, we target one such area of their deficiency, verbal reasoning. We investigate whether injecting explicit information on verbs' semantic-syntactic behaviour improves the performance of LM-pretrained Transformers in event extraction tasks -- downstream tasks for which accurate verb processing is paramount. Concretely, we impart the verb knowledge from curated lexical resources into dedicated adapter modules (dubbed verb adapters),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Multi-Head Attention · Dropout · Softmax · Dense Connections · Label Smoothing · Attention Is All You Need