Impact of Subword Pooling Strategy on Cross-lingual Event Detection
Shantanu Agarwal, Steven Fincke, Chris Jenkins, Scott Miller,, Elizabeth Boschee

TL;DR
This paper investigates how different subword pooling strategies affect cross-lingual event detection performance, revealing that attention pooling consistently offers robust and near-optimal results across multiple languages and datasets.
Contribution
It systematically evaluates the impact of various subword pooling strategies on cross-lingual event detection, highlighting the superiority of attention pooling over traditional methods.
Findings
Performance varies significantly with pooling strategy, up to 16 F1 points.
Attention pooling is consistently robust and near-optimal across languages.
Using only the first subword is often sub-optimal.
Abstract
Pre-trained multilingual language models (e.g., mBERT, XLM-RoBERTa) have significantly advanced the state-of-the-art for zero-shot cross-lingual information extraction. These language models ubiquitously rely on word segmentation techniques that break a word into smaller constituent subwords. Therefore, all word labeling tasks (e.g. named entity recognition, event detection, etc.), necessitate a pooling strategy that takes the subword representations as input and outputs a representation for the entire word. Taking the task of cross-lingual event detection as a motivating example, we show that the choice of pooling strategy can have a significant impact on the target language performance. For example, the performance varies by up to 16 absolute points depending on the pooling strategy when training in English and testing in Arabic on the ACE task. We carry out our analysis with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques
MethodsAttention Pooling · mBERT
