Multilingual Hierarchical Attention Networks for Document Classification

Nikolaos Pappas; Andrei Popescu-Belis

arXiv:1707.00896·cs.CL·September 18, 2017·48 cites

Multilingual Hierarchical Attention Networks for Document Classification

Nikolaos Pappas, Andrei Popescu-Belis

PDF

Open Access 2 Repos

TL;DR

This paper introduces multilingual hierarchical attention networks that share components across languages, enabling effective document classification with fewer parameters and improved transfer, demonstrated on a large multilingual news dataset.

Contribution

The paper proposes a novel multilingual hierarchical attention network architecture with shared encoders and attention mechanisms, enhancing cross-language transfer and efficiency.

Findings

01

Multilingual models outperform monolingual models in low-resource settings.

02

Shared models use fewer parameters than separate models.

03

Models achieve high accuracy on a large multilingual news dataset.

Abstract

Hierarchical attention networks have recently achieved remarkable performance for document classification in a given language. However, when multilingual document collections are considered, training such models separately for each language entails linear parameter growth and lack of cross-language transfer. Learning a single multilingual model with fewer parameters is therefore a challenging but potentially beneficial objective. To this end, we propose multilingual hierarchical attention networks for learning document structures, with shared encoders and/or shared attention mechanisms across languages, using multi-task learning and an aligned semantic space as input. We evaluate the proposed models on multilingual document classification with disjoint label sets, on a large dataset which we provide, with 600k news documents in 8 languages, and 5k labels. The multilingual models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies