Exploring Swedish & English fastText Embeddings for NER with the   Transformer

Tosin P. Adewumi; Foteini Liwicki; Marcus Liwicki

arXiv:2007.16007·cs.CL·April 20, 2021·1 cites

Exploring Swedish & English fastText Embeddings for NER with the Transformer

Tosin P. Adewumi, Foteini Liwicki, Marcus Liwicki

PDF

Open Access 1 Repo

TL;DR

This study demonstrates that smaller, well-trained embeddings can outperform larger ones in NER tasks using Transformers, and introduces a new Swedish analogy test set to aid future research.

Contribution

It shows that smaller corpora can produce effective embeddings for NER and provides a new Swedish analogy test set for benchmarking.

Findings

01

Smaller embeddings can outperform larger ones in NER tasks.

02

Character n-grams improve Swedish embedding performance.

03

Swedish and English embeddings achieve better results with less training data.

Abstract

In this paper, our main contributions are that embeddings from relatively smaller corpora can outperform ones from larger corpora and we make the new Swedish analogy test set publicly available. To achieve a good network performance in natural language processing (NLP) downstream tasks, several factors play important roles: dataset size, the right hyper-parameters, and well-trained embeddings. We show that, with the right set of hyper-parameters, good network performance can be reached even on smaller datasets. We evaluate the embeddings at both the intrinsic and extrinsic levels. The embeddings are deployed with the Transformer in named entity recognition (NER) task and significance tests conducted. This is done for both Swedish and English. We obtain better performance in both languages on the downstream task with smaller training data, compared to recently released, Common Crawl…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tosingithub/tdesk
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization · Attention Is All You Need · Label Smoothing · Adam · Dropout · Multi-Head Attention · Softmax