NorNE: Annotating Named Entities for Norwegian

Fredrik J{\o}rgensen; Tobias Aasmoe; Anne-Stine Ruud Husev{\aa}g,; Lilja {\O}vrelid; Erik Velldal

arXiv:1911.12146·cs.CL·March 9, 2020·1 cites

NorNE: Annotating Named Entities for Norwegian

Fredrik J{\o}rgensen, Tobias Aasmoe, Anne-Stine Ruud Husev{\aa}g,, Lilja {\O}vrelid, Erik Velldal

PDF

Open Access 1 Repo 1 Models 2 Datasets

TL;DR

This paper introduces NorNE, a comprehensive manually annotated corpus of Norwegian named entities covering Bokmål and Nynorsk, designed to support NLP tasks with detailed entity annotations and an analysis of annotation quality and neural model performance.

Contribution

The paper presents NorNE, the first extensive Norwegian named entity corpus with detailed annotations for multiple entity types and an evaluation of neural sequence labeling methods.

Findings

01

High inter-annotator agreement achieved

02

Effective neural models demonstrated on the corpus

03

Rich set of entity annotations enhances Norwegian NLP resources

Abstract

This paper presents NorNE, a manually annotated corpus of named entities which extends the annotation of the existing Norwegian Dependency Treebank. Comprising both of the official standards of written Norwegian (Bokm{\aa}l and Nynorsk), the corpus contains around 600,000 tokens and annotates a rich set of entity types including persons, organizations, locations, geo-political entities, products, and events, in addition to a class corresponding to nominals derived from names. We here present details on the annotation effort, guidelines, inter-annotator agreement and an experimental analysis of the corpus using a neural sequence labeling architecture.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ltgoslo/norne
noneOfficial

Models

🤗
saattrupdan/nbailab-base-ner-scandi
model· 61k dl· ♡ 24
61k dl♡ 24

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Biomedical Text Mining and Ontologies