NoReC: The Norwegian Review Corpus

Erik Velldal; Lilja {\O}vrelid; Eivind Alexander Bergem and; Cathrine Stadsnes; Samia Touileb; Fredrik J{\o}rgensen

arXiv:1710.05370·cs.CL·October 17, 2017·1 cites

NoReC: The Norwegian Review Corpus

Erik Velldal, Lilja {\O}vrelid, Eivind Alexander Bergem and, Cathrine Stadsnes, Samia Touileb, Fredrik J{\o}rgensen

PDF

Open Access 1 Repo

TL;DR

The paper introduces NoReC, a comprehensive Norwegian review corpus with over 35,000 labeled reviews from diverse domains, designed to facilitate sentiment analysis and opinion mining for Norwegian language technology.

Contribution

It provides the first large-scale, annotated Norwegian review dataset in a standardized format, supporting sentiment analysis research and development.

Findings

01

Over 35,000 reviews included

02

Diverse domains covered including literature, movies, and products

03

Resource supports Norwegian sentiment analysis advancements

Abstract

This paper presents the Norwegian Review Corpus (NoReC), created for training and evaluating models for document-level sentiment analysis. The full-text reviews have been collected from major Norwegian news sources and cover a range of different domains, including literature, movies, video games, restaurants, music and theater, in addition to product reviews across a range of categories. Each review is labeled with a manually assigned score of 1-6, as provided by the rating of the original author. This first release of the corpus comprises more than 35,000 reviews. It is distributed using the CoNLL-U format, pre-processed using UDPipe, along with a rich set of metadata. The work reported in this paper forms part of the SANT initiative (Sentiment Analysis for Norwegian Text), a project seeking to provide resources and tools for sentiment analysis and opinion mining for Norwegian. As…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ltgoslo/norec
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Topic Modeling · Computational and Text Analysis Methods