MELO: An Evaluation Benchmark for Multilingual Entity Linking of   Occupations

Federico Retyk; Luis Gasco; Casimiro Pio Carrino; Daniel Deniz; Rabih; Zbib

arXiv:2410.08319·cs.CL·October 14, 2024

MELO: An Evaluation Benchmark for Multilingual Entity Linking of Occupations

Federico Retyk, Luis Gasco, Casimiro Pio Carrino, Daniel Deniz, Rabih, Zbib

PDF

Open Access 1 Repo

TL;DR

MELO is a comprehensive multilingual benchmark with 48 datasets for evaluating entity linking to the ESCO Occupations taxonomy across 21 languages, facilitating future research in multilingual occupation entity linking.

Contribution

The paper introduces MELO, a new multilingual benchmark dataset for entity linking of occupations, along with baseline experiments using lexical models and sentence encoders.

Findings

01

Baseline models provide initial performance metrics.

02

MELO covers 21 languages with high-quality annotations.

03

Publicly available datasets and code support future research.

Abstract

We present the Multilingual Entity Linking of Occupations (MELO) Benchmark, a new collection of 48 datasets for evaluating the linking of entity mentions in 21 languages to the ESCO Occupations multilingual taxonomy. MELO was built using high-quality, pre-existent human annotations. We conduct experiments with simple lexical models and general-purpose sentence encoders, evaluated as bi-encoders in a zero-shot setup, to establish baselines for future research. The datasets and source code for standardized evaluation are publicly available at https://github.com/Avature/melo-benchmark

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

avature/melo-benchmark
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOccupational Health and Safety Research