Fine-tuning the SwissBERT Encoder Model for Embedding Sentences and   Documents

Juri Grosjean; Jannis Vamvas

arXiv:2405.07513·cs.CL·May 14, 2024

Fine-tuning the SwissBERT Encoder Model for Embedding Sentences and Documents

Juri Grosjean, Jannis Vamvas

PDF

Open Access

TL;DR

This paper presents SentenceSwissBERT, a fine-tuned SwissBERT model optimized for embedding sentences and documents, improving accuracy in multilingual Swiss-specific tasks like semantic search and classification.

Contribution

The authors fine-tuned SwissBERT with contrastive learning to enhance its performance on sentence and document embeddings for Swiss languages.

Findings

01

SentenceSwissBERT outperforms original SwissBERT in document retrieval.

02

The model improves text classification accuracy in Swiss multilingual contexts.

03

Openly available for research use.

Abstract

Encoder models trained for the embedding of sentences or short documents have proven useful for tasks such as semantic search and topic modeling. In this paper, we present a version of the SwissBERT encoder model that we specifically fine-tuned for this purpose. SwissBERT contains language adapters for the four national languages of Switzerland -- German, French, Italian, and Romansh -- and has been pre-trained on a large number of news articles in those languages. Using contrastive learning based on a subset of these articles, we trained a fine-tuned version, which we call SentenceSwissBERT. Multilingual experiments on document retrieval and text classification in a Switzerland-specific setting show that SentenceSwissBERT surpasses the accuracy of the original SwissBERT model and of a comparable baseline. The model is openly available for research use.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsContrastive Learning