Approaches to Semantic Textual Similarity in Slovak Language: From Algorithms to Transformers

Lukas Radosky; Miroslav Blstak; Matej Krajcovic; Ivan Polasek

arXiv:2602.04659·cs.CL·February 5, 2026

Approaches to Semantic Textual Similarity in Slovak Language: From Algorithms to Transformers

Lukas Radosky, Miroslav Blstak, Matej Krajcovic, Ivan Polasek

PDF

Open Access

TL;DR

This paper evaluates various semantic textual similarity methods for Slovak, comparing traditional algorithms, machine learning models, and deep learning tools, highlighting their strengths and trade-offs.

Contribution

It provides a comprehensive comparison of STS approaches in Slovak, including novel use of optimization for feature selection and hyperparameter tuning.

Findings

01

Traditional algorithms have limited accuracy.

02

Deep learning models outperform classical methods.

03

Optimization improves machine learning model performance.

Abstract

Semantic textual similarity (STS) plays a crucial role in many natural language processing tasks. While extensively studied in high-resource languages, STS remains challenging for under-resourced languages such as Slovak. This paper presents a comparative evaluation of sentence-level STS methods applied to Slovak, including traditional algorithms, supervised machine learning models, and third-party deep learning tools. We trained several machine learning models using outputs from traditional algorithms as features, with feature selection and hyperparameter tuning jointly guided by artificial bee colony optimization. Finally, we evaluated several third-party tools, including fine-tuned model by CloudNLP, OpenAI's embedding models, GPT-4 model, and pretrained SlovakBERT model. Our findings highlight the trade-offs between different approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Language and cultural evolution