Stop Jostling: Adaptive Negative Sampling Reduces the Marginalization of Low-Resource Language Tokens by Cross-Entropy Loss

Galim Turumtaev

arXiv:2601.22439·cs.CL·February 2, 2026

Stop Jostling: Adaptive Negative Sampling Reduces the Marginalization of Low-Resource Language Tokens by Cross-Entropy Loss

Galim Turumtaev

PDF

Open Access

TL;DR

This paper introduces an adaptive negative sampling method that reduces marginalization of rare tokens in low-resource languages, significantly improving language model performance for underrepresented languages.

Contribution

It presents a novel thresholding technique for negative sampling that mitigates token marginalization, enhancing learning for low-resource language tokens.

Findings

01

Improved validation performance on low-resource languages

02

First application of negative sampling to address token marginalization

03

Significant gains over baseline models in low-resource scenarios

Abstract

Neural language models often struggle with low-resource languages due to the limited availability of training data, making tokens from these languages rare in the training set. This paper addresses a specific challenge during training: rare tokens are disproportionately affected by marginalization, which prevents them from learning effectively. We propose a thresholding technique that reduces the impact of this marginalization, allowing rare tokens to benefit from more meaningful alignment. Through experiments with a character-level language model, we demonstrate that this method significantly improves performance on low-resource language validation data. This work is the first to show how negative sampling can be applied to improve the representation of rare tokens by limiting the harmful influence of excessive marginalization, offering a new approach to enhancing language model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods