Sequence Repetition Enhances Token Embeddings and Improves Sequence Labeling with Decoder-only Language Models

Matija Luka Kuki\'c; Marko \v{C}uljak; David Duki\'c; Martin Tutek; Jan \v{S}najder

arXiv:2601.17585·cs.CL·January 27, 2026

Sequence Repetition Enhances Token Embeddings and Improves Sequence Labeling with Decoder-only Language Models

Matija Luka Kuki\'c, Marko \v{C}uljak, David Duki\'c, Martin Tutek, Jan \v{S}najder

PDF

Open Access 1 Video

TL;DR

This paper introduces sequence repetition as a simple method to make decoder-only language models bidirectional, enhancing token embeddings and sequence labeling performance without major model modifications.

Contribution

It demonstrates that sequence repetition naturally induces bidirectionality in decoder-only models, improving token-level embeddings and sequence labeling accuracy.

Findings

01

Sequence repetition improves token embedding quality.

02

SR surpasses encoder-only models in sequence labeling.

03

Intermediate layer embeddings are as effective as final layers.

Abstract

Modern language models (LMs) are trained in an autoregressive manner, conditioned only on the prefix. In contrast, sequence labeling (SL) tasks assign labels to each individual input token, naturally benefiting from bidirectional context. This discrepancy has historically led SL to rely on inherently bidirectional encoder-only models. However, the rapid development of decoder-only models has raised the question of whether they can be adapted to SL. While causal mask removal has emerged as a viable technique for adapting decoder-only models to leverage the full context for SL, it requires considerable changes to the base model functionality. In this work, we explore sequence repetition (SR) as a less invasive alternative for enabling bidirectionality in decoder-only models. Through fine-tuning experiments, we show that SR inherently makes decoders bidirectional, improving the quality of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Sequence Repetition Enhances Token Embeddings and Improves Sequence Labeling with Decoder-only Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education