Set-LLM: A Permutation-Invariant LLM

Beni Egressy; Jan St\"uhmer

arXiv:2505.15433·cs.LG·May 22, 2025

Set-LLM: A Permutation-Invariant LLM

Beni Egressy, Jan St\"uhmer

PDF

Open Access 1 Video

TL;DR

This paper introduces Set-LLM, an architecture that makes large language models permutation-invariant, reducing order bias and improving robustness in tasks involving set inputs without sacrificing performance or runtime.

Contribution

Set-LLM is the first approach to adapt pretrained LLMs for permutation invariance through novel attention masks and positional encodings, with theoretical guarantees and practical effectiveness.

Findings

01

Set-LLM achieves permutation invariance in LLMs.

02

Set-LLM maintains or improves performance on set-based tasks.

03

Set-LLM does not increase runtime compared to original models.

Abstract

While large language models (LLMs) demonstrate impressive capabilities across numerous applications, their robustness remains a critical concern. This paper is motivated by a specific vulnerability: the order sensitivity of LLMs. This vulnerability manifests itself as the order bias observed when LLMs decide between possible options (for example, a preference for the first option) and the tendency of LLMs to provide different answers when options are reordered. The use cases for this scenario extend beyond the classical case of multiple-choice question answering to the use of LLMs as automated evaluators in AI pipelines, comparing output generated by different models. We introduce Set-LLM, a novel architectural adaptation for pretrained LLMs that enables the processing of mixed set-text inputs with permutation invariance guarantees. The adaptations involve a new attention mask and new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Set-LLM: A Permutation-Invariant LLM· slideslive

Taxonomy

TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques

MethodsSoftmax · Attention Is All You Need