Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs)
Anna Neumann, Elisabeth Kirsten, Muhammad Bilal Zafar, Jatinder Singh

TL;DR
This paper investigates how the placement of information in system prompts of large language models influences bias and behavior, revealing significant biases and advocating for transparency and auditing of system prompts.
Contribution
It provides an empirical analysis of biases introduced by system prompts in LLMs and highlights the need for transparency and auditing in their deployment.
Findings
Significant biases linked to prompt placement across models and demographic groups
Opaque system prompts can cause unintended biases and harms
Recommends incorporating system prompt analysis into AI auditing
Abstract
System prompts in Large Language Models (LLMs) are predefined directives that guide model behaviour, taking precedence over user inputs in text processing and generation. LLM deployers increasingly use them to ensure consistent responses across contexts. While model providers set a foundation of system prompts, deployers and third-party developers can append additional prompts without visibility into others' additions, while this layered implementation remains entirely hidden from end-users. As system prompts become more complex, they can directly or indirectly introduce unaccounted for side effects. This lack of transparency raises fundamental questions about how the position of information in different directives shapes model outputs. As such, this work examines how the placement of information affects model behaviour. To this end, we compare how models process demographic information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training
