Many-Tier Instruction Hierarchy in LLM Agents

Jingyu Zhang; Tianjian Li; William Jurayj; Hongyuan Zhan; Benjamin Van Durme; Daniel Khashabi

arXiv:2604.09443·cs.CL·April 15, 2026

Many-Tier Instruction Hierarchy in LLM Agents

Jingyu Zhang, Tianjian Li, William Jurayj, Hongyuan Zhan, Benjamin Van Durme, Daniel Khashabi

PDF

1 Repo 3 Datasets

TL;DR

This paper introduces ManyIH, a new paradigm and benchmark for resolving instruction conflicts across many privilege levels in large language model agents, highlighting current models' limitations.

Contribution

The paper proposes ManyIH, a scalable instruction hierarchy framework, and introduces ManyIH-Bench, a benchmark with realistic, multi-level conflicting instructions for LLM agents.

Findings

01

Current models perform poorly (~40% accuracy) on ManyIH-Bench.

02

ManyIH-Bench includes 853 tasks with up to 12 conflicting instruction levels.

03

The benchmark spans 46 real-world agent scenarios.

Abstract

Large language model agents receive instructions from many sources-system messages, user prompts, tool outputs, other agents, and more-each carrying different levels of trust and authority. When these instructions conflict, agents must reliably follow the highest-privilege instruction to remain safe and effective. The dominant paradigm, instruction hierarchy (IH), assumes a fixed, small set of privilege levels (typically fewer than five) defined by rigid role labels (e.g., system > user). This is inadequate for real-world agentic settings, where conflicts can arise across far more sources and contexts. In this work, we propose Many-Tier Instruction Hierarchy (ManyIH), a paradigm for resolving instruction conflicts among instructions with arbitrarily many privilege levels. We introduce ManyIH-Bench, the first benchmark for ManyIH. ManyIH-Bench requires models to navigate up to 12 levels…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jhu-clsp/ManyIH
github

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.