How Useful Is Cross-Domain Generalization for Training LLM Monitors?

Sam Martin; Fabien Roger

arXiv:2605.12265·cs.AI·May 13, 2026

How Useful Is Cross-Domain Generalization for Training LLM Monitors?

Sam Martin, Fabien Roger

PDF

TL;DR

This paper investigates whether training language models on multiple classification tasks with different prompts enhances their ability to generalize across domains and prompts, revealing both benefits and limitations.

Contribution

It demonstrates that multi-task classification training partially generalizes to new domains and prompts, and proposes combining it with instruction-following training to improve robustness.

Findings

01

Training on multiple classification tasks improves performance on unseen domains.

02

Models struggle when classification prompts change drastically within the same domain.

03

Combining classification training with instruction-following training mitigates generalization failures.

Abstract

Using prompted language models as classifiers enables classification in domains with limited training data, but misses some of the robustness and performance benefits that fine-tuning can bring. We study whether training on multiple classification tasks, each with its own prompt, improves performance on new domains with new classification prompts. We show that such training partially generalizes to adjacent domains, improving classification performance on tasks that are unseen during training. However, we identify specific edge cases where the fine-tuned models fail to follow prompts, such as when the classification prompt changes completely while the data domain remains the same as during training. We show that classification training can be mixed with general instruction following training, and that (when done well) such training keeps the benefits of classification training and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.