TL;DR
Jinx is an open-weight helpful-only language model designed to assist researchers in probing and understanding alignment failures without safety filters, thus enabling systematic safety evaluation.
Contribution
It introduces Jinx, a helpful-only variant of open-weight LLMs that responds to all queries, facilitating research on alignment failures and safety boundaries.
Findings
Provides a tool for systematic safety failure analysis
Maintains reasoning and instruction-following capabilities
Enables comparison with safety-aligned models
Abstract
Unlimited, or so-called helpful-only language models are trained without safety alignment constraints and never refuse user queries. They are widely used by leading AI companies as internal tools for red teaming and alignment evaluation. For example, if a safety-aligned model produces harmful outputs similar to an unlimited model, this indicates alignment failures that require further attention. Despite their essential role in assessing alignment, such models are not available to the research community. We introduce Jinx, a helpful-only variant of popular open-weight LLMs. Jinx responds to all queries without refusals or safety filtering, while preserving the base model's capabilities in reasoning and instruction following. It provides researchers with an accessible tool for probing alignment failures, evaluating safety boundaries, and systematically studying failure modes in language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Jinx-org/Jinx-Qwen3-0.6Bmodel· ♡ 2♡ 2
- 🤗Jinx-org/Jinx-Qwen3-1.7Bmodel· 29 dl· ♡ 229 dl♡ 2
- 🤗Jinx-org/Jinx-Qwen3-4Bmodel· 1 dl· ♡ 11 dl♡ 1
- 🤗Jinx-org/Jinx-Qwen3-8Bmodel· 26 dl· ♡ 326 dl♡ 3
- 🤗Jinx-org/Jinx-Qwen3-14Bmodel· 50 dl· ♡ 150 dl♡ 1
- 🤗Jinx-org/Jinx-Qwen3-30B-A3B-Thinking-2507model· ♡ 10♡ 10
- 🤗Jinx-org/Jinx-Qwen3-32Bmodel· 147 dl· ♡ 8147 dl♡ 8
- 🤗Jinx-org/Jinx-Qwen3-235B-A22B-Thinking-2507model· ♡ 4♡ 4
- 🤗Jinx-org/Jinx-gpt-oss-20bmodel· 33 dl· ♡ 9333 dl♡ 93
- 🤗Jinx-org/Jinx-DeepSeek-R1-0528model· ♡ 6♡ 6
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
