What if Pinocchio Were a Reinforcement Learning Agent: A Normative End-to-End Pipeline

Beno\^it Alcaraz

arXiv:2603.16651·cs.AI·April 14, 2026

What if Pinocchio Were a Reinforcement Learning Agent: A Normative End-to-End Pipeline

Beno\^it Alcaraz

PDF

TL;DR

This paper introduces a pipeline for developing norm-compliant, context-aware reinforcement learning agents supervised by argumentation-based advisors, addressing societal rule adherence in AI systems.

Contribution

It presents a novel hybrid model combining RL with normative argumentation advisors and a new algorithm for extracting normative arguments automatically.

Findings

01

Empirical evaluation of each pipeline component.

02

Demonstration of norm avoidance mitigation strategies.

03

Analysis of the pipeline's effectiveness in normative compliance.

Abstract

In the past decade, artificial intelligence (AI) has developed quickly. With this rapid progression came the need for systems capable of complying with the rules and norms of our society so that they can be successfully and safely integrated into our daily lives. Inspired by the story of Pinocchio in ``Le avventure di Pinocchio - Storia di un burattino'', this thesis proposes a pipeline that addresses the problem of developing norm compliant and context-aware agents. Building on the AJAR, Jiminy, and NGRL architectures, the work introduces \pino, a hybrid model in which reinforcement learning agents are supervised by argumentation-based normative advisors. In order to make this pipeline operational, this thesis also presents a novel algorithm for automatically extracting the arguments and relationships that underlie the advisors' decisions. Finally, this thesis investigates the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.