Privacy-Preserving Reinforcement Learning Beyond Expectation

Arezoo Rajabi; Bhaskar Ramasubramanian; Abdullah Al Maruf; Radha; Poovendran

arXiv:2203.10165·cs.LG·March 22, 2022

Privacy-Preserving Reinforcement Learning Beyond Expectation

Arezoo Rajabi, Bhaskar Ramasubramanian, Abdullah Al Maruf, Radha, Poovendran

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning framework that incorporates human-like risk assessment via CPT and ensures privacy through differential privacy, enabling agents to learn human-aligned behaviors without revealing sensitive information.

Contribution

It develops a novel RL algorithm combining CPT-based objectives with differential privacy guarantees, addressing risk modeling and privacy preservation simultaneously.

Findings

01

The algorithm effectively balances privacy and utility in learning policies.

02

Agents can learn human-aligned behaviors while maintaining privacy guarantees.

03

Empirical results demonstrate a clear privacy-utility tradeoff.

Abstract

Cyber and cyber-physical systems equipped with machine learning algorithms such as autonomous cars share environments with humans. In such a setting, it is important to align system (or agent) behaviors with the preferences of one or more human users. We consider the case when an agent has to learn behaviors in an unknown environment. Our goal is to capture two defining characteristics of humans: i) a tendency to assess and quantify risk, and ii) a desire to keep decision making hidden from external parties. We incorporate cumulative prospect theory (CPT) into the objective of a reinforcement learning (RL) problem for the former. For the latter, we use differential privacy. We design an algorithm to enable an RL agent to learn policies to maximize a CPT-based objective in a privacy-preserving manner and establish guarantees on the privacy of value functions learned by the algorithm when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety

MethodsGaussian Process