Reinforcement Learning with Segment Feedback

Yihan Du; Anna Winnicki; Gal Dalal; Shie Mannor; R. Srikant

arXiv:2502.01876·cs.LG·June 18, 2025

Reinforcement Learning with Segment Feedback

Yihan Du, Anna Winnicki, Gal Dalal, Shie Mannor, R. Srikant

PDF

Open Access 1 Video

TL;DR

This paper introduces a new RL model called RL with segment feedback, analyzing how different feedback types and segment counts affect learning efficiency through theoretical bounds and experiments.

Contribution

It proposes the RL with segment feedback model, providing algorithms and regret bounds for binary and sum feedback settings, highlighting the impact of segment number on learning performance.

Findings

01

Increasing segments reduces regret exponentially under binary feedback.

02

Segment count has little effect on regret under sum feedback.

03

Theoretical and experimental validation of the model's behavior.

Abstract

Standard reinforcement learning (RL) assumes that an agent can observe a reward for each state-action pair. However, in practical applications, it is often difficult and costly to collect a reward for each state-action pair. While there have been several works considering RL with trajectory feedback, it is unclear if trajectory feedback is inefficient for learning when trajectories are long. In this work, we consider a model named RL with segment feedback, which offers a general paradigm filling the gap between per-state-action feedback and trajectory feedback. In this model, we consider an episodic Markov decision process (MDP), where each episode is divided into $m$ segments, and the agent observes reward feedback only at the end of each segment. Under this model, we study two popular feedback settings: binary feedback and sum feedback, where the agent observes a binary outcome and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Reinforcement Learning with Segment Feedback· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Elevator Systems and Control