Data-Driven Offline Decision-Making via Invariant Representation Learning
Han Qi, Yi Su, Aviral Kumar, Sergey Levine

TL;DR
This paper introduces invariant representation learning for offline decision-making, framing the problem as domain adaptation to improve prediction accuracy under distributional shift without active data collection.
Contribution
It proposes invariant objective models (IOM) that enforce invariance in learned representations to address distributional shift in offline decision-making tasks.
Findings
IOM effectively mitigates distributional shift in offline RL, bandits, and MBO.
The approach balances between staying close to training data and optimizing decisions.
Experimental results show improved decision quality over prior methods.
Abstract
The goal in offline data-driven decision-making is synthesize decisions that optimize a black-box utility function, using a previously-collected static dataset, with no active interaction. These problems appear in many forms: offline reinforcement learning (RL), where we must produce actions that optimize the long-term reward, bandits from logged data, where the goal is to determine the correct arm, and offline model-based optimization (MBO) problems, where we must find the optimal design provided access to only a static dataset. A key challenge in all these settings is distributional shift: when we optimize with respect to the input into a model trained from offline data, it is easy to produce an out-of-distribution (OOD) input that appears erroneously good. In contrast to prior approaches that utilize pessimism or conservatism to tackle this problem, in this paper, we formulate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Multi-Objective Optimization Algorithms · Reinforcement Learning in Robotics
MethodsAttentive Walk-Aggregating Graph Neural Network
