Latent-Predictive Empowerment: Measuring Empowerment without a Simulator
Andrew Levy, Alessandro Allievi, George Konidaris

TL;DR
Latent-Predictive Empowerment (LPE) enables agents to learn diverse skills by maximizing a predictive objective that does not require a full environment model, making empowerment scalable in complex, stochastic settings.
Contribution
LPE introduces a practical empowerment method using a latent-predictive model, removing the need for a full environment simulator, and demonstrates comparable or superior skill learning.
Findings
LPE learns skillsets similar in size to model-based methods.
LPE outperforms other model-based empowerment approaches.
Effective in high-dimensional and stochastic environments.
Abstract
Empowerment has the potential to help agents learn large skillsets, but is not yet a scalable solution for training general-purpose agents. Recent empowerment methods learn diverse skillsets by maximizing the mutual information between skills and states; however, these approaches require a model of the transition dynamics, which can be challenging to learn in realistic settings with high-dimensional and stochastic observations. We present Latent-Predictive Empowerment (LPE), an algorithm that can compute empowerment in a more practical manner. LPE learns large skillsets by maximizing an objective that is a principled replacement for the mutual information between skills and states and that only requires a simpler latent-predictive model rather than a full simulator of the environment. We show empirically in a variety of settings--including ones with high-dimensional observations and…
Peer Reviews
Decision·Submitted to ICLR 2025
The motivation of measure empowerment without requiring a full simulator of the environment’s transition dynamics makes sense to me.
1. The paper looks highly similar to the paper “Learning Large Skillsets in Stochastic Settings with Empowerment”. The whole paragraphs and equations, e.g. Background and Results, are the same with only a few words being different. Also, the paper is difficult to read and the narrative can be simplified and improved. 2. The experiments are toy examples. The baselines are not strong and comprehensive enough and the preference improvement is marginal. More environments and more complex tasks may
1. The proposed algorithm LPE sounds effective as a scalable method to compute the empowerment of states. 2. Experimental results show the superiority of LPE in the tasks from the fundamental baseline algorithm.
1. It seems that the environments in the experiments are still simple. Since LPE is proposed to function as a scalable method, it would be better to validate the effectiveness of LPE in more complex and practical environments, especially in tasks with higher-dimensional state spaces and more complex dynamics. 2. The authors are expected to improve writing for easier reading, especially in 3.1. Besides, there are some typos. For example, “defined” in Line 127 and “the” in Line 246 are repeated.
* Compared to its previous work SE, which requires a transition dynamics model $p(s' \mid s, a)$, the proposed method doesn't require it, and thus is relatively better scalable to high-dimensional image-based domains.
* It is unclear how good LPE is compared to other empowerment or unsupervised skill discovery methods. The authors only compare LPE with SE (the closest work) and its ablations, and the domains are mostly simple and a bit "contrived". For example, it is unclear how it works in more complex environments, such as Gym MuJoCo locomotion tasks, URLB tasks, robotic manipulation domains, or similar realistic environments, and how good it is compared to other existing unsupervised skill learning methods
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCommunity Health and Development
