Jointly Optimal Policies for Remote Estimation of Autoregressive Markov Processes over Time-Correlated Fading Channel
Manali Dutta, Rahul Singh, and Shalabh Bhatnagar

TL;DR
This paper develops a joint optimal transmission and estimation strategy for remote estimation of autoregressive Markov processes over a time-correlated fading channel, using POMDP formulation and reinforcement learning for unknown parameters.
Contribution
It introduces a structured optimal policy framework for joint transmission and estimation in a decentralized setting with unknown channel parameters, combining POMDP analysis and reinforcement learning.
Findings
Optimal transmission follows a threshold policy based on channel belief.
Estimation strategy uses a Kalman-like update rule.
Reinforcement learning achieves near-optimal performance with 5.5% gap.
Abstract
We study a remote estimation setup with an autoregressive (AR) Markov process, a sensor, and a remote estimator. The sensor observes the process and sends encoded observations to the estimator as packets over an unreliable communication channel modeled as the Gilbert-Elliot (GE) channel. We assume that the sensor gets to observe the channel state by the ACK/NACK feedback mechanism only when it attempts a transmission while it does not observe the channel state when no transmission attempt is made. The objective is to design a transmission scheduling strategy for the sensor, and an estimation strategy for the estimator that are jointly optimal, i.e., they minimize the expected value of an infinite-horizon cumulative discounted cost defined as the sum of squared estimation error over time and the sensor's transmission power. Since the sensor and the estimator have access to different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Age of Information Optimization · Adaptive Dynamic Programming Control
