Tiny, On-Device Decision Makers with the MiniConv Library
Carlos Purves

TL;DR
This paper presents MiniConv, a split-policy RL architecture that enables efficient on-device visual decision-making by reducing data transmission and latency, suitable for resource-constrained edge devices.
Contribution
It introduces a novel split-policy architecture with a GPU-supported on-device encoder, improving RL deployment on edge devices with minimal performance trade-offs.
Findings
Reduces data transmission by transforming observations into compact features.
Lowers decision latency in bandwidth-limited environments.
Achieves comparable RL performance to traditional methods.
Abstract
Reinforcement learning (RL) has achieved strong results, but deploying visual policies on resource-constrained edge devices remains challenging due to computational cost and communication latency. Many deployments therefore offload policy inference to a remote server, incurring network round trips and requiring transmission of high-dimensional observations. We introduce a split-policy architecture in which a small on-device encoder, implemented as OpenGL fragment-shader passes for broad embedded GPU support, transforms each observation into a compact feature tensor that is transmitted to a remote policy head. In RL, this communication overhead manifests as closed-loop decision latency rather than only per-request inference latency. The proposed approach reduces transmitted data, lowers decision latency in bandwidth-limited settings, and reduces server-side compute per request, whilst…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Software-Defined Networks and 5G · Advanced Neural Network Applications
