Efficient Implementation of LinearUCB through Algorithmic Improvements and Vector Computing Acceleration for Embedded Learning Systems
Marco Angioli, Marcello Barbirotta, Abdallah Cheikh, Antonio, Mastrandrea, Francesco Menichelli, Mauro Olivieri

TL;DR
This paper introduces combined algorithmic and hardware optimizations to efficiently implement LinearUCB algorithms on resource-limited embedded devices, enabling real-time autonomous decision-making in IoT applications.
Contribution
It proposes novel algorithmic modifications and vector computing acceleration techniques to improve the efficiency of LinearUCB algorithms on embedded systems.
Findings
Significant reduction in execution time.
Lower energy consumption.
Enhanced suitability for real-time IoT applications.
Abstract
As the Internet of Things expands, embedding Artificial Intelligence algorithms in resource-constrained devices has become increasingly important to enable real-time, autonomous decision-making without relying on centralized cloud servers. However, implementing and executing complex algorithms in embedded devices poses significant challenges due to limited computational power, memory, and energy resources. This paper presents algorithmic and hardware techniques to efficiently implement two LinearUCB Contextual Bandits algorithms on resource-constrained embedded devices. Algorithmic modifications based on the Sherman-Morrison-Woodbury formula streamline model complexity, while vector acceleration is harnessed to speed up matrix operations. We analyze the impact of each optimization individually and then combine them in a two-pronged strategy. The results show notable improvements in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
