Exploration of TPUs for AI Applications
Diego Sanmart\'in Carri\'on, Vera Prohaska

TL;DR
This paper explores the architecture, applications, and performance of TPUs in cloud and edge AI computing, highlighting their advantages and the need for further optimization and benchmarking standards.
Contribution
It provides a comprehensive overview of TPU design, compares their performance with other chips, and emphasizes the importance of optimization and benchmarking for edge AI deployment.
Findings
TPUs offer significant performance improvements in cloud and edge AI.
Performance varies across different chip architectures, highlighting TPU advantages.
Further research is needed for optimization and benchmarking standards.
Abstract
Tensor Processing Units (TPUs) are specialized hardware accelerators for deep learning developed by Google. This paper aims to explore TPUs in cloud and edge computing focusing on its applications in AI. We provide an overview of TPUs, their general architecture, specifically their design in relation to neural networks, compilation techniques and supporting frameworks. Furthermore, we provide a comparative analysis of Cloud and Edge TPU performance against other counterpart chip architectures. Our results show that TPUs can provide significant performance improvements in both cloud and edge computing. Additionally, this paper underscores the imperative need for further research in optimization techniques for efficient deployment of AI architectures on the Edge TPU and benchmarking standards for a more robust comparative analysis in edge computing scenarios. The primary motivation behind…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Stochastic Gradient Optimization Techniques · Quantum Computing Algorithms and Architecture
MethodsFocus · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
