Exploration of TPUs for AI Applications

Diego Sanmart\'in Carri\'on; Vera Prohaska

arXiv:2309.08918·cs.AR·November 15, 2023·1 cites

Exploration of TPUs for AI Applications

Diego Sanmart\'in Carri\'on, Vera Prohaska

PDF

Open Access

TL;DR

This paper explores the architecture, applications, and performance of TPUs in cloud and edge AI computing, highlighting their advantages and the need for further optimization and benchmarking standards.

Contribution

It provides a comprehensive overview of TPU design, compares their performance with other chips, and emphasizes the importance of optimization and benchmarking for edge AI deployment.

Findings

01

TPUs offer significant performance improvements in cloud and edge AI.

02

Performance varies across different chip architectures, highlighting TPU advantages.

03

Further research is needed for optimization and benchmarking standards.

Abstract

Tensor Processing Units (TPUs) are specialized hardware accelerators for deep learning developed by Google. This paper aims to explore TPUs in cloud and edge computing focusing on its applications in AI. We provide an overview of TPUs, their general architecture, specifically their design in relation to neural networks, compilation techniques and supporting frameworks. Furthermore, we provide a comparative analysis of Cloud and Edge TPU performance against other counterpart chip architectures. Our results show that TPUs can provide significant performance improvements in both cloud and edge computing. Additionally, this paper underscores the imperative need for further research in optimization techniques for efficient deployment of AI architectures on the Edge TPU and benchmarking standards for a more robust comparative analysis in edge computing scenarios. The primary motivation behind…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Stochastic Gradient Optimization Techniques · Quantum Computing Algorithms and Architecture

MethodsFocus · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings