Advances in Transformers for Robotic Applications: A Review
Nikunj Sanghai, Nik Bear Brown

TL;DR
This review paper discusses recent advances and trends in applying Transformer architectures to robotics, highlighting their integration into perception, planning, control, and reinforcement learning for autonomous systems.
Contribution
It provides a comprehensive overview of Transformer adaptations in robotics, covering recent research, applications, challenges, and future directions.
Findings
Transformers enhance robotic perception and decision-making.
Integration with Deep Reinforcement Learning improves autonomous system performance.
Transformers are adapted for reliable planning and human-robot interaction.
Abstract
The introduction of Transformers architecture has brought about significant breakthroughs in Deep Learning (DL), particularly within Natural Language Processing (NLP). Since their inception, Transformers have outperformed many traditional neural network architectures due to their "self-attention" mechanism and their scalability across various applications. In this paper, we cover the use of Transformers in Robotics. We go through recent advances and trends in Transformer architectures and examine their integration into robotic perception, planning, and control for autonomous systems. Furthermore, we review past work and recent research on use of Transformers in Robotics as pre-trained foundation models and integration of Transformers with Deep Reinforcement Learning (DRL) for autonomous systems. We discuss how different Transformer variants are being adapted in robotics for reliable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Soft Robotics and Applications
MethodsLinear Layer · Dropout · Attention Is All You Need · Dense Connections · Byte Pair Encoding · Multi-Head Attention · Adam · Layer Normalization · Position-Wise Feed-Forward Layer · Label Smoothing
