PathFormer: A Transformer with 3D Grid Constraints for Digital Twin Robot-Arm Trajectory Generation
Ahmed Alanazi, Duy Ho, and Yugyung Lee

TL;DR
PathFormer introduces a transformer model that uses 3D grid constraints for robot arm trajectory generation, achieving high accuracy, safety, and success rates in complex, real-world tasks with digital twin simulation.
Contribution
It proposes a novel path-based transformer with grid constraints and constraint-masked decoding for improved robotic trajectory planning.
Findings
Achieves 89.44% stepwise accuracy in trajectory prediction.
Attains 97.5% reach and 92.5% pick success in controlled tests.
Successfully completes 86.7% of language-specified tasks in cluttered scenes.
Abstract
Robotic arms require precise, task-aware trajectory planning, yet sequence models that ignore motion structure often yield invalid or inefficient executions. We present a Path-based Transformer that encodes robot motion with a 3-grid (where/what/when) representation and constraint-masked decoding, enforcing lattice-adjacent moves and workspace bounds while reasoning over task graphs and action order. Trained on 53,755 trajectories (80% train / 20% validation), the model aligns closely with ground truth -- 89.44% stepwise accuracy, 93.32% precision, 89.44% recall, and 90.40% F1 -- with 99.99% of paths legal by construction. Compiled to motor primitives on an xArm Lite 6 with a depth-camera digital twin, it attains up to 97.5% reach and 92.5% pick success in controlled tests, and 86.7% end-to-end success across 60 language-specified tasks in cluttered scenes, absorbing slips and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotic Path Planning Algorithms · Robotic Mechanisms and Dynamics
