VGQ-CNN: Moving Beyond Fixed Cameras and Top-Grasps for Grasp Quality Prediction
A. Konrad, J. McDonald, R. Villing

TL;DR
VGQ-CNN is a versatile neural network for 6-DOF grasp quality prediction that generalizes across camera poses and object orientations, enabling more flexible robotic grasping without retraining.
Contribution
The paper introduces VGQ-CNN, a novel grasp quality prediction network that explicitly incorporates grasp orientation as input, allowing evaluation of 6-DOF grasps across diverse camera viewpoints.
Findings
Achieves 82.1% balanced accuracy on test set.
Generalizes well to various camera poses.
Fast inference with 128 predictions in 12ms on CPU.
Abstract
We present the Versatile Grasp Quality Convolutional Neural Network (VGQ-CNN), a grasp quality prediction network for 6-DOF grasps. VGQ-CNN can be used when evaluating grasps for objects seen from a wide range of camera poses or mobile robots without the need to retrain the network. By defining the grasp orientation explicitly as an input to the network, VGQ-CNN can evaluate 6-DOF grasp poses, moving beyond the 4-DOF grasps used in most image-based grasp evaluation methods like GQ-CNN. To train VGQ-CNN, we generate the new Versatile Grasp dataset (VG-dset) containing 6-DOF grasps observed from a wide range of camera poses. VGQ-CNN achieves a balanced accuracy of 82.1% on our test-split while generalising to a variety of camera poses. Meanwhile, it achieves competitive performance for overhead cameras and top-grasps with a balanced accuracy of 74.2% compared to GQ-CNN's 76.6%. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
