Loading paper
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning | Tomesphere