Loading paper
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation | Tomesphere