Loading paper
Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human Gaze | Tomesphere