Proposal Report for the 2nd SciCAP Competition 2024
Pengpeng Li, Tingmin Li, Jingyuan Wang, Boyuan Wang, Yang Yang

TL;DR
This paper presents a novel document summarization method that uses auxiliary information like images and tables, leveraging OCR data and model enhancements to achieve top scores in the 2024 SciCAP competition.
Contribution
It introduces an auxiliary information-based summarization approach and enhances text generation models, leading to superior performance in scientific captioning tasks.
Findings
Achieved top scores of 4.33 and 4.66 in SciCAP long and short caption tracks.
Effectively leverages OCR data and auxiliary information for summarization.
Enhanced models outperform baseline methods in competition.
Abstract
In this paper, we propose a method for document summarization using auxiliary information. This approach effectively summarizes descriptions related to specific images, tables, and appendices within lengthy texts. Our experiments demonstrate that leveraging high-quality OCR data and initially extracted information from the original text enables efficient summarization of the content related to described objects. Based on these findings, we enhanced popular text generation model models by incorporating additional auxiliary branches to improve summarization performance. Our method achieved top scores of 4.33 and 4.66 in the long caption and short caption tracks, respectively, of the 2024 SciCAP competition, ranking highest in both categories.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical and Engineering Education · Innovation Policy and R&D
