Towards Efficient Resume Understanding: A Multi-Granularity Multi-Modal Pre-Training Approach
Feihu Jiang, Chuan Qin, Jingshuai Zhang, Kaichun Yao, Xi Chen, Dazhong, Shen, Chen Zhu, Hengshu Zhu, Hui Xiong

TL;DR
This paper introduces ERU, a multi-modal transformer model that efficiently captures hierarchical and multi-granular information in resumes for automatic structured data extraction, outperforming traditional methods.
Contribution
The paper presents a novel multi-modal, layout-aware transformer with self-supervised pre-training and multi-granularity fine-tuning for improved resume understanding.
Findings
ERU significantly outperforms baseline models on real-world datasets.
The multi-modal fusion approach effectively integrates textual, visual, and layout information.
Self-supervised pre-training enhances the model's ability to understand resume structures.
Abstract
In the contemporary era of widespread online recruitment, resume understanding has been widely acknowledged as a fundamental and crucial task, which aims to extract structured information from resume documents automatically. Compared to the traditional rule-based approaches, the utilization of recently proposed pre-trained document understanding models can greatly enhance the effectiveness of resume understanding. The present approaches have, however, disregarded the hierarchical relations within the structured information presented in resumes, and have difficulty parsing resumes in an efficient manner. To this end, in this paper, we propose a novel model, namely ERU, to achieve efficient resume understanding. Specifically, we first introduce a layout-aware multi-modal fusion transformer for encoding the segments in the resume with integrated textual, visual, and layout information.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Topic Modeling · Natural Language Processing Techniques
MethodsDepthwise Convolution · Depthwise Separable Convolution · Dilated Convolution · 1x1 Convolution · Convolution · Grouped Convolution · Sigmoid Activation · Depthwise Dilated Separable Convolution · Pointwise Convolution · Hierarchical Feature Fusion
