Learning to Match Job Candidates Using Multilingual Bi-Encoder BERT
Dor Lavi

TL;DR
This paper presents a multilingual BERT-based bi-encoder model trained on Randstad's candidate placement data to improve CV-vacancy matching, addressing language barriers and vocabulary gaps for scalable recruitment solutions.
Contribution
It introduces a fine-tuned multilingual bi-encoder BERT model with cosine similarity loss for scalable, language-agnostic candidate-job matching.
Findings
Enhanced semantic understanding of CVs and vacancies.
Effective bridging of vocabulary gaps across languages.
Potential reduction of bias through multilingual transformers.
Abstract
In this talk, we will show how we used Randstad history of candidate placements to generate labeled CV-vacancy pairs dataset. Afterwards we fine-tune a multilingual BERT with bi encoder structure over this dataset, by adding a cosine similarity log loss layer. We will explain how using the mentioned structure helps us overcome most of the challenges described above, and how it enables us to build a maintainable and scalable pipeline to match CVs and vacancies. In addition, we show how we gain a better semantic understanding, and learn to bridge the vocabulary gap. Finally, we highlight how multilingual transformers help us handle cross language barrier and might reduce discrimination.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Layer Normalization · Linear Warmup With Linear Decay · Weight Decay · Adam · Residual Connection · Multi-Head Attention · Softmax
