Evaluating Token-Level and Passage-Level Dense Retrieval Models for Math   Information Retrieval

Wei Zhong; Jheng-Hong Yang; Yuqing Xie; and Jimmy Lin

arXiv:2203.11163·cs.IR·October 24, 2022

Evaluating Token-Level and Passage-Level Dense Retrieval Models for Math Information Retrieval

Wei Zhong, Jheng-Hong Yang, Yuqing Xie, and Jimmy Lin

PDF

Open Access 1 Repo

TL;DR

This paper evaluates token-level and passage-level dense retrieval models for math information retrieval, demonstrating their complementarity to traditional methods and advancing state-of-the-art performance.

Contribution

It introduces a combined approach using structure search and dense retrieval models for improved math formula search.

Findings

01

Bi-encoder models complement structure search methods.

02

Dense retrieval models improve MIR state-of-the-art.

03

Token-level and passage-level models both effective.

Abstract

With the recent success of dense retrieval methods based on bi-encoders, studies have applied this approach to various interesting downstream retrieval tasks with good efficiency and in-domain effectiveness. Recently, we have also seen the presence of dense retrieval models in Math Information Retrieval (MIR) tasks, but the most effective systems remain classic retrieval methods that consider hand-crafted structure features. In this work, we try to combine the best of both worlds:\ a well-defined structure search method for effective formula search and efficient bi-encoder dense retrieval models to capture contextual similarities. Specifically, we have evaluated two representative bi-encoder models for token-level and passage-level dense retrieval on recent MIR tasks. Our results show that bi-encoder models are highly complementary to existing structure search methods, and we are able…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

approach0/math-dense-retrievers
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Mathematics, Computing, and Information Processing · Natural Language Processing Techniques